Training a Deep Neural Network 101 | Alcidion Skip to content

​Training a Deep Network 101

AI and Deep Learning are regularly mentioned in the popular press and increasingly cited in medical literature; this year alone there will be about 5,000 papers in Medline on neural networks or deep learning! In this post we’ll get an intuition (without maths) on how Deep Neural Networks (DNNs) work so we can better understand their benefits and limitations, particularly when it comes to healthcare.


Imagine being able to accurately predict who is at risk of chronic diseases early enough for effective interventions, or having X-rays read at an expert level within seconds of the image being taken in any country in the world. These are things DNNs are good at, but today we can’t solve these problems using machine learning because we lack the large volumes of labelled data required to train machines.

Figure 1. A labelled image used to train neural networks.
An expert has outlined areas of cancer in a breast cancer histopathology slide.

 

An example of a labelled image is shown in Figure 1 (from a study on using DNNs in breast cancer detection [1]). The label, in this case, is the area on the slide with cancer cells, and it has been outlined by an expert. The outline is on a layer separate from the image so it can be made invisible during training.

DNN for image recognition

Figure 2. A simple example of a DNN for image recognition.

 

The DNN in Figure 2 has an input layer in yellow, 8 hidden layers (pink and green), and an output layer in red. Each layer is comprised of ‘nodes’ (circles). To train the DNN on breast cancer pathology, such as the image in Figure 1, we input the original image without the cancer labels into the yellow layer on the left. The input triggers a cascade of signals from left to right, ending in an output at the red nodes, such as a list of locations on the image where cancer is present. The black lines connecting nodes across layers are simply numbers that determine the strength of the signal from one layer to the next.

When we start training the output will be pretty much a random guess. We improve the network by making it learn from its mistakes. After the network generates an output, we reveal the labels (where the cancer actually is) and compare the DNN’s guess to the labelled data. The worse the guess from the ‘truth’ the bigger the error score. The error is then propagated back up the network, from right to left. This back propagation updates the weights in the network so that next time it should improve. The bigger the error, the more the weights are shifted.

The network’s ‘knowledge’ of breast cancer pathology is represented by the numerical weights in the network. Explaining how a network came to an answer is very hard because concepts that are meaningful to us, like the degree of cell necrosis, is spread across hundreds or thousands of weights in the network. In fact, some networks may have millions of connections.

How do we choose the number of layers, or nodes in a layer, or type of nodes? It’s pretty much guess work at the moment.

Traditional studies in clinical medicine were considered pretty big if they enrolled over 200 patients. A ‘mega-trial’ is defined as having over 5,000 patients enrolled. Deep Learning requires serious amounts of data. For example, Microsoft used 1.2 million images to train a DNN to have better-than-human performance in image recognition [3].

And herein lies the problem for healthcare: We don’t routinely capture structured data with labels at the level of detail required to train DNNs for many important problems. About 80% of data in EMRs is unstructured free text, which is difficult to use to train machines. We have text reports for radiology and histopathology, but we don’t have annotations on the images. A lot of current work on predicting diseases uses ICD codes from billing data as the ‘labels’ for training networks, because we don’t have reliable clinical outcome data.

Medical researchers are busy labelling data to be used for machine learning, and AI researchers are exploring ways to reduce the amount of data required for deep models. In the meantime, progress has been made in applying DNNs and AI in healthcare. Next time we’ll have a look at some examples in image detection, diagnostics, and disease prediction.

References:

  • Cruz-Roa A et.al. Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent. Sci Rep. 2017 Apr 18;7:46450
  • Asimov Institute. Neural Network Zoo. http://www.asimovinstitute.org/neural-network-zoo/
  • http://blogs.technet.microsoft.com/machinelearning/2016/11/15/imagenet-deep-neural-network-training-using-microsoft-r-server-and-azure-gpu-vms/

Download our free whitepaper to find out how DNNs and deep learning can eliminate inaccurate patient records and enhance clinical and administrative workflows in your hospital.

Download the whitepaper