Many of the advancements that every industry is seeing are due to computer science—the idea of using statistical techniques in a way that allows a computer to "teach" itself with data, rather than being programmed. This concept has become so expansive that it is now divided into countless subsets, one of which is deep learning.
However, when most people hear the term deep learning, they bundle it up with every other computer science term, like artificial intelligence and machine learning. To create clarity on the topic and differentiate it from other related fields, here is a deep dive into deep learning.
The History Of Deep Learning
The concept of deep learning only received its name in the mid-80s when Rina Dechter presented it to machine learning specialists. However, the concept itself has been around for much longer. It was Alexey Ivakhnenko and V.G. Lapa, in 1965, who published the first functioning learning algorithm for perceptrons (i.e. binary classifier supervised learning algorithms). Throughout the rest of the 1960s, 70s, and 80s more and more deep learning architectures were developed and presented to the computer science community. And as the science was further developed, it became increasingly applicable to repetitive manual tasks.
For example, Yann LeCun and his research partners developed a method for combining the deep neural network with the standard backpropagation algorithm to recognize ZIP codes that were handwritten on pieces of mail. Over the next few years, similar methods were used to recognize 2-D handwriting and 3-D objects. Then in the late 1990s, U.S. government funding allowed SRI International, a non-profit research organization, to study and find success with neural networks in speech processing and speaker recognition. It took several more years for computer scientists to publish a neural network that could handle speech recognition.
During the 2000s, the applications and potential of deep learning expanded and around 2010 speech recognition and other deep learning applications began to become more regularly used by businesses and individuals. There were several reasons that this transition was possible. The first was that researchers were able to expand deep learning beyond the basics and into large vocabulary speech recognition. Additionally, there were significant advances in hardware in the late 2000s. The most significant advancement was Nvidia's graphics processing unit or GPU, which made deep learning systems roughly 100 times faster.
Because of these increases in speed and ability, deep learning was able to predict the biomolecular target of a drug in the "Merck Molecular Activity Challenge" in 2012. In 2014, scientists used deep learning to examine nutrients, drugs, and household products to find the toxic and off-target effects of the environmental chemicals they contain. The applications of deep learning go on and on, from cancer detection and driverless cars to the creation of image descriptions and voice control on smartphones, tablets, and TVs.
In short, deep learning has become a tool that every industry is currently using and a technology that is effecting the lives of nearly every individual around the world. Over the next few years, its use and integration into business and society will only increase.
What Deep Learning Is
At its core, deep learning is about instilling computers with the ability to learn in the same way that humans learn—by example. It is a machine learning technique that uses multi-layered neural network architectures and feeds massive sets of labeled data to a computer model, allowing it to train itself to become better and better at a task.
An understanding of neural networks is also essential in fully grasping deep learning. Neural networks are computing systems that are based on the neural networks in animal brains. Just as a brain has neurons, an artificial neural network has artificial neurons or nodes that are connected to one another. And just as the neurons in the brain transmit messages to each other, an artificial neuron will receive a signal, process it, and transmit it to other nodes that it is connected to. It is important to note that all of these artificial neurons are simply a software simulation—a group of mathmatical equations that link algebraic variables together in a way that is meant to replicate how the brain makes decisions, identifies patterns, and learns information and how to complete tasks.
To clarify, deep learning and neural networks are not the same thing. Most neural networks are not very big. They only have a handful of hidden layers of artificial neurons. Deep learning requires a neural network that contains as many as 150 hidden layers of artificial neurons. This deep neural network is then fed large data sets that are labeled. For example, the data set could be tens of thousands of pictures of windows, which are labeled as 'window'. All the computer scientist has to do is provide the computer with the data, it does not need to manually input the features that a window should have (i.e. 'rectangular', 'on a house', etc.). The computer then learns what a window looks like by analyzing the pictures that are labeled as 'window'.
A CNN or convolutional neural network is one of the most common neural networks used in deep learning. This type of neural network specializes in 2D information, such as handwriting and images. The network operates in layers. As the piece of data passes through the sequence of layers in the neural network, the artificial neurons begin to extract more and more complex aspects of the piece of data. For example, if the piece of data is an image of a window, the first layer of the neural network would detect the edges of the picture, a layer in the middle might detect different tones of color in the image, and the last layer could detect reflections that are captured within the window panes.
Why Deep Learning Is Popular Now
At first glance many would assume that the neural networks that deep learning relies on were only developed recently, giving reason to why deep learning is only now being harnessed. This is not the case. Neural networks have been around for years.
The basis for their recent surge in popularity is two-fold. First, deep learning requires more than just a deep neural network. It requires vast amounts of labeled data. Driverless cars, for instance, could only be effectively programmed once they were fed tens of thousands of hours of driving video and tens of millions of road images. The second reason is that in order for deep learning to be valuable, it needed to be fast. This is only possible with enhanced computer power, which happened when Nvidia was able to release high-performance GPUs. This means that instead of weeks or months, computer scientists are able to have a deep learning neural network trained in a few hours or even minutes.
Deep Learning Vs. Machine Learning
Deep learning is not only a subset of computer science, but also of machine learning. More specifically, deep learning is a subset of machine learning, which is a subset of artificial intelligence, which is a subset of computer science. And while the two forms of computer science are closely related, there are two key differences.
The first is that machine learning requires more human intervention and involvement than deep learning. When the process of machine learning begins, relevant features from the images must be manually inserted. A model is then created from these features, allowing the computer simulation to categorize images. All of this is automated in deep learning. As long as the images are labeled, the artificial neural network extracts all of the image's features on its own. Deep learning is end-to-end. All the computer scientist has to do is provide the data and tell the computer simulation the task that it is responsible for learning. It requires virtually no manual intervention, which is not the case for machine learning.
The second difference is potential performance. With machine learning there is a cap to how well it can perform. The reason for this is that it involves more shallow learning methods, so even when more data is added to the computer, the performance will plateau. This does not happen with deep learning. As more data is processed, the algorithms scale and the performance continues to improve, gaining speed and accuracy.
However, this does not necessarily mean that deep learning is always the optimal choice. The benefit to machine learning is that it allows for a wider range of options, including the performance abilities of the computer, the amount of data that is being processed, and the task that the artificial intelligence is learning. Deep learning requires a very powerful GPU and vast amounts of data. If these are not present, machine learning is the better choice. If they are present, though, deep learning will likely offer better results.