— Ch. 1 · Foundations And Definitions —
Deep learning.
~5 min read · Ch. 1 of 7
In 1986, Rina Dechter introduced the term deep learning to the machine learning community. The field focuses on utilizing multilayered neural networks to perform tasks such as classification and regression. These systems stack artificial neurons into layers that transform data hierarchically. The adjective deep refers to the use of multiple layers ranging from three to several hundred or thousands in the network. Methods used can be supervised, semi-supervised, or unsupervised. A deep learning process learns which features to optimally place at which level on its own. Prior to this approach, techniques often involved hand-crafted feature engineering to transform data for a classification algorithm. In the deep learning approach, features are not hand-crafted and the model discovers useful representations automatically. This does not eliminate the need for hand-tuning varying numbers of layers and layer sizes provide different degrees of abstraction.
Historical Evolution Timeline
The first working deep learning algorithm was the Group method of data handling published by Alexey Ivakhnenko and Lapa in 1965. They regarded it as a form of polynomial regression generalizing Rosenblatt's perceptron to handle complex relationships. A 1971 paper described a deep network with eight layers trained by this method based on layer-by-layer training through regression analysis. In 1972 Shun'ichi Amari made an architecture adaptive and his learning recurrent neural network was republished by John Hopfield in 1982. Kunihiko Fukushima introduced the ReLU activation function in 1969 which became the most popular activation function for deep learning. The Neocognitron introduced by Fukushima in 1979 began convolutional neural networks though not trained by backpropagation. Backpropagation terminology was actually introduced in 1962 by Rosenblatt but he did not know how to implement it. The modern form was first published in Seppo Linnainmaa's master thesis in 1970. Paul Werbos applied backpropagation to neural networks in 1982 while David E. Rumelhart et al. popularized it in 1986 without citing original work.