— Ch. 1 · Biological Inspiration And Origins —
Convolutional neural network.
~5 min read · Ch. 1 of 6
In the 1950s and 1960s, neuroscientists David Hubel and Torsten Wiesel studied cat visual cortices to understand how brains process vision. They discovered that individual neurons in these animals respond only to stimuli within a restricted region of the visual field known as the receptive field. Neighboring cells possessed similar and overlapping receptive fields that covered the entire visual space systematically. Their 1968 paper identified two basic cell types: simple cells maximized output for straight edges with specific orientations, while complex cells had larger receptive fields insensitive to exact edge positions. This biological model inspired Kunihiko Fukushima to introduce a multilayer visual feature detection network in 1969 called the neocognitron. Fukushima's design featured elements where all units in one layer shared identical interconnecting coefficients across homogeneous arrangements. The weights were not trainable at this stage, but the architecture laid the essential core for future convolutional networks.
Architectural Evolution And Milestones
The neocognitron introduced in 1980 established two fundamental layer types: S-layers acting as shared-weight receptive-field layers and C-layers functioning as downsampling layers. In 1987, Toshiteru Homma, Les Atlas, and Robert Marks II presented a paper replacing multiplication with convolution in time during the first Conference on Neural Information Processing Systems. Alex Waibel and colleagues developed Time Delay Neural Networks in 1987 for phoneme recognition using gradient descent training methods. Yann LeCun and his team at AT&T Bell Laboratories published work in 1989 demonstrating backpropagation applied to handwritten ZIP code recognition. LeNet-5 emerged in 1995 as a pioneering seven-level network classifying hand-written numbers on checks digitized into 32 by 32 pixel images. This system integrated into NCR check reading systems and began fielding in American banks since June 1996. GPU implementations accelerated dramatically starting in 2004 when K.S. Oh and K. Jung showed standard neural networks could run twenty times faster on graphics processors. Dan Ciresan trained deep feedforward networks on GPUs in 2010 before extending this to CNNs in 2011 achieving sixtyfold acceleration compared to CPU training. AlexNet won the ImageNet Large Scale Visual Recognition Challenge in 2012 marking an early catalytic event for the AI boom.