Skip to content
— CH. 1 · BIOLOGICAL FOUNDATIONS OF VISION —

Visual perception

~4 min read · Ch. 1 of 6
6 sections
  • Light enters the eye through the cornea and is focused by the lens onto the retina. This light-sensitive membrane sits at the back of the eye in most vertebrates. Specialized photoreceptive cells act as transducers to convert light into neural impulses. These photoreceptors fall into two broad classes: cone cells and rod cells. Cone cells enable photopic vision during daylight hours. Rod cells facilitate scotopic vision for nighttime conditions. Signals from these cells travel via the optic nerve to central ganglia in the brain. The lateral geniculate nucleus then transmits information to the visual cortex. Some signals bypass this route and go directly to the superior colliculus.

  • Ancient Greek scholars proposed emission theory, claiming rays emanated from eyes to intercept objects. Euclid's Optics and Ptolemy's Optics championed this view. Aristotle advocated intromission, suggesting something entered the eye representing the object. His work De Sensu lacked experimental foundation despite its modern resonance. Ibn al-Haytham published Book of Optics around 1021 to reject both earlier theories. He demonstrated that vision occurs when reflected light rays enter the eye. Alhazen introduced systematic experimentation to prove his point. Roger Bacon later adopted his methods. Kepler built upon these findings. Isaac Newton isolated individual colors using a prism in the late 1600s. Leonardo da Vinci recognized distinct optical qualities of the human eye between 1452 and 1519. He noted clear vision exists only along the line of sight ending at the fovea.

  • Hermann von Helmholtz coined unconscious inference in 1867 after examining the human eye. He concluded insufficient retinal data required assumptions based on prior experience. Light typically comes from above according to visual habits. Objects are rarely viewed from below in normal circumstances. Faces appear upright because we expect them so. Closer objects block distant ones but not vice versa. Figures tend to have convex borders as a rule. Visual illusions reveal where these assumptions fail. Bayesian studies revived probability-based inference recently. Models describe motion perception, depth perception, and figure-ground perception. The wholly empirical theory rationalizes perception without invoking formal equations. Colin Murray Turbayne argued for a language model alternative to geometric explanations. He cited sculptor Naum Gabo stating lines and shapes possess their own language.

  • The lateral geniculate nucleus sends signals to the primary visual cortex known as striate cortex. Extrastriate cortex receives information from striate cortex and other cortical structures. Recent descriptions divide this area into ventral and dorsal pathways. This division forms the two streams hypothesis. The dorsal stream handles spatial awareness functions. The ventral stream manages object recognition tasks. Prosopagnosic patients show deficits in face processing while sparing object handling. Patient C.K. demonstrated object agnosia with intact face recognition. Doris Tsao described brain regions for face recognition in macaque monkeys using fMRI. The inferotemporal cortex plays a key role in differentiating various objects. MIT researchers found subset regions of IT cortex handle specific items. Shutting off neural activity in small areas caused animals to confuse particular pairs of objects. Certain patches focus on face recognition more than general object identification.

  • David Marr developed a multi-level theory of vision during the 1970s. He identified three levels: computational, algorithmic, and implementational. Tomaso Poggio embraced these levels to characterize vision computationally. Marr proposed vision proceeds from a two-dimensional retinal array to a three-dimensional world description. His stages include a primal sketch based on feature extraction like edges and regions. A second stage acknowledges textures similar to shading in drawing. A final stage constructs a continuous three-dimensional map. Critics noted stereoscopic perception proves 3D shape precedes depth point perception. An alternative framework proposes encoding, selection, and decoding stages instead. Encoding samples visual inputs as neural activities in the retina. Selection shifts gaze to process signals at specific locations. Decoding infers or recognizes selected input signals like faces. Attentional constraints impose dichotomy between central and peripheral fields for recognition.

  • Studies of people whose sight returns after long blindness reveal they cannot always recognize objects or faces. They distinguish color, motion, and simple geometric shapes better than complex forms. Being blind during childhood prevents some higher-level tasks from developing properly. The general belief held that critical periods lasted until age five or six. A 2007 study challenged this finding by showing older patients could improve with years of exposure. Out Of Darkness documented rare cases of restored vision revealing how brains learn to see. Visual system maturation depends heavily on early sensory experience. Deprivation during formative years impacts later capabilities significantly. Research continues into whether extended training can overcome these developmental gaps.

Common questions

How does light enter the eye and reach the brain?

Light enters the eye through the cornea and is focused by the lens onto the retina. Specialized photoreceptive cells convert this light into neural impulses that travel via the optic nerve to central ganglia in the brain.

Who rejected emission theory and proved vision occurs when reflected light rays enter the eye?

Ibn al-Haytham published Book of Optics around 1021 to reject earlier theories. He demonstrated that vision occurs when reflected light rays enter the eye using systematic experimentation.

What are the two main streams of visual processing described in the two streams hypothesis?

The lateral geniculate nucleus sends signals to the primary visual cortex known as striate cortex which divides into ventral and dorsal pathways. The dorsal stream handles spatial awareness functions while the ventral stream manages object recognition tasks.

When did Hermann von Helmholtz coin unconscious inference and what conclusion did he draw about retinal data?

Hermann von Helmholtz coined unconscious inference in 1867 after examining the human eye. He concluded insufficient retinal data required assumptions based on prior experience such as expecting faces to appear upright.

How does David Marr's multi-level theory of vision describe the process from retina to world description?

David Marr developed a multi-level theory of vision during the 1970s identifying three levels: computational, algorithmic, and implementational. His stages include a primal sketch based on feature extraction like edges and regions followed by texture acknowledgment and final construction of a continuous three-dimensional map.