The Constructive Nature of Visual Processing
Visual Perception Is a Constructive Process
Visual Perception Is Mediated by the Geniculostriate Pathway
Form, Color, Motion, and Depth Are Processed in Discrete Areas of the Cerebral Cortex
The Visual Cortex Is Organized into Columns of Specialized Neurons
Intrinsic Cortical Circuits Transform Neural Information
Visual Information Is Represented by a Variety of Neural Codes
We are so familiar with seeing, that it takes a leap of imagination to realize that there are problems to be solved. But consider it. We are given tiny distorted upside-down images in the eyes and we see separate solid objects in surrounding space. From the patterns of stimulation on the retina we perceive the world of objects and this is nothing short of a miracle.
—Richard L. Gregory, Eye and Brain, 1966
MOST OF OUR IMPRESSIONS of the world and our memories of it are based on sight. Yet the mechanisms that underlie vision are not at all obvious. How do we perceive form and movement? How do we distinguish colors? Identifying objects in complex visual environments is an extraordinary computational achievement that artificial vision systems have yet to duplicate. Vision is used not only for object recognition but also for guiding our movements, and these separate functions are mediated by at least two parallel and interacting pathways.
The existence of parallel pathways in the visual system raises one of the central questions of cognition, the binding problem. How are different types of information carried by discrete pathways brought together into a coherent visual image?
Visual Perception Is a Constructive Process
Vision is often incorrectly compared to the operation of a camera. Unlike a camera, however, the visual system is able to create a three-dimensional representation of the world from the two-dimensional images on the retina. In addition, an object is perceived as the same under strikingly different visual conditions.
A camera reproduces point-by-point the light intensities in one plane of the visual field. The brain, in contrast, parses scenes into distinct components, separating foreground from background, to determine which light stimuli belong to one object and which to others. In doing so it uses previously learned rules about the structure of the world. In analyzing the incoming stream of visual signals the brain guesses at the scene presented to the eyes based on past experience.
This constructive nature of visual perception has only recently been fully appreciated. Earlier thinking about sensory perception was greatly influenced by the British empiricist philosophers, notably John Locke, David Hume, and George Berkeley, who thought of perception as an atomistic process in which simple sensory elements, such as color, shape, and brightness, were assembled in an additive way, component by component. The modern view that perception is an active and creative process that involves more than just the information provided to the retina has its roots in the philosophy of Immanuel Kant and was developed in detail in the early 20th century by the German psychologists Max Wertheimer, Kurt Koffka, and Wolfgang Köhler, who founded the school of Gestalt psychology.
The German term Gestalt means configuration or form. The central idea of the Gestalt psychologists is that what we see about a stimulus—the perceptual interpretation we make of any visual object—depends not just on the properties of the stimulus but also on its context, on other features in the visual field. The Gestalt psychologists argued that the visual system processes sensory information about the shape, color, distance, and movement of objects according to computational rules inherent in the system. The brain has a way of looking at the world, a set of expectations that derives in part from experience and in part from built-in neural wiring.
Max Wertheimer wrote: “There are entities where the behavior of the whole cannot be derived from its individual elements nor from the way these elements fit together; rather the opposite is true: the properties of any of the parts are determined by the intrinsic structural laws of the whole.” In the early part of the 20th century the Gestalt psychologists worked out the laws of perception that determine how we see, including similarity, proximity, and good continuation.
We see a uniform six-by-six array of dots as either rows or columns because of the brain’s tendency to impose a pattern. Thus if the dots in each row are similar we are more likely to see a pattern of alternating rows (Figure 25–1A). If the dots in each column are closer together than those in the rows, we are more disposed to see a pattern of columns (Figure 25-1B). The principle of good continuation is an important basis for linking line elements into unified shapes (Figure 25-1C). It is also seen in the phenomenon of contour saliency, whereby smooth contours tend to pop out from complex backgrounds (Figure 25-1D).
Figure 25-1 Organizational rules of visual perception. To link the elements of a visual scene into unified percepts, the visual system relies on organizational rules such as similarity, proximity, and good continuation.
A. The dots in each row have the same color, and thus, an overall pattern of alternating blue and white rows is perceived.
B. The dots in the columns are closer together than those in the rows, leading to the perception of columns.
C. Line segments are perceptually linked when they are col-linear. In the top set of lines, one is more likely to see line segment a as belonging with c rather than d. In the bottom set a and c are perceptually linked because they maintain the same curvature, whereas a and b appear to be discontinuous.
D. The principle of good continuation is also seen in contour saliency. On the right a smooth contour of line elements pops out from the background, whereas the jagged contour on the left is lost in the background. (Adapted, with permission, from Field, Hayes, and Hess 1993.)
An important step in object recognition is separating figure from background. At different moments the same elements in the visual field can be organized into a recognizable figure or serve as part of the background for other figures (Figure 25–2). Segmentation relies not only on certain geometric principles, but also on cognitive influences such as attention and expectation. Thus a priming stimulus or an internal representation of object shape can facilitate the association of visual elements into a unified percept (Figure 25–3).
Figure 25-2 Object recognition depends on the separation of foreground and background in a scene. Recognition of the white salamanders in this image depends on the brain’s segmentation of the image, situating the white salamanders in the foreground and the brown and black salamanders in the background. The image also illustrates the role of higher influences in segmentation: One can consciously select any of the three colors as the foreground. (Reproduced, with permission, from M.C. Escher’s “Symmetry Drawing E56” © 2010 The M.C. Escher Company-Holland. All rights reserved. www.mcescher.com.)
Figure 25-3 Expectation and perceptual task play a critical role in what is seen. It is difficult to segment the dark and white patches in this figure into foreground and background without additional information. After viewing the priming image on page 561, this figure immediately becomes recognizable. In this example higher-order representations of shape guide lower-order processes of segmentation. (Reproduced, with permission, from Porter 1954.)
The brain analyzes a visual scene at three levels: low, intermediate, and high (Figure 25–4). At the lowest level, which we consider in the next chapter, visual attributes such as local contrast, orientation, color, and movement are discriminated. The intermediate level involves analysis of the layout of scenes and of surface properties, parsing the visual image into surfaces and global contours, and distinguishing foreground from background (see Chapter 27). The highest level involves object recognition (see Chapter 28). Once a scene has been parsed by the brain and objects recognized, the objects can be matched with memories of shapes and their associated meanings. Vision also has an important role in guiding body movement, particularly hand movement (see Chapter 29).
Figure 25-4 A visual scene is analyzed at three levels. First, simple attributes of the visual environment are analyzed (low-level processing). These low-level features are used to parse the visual scene (intermediate-level processing): Local visual features are assembled into surfaces, objects are segregated from background (surface segmentation), local orientation is integrated into global contours (contour integration), and surface shape is identified from shading and kinematic cues. Finally, surfaces and contours are used to identify the object (high-level processing). (Images of horses reproduced, with permission, from Pintos, © Bev Doolittle, courtesy of The Greenwich Workshop, Inc., www.greenwichworkshop.com.)
In vision as in other cognitive operations, various features—motion, depth, form, and color—occur together in a unified percept. This unity is achieved not by one hierarchical neural system but by multiple areas in the brain that are fed by at least two major interacting neural pathways. Because distributed processing is one of the main organizational principles in the neurobiology of vision, one must have a grasp of the anatomical pathways of the visual system to understand fully the physiological description of visual processing in later chapters.
In this chapter we lay the foundation for understanding the neural circuitry and organizational principles of the visual pathways. These principles apply quite broadly and are relevant not only for the multiple areas of the brain concerned with vision but also for other types of information processing by the brain.
Visual Perception Is Mediated by the Geniculostriate Pathway
Visual processing begins in the two retinae (see Chapter 26). The axons of the retinal ganglion cells, the projection neurons of the retina, form the optic nerve that extends to a midline crossing point, the optic chiasm. Beyond the chiasm fibers from the temporal hemiretinas proceed to the ipsilateral hemisphere; fibers from the nasal hemiretinas cross to the contralateral hemisphere (Figure 25–5). Because the temporal hemiretina of one eye sees the same half of the visual field (hemifield) as the nasal hemiretina of the other, the partial decussation of fibers at the chiasm ensures that all the information about each hemifield is processed in the visual cortex of the contralateral hemisphere.
Figure 25-5 Representation of the visual field along the visual pathway. Each eye sees most of the visual field, with the exception of a portion of the peripheral visual field known as the monocular crescent. The axons of retinal neurons (ganglion cells) carry information from each visual hemifield along the optic nerve up to the optic chiasm, where fibers from the nasal hemiretina cross to the opposite hemisphere. Fibers from the temporal hemiretina stay on the same side, joining the fibers from the nasal hemiretina of the contralateral eye to form the optic tract. The optic tract carries information from the opposite visual hemifield originating in both eyes and projects into the lateral geniculate nucleus. Cells in this nucleus send their axons along the optic radiation to the primary visual cortex.
Lesions along the visual pathway produce specific visual field deficits, as shown on the right:
1. A lesion of an optic nerve causes a total loss of vision in one eye.
2. A lesion of the optic chiasm causes a loss of vision in the temporal half of each visual hemifield (bitemporal hemianopsia).
3. A lesion of the optic tract causes a loss of vision in the opposite half of the visual hemifield (contralateral hemianopsia).
4. A lesion of the optic radiation fibers that curve into the temporal lobe (Meyer’s loop) causes loss of vision in the upper quadrant of the contralateral visual hemifield in both eyes (upper contralateral quadrantic anopsia).
5,6. Partial lesions of the visual cortex lead to deficits in portions of the contralateral visual hemifield. For example, a lesion in the upper bank of the calcarine sulcus (5) causes a partial deficit in the inferior quadrant, while a lesion in the lower bank (6) causes a partial deficit in the superior quadrant. The central area of the visual field tends to be unaffected by cortical lesions because of the extent of the representation of the fovea and the duplicate representation of the vertical meridian in the hemispheres.
Beyond the optic chiasm the axons from nasal and temporal hemiretinas carrying input from one hemifield join in the optic tract, which extends to the lateral geniculate nucleus of the thalamus. The lateral geniculate nucleus in primates consists of six layers, each of which receives input from either the ipsilateral or the contralateral eye. Because each layer contains a map of the contralateral hemifield, six concordant maps are stacked atop one another. The thalamic neurons then relay retinal information to the primary visual cortex.
The primary visual pathway is also called the geniculostriate pathway because it passes through the lateral geniculate nucleus on its way to the primary visual cortex, also known as the striate cortex because of the myelin-rich stripe that runs through its middle layers. A second pathway from the retina runs to the superior colliculus and is important in controlling eye movements. This pathway continues to the pontine formation in the brain stem and then to the extraocular motor nuclei. A third pathway extends from the retina to the pretectal area of the midbrain, where neurons mediate the pupillary reflexes that control the amount of light entering the eyes.
Each lateral geniculate nucleus projects to the primary visual cortex through a pathway known as the optic radiation (Figure 25–6A). These afferent fibers form a complete neural map of the contralateral visual field in the primary visual cortex. Beyond the striate cortex lie the extrastriate areas, a set of higher-order visual areas that are also organized as neural maps of the visual field. The preservation of the spatial arrangement of inputs from the retina is called retinotopy, and a neural map of the visual field is described as retinotopic or having a retinotopic frame of reference.
Figure 25-6 Pathways for visual processing, pupillary reflex and accommodation, and control of eye position.
A. Visual processing. The eye sends information first to thalamic nuclei, including the lateral geniculate nucleus and pulvinar, and from there to cortical areas. Cortical projections go forward from the primary visual cortex to areas in the parietal lobe (the dorsal pathway, which is concerned with visually guided movement) and areas in the temporal lobe (the ventral pathway, which is concerned with object recognition). The pulvinar also serves as a relay between cortical areas to supplement their direct connections.
B. Pupillary reflex and accommodation. Light signals are relayed through the midbrain pretectum, to preganglionic parasympathetic neurons in the Edinger-Westphal nucleus, and out through the parasympathetic outflow of the oculomotor nerve to the ciliary ganglion. Postganglionic neurons innervate the smooth muscle of the pupillary sphincter, as well as the muscles controlling the lens.
C. Eye movement. Information from the retina is sent to the superior colliculus (SC) directly along the optic nerve and indirectly through the geniculostriate pathway to cortical areas (primary visual cortex, posterior parietal cortex, and frontal eye fields) that project back to the superior colliculus. The colliculus projects to the pons (PPRF), which then sends control signals to oculomotor nuclei, including the abducens nucleus, which controls lateral movement of the eyes. (FEF, frontal eye field; LGN, lateral geniculate nucleus; PPRF, paramedian pontine reticular formation.)
The primary visual cortex constitutes the first level of cortical processing of visual information. From there information is transmitted over two major pathways. A ventral pathway into the temporal lobe carries information about what the stimulus is, and a dorsal pathway into the parietal lobe carries information about where the stimulus is, information that is critical for guiding movement.
A major fiber bundle called the corpus callosum connects the two hemispheres, transmitting information across the midline. The primary visual cortex in either hemisphere represents slightly more than half the visual field, with the two hemifield representations overlapping at the vertical meridian. One of the functions of the corpus callosum is to unify the perception of objects spanning the vertical meridian by linking the cortical areas that represent opposite hemifields.
Form, Color, Motion, and Depth Are Processed in Discrete Areas of the Cerebral Cortex
In the late 19th and early 20th centuries the cerebral cortex was differentiated by the anatomist Korbinian Brodmann and others using anatomical criteria. The criteria included the size, shape, and packing density of neurons in the cortical layers and the thickness and density of myelin. The functionally distinct cortical areas we have considered heretofore correspond only loosely to Brodmann’s classification. The primary visual cortex (V1) is identical to Brodmann’s area 17. In the extrastriate cortex the secondary visual area, V2, corresponds to area 18. Beyond that, however, area 19 contains several functionally distinct areas that generally cannot be defined by anatomical criteria.
The number of functionally discrete areas of visual cortex varies between species. Macaque monkeys have more than 30 areas. Although not all visual areas in humans have yet been identified, the number is likely to be at least as great as in the macaque. If one includes oculomotor areas and pre-frontal areas contributing to visual memory, almost half of the cerebral cortex is involved with vision. Functional magnetic resonance imaging (fMRI) has made it possible to establish homologies between the visual areas of the macaque and human brains (Figure 25–7). Based on pathway tracing studies in monkeys, we now appreciate that these areas are organized in functional streams (Figure 25-7B).