High-Level Visual Processing: Cognitive Influences
High-Level Visual Processing Is Concerned with Object Identification
The Inferior Temporal Cortex Is the Primary Center for Object Perception
Clinical Evidence Identifies the Inferior Temporal Cortex as Essential for Object Recognition
Neurons in the Inferior Temporal Cortex Encode Complex Visual Stimuli
Neurons in the Inferior Temporal Cortex Are Functionally Organized in Columns
The Inferior Temporal Cortex Is Part of a Network of Cortical Areas Involved in Object Recognition
Object Recognition Relies on Perceptual Constancy
Categorical Perception of Objects Simplifies Behavior
Visual Memory Is a Component of High-Level Visual Processing
Implicit Visual Learning Leads to Changes in the Selectivity of Neuronal Responses
Explicit Visual Learning Depends on Linkage of the Visual System and Declarative Memory Formation
THE IMAGES PROJECTED ONTO THE retina are generally complex dynamic patterns of light of varying intensity and color. As we have seen, low-level visual processing is responsible for detection of various types of contrast in these images (see Chapters 25 and 26), whereas intermediate-level processing is involved in the identification of so-called visual primitives, such as contours and fields of motion, and the representation of surfaces (see Chapter 27). High-level visual processing integrates information from a variety of sources and is the final stage in the visual pathway leading to conscious visual experience.
In practice high-level visual processing depends on top-down signals that imbue bottom-up (afferent) sensory representations with semantic significance, such as that arising from short-term working memory, long-term memory, and behavioral goals. High-level visual processing thus selects behaviorally meaningful attributes of the visual environment (Figure 28–1).
Figure 28–1 The neuronal representation of entire objects is central to high-level visual processing. Object representation involves integration of visual features extracted at earlier stages in the visual pathways. Ideally the resulting representation is a generalization of the numerous retinal images generated by the same object and of different members of an object category. The representation also incorporates information from other sensory modalities, attaches emotional valence, and associates the object with the memory of other objects or events. Object representations can be stored in working memory and recalled in association with other memories.
High-Level Visual Processing Is Concerned with Object Identification
Our visual experience of the world is fundamentally object-centered. Objects are often visually complex, being composed of a large number of conjoined visual features. In addition, the features projected on the retina by an object vary greatly under different viewing conditions, such as lighting, angle, position, and distance.
Moreover, objects are commonly associated with specific experiences, other remembered objects, other sensations—such as the hum of the coffee grinder or the aroma of a lover’s perfume—and a variety of emotions. Animate beings, which are objects to the visual system, also direct intentions, desires, and actions at others and ourselves. In conjunction with our own behavioral goals, it is the behavioral saliency of individual objects, memories, and emotional valences as well as the real or implied actions of others that enables us to take action based on visual information. Object perception is thus the nexus between vision and cognition.
The Inferior Temporal Cortex Is the Primary Center for Object Perception
Primate studies implicate neocortical regions of the temporal lobe, principally the inferior temporal cortex, in object perception. Because the hierarchy of synaptic relays in the cortical visual system extends from the primary visual cortex to the temporal lobe, the temporal lobe is a site of convergence of many types of visual information.
As we shall later see, neuropsychological studies have found that damage to the inferior temporal cortex can produce specific failures of object recognition. Neurophysiological and brain-imaging studies have in turn yielded remarkable insights into the ways in which the activity of inferior temporal neurons represents objects, how these representations relate to perceptual and cognitive events, and how they are modified by experience.
Visual signals originating in the retina are processed in the lateral geniculate nucleus of the thalamus before reaching the primary visual cortex (V1). Thereafter ascending visual pathways follow two parallel and hierarchically organized streams: the ventral and dorsal streams (see Chapter 25). The ventral stream extends ventrally and anteriorly from V1 through V2, V4, and the temporal-occipital junction before reaching the inferior temporal cortex, which comprises the lower bank of the superior temporal sulcus and the ventrolateral convexity of the temporal lobe (Figure 28–2). This pathway makes the inferior temporal cortex the seat of the highest stage of cortical visual processing. Neurons at each synaptic relay in this ventral stream receive convergent input from the preceding stage. Inferior temporal neurons are thus in a position to integrate a large and diverse quantity of visual information over a vast region of visual space.
Figure 28-2 Cortical pathway for object recognition.
A. The pathway for object recognition (red) is identified in a lateral view of the brain showing the major pathways involved in visual processing. (AIP, anterior intraparietal cortex; FEF, frontal eye fields; IT, inferior temporal cortex; LIP, lateral intraparietal cortex; MIP, medial intraparietal cortex; MST, medial superior temporal cortex; MT, middle temporal cortex; PF, pre-frontal cortex; PMd, dorsal premotor cortex; PMv, ventral premotor cortex; TEO, temporo-occipital cortex; VIP, ventral intraparietal cortex.)
B. Cortical areas involved in object recognition are shown on lateral and ventral views of the monkey brain.
C. The inferior temporal cortex (IT) is the end stage of the ventral stream (red arrows), and is reciprocally connected with neighboring areas of the medial temporal lobe and pre-frontal cortex (gray arrows). This chart illustrates the main connections and predominant direction of information flow. (ER, entorhinal cortex; PF, pre-frontal cortex; PH, parahippocampal cortex; PR, perirhinal cortex; STP, superior temporal polysensory area; TEO, temporo-occipital cortex.)
The inferior temporal cortex is a large brain region. The patterns of anatomical connections to and from this area indicate that it comprises at least two main functional subdivisions: the posterior and anterior inferior temporal cortex. Anatomical evidence identifies the anterior subdivision as a higher processing stage than the posterior subdivision. As we shall see, this distinction is supported by both neuropsychological and neurophysiological evidence.
Clinical Evidence Identifies the Inferior Temporal Cortex as Essential for Object Recognition
The first clear insight into the neural pathways mediating object recognition was obtained in the late 19th century when the American neurologist Sanger Brown and the British physiologist Edward Albert Schäfer found that experimental lesions of the temporal lobe in primates resulted in loss of the ability to recognize objects. This impairment is distinct from the deficits that accompany lesions of occipital cortical areas in that sensitivity to basic visual attributes, such as color, motion, and distance, remains intact. Because of the unusual type of visual loss, the impairment was originally called psychic blindness, but this term was later replaced by visual agnosia (“without visual knowledge”), a term coined by Sigmund Freud.
In humans there are two basic categories of visual agnosia, apperceptive and associative, the description of which led to a two-stage model of object recognition in the visual system (Figure 28–3). With apperceptive agnosia the ability to match or copy complex visual shapes or objects is impaired. This impairment is perceptual in nature, resulting from disruption of the first stage of object recognition: integration of visual features into sensory representations of entire objects. In contrast, patients with associative agnosia can match or copy complex objects, but their ability to identify the objects is impaired. This impairment results from disruption of the second stage of object recognition: association of the sensory representation of an object with knowledge of the object’s meaning or function.
Figure 28-3 Neuropsychological evidence for the neuronal correlates of object recognition in the temporal lobe. Damage to inferior temporal cortex (IT) impairs the ability to recognize visual objects, a condition known as visual agnosia. There are two major categories of visual agnosia: apperceptive, a result of damage to the posterior region, and associative, resulting from damage of the anterior region. (Reproduced, with permission, from Farah 1990.)
Consistent with this functional hierarchy, apperceptive agnosia is most common following damage to the posterior inferior temporal cortex, whereas associative agnosia, a higher-order perceptual deficit, is more common following damage to the anterior inferior temporal cortex, a later stage in the functional hierarchy. Neurons in the anterior subdivision exhibit a variety of memory-related properties not seen in the posterior area.
Neurons in the Inferior Temporal Cortex Encode Complex Visual Stimuli
The coding of visual information in the temporal lobe has been studied extensively using electrophysiological techniques, beginning with the work of Charles Gross and colleagues in the 1970s. Neurons in this region have distinctive response properties.
They are relatively insensitive to simple stimulus features such as orientation and color. Instead, the vast majority possess large, centrally located receptive fields and encode complex stimulus features. These selectivities often appear somewhat arbitrary. An individual neuron might, for example, respond strongly to a crescent-shaped pattern of a particular color and texture. Cells with such unique selectivities likely provide inputs to yet higher-order neuronal representations of meaningful objects.
Indeed, several small subpopulations of neurons encode objects that convey to the observer highly meaningful information, such as faces and hands (Figure 28–4). For cells that respond to the sight of a hand, individual fingers are particularly critical. Among cells that respond to faces, the most effective stimulus for some cells is the frontal view of the face, whereas for others it is the side view. Although some neurons respond preferentially to faces, others respond to facial expressions. It seems likely that such cells contribute directly to face recognition.
Figure 28-4 Neurons in the inferior temporal cortex of the monkey are involved in face recognition. (Reproduced, with permission, from Desimone et al. 1984.)
A. The location of the inferior temporal cortex of the monkey is shown in a lateral view and coronal section. The colored area is the location of the recorded neurons.
B. Peristimulus histograms illustrate the frequency of action potentials in a single neuron in response to the different images illustrated below. This neuron responded selectively to faces. Masking of critical features, such as the mouth or eyes (4, 5), led to a substantial but not complete reduction in response. Scrambling the parts of the face (2) nearly eliminated the response.
Damage to a small region of the human temporal lobe results in an inability to recognize faces, a form of associative agnosia known as prosopagnosia. Patients with prosopagnosia can identify a face as a face, recognize its parts, and even detect specific emotions expressed by the face, but they are unable to identify a particular face as belonging to a specific person.
Prosopagnosia is one example of “category-specific” agnosia, in which patients with temporal-lobe damage fail to recognize items within a specific semantic category. There are reported cases of category-specific agnosias for living things, fruits, vegetables, tools, or animals. Owing to the pronounced behavioral significance of faces and the normal ability of people to recognize an extraordinarily large number of items from this category, prosopagnosia may simply be the most common variety of category-specific agnosia.
Neurons in the Inferior Temporal Cortex Are Functionally Organized in Columns
Early relays in the cortical visual system are organized in columns of neurons that represent the same stimulus features, such as orientation or direction of motion, in different parts of the visual field. Cells within the inferior temporal cortex are also organized in columns of neurons representing the same or similar stimulus properties (Figure 28–5). These columns commonly extend throughout the cortical thickness and over a range of approximately 400 mm. Columnar patches in the inferior temporal cortex are arranged such that different stimuli that possess some similar features are represented in partially overlapping columns (Figure 28–5). Thus one stimulus can activate multiple patches within the cortex. Long-range horizontal connections within the cortex may serve to connect patches into distributed networks for object representation.
Figure 28-5 Neurons in the anterior portion of the inferior temporal cortex that represent complex visual stimuli are organized into columns. (Reproduced, with permission, from Tanaka 2003.)
A. Optical images of the surface of the anterior inferior temporal cortex illustrate regions selectively activated by the objects shown at the right.
B. In this schematic depiction of the columnar structure of the inferior temporal cortex the vertical axis represents cortical depth. According to this model each column includes neurons that represent a distinct complex pattern. Columns of neurons that represent variations of a pattern, such as the different faces or the different fire extinguishers, constitute a hypercolumn.
Face-selective cells constitute a highly specialized class of neurons. Indeed, the fact that prosopagnosia often occurs in the absence of any other form of agnosia suggests that face-selective neurons of the temporal lobe may be located in exclusive clusters. While many early studies of neuronal response properties offered circumstantial evidence for such clustering, in 2006 Doris Tsao and Margaret Livingstone obtained dramatic support for this hypothesis. Functional magnetic resonance images of monkeys that were viewing faces revealed large active zones in a region of cortex in the lower bank of the superior temporal sulcus. Neurophysiological recordings of neurons in these zones confirmed that face-recognition cells formed large, dense clusters (Figure 28–6). Winrich Freiwald and Tsao later found that the five face-representation areas in monkeys interconnect with one another and form a processing system, with each node apparently concerned with a different aspect of face recognition.
Figure 28-6 The inferior temporal cortex contains dense clusters of face-selective neurons. (Reproduced, with permission, from Tsao et al. 2006.)
A. Functional magnetic resonance imaging (fMRI) identifies three regions of the inferior temporal cortex that are selectively activated by faces. The upper image, a sagittal section, shows the three active zones along the lower bank of the superior temporal sulcus in one monkey. The two lower images are coronal sections through the face-representation areas in two monkeys.
B. Neurophysiological recordings reveal a preponderance of face-selective neurons in the middle face area identified by fMRI. The histogram plots the mean normalized response rate (minus baseline) of 182 neurons in the middle face area of one monkey. The monkey was shown 96 visual stimuli in six categories. Only faces elicited consistently vigorous responses.