Neural Mechanisms of Communication
Julia Sliwa, Daniel Y. Takahashi, and Stephen V. Shepherd
16.1 We Are Not Alone
Antelopes such as the Thompson’s gazelle, observing predators, often engage in ‘stotting”—they flee in a series of erratic leaps, legs stiff and straight. This method of escape is inefficient, and is abandoned for the most dangerous predators, but is common in response to slower or less‐focused threats. What is happening here? It appears the antelope is sending a signal to its potential predator, who receives it.
All animals live in an environment that includes others. They can be detected by others and can influence the likelihood (and consequences) of this detection by sending signals. Signals are bodily features or behaviors of the signaler that trigger specific behaviors in the receiver. The receiver, signaler, signal, and medium are the four basic building blocks of any communication cycle. Each component can be considered separately, but in the service of communication they are interdependent and defined only in relation to one other. Cycles of reciprocal signal exchange mediate social interactions, but even “asocial” species coordinate reproduction, manage conflict over territory, and may anticipate and influence potential predators and prey.
Communication arose long before the evolution of animals and neurons, yet is a crucial aspect of animal behavior and nervous system evolution. In this chapter, we review general principles of signaling exchange, then detail how these exchanges take place among nonhuman primates and how neural systems act to mediate them. We end by outlining commonalities between primate and nonprimate communication systems, and close by discussing broader implications for the study of social neuroscience.
16.2 The Evolution of Communicative Signals
Signals can be sent through diverse media: Tactile signals are primarily short range, instantaneous, and reciprocal. By contrast, olfactory signals are perceived at spatiotemporal distance (depending on wind/current) and can be displaced from the body. Some media may not be evident to human observers, including infrasonic vibration and ultrasonic hearing, thermal emission of light, and electromagnetic fields generated by neuronal activity. Signals can be multimodal. Likewise, receivers can sense signals in diverse modalities and use them all to make inferences about their social environment. Depending on the environments, some signals and not others can be detected. The efficiency, range and directedness of signals vary with their modality and environmental context (see Table 16.1).
Table 16.1 The Relative Spatial Range, Temporal Range, and Reciprocity of Communication Channels by Modality.
Spatial Range | Temporal Range | Reciprocity | |
Touch | Local | Instant, transient | Strong |
Vision | Easily occluded | Instant, transient | Intermediate |
Vibration | Attenuates | Fast, transient | Weak |
Chemosensation | Plume (Displaced) | Slow, transient (Enduring) | Weak |
Electroception | Attenuates | Instant, transient | Strong |
Despite signals’ diversity, and the diversity of mechanisms required to produce and perceive them, all have common features which determine how they evolve and mediate behavior.
16.2.1 The Building Blocks of a Communication Cycle
Communication requires both a signaler and a receiver. The communicative component that is shared between the signaler and the receivers is the signal: a behavior or bodily transformation of the signaler that influences the receiver’s behavior in a given way. For example, consider again the Thompson gazelle’s stotting behavior (FitzGibbon & Fanshawe, 1988). Stotting functions as an honest signal: By indicating that the gazelle is aware of the predator and extremely fit, it deters further pursuit. Because the predator prefers not to waste time and energy chasing an alert, athletic animal, both parties benefit—despite the fact one would like the other for lunch.
Not all signals are honest. Among cuttlefish, which can change their shape and their skin color and patterning, males sometimes display typical male courtship patterns to females with one side of the body while simultaneously displaying typical female patterns to a rival male with the other—thus fooling the rival into complacence (Brown, Garwood, & Williamson, 2012). In both examples, the signalers influence the receivers into not chasing or attacking them by displaying a signal that triggers a pacific behavior of the receiver.
In many communication situations, a signal triggers a less specific outcome in the receiver: It simply attracts its attention. Often signals feature high contrast from sensory background, behavioral stereotypy, or pattern repetition (Johnstone, 1997), which increase saliency of the performer’s state or of its immediate environment to an audience. Sometimes a conspicuous “alerting component” prefixes and attracts receivers’ attention to a less‐conspicuous signal (Johnstone, 1997). Through different combinations of these components, signals can advertise traits or states of the animal or its environment, and grade from relatively fixed and inflexible to complex and context‐dependent.
A signal can be defined as a derived feature or behavior (i.e., one that did not exist in ancestors) that adds little of direct adaptive value to the performer, but increases the influence on the receiver. Thus signaling can be described as “informing” to exert influence. On the other hand, receiving can be described as “interpreting” signals to reduce the receiver’s uncertainty about its environment. While signals first arise from incidental information exchanges between signalers and receivers, they become part of the species communicative repertoire either through phylogenetic (evolved) or ontogenetic (learned) processes if they benefit the individuals or their species. In this case, signaling features are made more noticeable through a process of “ritualization” where, the signaling feature is isolated, cued, exaggerated, and/or iterated (Johnstone, 1997).
16.2.2 The Evolution of Communicative Repertoires
Like any organism’s features, communicative traits evolve through natural selection (Darwin, 1876). However the evolution of signaling systems takes place simultaneously in both signalers and receivers, and their evolutionary trajectories are incomplete when considered from just one side.
Phylogeny of innate signals has historically been reconstituted by species observation. Darwin observed the expression of emotions across different taxa and found that certain human facial expressions have apparent counterparts in nonhuman primates and other mammals (Darwin, 1872). For example fear would be expressed by wide opening of the mouth and eyes, with upraised eyebrows in many primate species. Some sensory features of innate distress signals, such as aversively loud, irregularlypatterned vocalizations, or overwhelming and persistent scents, also appear to be highly conserved across taxa, both in terms of signaler behavior and receiver response including beating of the heart, trembling of the muscles and cold perspiration (Rendall, Owren, & Ryan, 2009). Studies of behavior in animals raised in isolation, or cross‐fostered with other species, directly addressed which signals are determined relatively more by phylogenetic evolution (i.e., are “innate”) or relatively more by ontogenetic learning. For example cross‐fostered galah cockatoos use their innate alarm calls but use contact calls from their adoptive species of pink cockatoos (Rowley & Chapman, 1986). Most commonly, “innate” signaling behaviors are involved in response to imminent danger, in courtship, and in parental care (Domb & Pagel, 2001; Ghazanfar & Santos, 2004).
The evolution of communication interacts with ontogenetic learning, whereby signal usage is refined during individuals’ development. For example young vervet monkeys refine their production of vocalizations, their use in appropriate circumstances, and the response to the vocalizations of others, by observation of adult vervets (Seyfarth & Cheney, 1986). In different species, ontogenetic learning can arise from habituation, Pavlovian association, trial‐and‐error learning, emulation, imitation, or even active teaching. Ontogenetic learning comprises at least two aspects: (1) developmental learning of species‐typical communication patterns and (2) signal production depending on context.
An example of species‐typical communication is the use of “semantic” or “referential” signals (Macedonia, 1990; Macedonia & Evans, 1993): Many vertebrates, ranging from monkeys to chickens, produce predator‐specific alarm calls (Gyger, Marler, & Pickert, 1987; Pereira & Macedonia, 1991; Seyfarth, Cheney, & Marler, 1980); rhesus monkeys produce specific calls for food discovery/ally recruitment (Gouzoules, Gouzoules, & Marler, 1984; Hauser & Marler, 1993); dolphins and parrots seem to “name” their peers (Wanker, Sugama, & Prinage, 2005; Janik, Sayigh, & Wells, 2006). In some cases animals combine these “semantic” signals in “syntax,” in which the meaning of an overall pattern of signals differs from that of their individual parts. Campbell (Ouattara, Lemasson, & Zuberbühler, 2009) or putty‐nosed (Arnold & Zuberbühler, 2006) monkeys exchange combinatorial alarm calls. Referential codes have been found to be probabilistic: A “leopard call” is used to signify the presence of a ground predator, but also on occasion in group conflicts—showing that referential calls are not truly indexical but rather are social constructs. However, learning of species‐typical signals is most evident on the receivers’ side. While alarm calls typically elicit an innate startle from most animals, in some species animals can learn to display specific responses to the type of predator‐specific alarm call they hear. For example, Belding’s ground squirrels learn to run to the trees at the sound of a “terrestrial predator call” and not to the bushes: They learn to display the same hiding behavior as when they notice the predator themselves (Mateo, 2010). When sight of a specific threat is reliably predicted by a heard and innately arousing signal, learning of an appropriate defensive response is greatly facilitated. In this case initial, innate response to a communication signal is supplemented by further ontogenetically learned association.
The second major type of ontogenetic learning concerns context flexibility in communication. Flexibility in signal production may be evidenced by the generalization of a single signal to novel contexts, by selection among multiple signals within a context, or by the ability to forego signaling in its typical context. Audience effects exemplify how a given signal can be modulated or even inhibited depending on context: Audience effects have been documented in species ranging from chickens (Marler, Dufty, & Pickert, 1986) to bats (Bohn, Smarsh, & Smotherman, 2013), monkeys (Cheney & Seyfarth, 1990; Gouzoules et al., 1984), and chimps (Crockford, Wittig, Roman, Mundry, & Zuberbühler, 2012), all of whom modulate intensity and type of signal production based on the presence of conspecifics. Flexibility is also evident in the signal reception. It includes responding to a same signal differentially depending on who emitted it in what context, as well as the ability to inhibit automatic responses and interpret semantic/syntactic signals. For example, alarm calls emitted by infant baboons will not only be detected and localized by adult baboons, but also interpreted. Adult baboons can infer from a vocalization’s acoustical properties the size, the age, and family line of the caller (Cheney & Seyfarth, 1999). With this perceptual ability, baboons respond flexibly to the received call by inhibiting their rescue responses and gazing instead at the juvenile’s mother. Here, a motivational cue (the call type) and kinship information (the juvenile’s voice) are encoded in the same auditory signal; in other cases, receivers may seek contextual information from other modalities, for example by looking toward the call source (Kirchhof & Hammerschmidt, 2006).
Biological and cultural mechanisms can both propagate communication systems across generations (Levinson, 2006). Until recently, there was little evidence for within‐species cultural variation, in the sense that all members of a species were observed to learn and use the same referential calls (Owren, Dieter, Seyfarth, & Cheney, 1993; Wheeler & Fischer, 2012). However, it is only recently that social practices of different groups of individuals within given species have been compared. For example, we now know the mating songs of humpback whales are shared by all males of a given population but differ between groups; moreover, songs can undergo cultural revolution when foreign singers join a local population (Noad, Cato, Bryden, Jenner, & Jenner, 2000). Cultural transmission can also apply to referential signals, as illustrated when a population of chimpanzees using a particular “food call” abandoned it upon merging with a new group, adopting the call of their hosts (Watson et al., 2015). Cultural transmission feeds back, in turn, upon the transmission of genes. For instance, female killer whales more readily accept sexual advances from males who sing a new dialect rather than males who sing their pod dialect (Baird & Whitehead, 2000; Rendell & Whitehead, 2001). Conversely, Darwin finches mate preferentially with individuals singing their own songs, facilitating speciation along cultural lines within a common environment (Grant & Grant, 1996).
16.3 Primates
Most primates gather to sleep in shared or clustered shelters, and live in groups characterized by complex family and friendship bonds, in which shifting alliances are important in both within‐ and between‐group competition. These conditions are paralleled by routine exchange of signals across multiple sensory channels and by the use of flexible communication, which we will describe below. As such, primates exhibit neural mechanisms for a diverse range of communication systems.
16.3.1 Communicative Repertoire
Like most mammals, primates communicate through olfactory signals found in urine, saliva, scat, vaginal secretions, and sometimes in specialized scent glands. Olfactory signaling is prominent among primates with scent glands, and continues to influence behavior in all species. While odors can travel passively through the air to reach a receiver who is close to the signaler, species with scent glands also engage in “scent marking”— actively applying odorants to environmental substrates to indicate territorial borders (Heymann, 2006). Olfactory signaling also plays an important role in primate courtship and intrasexual competition. Because olfactory signals reliably encode individual phenotype (Smith, 2006) by association with the major histocompatibility system (Knapp, Robson, & Waterhouse, 2006), they advertise signalers’ gender, kinship, dominance, and reproductive status. This information is used by receivers, primarily to guide mating behavior (e.g., kin avoidance: Boulet, Charpentier, & Drea, 2009; Olsson, Barnard, & Turri, 2006; Porter, 1998; Weisfeld, Czilli, Phillips, Gall, & Lichtman, 2003).
Touch is the first sense to develop in primates, and touching behavior during nursing and mating is later elaborated to include hugging, grooming, sociosexual behavior, and rough‐and‐tumble play, each of which have important roles in primate social attachment and normal development (Suomi, Harlow, & Kimball, 1971). Interpersonal touch is crucial for emotional well‐being and prosocial behavior in large primate groups, and its prototypical form is social grooming. Social grooming evolved from a cleaning behavior, originally used to remove ticks and parasites, into a social bonding ritual which plays an important role in reducing stress and cementing alliances (Dunbar, 1991, 2010). Conversely, aggressive touch (kicking, biting, slapping, or clasping) not only directly injures or impedes the target but also establishes psychological dominance. Both types of touch pattern relationships within the group (De Waal & Yoshihara, 1983; Judge & De Waal, 1997). In certain human and ape cultures, additional forms of touch have been ontogenetically ritualized (e.g., the handshake used as a greeting). Touch is susceptible to audience effects—sex, rank, kin, and available time influence social grooming (Schino, 2001; Seyfarth, 1977).
Primates also produce postural, orofacial, and manual behaviors which can be seen and/or heard by receivers. Gestural communication includes expressions (e.g. Atkinson, Dittrich, Gemmell, & Young, 2004; de Gelder, 2009; de Meijer, 1989; van Hooff, 1967), attention getters, and pointing.
- Attention‐getting and solicitation signals are used flexibly, are sensitive to the audience, and illustrate ontogenetic ritualization (Call & Tomasello, 2007); they are reportedly used deceptively in mangabeys (Coussi‐Korbel, 1994). These signals attract attention and may imply communicative intent.
- Expressions comprise facial and bodily postures, primarily expressing behavioral states of avoidance, affiliation, or aggression (Partan, 2002). In macaques, facial expressions from each category include the silent‐bared teeth display or “fear grin,” the lipsmack, and the open mouth threat, respectively (Parr & Heintz, 2009).
- Pointing signals defined broadly as ritualized orienting behaviors, include exaggerated stance, reach, or stare that coordinate attention (Meunier, Prieur, & Vauclair, 2013; Shepherd, 2010; Shepherd & Cappuccio, 2011).
Most primates have been shown to not only identify their peers’ displays but also the signalers’ status including dominance, kinship, group membership, age, gender, and reproductive status through their facial and body features (Deaner, Khera, & Platt, 2005; Gerald, Waitt, & Little, 2009; Mahajan et al., 2011; Pokorny & de Waal, 2009; Schell, Rieck, Schell, Hammerschmidt, & Fischer, 2011; Sliwa, Duhamel, Pascalis, Wirth, 2011; Waitt et al., 2003; Waitt, Gerald, Little, & Kraiselburd, 2006).
Facial and bodily expressions sometimes include vocal displays, and attention getters often include an acoustic component. The three main classes of expression are usually accompanied by different types of vocalizations. Species‐specific vocalizations can be produced automatically as a function of an individual’s behavioral state; however, call production and response to heard calls can also be modulated by audience, as shown through playback experiments (Cheney & Seyfarth, 1997; Seyfarth et al., 1980). Many primates have been shown to recognize the species, identity, group membership, size, age, and kinship of the signalers (Adachi, Kuwahata, Fujita, Tomonaga, & Matsuzawa, 2006; Bachorowski & Owren, 1999; Ghazanfar et al., 2007; Rendall, Rodman, & Emond, 1996; Sliwa et al., 2011), through the formants present in their calls (Fant, 1960; Fitch & Fritz, 2006).
Humans also communicate through language. Language is amodal: It can be spoken and heard, signed and seen, written and read (by eye or, in Braille, by touch). In language, arbitrary symbols are sequenced to encode conceptual relations, called semantically compositional syntax, which has only been found so far in language (Petkov & Wilson 2012; Wheeler & Fischer, 2012). It has been suggested that language ability was facilitated by increasing use of communicative gesture (Arbib, Liebal, & Pika, 2008; Shepherd & Cappuccio, 2011), and as an adaptive response to (or even a prerequisite for) ever‐increasing social complexity (Byrne, 1996; Dunbar, 1992). Language may also directly strengthen social alliances, since human seem largely to have replaced social grooming with small talk (Dunbar, 2010). Attempts to teach language to human‐raised great apes showed that, while they fail to learn humanlike grammar (Terrace, Petitto, Sanders, & Bever, 1979; Yang, 2013), they possess the ability to learn to use arbitrary signs and symbols to communicate. Field studies also described both functional reference and proto‐syntax in diverse nonhuman species.
Primates display and respond to a wide range of communicative signals, from fast automatic attention‐getters to complex, multimodal, and flexible referential signs—many are multisensory. How are these signals produced and perceived by primate brains? To what extent are the neural mechanisms of communication modular, and how are those modules organized? Only a few primate models have received intensive study (e.g., rhesus and long‐tailed macaques, marmosets, and humans), yet most primates likely possess specialized neural pathways dedicated to the production and perception of communicative signals.
16.3.2 Neural Mechanisms of Signal Production
In this section, we briefly discuss neural mechanisms by which two specific, relatively well‐studied primate signals are produced: facial expressions and vocalizations.
16.3.2.1 Producing facial expressions
One of the most distinctive features of primate communication, compared to that of other mammals, is the central importance of orofacial movements. Primates use a relatively conserved suite of facial muscles (Diogo, Wood, Aziz, & Burrows, 2009) to produce stereotyped species‐typical facial postures and movements that communicate behavioral state (Ekman, Sorenson, & Friesen, 1969; van Hooff, 1967). The facial muscles are controlled mainly by two cranial nerves: the trigeminal (V), which primarily innervates jaw muscles related to ingestion, and the facial (VII), which innervates the more superficial “mimetic” muscles. The final common output for facial motor control is the facial nucleus, which (uniquely in anthropoids) receives projections directly from motor cortex (Sherwood et al., 2005). In total, five interconnected facial motor representations exert cortical control over movement; they reside in the primary and supplemental, rostral and caudal cingulate, and ventrolateral prefrontal motor areas (Morecraft, Louie, Herrick, & Stilwell‐Morecraft, 2001).
Symptoms of human brain lesions suggest that neural governance of voluntary and automatic facial movements are dissociable (Hopf, Müller‐Forell, & Hopf, 1992; Morecraft, Stilwell–Morecraft, & Rossing, 2004). In volitional facial paralysis, patients are impaired at voluntarily moving their facial muscles, for example, in order to speak, but are spared when responding reflexively to emotionally provocative stimuli. This condition is associated with lesions in the motor cortical area M1 and its underlying white matter. The opposite condition is emotional facial paralysis, in which a patient ceases to produce spontaneous facial expressions on one side of the face, though they can voluntarily approximate them with effort. This condition is usually related to lesions involving the anterior cingulate cortex (ACC), insula, thalamus, striatocapsular region, and pons. These clinical reports suggest the existence of separate emotional and voluntary facial movement centers, in which nonspeech orofacial signaling is governed reflexively by medial motor representations, while speech movements are governed by goal‐directed ventrolateral representations.
The neural circuits which govern the production of different communicative facial movements across different social contexts are less well understood. Although limbic system activity is involved in the generation of facial expressions (Weintstein & Bender, 1943), its specific mechanisms remain elusive. Among the limbic and paralimbic areas related to facial movements, the anterior cingulate cortex (ACC) is one of the most studied. Stimulation of ACC can induce both affective facial expressions and vocalization (Smith, 1945), while lesions of ACC are associated with decreased spontaneous vocalization and flattened facial affect (Devinsky et al., 1995); neural activity related to affective vocalization has been recorded both in the anterior ACC gyrus and in the ACC sulcus near facial motor areas (West & Larson, 1995). Because the ACC is also strongly implicated in the processing of reward contingencies, it may transform perceived social risks and rewards into appropriately responsive social behavior (Behrens, Hunt, & Rushworth, 2009; Paus, 2001; Rushworth, Behrens, Rudebeck, & Walton, 2007). In particular, information encoded in the ACC gyrus, where lesions disrupt normal social valuation signals (Rudebeck, Buckley, Walton, & Rushworth, 2006), may influence the adjacent motor representations in the anterior cingulate sulcus.
The ACC is part of a wider network involved in governing adaptive behavior, including orbitofrontal cortex (OFC), premotor cortex, somatosensory cortex and the amygdala. Interestingly, it has been shown that the amygdala also initiates adaptive production of facial signals via its connections to the motor systems (Livneh, Resnik, Shohat, & Paz, 2012).
16.3.2.2 Producing vocalizations
Primate vocalization is the result of coordinated action by several separate effectors, including the diaphragm, chest muscles, ribs, lungs, larynx, and upper vocal cavity. Each of these effectors is responsive to sensory feedback and governed by specific neural nuclei. Respiratory muscles are controlled by motor neurons in the ventral horn of the thoracic and upper lumbar spinal cord, while internal laryngeal muscles are controlled by motor neurons in the nucleus ambiguus. Cell groups below the hypoglossal nucleus control the external laryngeal muscles, and the articulatory orofacial muscles are controlled by motor neurons of the facial nucleus, trigeminal nucleus, ventral horn of the uppermost cervical cord, nucleus ambiguus, and hypoglossal nucleus (see Jürgens (2009) for details). There are very few direct projections between these different motor neuron pools, which suggests that outside brain regions must act to coordinate them. Consistent with this hypothesis, anatomical, microstimulation, recording, and lesioning studies have identified the lateral reticular formation and nucleus retorambiguus as coordinating areas (Hage & Jürgens, 2006; Hannig & Jürgens, 2006; Jürgens, 2000; Lüthe et al., 2000). The reticular formation has direct connections to all phonatory motor neurons, and contains neurons that show activity correlated with the duration and frequency modulation of vocalizations (Kirzinger & Jüergens, 1991). The nucleus retroambiguus is a relay to respiratory centers with direct connections to parabrachial regions; it and the lateral reticular formation are reciprocally connected (Mantyh, 1982; Vanderhorst et al., 2000).
Both of these brainstem coordinating areas receive input from the periaquedutal gray (PAG). Electrical stimulation of PAG produces vocalizations resembling naturally produced species‐specific calls, different from the more artificial‐sounding vocalizations produced when the lateral reticular formation alone is stimulated (Jürgens & Ploog, 1970). More than half of the neurons in PAG increase their neural activity before vocal onset. Moreover, most of their neural activity is not correlated with the specific acoustic features of vocalizations (Dusterhoft, 2004). These data suggest that PAG works as a vocal initiator rather than a vocal pattern generator (Hage & Jürgens, 2006). PAG also works as a relay station for sensory‐motor interaction, having strong reciprocal connections to superior collicullus and paraleminiscal area. Finally, PAG receives several inputs from limbic structures, including the ACC, amygdala, hypothalamus, hippocampus, midline thalamus, nucleus accumbens, nucleus striae terminalis, preoptic areas, septum, and subcallosal gyrus (Dujardin & Jürgens, 2005, 2006). Lesioning these limbic structures abolishes or significantly reduces their spontaneous vocal production, which can be rescued by direct stimulation of PAG (Jürgens & Pratt, 1979). These structures are therefore likely to initiate and modulate the vocal production and communication, and are in turn likely modulated by activity elsewhere in frontal cortex (see Figure 16.1). This feedforward and hierarchical view is no doubt an oversimplification, not least because sensory feedbacks plays a crucial role at every step of vocal production: For example, if both vagal nerves are cut, disrupting somatosensory feedback from the lungs, then vocalization cannot be elicited by PAG stimulation (Nakazawa et al., 1997).
Although humans and nonhuman primates share these same basic anatomical and neural structures for signaling, some interesting differences are apparent. Humans produce more distinct sound elements than other primates, including distinct consonants and vowels, using specialized movements of lips, tongue, and respiratory muscles (Maclarnon & Hewitt, 2004). One theory holds that that the uniquely‐human “descended larynx,” which enlarged the human vocal tract relative to other primates, increased the variety of speech formants which could be produced by subtle tongue movement (Fitch, 2000). Moreover, major neural differences between nonhuman primates and humans may explain the latter’s relatively greater vocal flexibility. In particular, while extensive lesions to Broca’s area homologues do not significantly alter vocal production and communication in macaques (Kirzinger & Jürgens, 1982), they lead in humans to significant speech disorders, including mutism (Trupe et al., 2013) Electrical stimulation of ACC systematically induces vocalization in monkeys (Jürgens & Ploog, 1970; Robinson, 1967; Smith, 1945), but stimulation of ACC in humans has not produced reliable vocalization in humans (Pool, 1954; Pool & Ransohoff, 1949; Talairach et al., 1973), although there are reports of momentary speech arrest (Lewin & Whitty, 1960, and production of mirth and laughter (Caruana et al., 2015). Electrical stimulation of orofacial regions of human primary motor cortex consistently induces vocalization, but repeated attempts in monkeys failed (Jürgens, 1974; Jürgens & Ploog, 1970; Robinson, 1967). These data suggest that the neocortex exerts stronger control over human speech than other communicative signals shared across species, but the exact nature of this difference remains elusive. Even in nonhuman primates, it is likely that cortical centers contribute to the ability to modulate otherwise‐reflexive communicative signals based on learned contexts and contingencies.
The primates’ ability to produce diverse audiovisual signals necessitates machinery for processing these signals. This suggests the following questions: How do brains read communication signals? How are they processed at the neuronal level in the receiver’s brain? What features and feature conjunctions must they encode to do so?
16.3.3 Neural Mechanisms of Signal Perception
Far more is known about mechanisms for signal perception in primates than for signal production. Signal processing in primate brains begins in several pathways representing distinct sensory modalities—though we will see that these modalities are ultimately integrated. While reflexive responses (including orienting toward attention‐getters) are processed mainly through subcortical structures, flexible responses rely heavily on cortical processing.
16.3.3.1 Processing signals that influence attention
Attention‐getters indicate where a signal is sent from. They are processed rapidly by generic pathways, as illustrated by fast orientating to the location of an abrupt sound or movement such as a cleared throat or waved hand. This reflexive orienting behavior is computed in structures that map between sensory and motor reference frames (see Figure 16.2); the earliest include the inferior and superior colliculi (Jay & Sparks, 1984). Similarly, touch gestures, such as the “poke at” or “throw chips” gestures of chimpanzees, can be detected by somatosensory pathways and transformed into orienting coordinates, again by the superior colliculus (Groh & Sparks, 1996). These pathways are used in social signal processing, but are not specific to it. Interestingly, however, socially dedicated subcortical systems contribute to orienting responses (e.g., to faces and eyes), typically through the superior colliculus in concert with the pulvinar nucleus of the thalamus and the amygdala (Johnson, 2005; Sewards & Sewards, 2002; Vuilleumier, Armony, Driver, & Dolan, 2003).
While attention‐getters elicit fast reflexive behavior, they and other signals are concomitantly processed through relatively slow cortical pathways. Body, head, eye, and hand direction can trigger sophisticated attention shifts in the receiver. These flexible orientating behaviors require geometric computation of line‐of‐sight, and involve neuronal computation occurring in cortical networks. For gaze, this network includes the lateral intraparietal cortex (LIP), dorsal posterior inferotemporal corex (PITd) and the posterior superior temporal sulcus (STS) of macaques (Freiwald & Tsao, 2010; Marciniak, Atabaki, Dicke, & Thier, 2014; Roy, Shepherd, & Platt, 2012; Shepherd et al., 2009) and their putative homologues, the intraparietal sulcus and pSTS of humans (Hoffman & Haxby, 2000; Pelphrey, Morris, & McCarthy, 2005). These networks both read the orientation of the signaler and compute the orientation behavior to be undertaken by the receiver, a seeming “mirror mechanism” which will be discussed in more depth in §16.3.3.4.
16.3.3.2 Subcortical processing of communicative signals
Neural processing of phylogenetically conserved signals frequently involves subcortical networks of brain areas (see Figure 16.2) (Sewards & Sewards, 2002). A large body of work has examined the involvement of the amygdala in automatic processing of fear and other emotions in human facial displays (Breiter et al., 1996; Costafreda, Brammer, David, & Fu, 2008; Morris et al., 1996), voices (Dolan, Morris, & de Gelder et al., 2001; Fecteau, Belin, Joanette, & Armony, 2007; Sander & Scheich, 2001), and body postures and movements (De Gelder, Snyder, Greve, Gerard, & Hadjikhani, 2004). Studies in rhesus monkey suggest this pathway is broadly conserved (Gil‐da‐Costa et al., 2004; Hadj‐Bouziane et al., 2012; Hoffman, Gothard, Schmid, & Logothetis et al., 2007; Petkov et al., 2008), and examined the computations performed by individual amygdala neurons (Gothard, Battaglia, Erickson, Spitler, & Amaral, 2007; Kuraoka & Nakamura, 2006, 2007; Leonard, Rolls, Wilson, & Baylis, 1985).
One of the least understood social signaling systems in primate brains is chemosensory. Accessory olfaction is defined as the chemoreceptive system that employs the vomeronasal organ (VNO) and its distinct central projections to the accessory olfactory bulb (AOB) and limbic/cortical systems. Vomeronasal function is maintained within the Strepsirrhines and tarsiers, reduced in Platyrrhines, and mostly absent in the Catarrhines including humans (Evans, 2006). However, even in humans, in whom the VNO is entirely vestigial (Wyatt, 2014), some hormone‐like scents cause sex‐differentiated hypothalamic activation (Savic, Berglund, Gulyas, & Roland, 2001). Similarly, in marmosets, sexually arousing odors of ovulating monkeys enhanced activity in the preoptic area and anterior hypothalamus (Ferris et al., 2001), striatum, septum, periaqueductal gray, and cerebellum (Ferris et al., 2004) compared to the odors of ovariectomized monkeys.
16.3.3.3 Cortical networks processing communicative signals
Cortical networks enable extensive computation based on learned patterns and remembered experiences. In primates, this includes flexible analysis of what is expressed by a face, body or vocalization and integration with its social context (who is expressing it and why). The cortical processing pathways operate largely in parallel with subcortical pathways (see Figure 16.2) and advance hierarchically from primary sensory areas toward increasingly complex and integrative neural representations in the anterior temporal lobe and prefrontal cortex (Rosa & Tweedale, 2005).
Studies in the 1970s identified the existence of neurons preferentially active when the macaque subjects were seeing a face (Gross, Bender, & Rocha‐Miranda, 1969; Gross, Rocha‐Miranda, & Bender, 1972). These neurons were seemingly scattered throughout the temporal lobe, but are now known to cluster into a specialized cortical network for face perception (Bell et al., 2011; Tsao, Freiwald, Tootell., & Livingstone, 2006). This network has been imaged in humans (Haxby et al., 1994; Kanwisher, McDermott, & Chun, 1997Kriegeskorte, Formisano, Sorger, & Goebel, 2007; McCarthy, Puce, Gore, & Allison, 1997; Puce, Allison, Gore, & McCarthy, 1995; Sergent, Ohta, & MacDonald, 1992; Tsao, Moeller, & Freiwald, 2008), chimpanzees (Parr, Hecht, Barks, Preuss, & Votaw, 2009), macaques (Ku, Logothetis, & Goense, 2011; Logothetis, Guggenberger, Peled, & Pauls, 1999; Moeller, Freiwald, & Tsao, 2008; Tsao, Freiwald, Knutsen, Mandeville, & Tootell, 2003) and marmosets (Hung et al., 2015). In all of these species, it runs through similar regions of the temporal lobe, suggesting cortical specialization for face processing evolved prior to the split between New and Old World monkeys. However, significant differences exist: For instance, face‐processing areas discovered in the orbital and ventrolateral frontal cortex of macaques (Tsao, Schweers, Moeller, & Freiwald, 2008) are less evident in humans. The cortical areas devoted to faces are abutted and partially overlapped by patches specialized for bodies, as shown in both humans (Downing et al., 2001) and macaques (imaging: Pinsk, DeSimone, Moore, Gross, & Kastner, 2005; Pinsk et al., 2009; electrophysiology: Bell et al., 2011; Perrett, Smith, Mistlin et al., 1985; Popivanov, Jastorff, Vanduffel, & Vogels, 2014).
Voice processing pathways parallels that for face processing, running dorsally through the temporal lobe in humans (Belin, Zatorre, & Ahad, 2002; Belin, Zatorre, Lafaille, Ahad, & Pike, 2000; Binder et al., 2000; DéMonet, Jiang, Shuman, & Kanwisher, 1992), chimpanzees (Taglialatela, Russell, Schaeffer, & Hopkins, 2009), macaques (Ghazanfar & Rendall, 2008; Gil‐da‐Costa et al., 2004, 2006; Petkov et al., 2008; Poremba et al., 2004) and marmosets (Sadagopan, Temiz‐Karayol, & Voss, 2015). Voice‐selective “call detector” neurons were first recorded in squirrel and marmoset monkeys (Wang, 2000; Wang & Kadia, 2001; Wang, Merzenich, Beitel, & Schreiner, 1995). Voice‐selective neurons were next described in the macaque monkey (Kikuchi, Horwitz, & Mishkin, 2010; Rauschecker, Tian, & Hauser, 1995; Recanzone, 2008; Russ, Ackelson, Baker, & Cohen, 2008; Tian, Reser, Durham, Kustov, & Rauschecker, 2001), principally clustered within fMRI‐identified voice‐selective areas (Perrodin, Kayser, Logothetis, & Petkov, 2011) but also in the insula (Remedios, Logothetis, & Kayser, 2009a) and in prefrontal and orbitofrontal cortex (Cohen, Russ, Gifford, Kiringoda, & MacLean, 2004; Rolls, Critchley, Browning, & Inoue, 2006; Romanski & Goldman‐Rakic, 2002). These areas also contribute to the perception of non‐vocal auditory social signals: Macaque drumming (an auditory attention‐getter) activates both the amygdala and cortical “voice” areas in macaques (Remedios, Logothetis, & Kayser, 2009b).
Chemosensory and tactile cortical pathways have received far less exploration than those for vision and audition, but likely follow similar patterns for flexible odors and touch processing. Cortical processing of chemosensory signals occurs in piriform and insular cortex, extending into the anterior temporal and ventral frontal lobes (Gottfried, 2010), where they may interact with the auditory and visual social processing streams for more nuanced processing of behavioral relevance (Ferris et al., 2004; Pause, 2012). Cortical processing of somatosensory signals likely arises in somatosensory cortex in the parietal lobe, adjacent to and interacting with frontal motor cortex, while affective aspects of somatosensation appear to additionally involve representations in the posterior insula along with the orbitofrontal cortex, anterior insula and anterior cingulate (Gazzola et al., 2012; Gordon et al., 2011; Keysers, Kaas, & Gazzola, 2010).
16.3.3.3.1 Perceiving traits: identity and status.
How do neurons from these networks code for essential social information present in faces, bodies and voices about who is sending a signal (including identity, gender, age, dominance status, species, kinship, familiarity, sexual state, attractiveness, health state, etc., see Figure 16.2)?
Neuronal activity to faces becomes less dependent on specific viewpoint anteriorly in the temporal lobe (De Souza, Eifuku, Tamura, Nishijo, & Ono, 2005; Freiwald & Tsao, 2010; Perrett, Smith, Potter et al., 1985), suggesting that these anterior areas represent categorical or individual identity (Kriegeskorte et al., 2007; Rotshtein, Henson, Treves, Driver, & Dolan, 2005; Sergent et al., 1992). Within these areas, physical identity appears to be encoded into a “face‐map” in a norm‐based rather than template‐based manner (Freiwald, Tsao, & Livingstone, 2009; Leopold, Bondar, & Giese, 2006), likely built up based on faces perceived through experience (Sugita, 2008). Accordingly, person familiarity is processed by anterior temporal parts of face‐selective cortex (Eifuku, De Souza, Nakata, Ono, & Tamura, 2011; Eifuku, Nakata, Sugimori, Ono, & Tamura, 2010; Sliwa, Planté, Duhamel, & Wirth, 2016; Sugiura et al., 2001) and adjacent mnemonic hippocampal complex (Denkova, Botzung, & Manning, 2006; Leveroni et al., 2000; Sliwa et al., 2016). Voice areas, like face areas, appear to process identity in the most anterior regions of the cortical voice‐dedicated network in both humans (Andics, McQueen, Petersson, Gál, Rudas, & Vidnyánszky, 2010; Imaizumi et al., 1997; Nakamura et al., 2001) and macaques (Kikuchi et al., 2010; Perrodin et al., 2011; Petkov et al., 2008). Also, similarly, familiar voices activate the para‐hippocampal complex (Nakamura et al., 2001; Sliwa et al., 2014; von Kriegstein & Giraud, 2004). Processing identity from body features has been relatively less investigated, but since cortical body areas are located at similar positions in the visual processing hierarchy to the face areas, anterior body regions are expected to extract identity. Accordingly, the human fusiform body area has been found to process static more than dynamic aspects of body appearance (Vangeneugden, Peelen, Tadin, & Battelli, 2014).
Another property of the signaler that is processed by the receiver is the former’s attractiveness. Face attractiveness is associated in humans with orbitofrontal cortex, close to regions devoted to processing reward (Ishai, 2007; Kim, Adolphs, O’Doherty, & Shimojo, 2007; Nakamura et al., 1998). Similarly, bodily attractiveness engages orbitofrontal cortex in humans (Sescousse, Redouté, & Dreher, 2010) and may similarly modulate frontal‐lobe responses to bodies within other primates.