Sharpening of the sensory representation by attention in primary auditory cortex.
Cognitive Neuroscience and Schizophrenia Prog., Nathan Kline Institute, Orangeburg, UNITED STATES
A recent study (O’Connell et al., 2011) showed that neuronal oscillations in the primary auditory cortex (A1) of macaques can be ‘phase-reset’ by inputs related to pure tones, and that when this happens, the resultant (reset) phase depends on pitch: if the pure tone frequency corresponds to the best frequency (BF) of a given A1 region, ongoing oscillations are reset to their high excitability phase, while in the case of non-BF tones, oscillations are reset to their low excitability phase. The goal of our present study was to determine what, if any role, these effects play in enhancing the sensory representation of attended auditory stimulus streams.
We analyzed laminar profiles of synaptic activity and action potentials recorded via linear array multielectrodes positioned in A1 of 2 monkeys performing an intermodal selective attention task. Attending to the auditory modality required them to detect a frequency deviant within a rhythmic stream of standard pure tones, while ignoring stimuli in the visual modality. The frequency of the standard tones was parametrically varied across blocks in half octave steps between 0.3-32 kHz. Attending to the visual modality required them to fixate on a rhythmically flashing LED and detect a change in color of the LED while ignoring the same streams of tones.
Similar to the findings of a prior study in primary visual cortex, we found that low frequency oscillatory activity was modulated so that it could entrain to the timing of stimuli in the attended stimulus stream. In the condition where auditory stimuli were attended, aside from timing, the phase of entrainment also depended on pitch: if the frequency of pure tones in the attended stream matched the BF of a given A1 site, oscillations entrained at their high excitability phase, while if the attended tone frequency was a non-BF, oscillations entrained at their low excitability phase. Consequently, responses to attended BF tones were amplified, while responses to attended non-BF tones were suppressed, and the combination of these effects led to a sharpening of frequency tuning. The effects of attention on the amplitude and phase of higher frequency oscillations were more complex, apparently because of a phase-amplitude coupling of entrained delta-theta/gamma oscillations. These mechanisms outline a multidimensional filter role for ongoing oscillatory activity in A1 that enhances the sensory representation of attended auditory stimulus streams along both temporal and spectral feature dimensions.
A network analysis of speech perception in normals and aphasic stroke patients using dynamic causal modeling.
Teki, S1,2, Barnes, GR2, Penny, W2, Griffiths, TD1,2, and Leff, A.P3.
1 Newcastle Auditory Group, Medical School, Newcastle University, Newcastle-upon-Tyne, UK.
2 Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, UK.
3 Institute of Cognitive Neuroscience, University College London, UK.
Objective: Perception of speech and language is an important biological function that is mediated by a distributed network of primary (A1) and secondary (STG) auditory cortices that can be probed using mismatch paradigms (Schofield et al., 2009). We wished to investigate how network connectivity differed between normal subjects (n=17) and patients with chronic auditory perceptual deficits of speech caused by stroke (n=25), using a vowel mismatch paradigm and magnetoencephalography (MEG).
Stimuli: We used MEG and a standard paradigm to elicit mismatch responses (Nataanen, 1993) to a series of four vowels. The standard was of a consonant-vowel-consonant’ word /bart/. The three deviants were created by varying the frequency of the first and second formant of the vowel to produce: (D1) an acoustically different but within-class deviant /baart/; and two other deviants which were perceived as being in a different vowel category to the standard and D1: (D2) /burt/, and (D3) /beat/. We predicted that the phonemic deviants, D2 and D3 will elicit greater mismatch response than the acoustic deviant, D1.
Methods: Participants passively listened to the stimuli in an MEG scanner with 274 channels and third-order gradiometers (CTF Systems). The mismatch response typically occurs between 150 and 250 ms after stimulus onset and is considered an index of automatic change detection. The mismatch fields were fitted with a four dipole model with bilateral sources in A1 and STG using variational-Bayes equivalent current dipole algorithm (Kiebel et al., 2008).
MEG results: A single best source from each hemisphere was selected in a data-driven manner for the normal subjects. Mismatch responses from the fitted sources in both controls and aphasics show a significant main effect of deviancy and hemisphere such that the amplitude of response to D3 > D2 > D1 and the individual responses from the left hemisphere source were significantly greater than the corresponding responses from the right hemisphere source. Significantly, the mismatch responses in aphasics were as robust as in the control participants. The latency of the mismatch responses was significantly greater for D1 than D2 and D3 in both the hemispheres in both participant groups.
DCM results: To investigate modulation of connections of the hierarchical speech network as a function of phonemic deviancy (i.e., D3 and D3 vs. D1), we constructed 255 dynamic causal models (Friston et al., 2003) for each participant based on an inter-connected network of the four sources in bilateral primary auditory cortex and posterior STG. The resultant average Bayesian models in controls revealed a significant positive modulation of the self-connections of left A1 and left STG while the average Bayesian model in the aphasics showed a significant positive modulation of the self-connection of right A1 which may reflect adaptation and increased sensitivity to speech input in the right hemisphere at a lower level of hierarchy. Furthermore, the aphasics also showed increased modulation of the forward connections from A1 to STG in both hemispheres.
Conclusions: The DCM results are consistent with a predictive coding framework (Friston, 2011) in which forward connections carry bottom-up prediction errors from lower to higher levels of a hierarchical perceptual network. The aphasics’ speech network is characterized by impaired left hemisphere processing and greater prediction error which may underlie the neural bases of their impaired representation and perception of phonemes.
Friston KJ, Harrison L, Penny W (2003) Dynamic causal modelling. Neuroimage 19:1273-1302.
Friston K (2010) The free-energy principle: a unified brain theory’ Nat Rev Neurosci 11:127’138.
Kiebel SJ, Daunizeau J, Phillips C, Friston KJ (2008) Variational Bayesian inversion of the equivalent current dipole model in EEG/MEG. Neuroimage 39:728-741.
Näätänen R, Jacobsen T, Winkler I (2005) Memory-based or afferent processes in mismatch negativity (MMN): a review of the evidence. Psychophysiology 42:25-32.
Schofield TM, Iverson P, Kiebel SJ, Stephan KE, Kilner JM, Friston KJ, Crinion JT, Price CJ, Leff AP (2009) Changing meaning causes coupling changes within higher levels of the cortical hierarchy. Proc. Natl. Acad. Sci. U.S.A 106:11765-11770.
BOLD response patterns in right superior temporal sulcus predict category of musical stimuli
The Centre for Interdisciplinary Research in Music Media and Technology, McGill University, Montreal, CANADA
Categorical Perception (CP) in a phenomenon in which non-identical physical stimuli that have the same underlying perceptual meaning become invariantly represented by the brain. Psychophysical tasks have demonstrated CP to occur in the auditory domain, most famously in speech categorization, but also in the perception of musical categories such as intervals (e.g. minor, major). Various speech phoneme perception fMRI studies have implicated loci of activity in superior temporal cortex, most notably in the left superior temporal sulcus (STS), with the left bias arguably due to hemispheric dominance in language processing. An earlier fMRI study of ours (Klein & Zatorre, 2011) contrasted sounds along a minor/major continuum, as well as a category-free control spectrum, and highlighted regions of activity in the right STS, along with the left intraparietal sulcus (though with some bilateral activity).
This first analysis was unable to compare directly the neural responses underlying percepts of different categories (e.g. minor and major chords) as these sounds did not (and were not expected to) evoke differential activity levels in the brain. In this current study, to look more directly at the neural underpinnings of musical CP, we employed multivariate pattern analysis (MVPA). MVPA, which works by ‘decoding’ patterns of information in a signal, does not require a conventional control condition, instead comparing directly between neural patterns evoked by 2+ conditions of interest, even in absence of local activity level differences. We used an implementation of MVPA called ‘searchlight,’ in which classification is performed on small, spatially-constrained spheres of voxels, repeatedly throughout the cortex.
Our fMRI study enrolled musically-trained subjects who scored highly in a behavioral pretest measuring the degree of musical CP they exhibited. We implemented a slow event-related sparse temporal sampling protocol combined with a passive listening paradigm. In each trial, subjects heard a specific melodic musical interval (minor, major or perfect intervals with base notes of middle C, C# or D). As only one interval type was presented per trial, each individual T2* volume could be assigned to a particular category label used to train a classifier. Searchlight analyses were performed in each subjects’ native space using spheres of approximately 125 voxels, covering cortex and surrounding tissue. First-level analyses were performed on unsmoothed data, as spatial averaging, considered necessary for univariate analyses, removes fine patterns that are required for accurate classification. After first-level MVPA, each subject’s resulting “information map” was spatially normalized, smoothed, and input to SPM8, which was used to perform a group analysis. Major peaks were observed in the right STS and the bilateral IPS.
There was an obvious general correspondence between these MVPA results and univariate peaks highlighted in our earlier study (in both the STS and IPS). As the searchlights spanned the entire cortex, our analyses did not examine only these pre-defined regions, which highlights their likely involvement above and beyond all other cortical regions. Taken together with our earlier findings, we believe that the right STS/ bilateral IPS are heavily involved in CP of musical sounds, as these regions are more activated for categorically-perceived musical stimuli than for acoustically-matched control sounds, and contain distinct information patterns for specific sound categories. While most auditory CP studies look only at speech (and have primarily found loci of activity specific to left superior temporal cortex), our research suggests the presence of a network that does not rely as critically upon left superior temporal circuitry and is recruited for music-specific categorical processing.
The STS is thought to be part of the auditory system’s “ventral stream” and may perform signal processing that is of a “higher-order” nature than that which occurs in the superior temporal plane (including primary auditory cortex). Thus, the right STS could, in theory, be the location of musical interval category maps that are represented by population coding of neurons in the region. Specific categorical outputs from this region could then be passed on to even higher-order areas, putatively involved in processing of melody, key, musical grammar, etc. Separately, the right (and to a lesser extent, the left) IPS has been implicated in certain relative pitch judgements, including melody transposition (see Foster & Zatorre, 2010). Thus, the IPS may play a role in determining the positioning/ordering of certain sounds (e.g. major intervals use notes exactly one semi-tone further along the 12-tone scale than minor intervals). Such processing jibes with literature linking the IPS to mathematical calculations and spatial judgements. The temporal specificity limits of fMRI do not currently allow us to determine if the the STS and IPS are working serially, in parallel, or as distinct processing networks.
Adaptive reweighting of spectral cues to sound location following developmental hearing loss in one ear.
Peter Keating, Johannes C. Dahmen, and Andrew J. King
University of Oxford, Oxford, UNITED KINGDOM
Although sensory experience is critical for shaping and refining the way in which neural systems develop, the mechanisms that enable them to adapt to their environment are not yet fully understood. In the auditory system, valuable insights into these mechanisms have been gained by inducing a reversible unilateral hearing loss during development, a manipulation that can produce dramatic changes in the cues used for sound localization. Whereas studies in barn owls that have been raised with one ear occluded suggest that the auditory system learns to use abnormal auditory spatial cues in order to localize sound, analogous studies in mammals point toward ‘amblyaudio’, a condition in which inputs from the affected ear are ignored. Consistent with this view, recent studies of adult plasticity in mammals highlight the importance of spectral cues to sound location, suggesting that adaptation to a unilateral hearing loss may involve learning to rely more on the unaltered spectral cues available to the intact ear. It is unclear, however, how the neural processing of spectral cues is affected by unilateral hearing loss during development.
In this study, we investigated the behavioural and physiological consequences of rearing ferrets with an earplug in the left ear. By measuring the ability of the animals to localize broadband stimuli with randomized spectra, we found that their behavioural responses displayed a dependence on high-frequency features of the source spectra that mirrored the spectral cues available to the intact ear. A similar dependence was not observed in ferrets raised and tested with normal hearing.
To investigate the neural mechanisms involved, we made bilateral recordings from the primary auditory cortex (A1) of juvenile-plugged and normally-reared ferrets and presented two sets of virtual acoustic space (VAS) stimuli, one of which incorporated a normal set of sound localization cues, while the other recreated a virtual earplug in the left ear. Mutual information analyses were used to measure sensitivity to the spectral cues provided to the left and right ears, with these cues varied concurrently and independently. In each hemifield, we found that sensitivity to left-ear spectral cues was significantly degraded by a virtual earplug in all animals, with juvenile-plugged animals showing significantly worse sensitivity than normally-reared animals overall. In the case of the right-ear spectral cues, no significant effects of group or VAS condition were found for the right hemifield. In the left hemifield, however, the effect of a virtual earplug differed significantly between juvenile-plugged and normally-reared animals, and produced a much greater reduction in sensitivity to right-ear spectral cues in normally-reared animals. Overall, we therefore found that a unilateral virtual earplug produces an immediate degradation of spectral sensitivity in normally-reared animals, with juvenile-plugged animals showing a further degradation of sensitivity to left-ear spectral cues relative to the normally-reared controls whilst exhibiting representations of right-ear spectral cues that are robust with respect to a virtual unilateral hearing loss. These effects combine in juvenile-plugged ferrets to increase the relative weight given to the right-ear spectral cues by A1 neurons following the introduction of a virtual earplug in the left ear, thereby providing an explanation for the increased behavioural sensitivity of the animals to these cues.
Taken together, these results suggest that animals raised with a unilateral hearing loss learn to rely more on the unaltered spectral cues available to the intact ear, and to ignore the unreliable spectral cues provided by the affected ear, a finding consistent with theories of cue integration in which cue weights are determined by the relative reliability of each cue.
Organization of phonetic feature representation in the human superior temporal gyrus
University of California, San Francisco, San Francisco, UNITED STATES
High order auditory cortical regions extract relevant features from auditory input to form abstract phonemic representations. The nature of acoustic features representing in the posterior STG are incompletely understood. Here, we used electrocorticographic (ECoG) recordings from the human superior temporal gyrus (STG) during allophonic speech perception of consonant-vowel (CV) syllables to determine the organization of neural response to distinct phonetic information. We found a clear spatiotemporal representation of phonetic information in STG. Neural responses reveal a highly conserved organized structure of phonetic features in the STG. Decoding of single-trial cortical activity correlate well with the ability of humans to discriminate the same phonemes, suggesting that human perceptual accuracy can be accounted for by neural response properties. These results offer new insights to high order speech processing at a local mechanistic scale.
A Cognitive Framework for Subcortical Auditory Processing Enhancements in Musicians
Dana Strait1, Nina Kraus1,
Northwestern University, Evanston, UNITED STATES1,
Auditory training in the form of music engenders auditory perceptual acuity and robust speech-evoked auditory brainstem activity; cognitive contributions to these sensory enhancements have been proposed but are not yet established. Here, we asked whether musical training confers advantages in the subcortical differentiation of the closely-related stop consonants /ba/ and /ga/ using a cross-sectional developmental design in participants ages 3-5 8-13 and 18-35. We hypothesized that musical training enhances the subcortical differentiation of speech by means of strengthened cognitive control over auditory processing. Specifically, we anticipated that auditory cognitive differences between musicians and nonmusicians would developmentally precede subcortical response distinctions. Accordingly, the youngest musician children would demonstrate auditory cognitive but not subcortical enhancements for sound processing. By employing cross-phase analyses to objectively measure the degree to which subcortical response timing differs to the closely-related speech syllables /ba/ and /ga/, we reveal that older child and adult musicians have enhanced subcortical differentiation of stop consonants. Musicians in all three age groups demonstrate strengthened auditory cognitive performance, compared to nonmusicians. Correlations among auditory cognitive abilities and the extent of subcortical stop consonant discrimination are interpreted according to a corticofugal framework for auditory learning in musicians, in which sensory processing enhancements in the brainstem are driven in a top-down manner by strengthened cognitive control over basic auditory response properties. Outcomes are presented in the context of previous results from our group that support cognitive contributions to subcortical auditory function in humans with extensive auditory training backgrounds. Given that musical training relies on learning to associate slight acoustic discrepancies with behavioral significance and is dependent on memory, attention and emotional engagement, musical training may provide a remarkable avenue for inducing auditory plasticity in humans, especially during developmental years.
Funded by the National Institutes of Health grant F31DC011457-01 to D.S. and the National Science Foundation grant 0921275 to N.K.