Selective cortical representation of attended speaker in multi-talker speech perception

被引：602

作者：

Mesgarani, Nima ^{[1
,2
]}

Chang, Edward F. ^{[1
,2
]}

机构：

[1] Univ Calif San Francisco, UCSF Ctr Integrat Neurosci, Dept Neurol Surg, San Francisco, CA 94143 USA

[2] Univ Calif San Francisco, UCSF Ctr Integrat Neurosci, Dept Physiol, San Francisco, CA 94143 USA

来源：

NATURE | 2012年 / 485卷 / 7397期

基金：

美国国家卫生研究院;

关键词：

D O I：

10.1038/nature11020

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Humans possess a remarkable ability to attend to a single speaker's voice in a multi-talker background(1-3). How the auditory system manages to extract intelligible speech under such acoustically complex and adverse listening conditions is not known, and, indeed, it is not clear how attended speech is internally represented(4,5). Here, using multi-electrode surface recordings from the cortex of subjects engaged in a listening task with two simultaneous speakers, we demonstrate that population responses in non-primary human auditory cortex encode critical features of attended speech: speech spectrograms reconstructed based on cortical responses to the mixture of speakers reveal the salient spectral and temporal features of the attended speaker, as if subjects were listening to that speaker alone. A simple classifier trained solely on examples of single speakers can decode both attended words and speaker identity. We find that task performance is well predicted by a rapid increase in attention-modulated neural selectivity across both single-electrode and population-level cortical responses. These findings demonstrate that the cortical representation of speech does not merely reflect the external acoustic environment, but instead gives rise to the perceptual aspects relevant for the listener's intended goal.

引用

页码：233 / U118

页数：5

共 50 条

[1] Selective cortical representation of attended speaker in multi-talker speech perception
Nima Mesgarani
Edward F. Chang
Nature, 2012, 485 : 233 - 236
[2] Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception
O'Sullivan, James
Herrero, Jose
Smith, Elliot
Schevon, Catherine
McKhann, Guy M.
Sheth, Sameer A.
Mehta, Ashesh D.
Mesgarani, Nima
NEURON, 2019, 104 (06) : 1195 - +
[3] Target Speaker Verification With Selective Auditory Attention for Single and Multi-Talker Speech
Xu, Chenglin
Rao, Wei
Wu, Jibin
Li, Haizhou
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2696 - 2709
[4] Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Lu, Liang
Kanda, Naoyuki
Li, Jinyu
Gong, Yifan
INTERSPEECH 2021, 2021, : 1782 - 1786
[5] Multi-Channel Speaker Verification for Single and Multi-talker Speech
Kataria, Saurabh
Zhang, Shi-Xiong
Yu, Dong
INTERSPEECH 2021, 2021, : 4608 - 4612
[6] Speaker Identification in Multi-Talker Overlapping Speech Using Neural Networks
Tran, Van-Thuan
Tsai, Wei-Ho
IEEE ACCESS, 2020, 8 : 134868 - 134879
[7] Target Speaker Extraction for Multi-Talker Speaker Verification
Rao, Wei
Xu, Chenglin
Chng, Eng Siong
Li, Haizhou
INTERSPEECH 2019, 2019, : 1273 - 1277
[8] Selective spatial attention in lateralized multi-talker speech perception: EEG correlates and the role of age
Getzmann, Stephan
Schneider, Daniel
Wascher, Edmund
NEUROBIOLOGY OF AGING, 2023, 126 : 1 - 13
[9] Auditory spatial cuing for speech perception in a dynamic multi-talker environment
Tomoriova, Beata
Kopco, Norbert
2008 6TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS, 2008, : 230 - 233
[10] Speech prosody supports speaker selection and auditory stream segregation in a multi-talker situation
Kovacs, Petra
Toth, Brigitta
Honbolygo, Ferenc
Szalardy, Orsolya
Kohari, Anna
Mady, Katalin
Magyari, Lilla
Winkler, Istvan
BRAIN RESEARCH, 2023, 1805

← 1 2 3 4 5 →