Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception

被引：61

作者：

O'Sullivan, James ^{[1
]}

Herrero, Jose ^{[3
,4
]}

Smith, Elliot ^{[2
,5
]}

Schevon, Catherine ^{[2
]}

McKhann, Guy M. ^{[2
]}

Sheth, Sameer A. ^{[2
,6
]}

Mehta, Ashesh D. ^{[3
,4
]}

Mesgarani, Nima ^{[1
]}

机构：

[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA

[2] Neurol Inst, Dept Neurol Surg, 710 W 168Th St, New York, NY 10032 USA

[3] Hofstra Northwell Sch Med, Dept Neurosurg, Manhasset, NY USA

[4] Feinstein Inst Med Res, Manhasset, NY USA

[5] Univ Utah, Dept Neurosurg, Salt Lake City, UT USA

[6] Baylor Coll Med, Dept Neurosurg, Houston, TX 77030 USA

来源：

NEURON | 2019年 / 104卷 / 06期

关键词：

SPECTROTEMPORAL RECEPTIVE-FIELDS; TASK-RELATED PLASTICITY; CORTICAL REPRESENTATION; COCKTAIL PARTY; COMPLEX SOUNDS; HUMAN CORE; CORTEX; ORGANIZATION; EMERGENCE; FEATURES;

D O I：

10.1016/j.neuron.2019.09.007

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Humans can easily focus on one speaker in a multi-talker acoustic environment, but how different areas of the human auditory cortex (AC) represent the acoustic components of mixed speech is unknown. We obtained invasive recordings from the primary and nonprimary AC in neurosurgical patients as they listened to multi-talker speech. We found that neural sites in the primary AC responded to individual speakers in the mixture and were relatively unchanged by attention. In contrast, neural sites in the nonprimary AC were less discerning of individual speakers but selectively represented the attended speaker. Moreover, the encoding of the attended speaker in the nonprimary AC was invariant to the degree of acoustic overlap with the unattended speaker. Finally, this emergent representation of attended speech in the nonprimary AC was linearly predictable from the primary AC responses. Our results reveal the neural computations underlying the hierarchical formation of auditory objects in human AC during multi-talker speech perception.

引用

页码：1195 / +

页数：18

共 50 条

[1] Selective cortical representation of attended speaker in multi-talker speech perception
Nima Mesgarani
Edward F. Chang
[J]. Nature, 2012, 485 : 233 - 236
[2] Selective cortical representation of attended speaker in multi-talker speech perception
Mesgarani, Nima
Chang, Edward F.
[J]. NATURE, 2012, 485 (7397) : 233 - U118
[3] Auditory spatial cuing for speech perception in a dynamic multi-talker environment
Tomoriova, Beata
Kopco, Norbert
[J]. 2008 6TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS, 2008, : 230 - 233
[4] Auditory masking of speech in reverberant multi-talker environments
Weller, Tobias
Buchholz, Joerg M.
Best, Virginia
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 139 (03): : 1303 - 1313
[5] Hierarchical Variational Loopy Belief Propagation for Multi-talker Speech Recognition
Rennie, Steven J.
Hershey, John R.
Olsen, Peder A.
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 176 - 181
[6] Target Speaker Verification With Selective Auditory Attention for Single and Multi-Talker Speech
Xu, Chenglin
Rao, Wei
Wu, Jibin
Li, Haizhou
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2696 - 2709
[7] Speech prosody supports speaker selection and auditory stream segregation in a multi-talker situation
Kovacs, Petra
Toth, Brigitta
Honbolygo, Ferenc
Szalardy, Orsolya
Kohari, Anna
Mady, Katalin
Magyari, Lilla
Winkler, Istvan
[J]. BRAIN RESEARCH, 2023, 1805
[8] The perception of acoustically distorted speech produced with face masks in multilingual multi-talker environments
Chiu, Faith
Bartoseviciute, Laura
Lee, Albert
Yao, Yujia
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
[9] Recognizing Multi-talker Speech with Permutation Invariant Training
Yu, Dong
Chang, Xuankai
Qian, Yanmin
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2456 - 2460
[10] Modeling speech localization, talker identification, and word recognition in a multi-talker setting
Josupeit, Angela
Hohmann, Volker
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (01): : 35 - 54

← 1 2 3 4 5 →