Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception

被引：61

作者：

O'Sullivan, James ^{[1
]}

Herrero, Jose ^{[3
,4
]}

Smith, Elliot ^{[2
,5
]}

Schevon, Catherine ^{[2
]}

McKhann, Guy M. ^{[2
]}

Sheth, Sameer A. ^{[2
,6
]}

Mehta, Ashesh D. ^{[3
,4
]}

Mesgarani, Nima ^{[1
]}

机构：

[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA

[2] Neurol Inst, Dept Neurol Surg, 710 W 168Th St, New York, NY 10032 USA

[3] Hofstra Northwell Sch Med, Dept Neurosurg, Manhasset, NY USA

[4] Feinstein Inst Med Res, Manhasset, NY USA

[5] Univ Utah, Dept Neurosurg, Salt Lake City, UT USA

[6] Baylor Coll Med, Dept Neurosurg, Houston, TX 77030 USA

来源：

NEURON | 2019年 / 104卷 / 06期

关键词：

SPECTROTEMPORAL RECEPTIVE-FIELDS; TASK-RELATED PLASTICITY; CORTICAL REPRESENTATION; COCKTAIL PARTY; COMPLEX SOUNDS; HUMAN CORE; CORTEX; ORGANIZATION; EMERGENCE; FEATURES;

D O I：

10.1016/j.neuron.2019.09.007

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Humans can easily focus on one speaker in a multi-talker acoustic environment, but how different areas of the human auditory cortex (AC) represent the acoustic components of mixed speech is unknown. We obtained invasive recordings from the primary and nonprimary AC in neurosurgical patients as they listened to multi-talker speech. We found that neural sites in the primary AC responded to individual speakers in the mixture and were relatively unchanged by attention. In contrast, neural sites in the nonprimary AC were less discerning of individual speakers but selectively represented the attended speaker. Moreover, the encoding of the attended speaker in the nonprimary AC was invariant to the degree of acoustic overlap with the unattended speaker. Finally, this emergent representation of attended speech in the nonprimary AC was linearly predictable from the primary AC responses. Our results reveal the neural computations underlying the hierarchical formation of auditory objects in human AC during multi-talker speech perception.

引用

页码：1195 / +

页数：18

共 50 条

[31] EFFECTS OF MULTI-TALKER COMPETING SPEECH ON THE VARIABILITY OF THE CALIFORNIA CONSONANT TEST
SURR, RK
SCHWARTZ, DM
[J]. EAR AND HEARING, 1980, 1 (06): : 319 - 323
[32] Chinese speech identification in multi-talker babble with diotic and dichotic listening
Peng JianXin
Zhang HongHu
Wang ZiYou
[J]. CHINESE SCIENCE BULLETIN, 2012, 57 (20): : 2548 - 2553
[33] Speech-derived haptic stimulation enhances speech recognition in a multi-talker background
Rautu, I. Sabina
De Tiege, Xavier
Jousmaki, Veikko
Bourguignon, Mathieu
Bertels, Julie
[J]. SCIENTIFIC REPORTS, 2023, 13 (01)
[34] Speech-derived haptic stimulation enhances speech recognition in a multi-talker background
I. Sabina Răutu
Xavier De Tiège
Veikko Jousmäki
Mathieu Bourguignon
Julie Bertels
[J]. Scientific Reports, 13
[35] Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception
Patel, Prachi
van der Heijden, Kiki
Bickel, Stephan
Herrero, Jose L.
Mehta, Ashesh D.
Mesgarani, Nima
[J]. CURRENT BIOLOGY, 2022, 32 (18) : 3971 - +
[36] Single-channel multi-talker speech recognition with permutation invariant training
Qian, Yanmin
Chang, Xuankai
Yu, Dong
[J]. SPEECH COMMUNICATION, 2018, 104 : 1 - 11
[37] MULTI-MICROPHONE NEURAL SPEECH SEPARATION FOR FAR-FIELD MULTI-TALKER SPEECH RECOGNITION
Yoshioka, Takuya
Erdogan, Hakan
Chen, Zhuo
Alleva, Fil
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5739 - 5743
[38] Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition
Weng, Chao
Yu, Dong
Seltzer, Michael L.
Droppo, Jasha
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (10) : 1670 - 1679
[39] The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene
Rimmele, Johanna M.
Golumbic, Elana Zion
Schroeger, Erich
Poeppel, David
[J]. CORTEX, 2015, 68 : 144 - 154
[40] SURT 2.0: Advances in Transducer-Based Multi-Talker Speech Recognition
Raj, Desh
Povey, Daniel
Khudanpur, Sanjeev
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3800 - 3813

← 1 2 3 4 5 →