The integration of continuous audio and visual speech in a cocktail-party environment depends on attention

被引：4

作者：

Ahmed, Farhin ^{[1
,2
]}

Nidiffer, Aaron R. ^{[1
,2
]}

O'Sullivan, Aisling E. ^{[1
,2
,3
,4
]}

Zuk, Nathaniel J. ^{[5
]}

Lalor, Edmund C. ^{[1
,2
,3
,4
]}

机构：

[1] Univ Rochester, Dept Biomed Engn, Dept Neurosci, Rochester, NY 14627 USA

[2] Univ Rochester, Del Monte Inst Neurosci, Rochester, NY 14627 USA

[3] Trinity Coll Dublin, Trinity Ctr Biomed Engn, Sch Engn, Dublin 2, Ireland

[4] Trinity Coll Dublin, Trinity Coll Inst Neurosci, Dublin 2, Ireland

[5] Hebrew Univ Jerusalem, Edmond & Lily Safra Ctr Brain Sci, Jerusalem, Israel

来源：

NEUROIMAGE | 2023年 / 274卷

基金：

爱尔兰科学基金会;

关键词：

Multisensory integration; Speech; Cocktail party; Hierarchical processing; AUDIOVISUAL SPEECH; MULTISENSORY INTEGRATION; SELECTIVE ATTENTION; AUDITORY-CORTEX; INFORMATION; DIRECTION; TRACKING; OBJECTS; HEAR;

D O I：

10.1016/j.neuroimage.2023.120143

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

In noisy environments, our ability to understand speech benefits greatly from seeing the speaker's face. This is attributed to the brain's ability to integrate audio and visual information, a process known as multisensory inte-gration. In addition, selective attention plays an enormous role in what we understand, the so-called cocktail-party phenomenon. But how attention and multisensory integration interact remains incompletely understood, partic-ularly in the case of natural, continuous speech. Here, we addressed this issue by analyzing EEG data recorded from participants who undertook a multisensory cocktail-party task using natural speech. To assess multisensory integration, we modeled the EEG responses to the speech in two ways. The first assumed that audiovisual speech processing is simply a linear combination of audio speech processing and visual speech processing (i.e., an A + V model), while the second allows for the possibility of audiovisual interactions (i.e., an AV model). Applying these models to the data revealed that EEG responses to attended audiovisual speech were better explained by an AV model, providing evidence for multisensory integration. In contrast, unattended audiovisual speech responses were best captured using an A + V model, suggesting that multisensory integration is suppressed for unattended speech. Follow up analyses revealed some limited evidence for early multisensory integration of unattended AV speech, with no integration occurring at later levels of processing. We take these findings as evidence that the integration of natural audio and visual speech occurs at multiple levels of processing in the brain, each of which can be differentially affected by attention.

引用

下载

页数：13

共 50 条

[1] Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Chao, Guan-Lin
Chan, William
Lane, Ian
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2120 - 2124
[2] Electrophysiological attention effects in a virtual cocktail-party setting
Muente, Thomas F.
Spring, Doerte K.
Szycik, Gregor R.
Noesselt, Toemme
BRAIN RESEARCH, 2010, 1307 : 78 - 88
[3] Using visual speech at the cocktail-party: CNV evidence for early speech extraction in younger and older adults
Begau, Alexandra
Arnau, Stefan
Klatt, Laura-Isabelle
Wascher, Edmund
Getzmann, Stephan
HEARING RESEARCH, 2022, 426
[4] Automatic speech recognition in cocktail-party situations: A specific training for separated speech
Marti, Amparo
Cobos, Maximo
Lopez, Jose J.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (02): : 1529 - 1535
[5] Audio-Visual Multi-Talker Speech Recognition in A Cocktail Party
Wu, Yifei
Hi, Chenda
Yang, Song
Wu, Zhongqin
Qian, Yanmin
INTERSPEECH 2021, 2021, : 3021 - 3025
[6] Speaking rhythmically improves speech recognition under "cocktail-party" conditions
Wang, Mengyuan
Kong, Lingzhi
Zhang, Changxin
Wu, Xihong
Li, Liang
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (04): : EL255 - EL259
[7] Speech recognition by bilateral cochlear implant users in a cocktail-party setting
Loizou, Philipos C.
Hu, Yi
Litovsky, Ruth
Yu, Gongqiang
Peters, Robert
Lake, Jennifer
Roland, Peter
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (01): : 372 - 383
[8] Selective Attention Enhances Beta-Band Cortical Oscillation to Speech under "Cocktail-Party" Listening Conditions
Gao, Yayue
Wang, Qian
Ding, Yu
Wang, Changming
Li, Haifeng
Wu, Xihong
Qu, Tianshu
Li, Liang
FRONTIERS IN HUMAN NEUROSCIENCE, 2017, 11
[9] Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation
Li, Chenda
Qian, Yanmin
INTERSPEECH 2020, 2020, : 1426 - 1430
[10] Effects of age on electrophysiological correlates of speech processing in a dynamic "cocktail-party" situation
Getzmann, Stephan
Hanenberg, Christina
Lewald, Joerg
Falkensteinand, Michael
Wascher, Edmund
FRONTIERS IN NEUROSCIENCE, 2015, 9

← 1 2 3 4 5 →