The integration of continuous audio and visual speech in a cocktail-party environment depends on attention

被引:4
|
作者
Ahmed, Farhin [1 ,2 ]
Nidiffer, Aaron R. [1 ,2 ]
O'Sullivan, Aisling E. [1 ,2 ,3 ,4 ]
Zuk, Nathaniel J. [5 ]
Lalor, Edmund C. [1 ,2 ,3 ,4 ]
机构
[1] Univ Rochester, Dept Biomed Engn, Dept Neurosci, Rochester, NY 14627 USA
[2] Univ Rochester, Del Monte Inst Neurosci, Rochester, NY 14627 USA
[3] Trinity Coll Dublin, Trinity Ctr Biomed Engn, Sch Engn, Dublin 2, Ireland
[4] Trinity Coll Dublin, Trinity Coll Inst Neurosci, Dublin 2, Ireland
[5] Hebrew Univ Jerusalem, Edmond & Lily Safra Ctr Brain Sci, Jerusalem, Israel
基金
爱尔兰科学基金会;
关键词
Multisensory integration; Speech; Cocktail party; Hierarchical processing; AUDIOVISUAL SPEECH; MULTISENSORY INTEGRATION; SELECTIVE ATTENTION; AUDITORY-CORTEX; INFORMATION; DIRECTION; TRACKING; OBJECTS; HEAR;
D O I
10.1016/j.neuroimage.2023.120143
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
In noisy environments, our ability to understand speech benefits greatly from seeing the speaker's face. This is attributed to the brain's ability to integrate audio and visual information, a process known as multisensory inte-gration. In addition, selective attention plays an enormous role in what we understand, the so-called cocktail-party phenomenon. But how attention and multisensory integration interact remains incompletely understood, partic-ularly in the case of natural, continuous speech. Here, we addressed this issue by analyzing EEG data recorded from participants who undertook a multisensory cocktail-party task using natural speech. To assess multisensory integration, we modeled the EEG responses to the speech in two ways. The first assumed that audiovisual speech processing is simply a linear combination of audio speech processing and visual speech processing (i.e., an A + V model), while the second allows for the possibility of audiovisual interactions (i.e., an AV model). Applying these models to the data revealed that EEG responses to attended audiovisual speech were better explained by an AV model, providing evidence for multisensory integration. In contrast, unattended audiovisual speech responses were best captured using an A + V model, suggesting that multisensory integration is suppressed for unattended speech. Follow up analyses revealed some limited evidence for early multisensory integration of unattended AV speech, with no integration occurring at later levels of processing. We take these findings as evidence that the integration of natural audio and visual speech occurs at multiple levels of processing in the brain, each of which can be differentially affected by attention.
引用
下载
收藏
页数:13
相关论文
共 50 条
  • [1] Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
    Chao, Guan-Lin
    Chan, William
    Lane, Ian
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2120 - 2124
  • [2] Electrophysiological attention effects in a virtual cocktail-party setting
    Muente, Thomas F.
    Spring, Doerte K.
    Szycik, Gregor R.
    Noesselt, Toemme
    BRAIN RESEARCH, 2010, 1307 : 78 - 88
  • [3] Using visual speech at the cocktail-party: CNV evidence for early speech extraction in younger and older adults
    Begau, Alexandra
    Arnau, Stefan
    Klatt, Laura-Isabelle
    Wascher, Edmund
    Getzmann, Stephan
    HEARING RESEARCH, 2022, 426
  • [4] Automatic speech recognition in cocktail-party situations: A specific training for separated speech
    Marti, Amparo
    Cobos, Maximo
    Lopez, Jose J.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (02): : 1529 - 1535
  • [5] Audio-Visual Multi-Talker Speech Recognition in A Cocktail Party
    Wu, Yifei
    Hi, Chenda
    Yang, Song
    Wu, Zhongqin
    Qian, Yanmin
    INTERSPEECH 2021, 2021, : 3021 - 3025
  • [6] Speaking rhythmically improves speech recognition under "cocktail-party" conditions
    Wang, Mengyuan
    Kong, Lingzhi
    Zhang, Changxin
    Wu, Xihong
    Li, Liang
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (04): : EL255 - EL259
  • [7] Speech recognition by bilateral cochlear implant users in a cocktail-party setting
    Loizou, Philipos C.
    Hu, Yi
    Litovsky, Ruth
    Yu, Gongqiang
    Peters, Robert
    Lake, Jennifer
    Roland, Peter
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (01): : 372 - 383
  • [8] Selective Attention Enhances Beta-Band Cortical Oscillation to Speech under "Cocktail-Party" Listening Conditions
    Gao, Yayue
    Wang, Qian
    Ding, Yu
    Wang, Changming
    Li, Haifeng
    Wu, Xihong
    Qu, Tianshu
    Li, Liang
    FRONTIERS IN HUMAN NEUROSCIENCE, 2017, 11
  • [9] Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation
    Li, Chenda
    Qian, Yanmin
    INTERSPEECH 2020, 2020, : 1426 - 1430
  • [10] Effects of age on electrophysiological correlates of speech processing in a dynamic "cocktail-party" situation
    Getzmann, Stephan
    Hanenberg, Christina
    Lewald, Joerg
    Falkensteinand, Michael
    Wascher, Edmund
    FRONTIERS IN NEUROSCIENCE, 2015, 9