The cocktail-party problem revisited: early processing and selection of multi-talker speech

被引:265
|
作者
Bronkhorst, Adelbert W. [1 ,2 ]
机构
[1] TNO Human Factors, NL-3769 ZG Soesterberg, Netherlands
[2] Vrije Univ Amsterdam, Dept Cognit Psychol, NL-1081 BT Amsterdam, Netherlands
关键词
Attention; Auditory scene analysis; Cocktail-party problem; Informational masking; Speech perception; HUMAN AUDITORY-CORTEX; INTERAURAL TIME DIFFERENCES; RECEPTION THRESHOLD; FUNDAMENTAL-FREQUENCY; ENERGETIC MASKING; INFORMATIONAL MASKING; PERCEPTUAL SEPARATION; INTELLIGIBILITY INDEX; MISMATCH NEGATIVITY; ATTENTIONAL CAPTURE;
D O I
10.3758/s13414-015-0882-9
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and "unmasking" resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping-the segregation and streaming of sounds-represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped-and subsequently selected-using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.
引用
收藏
页码:1465 / 1487
页数:23
相关论文
共 50 条
  • [1] The cocktail-party problem revisited: early processing and selection of multi-talker speech
    Adelbert W. Bronkhorst
    Attention, Perception, & Psychophysics, 2015, 77 : 1465 - 1487
  • [2] Audio-Visual Multi-Talker Speech Recognition in A Cocktail Party
    Wu, Yifei
    Hi, Chenda
    Yang, Song
    Wu, Zhongqin
    Qian, Yanmin
    INTERSPEECH 2021, 2021, : 3021 - 3025
  • [3] Musicians Show Improved Speech Segregation in Competitive, Multi-Talker Cocktail Party Scenarios
    Bidelman, Gavin M.
    Yoo, Jessica
    FRONTIERS IN PSYCHOLOGY, 2020, 11
  • [4] The effects of speech processing units on auditory stream segregation and selective attention in a multi-talker (cocktail party) situation
    Toth, Brigitta
    Honbolygo, Ferenc
    Szalardy, Orsolya
    Orosz, Gabor
    Farkas, David
    Winkler, Istvan
    CORTEX, 2020, 130 : 387 - 400
  • [5] Molecular analysis of individual differences in talker search at the cocktail-party
    Lutfi, Robert A.
    Pastore, Torben
    Rodriguez, Briana
    Yost, William A.
    Lee, Jungmee
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 152 (03): : 1804 - 1813
  • [6] Monaural multi-talker speech recognition using factorial speech processing models
    Khademian, Mahdi
    Homayounpour, Mohammad Mehdi
    SPEECH COMMUNICATION, 2018, 98 : 1 - 16
  • [7] Effects of age on electrophysiological correlates of speech processing in a dynamic "cocktail-party" situation
    Getzmann, Stephan
    Hanenberg, Christina
    Lewald, Joerg
    Falkensteinand, Michael
    Wascher, Edmund
    FRONTIERS IN NEUROSCIENCE, 2015, 9
  • [8] The Genetic contribution to solving the cocktail-party problem
    Mathias, Samuel R.
    Knowles, Emma E. M.
    Mollon, Josephine
    Rodrigue, Amanda L.
    Woolsey, Mary K.
    Hernandez, Alyssa M.
    Garrett, Amy S.
    Fox, Peter T.
    Olvera, Rene L.
    Peralta, Juan M.
    Kumar, Satish
    Goring, Harald H. H.
    Duggirala, Ravi
    Curran, Joanne E.
    Blangero, John
    Glahn, David C.
    ISCIENCE, 2022, 25 (09)
  • [9] USING BINARUAL PROCESSING FOR AUTOMATIC SPEECH RECOGNITION IN MULTI-TALKER SCENES
    Spille, Constantin
    Dietz, Mathias
    Hohmann, Volker
    Meyer, Bernd T.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7805 - 7809
  • [10] Using visual speech at the cocktail-party: CNV evidence for early speech extraction in younger and older adults
    Begau, Alexandra
    Arnau, Stefan
    Klatt, Laura-Isabelle
    Wascher, Edmund
    Getzmann, Stephan
    HEARING RESEARCH, 2022, 426