Automatic audiovisual integration in speech perception

被引:0
|
作者
Maurizio Gentilucci
Luigi Cattaneo
机构
[1] Universitá di Parma,Dipartimento di Neuroscienze
来源
关键词
McGurk effect; Audiovisual integration; Voice spectrum analysis; Lip kinematics; Imitation;
D O I
暂无
中图分类号
学科分类号
摘要
Two experiments aimed to determine whether features of both the visual and acoustical inputs are always merged into the perceived representation of speech and whether this audiovisual integration is based on either cross-modal binding functions or on imitation. In a McGurk paradigm, observers were required to repeat aloud a string of phonemes uttered by an actor (acoustical presentation of phonemic string) whose mouth, in contrast, mimicked pronunciation of a different string (visual presentation). In a control experiment participants read the same printed strings of letters. This condition aimed to analyze the pattern of voice and the lip kinematics controlling for imitation. In the control experiment and in the congruent audiovisual presentation, i.e. when the articulation mouth gestures were congruent with the emission of the string of phones, the voice spectrum and the lip kinematics varied according to the pronounced strings of phonemes. In the McGurk paradigm the participants were unaware of the incongruence between visual and acoustical stimuli. The acoustical analysis of the participants’ spoken responses showed three distinct patterns: the fusion of the two stimuli (the McGurk effect), repetition of the acoustically presented string of phonemes, and, less frequently, of the string of phonemes corresponding to the mouth gestures mimicked by the actor. However, the analysis of the latter two responses showed that the formant 2 of the participants’ voice spectra always differed from the value recorded in the congruent audiovisual presentation. It approached the value of the formant 2 of the string of phonemes presented in the other modality, which was apparently ignored. The lip kinematics of the participants repeating the string of phonemes acoustically presented were influenced by the observation of the lip movements mimicked by the actor, but only when pronouncing a labial consonant. The data are discussed in favor of the hypothesis that features of both the visual and acoustical inputs always contribute to the representation of a string of phonemes and that cross-modal integration occurs by extracting mouth articulation features peculiar for the pronunciation of that string of phonemes.
引用
收藏
页码:66 / 75
页数:9
相关论文
共 50 条
  • [41] NEUROMAGNETIC CORRELATES OF AUDIOVISUAL SPEECH-PERCEPTION
    AULANKO, R
    SAMS, M
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 88 - 88
  • [42] The word superiority effect in audiovisual speech perception
    Fort, Mathilde
    Spinelli, Elsa
    Savariaux, Christophe
    Kandel, Sonia
    [J]. SPEECH COMMUNICATION, 2010, 52 (06) : 525 - 532
  • [43] Audiovisual gating and the time course of speech perception
    Munhall, KG
    Tohkura, Y
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1998, 104 (01): : 530 - 539
  • [44] The Principle of Inverse Effectiveness in Audiovisual Speech Perception
    van de Rijt, Luuk P. H.
    Roye, Anja
    Mylanus, Emmanuel A. M.
    van Opstal, A. John
    van Wanrooij, Marc M.
    [J]. FRONTIERS IN HUMAN NEUROSCIENCE, 2019, 13
  • [45] Spatial and temporal influences on audiovisual speech perception
    Jones, JA
    Munhall, KG
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1996, 31 (3-4) : 4734 - 4734
  • [46] Automatic Bimodal Audiovisual Speech Recognition: A Review
    Kandagal, Amaresh P.
    Udayashankara, V.
    [J]. 2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2014, : 940 - 945
  • [47] Recent advances in the automatic recognition of audiovisual speech
    Potamianos, G
    Neti, C
    Gravier, G
    Garg, A
    Senior, AW
    [J]. PROCEEDINGS OF THE IEEE, 2003, 91 (09) : 1306 - 1326
  • [48] Automatic Viseme Clustering for Audiovisual Speech Synthesis
    Mattheyses, Wesley
    Latacz, Lukas
    Verhelst, Werner
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2184 - 2187
  • [49] Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration
    Navarra, J
    Vatakis, A
    Zampini, M
    Soto-Faraco, S
    Humphreys, W
    Spence, C
    [J]. COGNITIVE BRAIN RESEARCH, 2005, 25 (02): : 499 - 507
  • [50] Speech and non-speech measures of audiovisual integration are not correlated
    Jonathan M. P. Wilbiks
    Violet A. Brown
    Julia F. Strand
    [J]. Attention, Perception, & Psychophysics, 2022, 84 : 1809 - 1819