Integrative interaction of emotional speech in audio-visual modality

被引:2
|
作者
Dong, Haibin [1 ]
Li, Na [1 ]
Fan, Lingzhong [2 ]
Wei, Jianguo [1 ]
Xu, Junhai [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin Key Lab Cognit Comp & Applicat, Tianjin, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Brainnetome Ctr, Beijing, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
audio-visual integration; emotional speech; fMRI; left insula; weighted RSA; SUPERIOR TEMPORAL SULCUS; HUMAN BRAIN; PERCEPTION; FACE; INFORMATION; EXPRESSIONS; ACTIVATION; PRECUNEUS; INSULA; VOICE;
D O I
10.3389/fnins.2022.797277
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Emotional clues are always expressed in many ways in our daily life, and the emotional information we receive is often represented by multiple modalities. Successful social interactions require a combination of multisensory cues to accurately determine the emotion of others. The integration mechanism of multimodal emotional information has been widely investigated. Different brain activity measurement methods were used to determine the location of brain regions involved in the audio-visual integration of emotional information, mainly in the bilateral superior temporal regions. However, the methods adopted in these studies are relatively simple, and the materials of the study rarely contain speech information. The integration mechanism of emotional speech in the human brain still needs further examinations. In this paper, a functional magnetic resonance imaging (fMRI) study was conducted using event-related design to explore the audio-visual integration mechanism of emotional speech in the human brain by using dynamic facial expressions and emotional speech to express emotions of different valences. Representational similarity analysis (RSA) based on regions of interest (ROIs), whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis were used to analyze and verify the role of relevant brain regions. Meanwhile, a weighted RSA method was used to evaluate the contributions of each candidate model in the best fitted model of ROIs. The results showed that only the left insula was detected by all methods, suggesting that the left insula played an important role in the audio-visual integration of emotional speech. Whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis together revealed that the bilateral middle temporal gyrus (MTG), right inferior parietal lobule and bilateral precuneus might be involved in the audio-visual integration of emotional speech from other aspects.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] EMID: An Emotional Aligned Dataset in Audio-Visual Modality
    Zou, Jialing
    Mei, Jiahao
    Ye, Guangze
    Huai, Tianyu
    Shen, Qiwei
    Dong, Daoguo
    [J]. PROCEEDINGS OF THE 1ST INTERNATIONAL WORKSHOP ON MULTIMEDIA CONTENT GENERATION AND EVALUATION, MCGE 2023: New Methods and Practice, 2023, : 41 - 48
  • [2] Emotional Audio-Visual Speech Synthesis Based on PAD
    Jia, Jia
    Zhang, Shen
    Meng, Fanbo
    Wang, Yongxin
    Cai, Lianhong
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (03): : 570 - 582
  • [3] A Cantonese Audio-Visual Emotional Speech (CAVES) dataset
    Chee Seng Chong
    Chris Davis
    Jeesun Kim
    [J]. Behavior Research Methods, 2024, 56 (5) : 5264 - 5278
  • [4] An audio-visual distance for audio-visual speech vector quantization
    Girin, L
    Foucher, E
    Feng, G
    [J]. 1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 523 - 528
  • [6] Dense Modality Interaction Network for Audio-Visual Event Localization
    Liu, Shuo
    Quan, Weize
    Wang, Chaoqun
    Liu, Yuan
    Liu, Bin
    Yan, Dong-Ming
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2734 - 2748
  • [7] MODALITY ATTENTION FOR END-TO-END AUDIO-VISUAL SPEECH RECOGNITION
    Zhou, Pan
    Yang, Wenwen
    Chen, Wei
    Wang, Yanfeng
    Jia, Jia
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6565 - 6569
  • [8] Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech
    Alm, Magnus
    Behne, Dawn
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (04): : 3001 - 3010
  • [9] An audio-visual speech recognition with a new mandarin audio-visual database
    Liao, Wen-Yuan
    Pao, Tsang-Long
    Chen, Yu-Te
    Chang, Tsun-Wei
    [J]. INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
  • [10] Expressive audio-visual speech
    Bevacqua, E
    Pelachaud, C
    [J]. COMPUTER ANIMATION AND VIRTUAL WORLDS, 2004, 15 (3-4) : 297 - 304