Integrative interaction of emotional speech in audio-visual modality

被引:2
|
作者
Dong, Haibin [1 ]
Li, Na [1 ]
Fan, Lingzhong [2 ]
Wei, Jianguo [1 ]
Xu, Junhai [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin Key Lab Cognit Comp & Applicat, Tianjin, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Brainnetome Ctr, Beijing, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
audio-visual integration; emotional speech; fMRI; left insula; weighted RSA; SUPERIOR TEMPORAL SULCUS; HUMAN BRAIN; PERCEPTION; FACE; INFORMATION; EXPRESSIONS; ACTIVATION; PRECUNEUS; INSULA; VOICE;
D O I
10.3389/fnins.2022.797277
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Emotional clues are always expressed in many ways in our daily life, and the emotional information we receive is often represented by multiple modalities. Successful social interactions require a combination of multisensory cues to accurately determine the emotion of others. The integration mechanism of multimodal emotional information has been widely investigated. Different brain activity measurement methods were used to determine the location of brain regions involved in the audio-visual integration of emotional information, mainly in the bilateral superior temporal regions. However, the methods adopted in these studies are relatively simple, and the materials of the study rarely contain speech information. The integration mechanism of emotional speech in the human brain still needs further examinations. In this paper, a functional magnetic resonance imaging (fMRI) study was conducted using event-related design to explore the audio-visual integration mechanism of emotional speech in the human brain by using dynamic facial expressions and emotional speech to express emotions of different valences. Representational similarity analysis (RSA) based on regions of interest (ROIs), whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis were used to analyze and verify the role of relevant brain regions. Meanwhile, a weighted RSA method was used to evaluate the contributions of each candidate model in the best fitted model of ROIs. The results showed that only the left insula was detected by all methods, suggesting that the left insula played an important role in the audio-visual integration of emotional speech. Whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis together revealed that the bilateral middle temporal gyrus (MTG), right inferior parietal lobule and bilateral precuneus might be involved in the audio-visual integration of emotional speech from other aspects.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Audio-visual enhancement of speech in noise
    Girin, L
    Schwartz, JL
    Feng, G
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (06): : 3007 - 3020
  • [32] Audio-Visual Speech Recognition in Noisy Audio Environments
    Palecek, Karel
    Chaloupka, Josef
    [J]. 2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 484 - 487
  • [33] Audio-Visual Speech Modeling for Continuous Speech Recognition
    Dupont, Stephane
    Luettin, Juergen
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) : 141 - 151
  • [34] Audio-visual speech perception without speech cues
    Saldana, HM
    Pisoni, DB
    Fellowes, JM
    Remez, RE
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2187 - 2190
  • [35] Multi-Speaker Audio-Visual Corpus RUSAVIC: Russian Audio-Visual Speech in Cars
    Ivanko, Denis
    Ryumin, Dmitry
    Axyonov, Alexandr
    Kashevnik, Alexey
    Karpov, Alexey
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1555 - 1559
  • [36] Audio-visual integration of emotional cues in song
    Thompson, William Forde
    Russo, Frank A.
    Quinto, Lena
    [J]. COGNITION & EMOTION, 2008, 22 (08) : 1457 - 1470
  • [37] Preattentive processing of audio-visual emotional signals
    Foecker, Julia
    Gondan, Matthias
    Roeder, Brigitte
    [J]. ACTA PSYCHOLOGICA, 2011, 137 (01) : 36 - 47
  • [38] Searching Audio-Visual Clips for Dual-mode Chinese Emotional Speech Database
    Zhang, Xudong
    Wu, Guoqing
    Ren, Fuji
    [J]. 2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [39] Audio-Visual Speech Recognition Based on Dual Cross-Modality Attentions with the Transformer Model
    Lee, Yong-Hyeok
    Jang, Dong-Won
    Kim, Jae-Bin
    Park, Rae-Hong
    Park, Hyung-Min
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (20): : 1 - 18
  • [40] Audio-visual interaction of environmental noise
    Preis, Anna
    Hafke-Dys, Honorata
    Szychowska, Malina
    Kocinski, Jedrzej
    Felcyn, Jan
    [J]. NOISE CONTROL ENGINEERING JOURNAL, 2016, 64 (01) : 34 - 43