Integrative interaction of emotional speech in audio-visual modality

被引：2

作者：

Dong, Haibin ^{[1
]}

Li, Na ^{[1
]}

Fan, Lingzhong ^{[2
]}

Wei, Jianguo ^{[1
]}

Xu, Junhai ^{[1
]}

机构：

[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin Key Lab Cognit Comp & Applicat, Tianjin, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Brainnetome Ctr, Beijing, Peoples R China

来源：

FRONTIERS IN NEUROSCIENCE | 2022年 / 16卷

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

audio-visual integration; emotional speech; fMRI; left insula; weighted RSA; SUPERIOR TEMPORAL SULCUS; HUMAN BRAIN; PERCEPTION; FACE; INFORMATION; EXPRESSIONS; ACTIVATION; PRECUNEUS; INSULA; VOICE;

D O I：

10.3389/fnins.2022.797277

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Emotional clues are always expressed in many ways in our daily life, and the emotional information we receive is often represented by multiple modalities. Successful social interactions require a combination of multisensory cues to accurately determine the emotion of others. The integration mechanism of multimodal emotional information has been widely investigated. Different brain activity measurement methods were used to determine the location of brain regions involved in the audio-visual integration of emotional information, mainly in the bilateral superior temporal regions. However, the methods adopted in these studies are relatively simple, and the materials of the study rarely contain speech information. The integration mechanism of emotional speech in the human brain still needs further examinations. In this paper, a functional magnetic resonance imaging (fMRI) study was conducted using event-related design to explore the audio-visual integration mechanism of emotional speech in the human brain by using dynamic facial expressions and emotional speech to express emotions of different valences. Representational similarity analysis (RSA) based on regions of interest (ROIs), whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis were used to analyze and verify the role of relevant brain regions. Meanwhile, a weighted RSA method was used to evaluate the contributions of each candidate model in the best fitted model of ROIs. The results showed that only the left insula was detected by all methods, suggesting that the left insula played an important role in the audio-visual integration of emotional speech. Whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis together revealed that the bilateral middle temporal gyrus (MTG), right inferior parietal lobule and bilateral precuneus might be involved in the audio-visual integration of emotional speech from other aspects.

引用

页数：13

共 50 条

[31] Audio-visual enhancement of speech in noise
Girin, L
Schwartz, JL
Feng, G
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (06): : 3007 - 3020
[32] Audio-Visual Speech Recognition in Noisy Audio Environments
Palecek, Karel
Chaloupka, Josef
[J]. 2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 484 - 487
[33] Audio-Visual Speech Modeling for Continuous Speech Recognition
Dupont, Stephane
Luettin, Juergen
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) : 141 - 151
[34] Audio-visual speech perception without speech cues
Saldana, HM
Pisoni, DB
Fellowes, JM
Remez, RE
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2187 - 2190
[35] Multi-Speaker Audio-Visual Corpus RUSAVIC: Russian Audio-Visual Speech in Cars
Ivanko, Denis
Ryumin, Dmitry
Axyonov, Alexandr
Kashevnik, Alexey
Karpov, Alexey
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1555 - 1559
[36] Audio-visual integration of emotional cues in song
Thompson, William Forde
Russo, Frank A.
Quinto, Lena
[J]. COGNITION & EMOTION, 2008, 22 (08) : 1457 - 1470
[37] Preattentive processing of audio-visual emotional signals
Foecker, Julia
Gondan, Matthias
Roeder, Brigitte
[J]. ACTA PSYCHOLOGICA, 2011, 137 (01) : 36 - 47
[38] Searching Audio-Visual Clips for Dual-mode Chinese Emotional Speech Database
Zhang, Xudong
Wu, Guoqing
Ren, Fuji
[J]. 2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
[39] Audio-Visual Speech Recognition Based on Dual Cross-Modality Attentions with the Transformer Model
Lee, Yong-Hyeok
Jang, Dong-Won
Kim, Jae-Bin
Park, Rae-Hong
Park, Hyung-Min
[J]. APPLIED SCIENCES-BASEL, 2020, 10 (20): : 1 - 18
[40] Audio-visual interaction of environmental noise
Preis, Anna
Hafke-Dys, Honorata
Szychowska, Malina
Kocinski, Jedrzej
Felcyn, Jan
[J]. NOISE CONTROL ENGINEERING JOURNAL, 2016, 64 (01) : 34 - 43

← 1 2 3 4 5 →