Multimodal Embeddings From Language Models for Emotion Recognition in the Wild

被引:10
|
作者
Tseng, Shao-Yen [1 ]
Narayanan, Shrikanth [1 ]
Georgiou, Panayiotis [2 ]
机构
[1] Univ Southern Calif, Dept Elect & Comp Engn, Los Angeles, CA 90089 USA
[2] Apple Inc, Siri Understanding, Culver City, CA 90016 USA
关键词
Acoustics; Task analysis; Feature extraction; Convolution; Emotion recognition; Context modeling; Bit error rate; Machine learning; unsupervised learning; natural language processing; speech processing; emotion recognition; SPEECH;
D O I
10.1109/LSP.2021.3065598
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Word embeddings such as ELMo and BERT have been shown to model word usage in language with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant performance improvement across many natural language processing tasks. In this work we integrate acoustic information into contextualized lexical embeddings through the addition of a parallel stream to the bidirectional language model. This multimodal language model is trained on spoken language data that includes both text and audio modalities. We show that embeddings extracted from this model integrate paralinguistic cues into word meanings and can provide vital affective information by applying these multimodal embeddings to the task of speaker emotion recognition.
引用
下载
收藏
页码:608 / 612
页数:5
相关论文
共 50 条
  • [41] Decoupled Multimodal Distilling for Emotion Recognition
    Li, Yong
    Wang, Yuanzhi
    Cui, Zhen
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6631 - 6640
  • [42] Emotion Recognition Based on Multimodal Information
    Zeng, Zhihong
    Pantic, Maja
    Huang, Thomas S.
    AFFECTIVE INFORMATION PROCESSING, 2009, : 241 - +
  • [43] Emotion Recognition Using Multimodal Approach
    Saini, Samiksha
    Rao, Rohan
    Vaichole, Vinit
    Rane, Anand
    Abin, Deepa
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [44] A robust multimodal approach for emotion recognition
    Song, Mingli
    You, Mingyu
    Li, Na
    Chen, Chun
    NEUROCOMPUTING, 2008, 71 (10-12) : 1913 - 1920
  • [45] A Multimodal Corpus for Emotion Recognition in Sarcasm
    Ray, Anupama
    Mishra, Shubham
    Nunna, Apoorva
    Bhattacharyya, Pushpak
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6992 - 7003
  • [46] Multimodal approaches for emotion recognition: A survey
    Sebe, N
    Cohen, I
    Gevers, T
    Huang, TS
    INTERNET IMAGING VI, 2005, 5670 : 56 - 67
  • [47] Emotion Recognition using Multimodal Features
    Zhao, Jinming
    Chen, Shizhe
    Wang, Shuai
    Jin, Qin
    2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [48] Multimodal human emotion/expression recognition
    Chen, LS
    Huang, TS
    Miyasato, T
    Nakatsu, R
    AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS, 1998, : 366 - 371
  • [49] A Multimodal Dataset for Mixed Emotion Recognition
    Yang, Pei
    Liu, Niqi
    Liu, Xinge
    Shu, Yezhi
    Ji, Wenqi
    Ren, Ziqi
    Sheng, Jenny
    Yu, Minjing
    Yi, Ran
    Zhang, Dan
    Liu, Yong-Jin
    SCIENTIFIC DATA, 2024, 11 (01)
  • [50] Outlier Processing in Multimodal Emotion Recognition
    Zhang, Ge
    Luo, Tianxiang
    Pedrycz, Witold
    El-Meligy, Mohammed A.
    Sharaf, Mohamed Abdel Fattah
    Li, Zhiwu
    IEEE ACCESS, 2020, 8 (08): : 55688 - 55701