Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

被引:4
|
作者
Triantafyllopoulos, Andreas [1 ]
Wagner, Johannes [2 ]
Wierstorf, Hagen [2 ]
Schmitt, Maximilian [2 ]
Reichel, Uwe [2 ]
Eyben, Florian [2 ]
Burkhardt, Felix [2 ]
Schuller, Bjoern W. [1 ,2 ,3 ]
机构
[1] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
[2] audEERING GmbH, Gilching, Germany
[3] Imperial Coll, GLAM Grp Language Audio & Mus, London, England
来源
关键词
speech emotion recognition; transformers;
D O I
10.21437/Interspeech.2022-10371
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised manner with the goal to improve automatic speech recognition performance - and thus, to understand linguistic information. In this work, we investigate the extent in which this information is exploited during SER fine-tuning. Using a reproducible methodology based on open-source tools, we synthesise prosodically neutral speech utterances while varying the sentiment of the text. Valence predictions of the transformer model are very reactive to positive and negative sentiment content, as well as negations, but not to intensifiers or reducers, while none of those linguistic features impact arousal or dominance. These findings show that transformers can successfully leverage linguistic information to improve their valence predictions, and that linguistic analysis should be included in their testing.
引用
下载
收藏
页码:146 / 150
页数:5
相关论文
共 50 条
  • [31] Windowing for Speech Emotion Recognition
    Puterka, Boris
    Kacur, Juraj
    Pavlovicova, Jarmila
    2019 61ST INTERNATIONAL SYMPOSIUM ELMAR, 2019, : 147 - 150
  • [32] Mandarin emotion recognition in speech
    Pao, TL
    Chen, YT
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 227 - 230
  • [33] Progress in speech emotion recognition
    Zhang, Xueying
    Sun, Ying
    Duan, Shufei
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [34] Review on speech emotion recognition
    Han, W.-J. (hanwenjing07@gmail.com), 1600, Chinese Academy of Sciences (25):
  • [35] Emotion recognition in Arabic speech
    Hadjadji, Imene
    Falek, Leila
    Demri, Lyes
    Teffahi, Hocine
    2019 INTERNATIONAL CONFERENCE ON ADVANCED ELECTRICAL ENGINEERING (ICAEE), 2019,
  • [36] Bengali Speech Emotion Recognition
    Mohanta, Abhijit
    Sharma, Uzzal
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2812 - 2814
  • [37] Emotion recognition in Arabic speech
    Klaylat, Samira
    Osman, Ziad
    Hamandi, Lama
    Zantout, Rached
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2018, 96 (02) : 337 - 351
  • [38] Multiroom Speech Emotion Recognition
    Shalev, Erez
    Cohen, Israel
    European Signal Processing Conference, 2022, 2022-August : 135 - 139
  • [39] Effects of Vocabulary and Implicit Linguistic Knowledge on Speech Recognition in Adverse Listening Conditions
    Fletcher, Annalise
    McAuliffe, Megan
    Kerr, Sarah
    Sinex, Donal
    AMERICAN JOURNAL OF AUDIOLOGY, 2019, 28 (03) : 742 - 755
  • [40] ViTFER: Facial Emotion Recognition with Vision Transformers
    Chaudhari, Aayushi
    Bhatt, Chintan
    Krishna, Achyut
    Mazzeo, Pier Luigi
    APPLIED SYSTEM INNOVATION, 2022, 5 (04)