Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

被引:4
|
作者
Triantafyllopoulos, Andreas [1 ]
Wagner, Johannes [2 ]
Wierstorf, Hagen [2 ]
Schmitt, Maximilian [2 ]
Reichel, Uwe [2 ]
Eyben, Florian [2 ]
Burkhardt, Felix [2 ]
Schuller, Bjoern W. [1 ,2 ,3 ]
机构
[1] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
[2] audEERING GmbH, Gilching, Germany
[3] Imperial Coll, GLAM Grp Language Audio & Mus, London, England
来源
关键词
speech emotion recognition; transformers;
D O I
10.21437/Interspeech.2022-10371
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised manner with the goal to improve automatic speech recognition performance - and thus, to understand linguistic information. In this work, we investigate the extent in which this information is exploited during SER fine-tuning. Using a reproducible methodology based on open-source tools, we synthesise prosodically neutral speech utterances while varying the sentiment of the text. Valence predictions of the transformer model are very reactive to positive and negative sentiment content, as well as negations, but not to intensifiers or reducers, while none of those linguistic features impact arousal or dominance. These findings show that transformers can successfully leverage linguistic information to improve their valence predictions, and that linguistic analysis should be included in their testing.
引用
收藏
页码:146 / 150
页数:5
相关论文
共 50 条
  • [1] Linguistic knowledge and empirical methods in speech recognition
    Stolcke, A
    [J]. AI MAGAZINE, 1997, 18 (04) : 25 - 31
  • [2] Multistage linguistic conditioning of convolutional layers for speech emotion recognition
    Triantafyllopoulos, Andreas
    Reichel, Uwe
    Liu, Shuo
    Huber, Stephan
    Eyben, Florian
    Schuller, Bjoern W.
    [J]. FRONTIERS IN COMPUTER SCIENCE, 2023, 5
  • [3] Emotion Recognition from Speech using Prosodic and Linguistic Features
    Pervaiz, Mahwish
    Khan, Tamim Ahmed
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 84 - 90
  • [4] Personalised Emotion Recognition Utilising Speech Signal and Linguistic Cues
    Ramya, H. R.
    Bhatt, Mahabaleswara Ram
    [J]. 2019 11TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2019, : 856 - 860
  • [5] ScSer: Supervised Contrastive Learning for Speech Emotion Recognition using Transformers
    Alaparthi, Varun Sai
    Pasam, Tejeswara Reddy
    Inagandla, Deepak Abhiram
    Prakash, Jay
    Singh, Pramod Kumar
    [J]. 2022 15TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI), 2022,
  • [6] Applying Generative Adversarial Networks and Vision Transformers in Speech Emotion Recognition
    Heracleous, Panikos
    Fukayama, Satoru
    Ogata, Jun
    Mohammad, Yasser
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13519 LNCS : 67 - 75
  • [7] Spontaneous Speech Emotion Recognition using Prior Knowledge
    Chakraborty, Rupayan
    Pandharipande, Meghna
    Kopparapu, Sunil Kumar
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 2866 - 2871
  • [8] Speaker and gender dependencies in within/cross linguistic Speech Emotion Recognition
    Chakhtouna A.
    Sekkate S.
    Adib A.
    [J]. International Journal of Speech Technology, 2023, 26 (03) : 609 - 625
  • [9] Emotion recognition by speech signal characteristics (linguistic, clinical, informative aspects)
    Prokofyeva, L. P.
    Plastun, I. L.
    Filippova, N., V
    Matveeva, L. Yu
    Plastun, Na S.
    [J]. SIBIRSKII FILOLOGICHESKII ZHURNAL, 2021, (02): : 325 - 336
  • [10] A Non-Linguistic Approach for Human Emotion Recognition from Speech
    Spyrou, Evaggelos
    Vernikos, Ioannis
    Nikopoulou, Rozalia
    Mylonas, Phivos
    [J]. 2018 9TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS (IISA), 2018, : 244 - 248