A Cross-Corpus Speech-Based Analysis of Escalating Negative Interactions

被引:3
|
作者
Lefter, Iulia [1 ]
Baird, Alice [2 ]
Stappen, Lukas [2 ]
Schuller, Bjorn W. [2 ,3 ]
机构
[1] Delft Univ Technol, Dept Multiactor Syst, Delft, Netherlands
[2] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
[3] Imperial Coll London, Grp Language Audio & Mus, London, England
来源
关键词
affective computing; negative interactions; cross-corpora analysis; conflict escalation; speech paralinguistics; emotion recognition; ACOUSTIC EMOTION RECOGNITION; CONFLICT;
D O I
10.3389/fcomp.2022.749804
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The monitoring of an escalating negative interaction has several benefits, particularly in security, (mental) health, and group management. The speech signal is particularly suited to this, as aspects of escalation, including emotional arousal, are proven to easily be captured by the audio signal. A challenge of applying trained systems in real-life applications is their strong dependence on the training material and limited generalization abilities. For this reason, in this contribution, we perform an extensive analysis of three corpora in the Dutch language. All three corpora are high in escalation behavior content and are annotated on alternative dimensions related to escalation. A process of label mapping resulted in two possible ground truth estimations for the three datasets as low, medium, and high escalation levels. To observe class behavior and inter-corpus differences more closely, we perform acoustic analysis of the audio samples, finding that derived labels perform similarly across each corpus, with escalation interaction increasing in pitch (F0) and intensity (dB). We explore the suitability of different speech features, data augmentation, merging corpora for training, and testing on actor and non-actor speech through our experiments. We find that the extent to which merging corpora is successful depends greatly on the similarities between label definitions before label mapping. Finally, we see that the escalation recognition task can be performed in a cross-corpus setup with hand-crafted speech features, obtaining up to 63.8% unweighted average recall (UAR) at best for a cross-corpus analysis, an increase from the inter-corpus results of 59.4% UAR.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Cross-Corpus Analysis for Acoustic Recognition of Negative Interactions
    Lefter, Iulia
    Nefs, Harold T.
    Jonker, Catholijn M.
    Rothkrantz, Leon J. M.
    2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2015, : 132 - 138
  • [2] A Cross-Corpus Recognition of Emotional Speech
    Xiao, Zhongzhe
    Wu, Di
    Zhang, Xiaojun
    Tao, Zhi
    PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2016, : 42 - 46
  • [3] Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization
    Song, Peng
    Zheng, Wenming
    Ou, Shifeng
    Zhang, Xinran
    Jin, Yun
    Liu, Jinglei
    Yu, Yanwei
    SPEECH COMMUNICATION, 2016, 83 : 34 - 41
  • [4] CROSS-CORPUS DEPRESSION PREDICTION FROM SPEECH
    Mitra, Vikramjit
    Shriberg, Elizabeth
    Vergyri, Dimitra
    Knoth, Bruce
    Salomon, Ronald M.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4769 - 4773
  • [5] On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement
    Pandey, Ashutosh
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2489 - 2499
  • [6] A CROSS-CORPUS STUDY ON SPEECH EMOTION RECOGNITION
    Milner, Rosanna
    Jalal, Md Asif
    Ng, Raymond W. M.
    Hain, Thomas
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 304 - 311
  • [7] Cross-Corpus Speech Emotion Recognition Based on Hybrid Neural Networks
    Rehman, Abdul
    Liu, Zhen-Tao
    Li, Dan-Yun
    Wu, Bao-Han
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7464 - 7468
  • [8] Analysis of Deep Learning Architectures for Cross-corpus Speech Emotion Recognition
    Parry, Jack
    Palaz, Dimitri
    Clarke, Georgia
    Lecomte, Pauline
    Mead, Rebecca
    Berger, Michael
    Hofer, Gregor
    INTERSPEECH 2019, 2019, : 1656 - 1660
  • [9] Cross-Corpus Speech Emotion Recognition Based on Sparse Subspace Transfer Learning
    Zhao, Keke
    Song, Peng
    Zhang, Wenjing
    Zhang, Weijian
    Li, Shaokai
    Chen, Dongliang
    Zheng, Wenming
    BIOMETRIC RECOGNITION (CCBR 2021), 2021, 12878 : 466 - 473
  • [10] Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation
    Fu, Hongliang
    Li, Qianqian
    Tao, Huawei
    Zhu, Chunhua
    Xie, Yue
    Guo, Ruxue
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (08) : 1097 - 1100