Hierarchical Transformer Network for Utterance-Level Emotion Recognition

被引:12
|
作者
Li, Qingbiao [1 ]
Wu, Chunhua [1 ]
Wang, Zhe [1 ]
Zheng, Kangfeng [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 13期
关键词
emotion recognition; text classification; dialog; transformer; pretrained model;
D O I
10.3390/app10134447
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
While there have been significant advances in detecting emotions in text, in the field of utterance-level emotion recognition (ULER), there are still many problems to be solved. In this paper, we address some challenges in ULER in dialog systems. (1) The same utterance can deliver different emotions when it is in different contexts. (2) Long-range contextual information is hard to effectively capture. (3) Unlike the traditional text classification problem, for most datasets of this task, they contain inadequate conversations or speech. (4) To better model the emotional interaction between speakers, speaker information is necessary. To address the problems of (1) and (2), we propose a hierarchical transformer framework (apart from the description of other studies, the "transformer" in this paper usually refers to the encoder part of the transformer) with a lower-level transformer to model the word-level input and an upper-level transformer to capture the context of utterance-level embeddings. For problem (3), we use bidirectional encoder representations from transformers (BERT), a pretrained language model, as the lower-level transformer, which is equivalent to introducing external data into the model and solves the problem of data shortage to some extent. For problem (4), we add speaker embeddings to the model for the first time, which enables our model to capture the interaction between speakers. Experiments on three dialog emotion datasets, Friends, EmotionPush, and EmoryNLP, demonstrate that our proposed hierarchical transformer network models obtain competitive results compared with the state-of-the-art methods in terms of the macro-averaged F1-score (macro-F1).
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction
    Makishima, Naoki
    Suzuki, Keita
    Suzuki, Satoshi
    Ando, Atsushi
    Masumura, Ryo
    INTERSPEECH 2023, 2023, : 2913 - 2917
  • [42] SUNET: Speaker-utterance interaction Graph Neural Network for Emotion Recognition in Conversations
    Song, Rui
    Giunchiglia, Fausto
    Shi, Lida
    Shen, Qiang
    Xu, Hao
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
  • [43] RTHN: A RNN-Transformer Hierarchical Network for Emotion Cause Extraction
    Xia, Rui
    Zhang, Mengran
    Ding, Zixiang
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 5285 - 5291
  • [44] Speech emotion recognition using the novel SwinEmoNet (Shifted Window Transformer Emotion Network)
    Ramesh R.
    Prahaladhan V.B.
    Nithish P.
    Mohanaprasad K.
    International Journal of Speech Technology, 2024, 27 (03) : 551 - 568
  • [45] Utterance-level extractive summarization of open-domain spontaneous conversations with rich features
    Zhu, Xiaodan
    Penn, Gerald
    2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 793 - +
  • [46] Disentangling the Effects of Position and Utterance-Level Declination on the Production of Complex Tones in Yoloxochitl Mixtec
    DiCanio, Christian
    Benn, Joshua
    Castillo Garcia, Rey
    LANGUAGE AND SPEECH, 2021, 64 (03) : 515 - 557
  • [47] Utterance-level Permutation Invariant Training with Discriminative Learning for Single Channel Speech Separation
    Fan, Cunhang
    Liu, Bin
    Tao, Jianhua
    Wen, Zhengqi
    Yi, Jiangyan
    Bai, Ye
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 26 - 30
  • [48] Hierarchical temporal transformer network for tool wear state recognition
    Xue, Zhongling
    Chen, Ni
    Wu, Youling
    Yang, Yinfei
    Li, Liang
    ADVANCED ENGINEERING INFORMATICS, 2023, 58
  • [49] HiTRANS: A Hierarchical Transformer Network for Nested Named Entity Recognition
    Yang, Zhiwei
    Ma, Jing
    Chen, Hechang
    Zhang, Yunke
    Chang, Yi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 124 - 132
  • [50] Utterance independent bimodal emotion recognition in spontaneous communication
    Tao, Jianhua
    Pan, Shifeng
    Yang, Minghao
    Li, Ya
    Mu, Kaihui
    Che, Jianfeng
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,