Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations

被引:0
|
作者
Ribeiro, Eugenio [1 ]
Ribeiro, Ricardo [2 ]
de Matos, David Martins [1 ]
机构
[1] Univ Lisbon, Inst Super Tecn, INESC ID Lisboa, L2F,Spoken Language Syst Lab, Lisbon, Portugal
[2] IUL, ISCTE, INESC ID Lisboa, L2F,Spoken Language Syst Lab, Lisbon, Portugal
关键词
CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic dialog act recognition is a task that has been widely explored over the years. In recent works, most approaches to the task explored different deep neural network architectures to combine the representations of the words in a segment and generate a segment representation that provides cues for intention. In this study, we explore means to generate more informative segment representations, not only by exploring different network architectures, but also by considering different token representations, not only at the word level, but also at the character and functional levels. At the word level, in addition to the commonly used uncontextualized embeddings, we explore the use of contextualized representations, which are able to provide information concerning word sense and segment structure. Character-level tokenization is important to capture intention-related morphological aspects that cannot be captured at the word level. Finally, the functional level provides an abstraction from words, which shifts the focus to the structure of the segment. Additionally, we explore approaches to enrich the segment representation with context information from the history of the dialog, both in terms of the classifications of the surrounding segments and the turn-taking history. This kind of information has already been proved important for the disambiguation of dialog acts in previous studies. Nevertheless, we are able to capture additional information by considering a summary of the dialog history and a wider turn-taking context. By combining the best approaches at each step, we achieve performance results that surpass the previous state-of-the-art on generic dialog act recognition on both the Switchboard Dialog Act Corpus (SwDA) and the ICSI Meeting Recorder Dialog Act Corpus (MRDA), which are two of the most widely explored corpora for the task. Furthermore, by considering both past and future context, similarly to what happens in an annotation scenario, our approach achieves a performance similar to that of a human annotator on SwDA and surpasses it on MRDA.
引用
收藏
页码:861 / 899
页数:39
相关论文
共 50 条
  • [21] DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification
    Qin, Libo
    Che, Wanxiang
    Li, Yangming
    Ni, Mingheng
    Liu, Ting
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8665 - 8672
  • [22] VISUAL EMOTION RECOGNITION USING COMPACT FACIAL REPRESENTATIONS AND VISEME INFORMATION
    Metallinou, Angeliki
    Busso, Carlos
    Lee, Sungbok
    Narayanan, Shrikanth
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 2474 - 2477
  • [23] Learning Hierarchical Representations for Face Recognition using Deep Belief Network Embedded with Softmax Regress and Multiple Neural Networks
    Zhang, Hai-jun
    Xiao, Nan-feng
    PROCEEDINGS OF THE 2015 2ND INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2015), 2015, 33 : 1 - 7
  • [24] Ballistic target recognition based on multiple data representations and deep-learning algorithms
    Lixun HAN
    Cunqian FENG
    Xiaowei HU
    Sisan HE
    Xuguang XU
    ChineseJournalofAeronautics, 2024, 37 (06) : 167 - 181
  • [25] Ballistic target recognition based on multiple data representations and deep-learning algorithms
    Han, Lixun
    Feng, Cunqian
    Hu, Xiaowei
    He, Sisan
    Xu, Xuguang
    CHINESE JOURNAL OF AERONAUTICS, 2024, 37 (06) : 167 - 181
  • [26] Simultaneous place and object recognition using collaborative context information
    Kim, Sungho
    Kweon, In So
    IMAGE AND VISION COMPUTING, 2009, 27 (06) : 824 - 833
  • [27] Automatic recognition of maize cell types using context information
    Horgan, GW
    Travis, AJ
    Ji, L
    MICRON, 2005, 36 (02) : 163 - 167
  • [28] Fusion of Multiple Texture Representations for Palmprint Recognition Using Neural Networks
    BinMakhashen, Galal M.
    El-Alfy, El-Sayed M.
    NEURAL INFORMATION PROCESSING, ICONIP 2012, PT V, 2012, 7667 : 410 - 417
  • [29] INVARIANT PATTERN-RECOGNITION USING MULTIPLE FILTER IMAGE REPRESENTATIONS
    ZETZSCHE, C
    CAELLI, T
    COMPUTER VISION GRAPHICS AND IMAGE PROCESSING, 1989, 45 (02): : 251 - 262
  • [30] Online Unconstrained Handwritten Thai Character Recognition Using Multiple Representations
    Bounnady, Khampheth
    Kruatrachue, Boontee
    Matsuura, Takenobu
    2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 135 - +