ON THE USE OF N-GRAM TRANSDUCERS FOR DIALOGUE ANNOTATION

被引:2
|
作者
Tamarit, Vicent [1 ]
Martinez-Hinarejos, Carlos-D. [1 ]
Benedi, Jose-Miguel [1 ]
机构
[1] Univ Politecn Valencia, Inst Tecnol Informat, Valencia, Spain
关键词
Statistical models; Dialogue annotation;
D O I
10.1007/978-1-4419-7934-6_11
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The implementation of dialogue systems is one of the most interesting applications of language technologies. Statistical models can be used in this implementation, allowing for a more flexible approach than when using rules defined by a human expert. However, statistical models require large amounts of dialogues annotated with dialogue-function labels (usually Dialogue Acts), and the annotation process is hard and time-consuming. Consequently, the use of other statistical models to obtain faster annotations is really interesting for the development of dialogue systems. In this work we compare two statistical models for dialogue annotation, a more classical Hidden Markov Model (HMM) based model and the new N-gram Transducers (NGT) model. This comparison is performed on two corpora of different nature, the well-known SwitchBoard corpus and the DIHANA corpus. The results show that the NGT model produces a much more accurate annotation that the HMM-based model (even 11% less error in the SwitchBoard corpus).
引用
收藏
页码:255 / 276
页数:22
相关论文
共 50 条
  • [1] Unsegmented Dialogue Act Annotation and Decoding With N-Gram Transducers
    Martinez-Hinarejos, Carlos-D.
    Benedi, Jose-Miguel
    Tamarit, Vicent
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) : 198 - 211
  • [2] Improving unsegmented dialogue turns annotation with N-gram transducers
    Instituto Tecnologico de Informatica, Universidad Politecnica de Valencia, Camino de Vera, s/n, 46022 Valencia, Spain
    PACLIC 23 - Proc. 23rd Pacific Asia Conf. Lang. Inf. Comput., 2009, (335-344):
  • [3] Direct and Wordgraph-Based Confidence Measures in Dialogue Annotation with N-Gram Transducers
    Martinez-Hinarejos, Carlos-D.
    Tamarit, Vicent
    Benedi, Jose-Miguel
    HUMAN LANGUAGE TECHNOLOGY CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, 2014, 8387 : 264 - 275
  • [4] Inference of stochastic finite-state transducers using N-gram mixtures
    Alabau, Vicente
    Casacuberta, Francisco
    Vidal, Enrique
    Juan, Alfons
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2007, 4478 : 282 - +
  • [5] High Order N-gram Model Construction and Application Based on Natural Annotation
    Wang, Qibo
    Rao, Gaoqi
    Xun, Endong
    CHINESE LEXICAL SEMANTICS (CLSW 2019), 2020, 11831 : 321 - 328
  • [6] N-gram Insight
    Prans, George
    AMERICAN SCIENTIST, 2011, 99 (05) : 356 - 357
  • [7] N-gram MalGAN: Evading machine learning detection via feature n-gram
    Zhu, Enmin
    Zhang, Jianjie
    Yan, Jijie
    Chen, Kongyang
    Gao, Chongzhi
    DIGITAL COMMUNICATIONS AND NETWORKS, 2022, 8 (04) : 485 - 491
  • [8] Pseudo-Conventional N-Gram Representation of the Discriminative N-Gram Model for LVCSR
    Zhou, Zhengyu
    Meng, Helen
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (06) : 943 - 952
  • [9] Pipilika N-gram Viewer: An Efficient Large Scale N-gram Model for Bengali
    Ahmad, Adnan
    Talha, Mahbubur Rub
    Amin, Md. Ruhul
    Chowdhury, Farida
    2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
  • [10] N-gram MalGAN:Evading machine learning detection via feature n-gram
    Enmin Zhu
    Jianjie Zhang
    Jijie Yan
    Kongyang Chen
    Chongzhi Gao
    Digital Communications and Networks, 2022, 8 (04) : 485 - 491