ON THE USE OF N-GRAM TRANSDUCERS FOR DIALOGUE ANNOTATION

被引:2
|
作者
Tamarit, Vicent [1 ]
Martinez-Hinarejos, Carlos-D. [1 ]
Benedi, Jose-Miguel [1 ]
机构
[1] Univ Politecn Valencia, Inst Tecnol Informat, Valencia, Spain
来源
SPOKEN DIALOGUE SYSTEMS: TECHNOLOGY AND DESIGN | 2011年
关键词
Statistical models; Dialogue annotation;
D O I
10.1007/978-1-4419-7934-6_11
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The implementation of dialogue systems is one of the most interesting applications of language technologies. Statistical models can be used in this implementation, allowing for a more flexible approach than when using rules defined by a human expert. However, statistical models require large amounts of dialogues annotated with dialogue-function labels (usually Dialogue Acts), and the annotation process is hard and time-consuming. Consequently, the use of other statistical models to obtain faster annotations is really interesting for the development of dialogue systems. In this work we compare two statistical models for dialogue annotation, a more classical Hidden Markov Model (HMM) based model and the new N-gram Transducers (NGT) model. This comparison is performed on two corpora of different nature, the well-known SwitchBoard corpus and the DIHANA corpus. The results show that the NGT model produces a much more accurate annotation that the HMM-based model (even 11% less error in the SwitchBoard corpus).
引用
收藏
页码:255 / 276
页数:22
相关论文
共 50 条
  • [31] Croatian Language N-Gram System
    Dembitz, Sandor
    Blaskovic, Bruno
    Gledec, Gordan
    ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, 2012, 243 : 696 - 705
  • [32] Google N-Gram Viewer does not Include Arabic Corpus! Towards N-Gram Viewer for Arabic Corpus
    Alsmadi, Izzat
    Zarour, Mohammad
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (05) : 785 - 794
  • [33] Towards Competitive N-gram Smoothing
    Falahatgar, Moein
    Ohannessian, Mesrob
    Orlitsky, Alon
    Pichapati, Venkatadheeraj
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 4206 - 4214
  • [34] NOVEL TOPIC N-GRAM COUNT LM INCORPORATING DOCUMENT-BASED TOPIC DISTRIBUTIONS AND N-GRAM COUNTS
    Haidar, Md. Akmal
    O'Shaughnessy, Douglas
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2310 - 2314
  • [35] Use of statistical N-gram models in natural language generation for machine translation
    Liu, FH
    Gu, L
    Gao, YQ
    Picheny, M
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 636 - 639
  • [36] Amyloidogenic motifs revealed by n-gram analysis
    Michał Burdukiewicz
    Piotr Sobczyk
    Stefan Rödiger
    Anna Duda-Madej
    Paweł Mackiewicz
    Małgorzata Kotulska
    Scientific Reports, 7
  • [37] A New Estimate of the n-gram Language Model
    Aouragh, Si Lhoussain
    Yousfi, Abdellah
    Laaroussi, Saida
    Gueddah, Hicham
    Nejja, Mohammed
    AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 211 - 215
  • [38] Discriminative N-gram Selection for Dialect Recognition
    Richardson, F. S.
    Campbell, W. M.
    Torres-Carrasquillo, P. A.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 192 - 195
  • [39] Generalized N-gram measures for melodic similarity
    Frieler, Klaus
    Data Science and Classification, 2006, : 289 - 298
  • [40] N-gram feature selection for authorship identification
    Houvardas, John
    Stamatatos, Efstathios
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, 2006, 4183 : 77 - 86