Part-of-Speech Tagging by Latent Analogy

被引:4
|
作者
Bellegarda, Jerome R. [1 ]
机构
[1] Apple Inc, Speech & Language Technol, Cupertino, CA 95014 USA
关键词
Latent semantic mapping (LSM); natural language processing (NLP); part-of-speech (POS) disambiguation; sequence labeling; statistical modeling;
D O I
10.1109/JSTSP.2010.2075970
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Part-of-speech tagging is often a critical first step in various speech and language processing tasks. High-accuracy taggers (e.g., based on conditional random fields) rely on well chosen feature functions to ensure that important characteristics of the empirical training distribution are reflected in the trained model. This makes them vulnerable to any discrepancy between training and tagging corpora, and, in particular, accuracy is adversely affected by the presence of out-of-vocabulary words. This paper explores an alternative tagging strategy based on the principle of latent analogy, which was originally introduced in the context of a speech synthesis application. In this approach, locally optimal tag subsequences emerge automatically from an appropriate representation of global sentence-level information. This solution eliminates the need for feature engineering, while exploiting a broader context more conducive to word sense disambiguation. Empirical evidence suggests that, in practice, tagging by latent analogy is essentially competitive with conventional Markovian techniques, while benefiting from substantially less onerous training costs. This opens up the possibility that integration with such techniques may lead to further improvements in tagging accuracy.
引用
收藏
页码:985 / 993
页数:9
相关论文
共 50 条
  • [1] A novel approach to part-of-speech tagging based on latent analogy
    Bellegarda, Jerome R.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4685 - 4688
  • [2] Part-of-speech tagging
    Martinez, Angel R.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2012, 4 (01): : 107 - 113
  • [3] Part-of-speech tagging for Swedish
    Prütz, K
    PARALLEL CORPORA, PARALLEL WORLDS, 2002, (43): : 201 - 206
  • [4] Standards for automatic part-of-speech tagging
    Minnaja, DC
    15TH INTERNATIONAL CONGRESS ON CYBERNETICS, PROCEEDINGS, 1999, : 745 - 750
  • [5] Part-of-speech tagging with minimal lexicalization
    Savova, V
    Peshkin, L
    RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING III, 2004, 260 : 163 - 172
  • [6] Part-of-Speech Tagging for Azerbaijani Language
    Mammadov, Samir
    Rustamov, Samir
    Mustafali, Ali
    Sadigov, Ziyaddin
    Mollayev, Rasim
    Mammadov, Zamir
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2018, : 40 - 45
  • [7] Domain adaptation in part-of-speech tagging
    Institute of Exact and Natural Sciences, Federal University of Pará , Pará, Brazil
    不详
    Emerging Applic. of Nat. Lang. Proc.: Concepts and New Res., (52-72):
  • [8] Part-of-speech tagging without training
    Bressan, S
    Indradjaja, LS
    INTELLIGENCE IN COMMUNICATION SYSTEMS, 2004, 3283 : 112 - 119
  • [9] Corpus based part-of-speech tagging
    Lv, Chengyao
    Liu, Huihua
    Dong, Yuanxing
    Chen, Yunliang
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (03) : 647 - 654
  • [10] The Application of CRFs in Part-of-Speech Tagging
    Zhang Xiaofei
    Huang Heyan
    Zhang Liang
    2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 347 - +