Survey: Finite-state technology in natural language processing

被引:5
|
作者
Maletti, Andreas [1 ]
机构
[1] Univ Stuttgart, Inst Nat Language Proc, Pfaffenwaldring 5b, D-70569 Stuttgart, Germany
关键词
Finite-state automaton; Tree automaton; Context-free grammar; Natural language processing; Tokenization; Part-of-speech tagging; Parsing; Machine translation; MAXIMUM-LIKELIHOOD; PROBABILISTIC FUNCTIONS;
D O I
10.1016/j.tcs.2016.05.030
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this survey, we will discuss current uses of finite-state information in several statistical natural language processing tasks. To this end, we will review standard approaches in tokenization, part-of-speech tagging, and parsing, and illustrate the utility of finite-state information and technology in these areas. The particular problems were chosen to allow a natural progression from simple prediction to structured prediction. We aim for a sufficiently formal presentation suitable for readers with a background in automata theory that allows to appreciate the contribution of finite-state approaches, but we will not discuss practical issues outside the core ideas. We provide instructive examples and pointers into the relevant literature for all constructions. We close with an outlook on finite-state technology in statistical machine translation. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:2 / 17
页数:16
相关论文
共 50 条
  • [41] ON THE CLASS OF L*-LANGUAGE FORMULAS THAT SPECIFY FINITE-MEMORY FINITE-STATE MACHINES
    Chebotares, A. N.
    CYBERNETICS AND SYSTEMS ANALYSIS, 2010, 46 (01) : 1 - 6
  • [42] The fusion of fuzzy theories and natural language processing: A state-of-the-art survey
    Liu, Ming
    Zhang, Hongjun
    Xu, Zeshui
    Ding, Kun
    APPLIED SOFT COMPUTING, 2024, 162
  • [43] Towards Speech Recognition Using Finite-State Transducers in Slovak Language
    Lojka, M.
    Juhar, J.
    RTT 2009: 11TH INTERNATIONAL CONFERENCE RTT 2009 RESEARCH IN TELECOMMUNICATION TECHNOLOGY, CONFERENCE PROCEEDINGS, 2009, : 130 - 134
  • [44] Incremental language models for speech recognition using finite-state transducers
    Dolfing, HJGA
    Hetherington, LL
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 194 - 197
  • [45] Image processing operations in color space using finite-state machines
    Waltz, FM
    MACHINE VISION SYSTEMS FOR INSPECTION AND METROLOGY VII, 1998, 3521 : 298 - 303
  • [46] Fast image processing using finite-state machines: software implementations
    Waltz, FM
    Miller, JWV
    TWO- AND THREE-DIMENSIONAL VISION SYSTEMS FOR INSPECTION, CONTROL, AND METROLOGY, 2004, 5265 : 171 - 178
  • [47] Finite-State Transducers with Multivalued Mappings for Processing of Rich Inflectional Languages
    Tukeyev, Ualsher
    Milosz, Marek
    Zhumanov, Zhandos
    NEW TRENDS IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, 2015, 598 : 271 - 280
  • [48] Finite-state script normalization and processing utilities: The Nisaba Brahmic library
    Johny, Cibu
    Wolf-Sonkin, Lawrence
    Gutkin, Alexander
    Roark, Brian
    EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE SYSTEM DEMONSTRATIONS, 2021, : 14 - 23
  • [49] State assignment of finite-state machines
    Ahmad, I
    Dhodhi, MK
    IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES, 2000, 147 (01): : 15 - 22
  • [50] Direct-mapped asynchronous finite-state machines in CMOS technology
    Sotiriou, CP
    14TH ANNUAL IEEE INTERNATIONAL ASIC/SOC CONFERENCE, PROCEEDINGS, 2001, : 105 - 109