A maximum entropy approach to adaptive statistical language modelling

被引:242
|
作者
Rosenfeld, R
机构
[1] Computer Science Department, Carnegie Mellon University, Pittsburgh
来源
COMPUTER SPEECH AND LANGUAGE | 1996年 / 10卷 / 03期
关键词
D O I
10.1006/csla.1996.0011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An adaptive statistical language model is described, which successfully integrates long distance linguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's history, we propose and use trigger pairs as the basic information bearing elements. This allows the model to adapt its expectations to the topic of discourse. Next, statistical evidence from multiple sources must be combined. Traditionally, linear interpolation and its variants have been used, but these are shown here to be seriously deficient. Instead, we apply the principle of Maximum Entropy (ME). Each information source gives rise to a set of constraints, to be imposed on the combined estimate. The intersection of these constraints is the set of probability functions which are consistent with all the information sources. The function with the highest entropy within that set is the ME solution. Given consistent statistical evidence, a unique ME solution is guaranteed to exist, and an iterative algorithm exists which is guaranteed to converge to it. The ME framework is extremely general: any phenomenon that can be described in terms of statistics of the text can be readily incorporated. An adaptive language model based on the ME approach was trained on the Wall Street Journal corpus, and showed a 32-39% perplexity reduction over the baseline. When interfaced to SPHINX-II, Carnegie Mellon's speech recognizer, it reduced its error rate by 10-14%. This thus illustrates the feasibility of incorporating many diverse knowledge sources in a single, unified statistical framework. (C) 1996 Academic Press Limited
引用
收藏
页码:187 / 228
页数:42
相关论文
共 50 条
  • [31] An effective approach for probabilistic lifetime modelling based on the principle of maximum entropy with fractional moments
    Zhang, Xufang
    He, Wei
    Zhang, Yimin
    Pandey, Mahesh D.
    APPLIED MATHEMATICAL MODELLING, 2017, 51 : 626 - 642
  • [32] Using the maximum entropy production approach to integrate energy budget modelling in a hydrological model
    Maheu, Audrey
    Hajji, Islem
    Anctil, Francois
    Nadeau, Daniel F.
    Therrien, Rene
    HYDROLOGY AND EARTH SYSTEM SCIENCES, 2019, 23 (09) : 3843 - 3863
  • [33] Maximum entropy modelling of acoustic and linguistic features
    Chueh, Chuang-Hua
    Chien, Jen-Tzung
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 1061 - 1064
  • [34] Prosodic Features for a Maximum Entropy Language Model
    Chan, Oscar
    Togneri, Roberto
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1858 - 1861
  • [35] Maximum entropy language modeling and the smoothing problem
    Martin, SC
    Ney, H
    Hamacher, C
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (05): : 626 - 632
  • [36] A whole sentence maximum entropy language model
    Rosenfeld, R
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 230 - 237
  • [37] Smoothing methods in maximum entropy language modeling
    Martin, SC
    Ney, H
    Zaplo, J
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 545 - 548
  • [38] TOWARDS AN APPROACH TO STOCHASTIC ADAPTIVE-CONTROL USING THE MAXIMUM-ENTROPY PRINCIPLE
    JUMARIE, G
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1990, 21 (12) : 2621 - 2636
  • [39] Adaptive estimation, truncation, and maximum entropy.
    Marsh, TL
    Mittelhammer, RC
    JOURNAL OF AGRICULTURAL AND RESOURCE ECONOMICS, 2001, 26 (02): : 558 - 558
  • [40] Maximum entropy adaptive control of chaotic systems
    Lin, JH
    Isik, C
    JOINT CONFERENCE ON THE SCIENCE AND TECHNOLOGY OF INTELLIGENT SYSTEMS, 1998, : 243 - 246