A maximum entropy approach to adaptive statistical language modelling

被引：242

作者：

Rosenfeld, R

机构：

[1] Computer Science Department, Carnegie Mellon University, Pittsburgh

来源：

COMPUTER SPEECH AND LANGUAGE | 1996年 / 10卷 / 03期

关键词：

D O I：

10.1006/csla.1996.0011

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An adaptive statistical language model is described, which successfully integrates long distance linguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's history, we propose and use trigger pairs as the basic information bearing elements. This allows the model to adapt its expectations to the topic of discourse. Next, statistical evidence from multiple sources must be combined. Traditionally, linear interpolation and its variants have been used, but these are shown here to be seriously deficient. Instead, we apply the principle of Maximum Entropy (ME). Each information source gives rise to a set of constraints, to be imposed on the combined estimate. The intersection of these constraints is the set of probability functions which are consistent with all the information sources. The function with the highest entropy within that set is the ME solution. Given consistent statistical evidence, a unique ME solution is guaranteed to exist, and an iterative algorithm exists which is guaranteed to converge to it. The ME framework is extremely general: any phenomenon that can be described in terms of statistics of the text can be readily incorporated. An adaptive language model based on the ME approach was trained on the Wall Street Journal corpus, and showed a 32-39% perplexity reduction over the baseline. When interfaced to SPHINX-II, Carnegie Mellon's speech recognizer, it reduced its error rate by 10-14%. This thus illustrates the feasibility of incorporating many diverse knowledge sources in a single, unified statistical framework. (C) 1996 Academic Press Limited

引用

页码：187 / 228

页数：42

共 50 条

[31] An effective approach for probabilistic lifetime modelling based on the principle of maximum entropy with fractional moments
Zhang, Xufang
He, Wei
Zhang, Yimin
Pandey, Mahesh D.
APPLIED MATHEMATICAL MODELLING, 2017, 51 : 626 - 642
[32] Using the maximum entropy production approach to integrate energy budget modelling in a hydrological model
Maheu, Audrey
Hajji, Islem
Anctil, Francois
Nadeau, Daniel F.
Therrien, Rene
HYDROLOGY AND EARTH SYSTEM SCIENCES, 2019, 23 (09) : 3843 - 3863
[33] Maximum entropy modelling of acoustic and linguistic features
Chueh, Chuang-Hua
Chien, Jen-Tzung
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 1061 - 1064
[34] Prosodic Features for a Maximum Entropy Language Model
Chan, Oscar
Togneri, Roberto
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1858 - 1861
[35] Maximum entropy language modeling and the smoothing problem
Martin, SC
Ney, H
Hamacher, C
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (05): : 626 - 632
[36] A whole sentence maximum entropy language model
Rosenfeld, R
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 230 - 237
[37] Smoothing methods in maximum entropy language modeling
Martin, SC
Ney, H
Zaplo, J
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 545 - 548
[38] TOWARDS AN APPROACH TO STOCHASTIC ADAPTIVE-CONTROL USING THE MAXIMUM-ENTROPY PRINCIPLE
JUMARIE, G
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1990, 21 (12) : 2621 - 2636
[39] Adaptive estimation, truncation, and maximum entropy.
Marsh, TL
Mittelhammer, RC
JOURNAL OF AGRICULTURAL AND RESOURCE ECONOMICS, 2001, 26 (02): : 558 - 558
[40] Maximum entropy adaptive control of chaotic systems
Lin, JH
Isik, C
JOINT CONFERENCE ON THE SCIENCE AND TECHNOLOGY OF INTELLIGENT SYSTEMS, 1998, : 243 - 246

← 1 2 3 4 5 →