Upper and Lower Tight Error Bounds for Feature Omission with an Extension to Context Reduction

被引：0

作者：

Schlueter, Ralf ^{[1
]}

Beck, Eugen ^{[1
]}

Ney, Hermann ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Dept Comp Sci, Human Language Technol & Pattern Recognit, Ahornstr 55, D-52056 Aachen, Germany

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2019年 / 41卷 / 02期

基金：

欧洲研究理事会;

关键词：

Error bound; Bayes error; feature selection; language model; perplexity; context reduction; pattern classification; sequence classification; LANGUAGE; RECOGNITION;

D O I：

10.1109/TPAMI.2017.2788434

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, fundamental analytic results in the form of error bounds are presented that quantify the effect of feature omission and selection for pattern classification in general, as well as the effect of context reduction in string classification, like automatic speech recognition, printed/handwritten character recognition, or statistical machine translation. A general simulation framework is introduced that supports discovery and proof of error bounds, which lead to the error bounds presented here. Initially derived tight lower and upper bounds for feature omission are generalized to feature selection, followed by another extension to context reduction of string class priors (aka language models) in string classification. For string classification, the quantitative effect of string class prior context reduction on symbol-level Bayes error is presented. The tightness of the original feature omission bounds seems lost in this case, as further simulations indicate. However, combining both feature omission andcontext reduction, the tightness of the bounds is retained. A central result of this work is the proof of the existence, and the amount of a statistical threshold w.r.t. the introduction of additional features in general pattern classification, or the increase of context in string classification beyond which a decrease in Bayes error is guaranteed.

引用

页码：502 / 514

页数：13

共 50 条

[1] Error Bounds for Context Reduction and Feature Omission
Beck, Eugen
Schlueter, Ralf
Ney, Hermann
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1280 - 1284
[2] Arbitrarily tight upper and lower bounds on the Bayesian probability of error
AviItzhak, H
Diep, T
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (01) : 89 - 91
[3] Tight lower bounds for the longest common extension problem
Kosolobov, Dmitry
[J]. INFORMATION PROCESSING LETTERS, 2017, 125 : 26 - 29
[4] Reduction of the Reconstruction Error With Lower and Upper Bounds in Synthetic Aperture Imaging Radiometers
Yang, Xiaocheng
Yang, Zhenyi
Yan, Jingye
Wu, Lin
Jiang, Mingfeng
Lyu, Wentao
[J]. IEEE ACCESS, 2020, 8 : 156964 - 156971
[5] Tight upper and lower bounds on suffix tree breadth
Badkobeh, Golnaz
Gawrychowski, Pawel
Kaerkkaeinen, Juha
Puglisi, Simon J.
Zhukova, Bella
[J]. THEORETICAL COMPUTER SCIENCE, 2021, 854 (854) : 63 - 67
[6] Convergence of Upper and Lower Bounds on the Bayes Error
Xiang Yingchang
Zhang Jiguang
Chen Dechang
Fries, Michael A.
[J]. THEORETICAL AND MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE, 2011, 164 : 534 - +
[7] Tight Upper Bounds on the Probability of Error of Quaternary Simplex Signals
Rugini, Luca
[J]. IEEE COMMUNICATIONS LETTERS, 2015, 19 (06) : 1001 - 1004
[8] LTV Model Reduction With Upper Error Bounds
Helmersson, Anders
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (07) : 1450 - 1462
[9] Tight upper and lower bounds for the reciprocal sum of Proth primes
Bertalan Borsos
Attila Kovács
Norbert Tihanyi
[J]. The Ramanujan Journal, 2022, 59 : 181 - 198
[10] Tight Lower and Upper Bounds on the Minimum Distance of LDPC Codes
Hashemi, Yoones
Banihashemi, Amir H.
[J]. IEEE COMMUNICATIONS LETTERS, 2018, 22 (01) : 33 - 36

← 1 2 3 4 5 →