Upper and Lower Tight Error Bounds for Feature Omission with an Extension to Context Reduction

被引:0
|
作者
Schlueter, Ralf [1 ]
Beck, Eugen [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Dept Comp Sci, Human Language Technol & Pattern Recognit, Ahornstr 55, D-52056 Aachen, Germany
基金
欧洲研究理事会;
关键词
Error bound; Bayes error; feature selection; language model; perplexity; context reduction; pattern classification; sequence classification; LANGUAGE; RECOGNITION;
D O I
10.1109/TPAMI.2017.2788434
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, fundamental analytic results in the form of error bounds are presented that quantify the effect of feature omission and selection for pattern classification in general, as well as the effect of context reduction in string classification, like automatic speech recognition, printed/handwritten character recognition, or statistical machine translation. A general simulation framework is introduced that supports discovery and proof of error bounds, which lead to the error bounds presented here. Initially derived tight lower and upper bounds for feature omission are generalized to feature selection, followed by another extension to context reduction of string class priors (aka language models) in string classification. For string classification, the quantitative effect of string class prior context reduction on symbol-level Bayes error is presented. The tightness of the original feature omission bounds seems lost in this case, as further simulations indicate. However, combining both feature omission andcontext reduction, the tightness of the bounds is retained. A central result of this work is the proof of the existence, and the amount of a statistical threshold w.r.t. the introduction of additional features in general pattern classification, or the increase of context in string classification beyond which a decrease in Bayes error is guaranteed.
引用
收藏
页码:502 / 514
页数:13
相关论文
共 50 条
  • [1] Error Bounds for Context Reduction and Feature Omission
    Beck, Eugen
    Schlueter, Ralf
    Ney, Hermann
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1280 - 1284
  • [2] Arbitrarily tight upper and lower bounds on the Bayesian probability of error
    AviItzhak, H
    Diep, T
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (01) : 89 - 91
  • [3] Tight lower bounds for the longest common extension problem
    Kosolobov, Dmitry
    [J]. INFORMATION PROCESSING LETTERS, 2017, 125 : 26 - 29
  • [4] Reduction of the Reconstruction Error With Lower and Upper Bounds in Synthetic Aperture Imaging Radiometers
    Yang, Xiaocheng
    Yang, Zhenyi
    Yan, Jingye
    Wu, Lin
    Jiang, Mingfeng
    Lyu, Wentao
    [J]. IEEE ACCESS, 2020, 8 : 156964 - 156971
  • [5] Tight upper and lower bounds on suffix tree breadth
    Badkobeh, Golnaz
    Gawrychowski, Pawel
    Kaerkkaeinen, Juha
    Puglisi, Simon J.
    Zhukova, Bella
    [J]. THEORETICAL COMPUTER SCIENCE, 2021, 854 (854) : 63 - 67
  • [6] Convergence of Upper and Lower Bounds on the Bayes Error
    Xiang Yingchang
    Zhang Jiguang
    Chen Dechang
    Fries, Michael A.
    [J]. THEORETICAL AND MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE, 2011, 164 : 534 - +
  • [8] LTV Model Reduction With Upper Error Bounds
    Helmersson, Anders
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (07) : 1450 - 1462
  • [9] Tight upper and lower bounds for the reciprocal sum of Proth primes
    Bertalan Borsos
    Attila Kovács
    Norbert Tihanyi
    [J]. The Ramanujan Journal, 2022, 59 : 181 - 198
  • [10] Tight Lower and Upper Bounds on the Minimum Distance of LDPC Codes
    Hashemi, Yoones
    Banihashemi, Amir H.
    [J]. IEEE COMMUNICATIONS LETTERS, 2018, 22 (01) : 33 - 36