Classification using Dirichlet priors when the training data are mislabeled

被引:0
|
作者
Lynch, RS [1 ]
Willett, PK [1 ]
机构
[1] Naval Undersea Warfare Ctr, Newport, RI 02841 USA
关键词
D O I
10.1109/ICASSP.1999.761387
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The average probability of error is used to demonstrate performance of a Bayesian classification test (referred to as the Combined Bayes Test (CBT)) given the training data of each class are mislabeled. The CBT combines the information in discrete training and test data to infer symbol probabilities, where a uniform Dirichlet prior (i.e., a noninformative prior of complete ignorance) is assumed for all classes. Using this prior it is shown how classification performance degrades when mislabeling exists in the training data, and this occurs with st severity that depends on the value of the mislabeling probabilities. However, an increase in the mislabeling probabilities are also shown to cause an increase in M* (i.e., the best quantization fineness). Further, even when the actual mislabeling probabilities are known by the CBT it is not possible to achieve the classification performance obtainable without mislabeling.
引用
收藏
页码:2973 / 2976
页数:4
相关论文
共 50 条
  • [31] Probabilistic abductive logic programming using Dirichlet priors
    Turliuc, Calin Rares
    Dickens, Luke
    Russo, Alessandra
    Broda, Krysia
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2016, 78 : 223 - 240
  • [32] Modeling Latent Information in Voting Data with Dirichlet Process Priors
    Traunmueller, Richard
    Murr, Andreas
    Gill, Jeff
    POLITICAL ANALYSIS, 2015, 23 (01) : 1 - 20
  • [34] Monte Carlo methods for Bayesian analysis of survival data using mixtures of Dirichlet process priors
    Doss, H
    Huffer, FW
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2003, 12 (02) : 282 - 307
  • [35] CLASSIFICATION OF MULTIVARIATE DATA USING DIRICHLET PROCESS MIXTURE MODELS
    Djuric, Petar M.
    Ferrari, Andre
    2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 441 - 445
  • [36] Latent Dirichlet Allocation for Classification using Gene Expression Data
    Yalamanchili, Hima Bindu
    Kho, Soon Jye
    Raymer, Michael L.
    2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 39 - 44
  • [37] RUSBoost: Improving Classification Performance when Training Data is Skewed
    Seiffert, Chris
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    Napolitano, Amri
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3650 - 3653
  • [38] Robust Learning of Mislabeled Training Samples for Remote Sensing Image Scene Classification
    Tu, Bing
    Kuang, Wenlan
    He, Wangquan
    Zhang, Guoyun
    Peng, Yishu
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 5623 - 5639
  • [39] Generalized Dirichlet priors for Naive Bayesian classifiers with multinomial models in document classification
    Wong, Tzu-Tsung
    DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (01) : 123 - 144
  • [40] LEARNING WITH MISLABELED TRAINING SAMPLES USING STOCHASTIC-APPROXIMATION
    PATHAKPAL, A
    PAL, SK
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1987, 17 (06): : 1072 - 1077