Classification using Dirichlet priors when the training data are mislabeled

被引：0

作者：

Lynch, RS ^{[1
]}

Willett, PK ^{[1
]}

机构：

[1] Naval Undersea Warfare Ctr, Newport, RI 02841 USA

来源：

ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI | 1999年

关键词：

D O I：

10.1109/ICASSP.1999.761387

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The average probability of error is used to demonstrate performance of a Bayesian classification test (referred to as the Combined Bayes Test (CBT)) given the training data of each class are mislabeled. The CBT combines the information in discrete training and test data to infer symbol probabilities, where a uniform Dirichlet prior (i.e., a noninformative prior of complete ignorance) is assumed for all classes. Using this prior it is shown how classification performance degrades when mislabeling exists in the training data, and this occurs with st severity that depends on the value of the mislabeling probabilities. However, an increase in the mislabeling probabilities are also shown to cause an increase in M* (i.e., the best quantization fineness). Further, even when the actual mislabeling probabilities are known by the CBT it is not possible to achieve the classification performance obtainable without mislabeling.

引用

页码：2973 / 2976

页数：4

共 50 条

[31] Probabilistic abductive logic programming using Dirichlet priors
Turliuc, Calin Rares
Dickens, Luke
Russo, Alessandra
Broda, Krysia
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2016, 78 : 223 - 240
[32] Modeling Latent Information in Voting Data with Dirichlet Process Priors
Traunmueller, Richard
Murr, Andreas
Gill, Jeff
POLITICAL ANALYSIS, 2015, 23 (01) : 1 - 20
[33] On the posterior consistency of mixtures of Dirichlet process priors with censored data
Kim, Y
SCANDINAVIAN JOURNAL OF STATISTICS, 2003, 30 (03) : 535 - 547
[34] Monte Carlo methods for Bayesian analysis of survival data using mixtures of Dirichlet process priors
Doss, H
Huffer, FW
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2003, 12 (02) : 282 - 307
[35] CLASSIFICATION OF MULTIVARIATE DATA USING DIRICHLET PROCESS MIXTURE MODELS
Djuric, Petar M.
Ferrari, Andre
2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 441 - 445
[36] Latent Dirichlet Allocation for Classification using Gene Expression Data
Yalamanchili, Hima Bindu
Kho, Soon Jye
Raymer, Michael L.
2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 39 - 44
[37] RUSBoost: Improving Classification Performance when Training Data is Skewed
Seiffert, Chris
Khoshgoftaar, Taghi M.
Van Hulse, Jason
Napolitano, Amri
19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3650 - 3653
[38] Robust Learning of Mislabeled Training Samples for Remote Sensing Image Scene Classification
Tu, Bing
Kuang, Wenlan
He, Wangquan
Zhang, Guoyun
Peng, Yishu
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 5623 - 5639
[39] Generalized Dirichlet priors for Naive Bayesian classifiers with multinomial models in document classification
Wong, Tzu-Tsung
DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (01) : 123 - 144
[40] LEARNING WITH MISLABELED TRAINING SAMPLES USING STOCHASTIC-APPROXIMATION
PATHAKPAL, A
PAL, SK
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1987, 17 (06): : 1072 - 1077

← 1 2 3 4 5 →