A source coding approach to classification by vector quantization and the principle of minimum description length

被引：3

作者：

Li, J ^{[1
]}

机构：

[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA

来源：

DCC 2002: DATA COMPRESSION CONFERENCE, PROCEEDINGS | 2002年

关键词：

D O I：

10.1109/DCC.2002.999978

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An algorithm for supervised classification using vector quantization and entropy coding is presented. The classification rule is formed from a set of training data {(X-i,Y-i)}(i=1)(n), which are independent samples from a joint distribution P-XY. Based on the principle of Minimum Description Length (MDL), a statistical model that approximates the distribution P-XY ought to enable efficient coding of X and Y. On the other hand, we expect a system that encodes (X, Y) efficiently to provide ample information on the distribution P-XY.. This information ran then be used to classify X, i.e., to predict the corresponding Y based on X. To encode both X and Y, a two-stage vector quantizer is applied to X and a Huffman code is formed for Y conditioned on each quantized value of X. The optimization of the encoder is equivalent to the design of a vector quantizer with an objective function reflecting the joint penalty of quantization error and misclassification rate. This vector quantizer provides an estimation of the conditional distribution of Y given X, which in turn yields an approximation to the Bayes classification rule. This algorithm, namely Discriminant Vector Quantization (DVQ), is compared with Learning Vector Quantization (LVQ) and CART(R) on a number of data sets. DVQ outperforms the other two on several data sets. The relation between DVQ, density estimation, and regression is also discussed.

引用

页码：382 / 391

页数：10

共 50 条

[1] Vector quantization and minimum description length
Bischof, H
Leonardis, A
INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, 1999, : 355 - 364
[2] The minimum description length principle in coding and modeling
Barron, A
Rissanen, J
Yu, B
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (06) : 2743 - 2760
[3] An introduction to coding theory and the two-part minimum description length principle
Lee, TCM
INTERNATIONAL STATISTICAL REVIEW, 2001, 69 (02) : 169 - 183
[4] Introducing the minimum description length principle
Grünwald, P
ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 3 - 21
[5] A minimum description length principle for perception
Chater, N
ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 385 - 409
[6] Gaussian Clusters and Noise: An Approach Based on the Minimum Description Length Principle
Luosto, Panu
Kivinen, Jyrki
Mannila, Heikki
DISCOVERY SCIENCE, DS 2010, 2010, 6332 : 251 - 265
[7] DATA EFFICIENT SUPPORT VECTOR MACHINE TRAINING USING THE MINIMUM DESCRIPTION LENGTH PRINCIPLE
Singh, Harsh
Arandjelovic, Ognjen
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1361 - 1365
[8] Incremental Learning with the Minimum Description Length Principle
Murena, Pierre-Alexandre
Cornuejols, Antoine
Dessalles, Jean-Louis
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1908 - 1915
[9] Model selection and the principle of minimum description length
Hansen, MH
Yu, B
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (454) : 746 - 774
[10] Histograms based on the minimum description length principle
Hai Wang
Kenneth C. Sevcik
The VLDB Journal, 2008, 17 : 419 - 442

← 1 2 3 4 5 →