A source coding approach to classification by vector quantization and the principle of minimum description length

被引:3
|
作者
Li, J [1 ]
机构
[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
关键词
D O I
10.1109/DCC.2002.999978
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An algorithm for supervised classification using vector quantization and entropy coding is presented. The classification rule is formed from a set of training data {(X-i,Y-i)}(i=1)(n), which are independent samples from a joint distribution P-XY. Based on the principle of Minimum Description Length (MDL), a statistical model that approximates the distribution P-XY ought to enable efficient coding of X and Y. On the other hand, we expect a system that encodes (X, Y) efficiently to provide ample information on the distribution P-XY.. This information ran then be used to classify X, i.e., to predict the corresponding Y based on X. To encode both X and Y, a two-stage vector quantizer is applied to X and a Huffman code is formed for Y conditioned on each quantized value of X. The optimization of the encoder is equivalent to the design of a vector quantizer with an objective function reflecting the joint penalty of quantization error and misclassification rate. This vector quantizer provides an estimation of the conditional distribution of Y given X, which in turn yields an approximation to the Bayes classification rule. This algorithm, namely Discriminant Vector Quantization (DVQ), is compared with Learning Vector Quantization (LVQ) and CART(R) on a number of data sets. DVQ outperforms the other two on several data sets. The relation between DVQ, density estimation, and regression is also discussed.
引用
收藏
页码:382 / 391
页数:10
相关论文
共 50 条
  • [1] Vector quantization and minimum description length
    Bischof, H
    Leonardis, A
    [J]. INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, 1999, : 355 - 364
  • [2] The minimum description length principle in coding and modeling
    Barron, A
    Rissanen, J
    Yu, B
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (06) : 2743 - 2760
  • [3] An introduction to coding theory and the two-part minimum description length principle
    Lee, TCM
    [J]. INTERNATIONAL STATISTICAL REVIEW, 2001, 69 (02) : 169 - 183
  • [4] Introducing the minimum description length principle
    Grünwald, P
    [J]. ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 3 - 21
  • [5] A minimum description length principle for perception
    Chater, N
    [J]. ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 385 - 409
  • [6] Gaussian Clusters and Noise: An Approach Based on the Minimum Description Length Principle
    Luosto, Panu
    Kivinen, Jyrki
    Mannila, Heikki
    [J]. DISCOVERY SCIENCE, DS 2010, 2010, 6332 : 251 - 265
  • [7] DATA EFFICIENT SUPPORT VECTOR MACHINE TRAINING USING THE MINIMUM DESCRIPTION LENGTH PRINCIPLE
    Singh, Harsh
    Arandjelovic, Ognjen
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1361 - 1365
  • [8] Incremental Learning with the Minimum Description Length Principle
    Murena, Pierre-Alexandre
    Cornuejols, Antoine
    Dessalles, Jean-Louis
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1908 - 1915
  • [9] Model selection and the principle of minimum description length
    Hansen, MH
    Yu, B
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (454) : 746 - 774
  • [10] Histograms based on the minimum description length principle
    Hai Wang
    Kenneth C. Sevcik
    [J]. The VLDB Journal, 2008, 17 : 419 - 442