On the selection of the globally optimal prototype subset for nearest-neighbor classification

被引:8
|
作者
Carrizosa, Emilio [1 ]
Martin-Barragan, Belen
Plastria, Frank
Morales, Dolores Romero
机构
[1] Univ Seville, Fac Matemat, Seville 41012, Spain
[2] Univ Carlos III Madrid, Dept Estadist, Madrid 28903, Spain
[3] Vrije Univ Brussels, Dept Math Operat Res Stat & Informat Syst Managem, MOSI, B-1050 Brussels, Belgium
[4] Univ Oxford, Said Sch Business, Oxford OX1 1HP, England
关键词
classification; optimal prototype subset; nearest neighbor; dissimilarities; integer programming; variable neighborhood search; missing values;
D O I
10.1287/ijoc.1060.0183
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The nearest-neighbor classifier has been shown to be a powerful tool for multiclass classification. We explore both theoretical properties and empirical behavior of a variant method, in which the nearest-neighbor rule is applied to a reduced set of prototypes. This set is selected a priori by fixing its cardinality and minimizing the empirical misclassification cost. In this way we alleviate the two serious drawbacks of the nearest-neighbor method: high storage requirements and time-consuming queries. Finding this reduced set is shown to be NP-hard. We provide mixed integer programming (MIP) formulations, which are theoretically compared and solved by a standard MIP solver for small problem instances. We show that the classifiers derived from these formulations are comparable to benchmark procedures. We solve large problem instances by a metaheuristic that yields good classification rules in reasonable time. Additional experiments indicate that prototype-based nearest-neighbor classifiers remain quite stable in the presence of missing values.
引用
收藏
页码:470 / 479
页数:10
相关论文
共 50 条
  • [41] Human action recognition based on boosted feature selection and naive Bayes nearest-neighbor classification
    Liu, Li
    Shao, Ling
    Rockett, Peter
    SIGNAL PROCESSING, 2013, 93 (06) : 1521 - 1530
  • [42] Time-Optimal Nearest-Neighbor Computations on Enhanced Meshes
    J Parallel Distrib Comput, 2 (144):
  • [43] HETEROGENEOUS DISTANCE MEASURES AND NEAREST-NEIGHBOR CLASSIFICATION IN AN ECOLOGICAL SETTING
    Spencer, Matthew S.
    Prins, Samantha C. Bates
    Beckom, Margaret S.
    MISSOURI JOURNAL OF MATHEMATICAL SCIENCES, 2010, 22 (02) : 108 - 123
  • [44] A reduction technique for nearest-neighbor classification: Small groups of examples
    Kubat, Miroslav
    Cooperson Jr., Martin
    Intelligent Data Analysis, 2001, 5 (06) : 463 - 476
  • [45] Learning weighted metrics to minimize nearest-neighbor classification error
    Paredes, R
    Vidal, E
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (07) : 1100 - 1110
  • [46] HANDWRITTEN CHARACTER CLASSIFICATION USING NEAREST-NEIGHBOR IN LARGE DATABASES
    SMITH, SJ
    BOURGOIN, MO
    SIMS, K
    VOORHEES, HL
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1994, 16 (09) : 915 - 919
  • [47] Time-optimal nearest-neighbor computations on enhanced meshes
    Olariu, S
    Stojmenovic, I
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1996, 36 (02) : 144 - 155
  • [48] Improved partial distance search for K nearest-neighbor classification
    Qiao, YL
    Pan, JS
    Sun, SH
    2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1275 - 1278
  • [49] IPADE: Iterative Prototype Adjustment for Nearest Neighbor Classification
    Triguero, Isaac
    Garcia, Salvador
    Herrera, Francisco
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (12): : 1984 - 1990
  • [50] CLASSIFICATION OF MULTIPLE OBSERVATIONS USING A RANK NEAREST-NEIGHBOR RULE
    BAGUI, SC
    PATTERN RECOGNITION LETTERS, 1993, 14 (08) : 611 - 617