On the selection of the globally optimal prototype subset for nearest-neighbor classification

被引:8
|
作者
Carrizosa, Emilio [1 ]
Martin-Barragan, Belen
Plastria, Frank
Morales, Dolores Romero
机构
[1] Univ Seville, Fac Matemat, Seville 41012, Spain
[2] Univ Carlos III Madrid, Dept Estadist, Madrid 28903, Spain
[3] Vrije Univ Brussels, Dept Math Operat Res Stat & Informat Syst Managem, MOSI, B-1050 Brussels, Belgium
[4] Univ Oxford, Said Sch Business, Oxford OX1 1HP, England
关键词
classification; optimal prototype subset; nearest neighbor; dissimilarities; integer programming; variable neighborhood search; missing values;
D O I
10.1287/ijoc.1060.0183
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The nearest-neighbor classifier has been shown to be a powerful tool for multiclass classification. We explore both theoretical properties and empirical behavior of a variant method, in which the nearest-neighbor rule is applied to a reduced set of prototypes. This set is selected a priori by fixing its cardinality and minimizing the empirical misclassification cost. In this way we alleviate the two serious drawbacks of the nearest-neighbor method: high storage requirements and time-consuming queries. Finding this reduced set is shown to be NP-hard. We provide mixed integer programming (MIP) formulations, which are theoretically compared and solved by a standard MIP solver for small problem instances. We show that the classifiers derived from these formulations are comparable to benchmark procedures. We solve large problem instances by a metaheuristic that yields good classification rules in reasonable time. Additional experiments indicate that prototype-based nearest-neighbor classifiers remain quite stable in the presence of missing values.
引用
收藏
页码:470 / 479
页数:10
相关论文
共 50 条
  • [21] Asymptotically-Optimal Topological Nearest-Neighbor Filtering
    Sandstrom, Read
    Denny, Jory
    Amato, Nancy M.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) : 6916 - 6923
  • [22] Fuzzy-rough nearest-neighbor classification approach
    Bian, HY
    Mazlack, L
    NAFIPS'2003: 22ND INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS PROCEEDINGS, 2003, : 500 - 505
  • [23] Integrating background knowledge into nearest-neighbor text classification
    Zelikovitz, S
    Hirsh, H
    ADVANCES IN CASE-BASED REASONING, 2002, 2416 : 1 - 5
  • [24] Adaptive κ-nearest-neighbor classification using a dynamic number of nearest neighbors
    Ougiaroglou, Stefanos
    Nanopoulos, Alexandros
    Papadopoulos, Apostolos N.
    Manolopoulos, Yannis
    Welzer-Druzovec, Tatjana
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4690 : 66 - +
  • [25] A new nearest-neighbor rule in the pattern classification problem
    Hattori, K
    Takahashi, M
    PATTERN RECOGNITION, 1999, 32 (03) : 425 - 432
  • [26] COMPUTING NEAREST-NEIGHBOR PATTERN-CLASSIFICATION PERCEPTRONS
    MURPHY, O
    BROOKS, B
    KITE, T
    INFORMATION SCIENCES, 1995, 83 (3-4) : 133 - 142
  • [27] A MULTISTAGE GENERALIZATION OF THE RANK NEAREST-NEIGHBOR CLASSIFICATION RULE
    BAGUI, SC
    PAL, NR
    PATTERN RECOGNITION LETTERS, 1995, 16 (06) : 601 - 614
  • [28] ON NEAREST-NEIGHBOR GRAPHS
    PATERSON, MS
    YAO, FF
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 623 : 416 - 426
  • [29] On Nearest-Neighbor Graphs
    D. Eppstein
    M. S. Paterson
    F. F. Yao
    Discrete & Computational Geometry, 1997, 17 : 263 - 282
  • [30] Selection of the optimal prototype subset for 1-NN classification
    Lipowezky, U
    PATTERN RECOGNITION LETTERS, 1998, 19 (10) : 907 - 918