Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

被引:0
|
作者
Wang, Jiachen T. [1 ]
Mittal, Prateek [1 ]
Jia, Ruoxi [2 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] Virginia Tech, Blacksburg, VA 24061 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted K nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting a notable improvement from O(N-K), the best result from existing literature. We develop a deterministic approximation algorithm that further improves computational efficiency while maintaining the key fairness properties of the Shapley value. Through extensive experiments, we demonstrate WKNN-Shapley's computational efficiency and its superior performance in discerning data quality compared to its unweighted counterpart.
引用
收藏
页数:39
相关论文
共 50 条
  • [1] Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms
    Jia, Ruoxi
    Dao, David
    Wang, Boxin
    Hubis, Frances Ann
    Gurel, Nezihe Merve
    Li, Bo
    Zhang, Ce
    Spanos, Costas J.
    Song, Dawn
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (11): : 1610 - 1623
  • [2] Efficient nearest neighbor classification with data reduction and fast search algorithms
    Sánchez, JS
    Sotoca, JM
    Pla, F
    2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 4757 - +
  • [3] Weighted nearest neighbor algorithms for the graph exploration problem on cycles
    Asahiro, Yuichi
    Miyano, Eiji
    Miyazaki, Shuichi
    Yoshimuta, Takuro
    INFORMATION PROCESSING LETTERS, 2010, 110 (03) : 93 - 98
  • [4] Weighted nearest neighbor algorithms for the graph exploration problem on cycles
    Asahiro, Yuichi
    Miyano, Eiji
    Miyazaki, Shuichi
    Yoshimuta, Takuro
    SOFSEM 2007: THEORY AND PRACTICE OF COMPUTER SCIENCE, PROCEEDINGS, 2007, 4362 : 164 - +
  • [5] Efficient Algorithms for Bayesian Nearest Neighbor Gaussian Processes
    Finley, Andrew O.
    Datta, Abhirup
    Cook, Bruce D.
    Morton, Douglas C.
    Andersen, Hans E.
    Banerjee, Sudipto
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (02) : 401 - 414
  • [6] AN EFFICIENT PATTERN CLASSIFICATION APPROACH: COMBINATION OF WEIGHTED LDA WITH WEIGHTED NEAREST NEIGHBOR
    Boostani, Reza
    Dehzangi, Omid
    Jarchi, Delaram
    Zolghadri, Mansoor J.
    NEURAL NETWORK WORLD, 2010, 20 (05) : 621 - 635
  • [7] Scalable Nearest Neighbor Algorithms for High Dimensional Data
    Muja, Marius
    Lowe, David G.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (11) : 2227 - 2240
  • [8] Using weighted nearest neighbor to benefit from unlabeled data
    Driessens, Kurt
    Reutemann, Peter
    Pfahringer, Bernhard
    Leschi, Claire
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 60 - 69
  • [9] Space Efficient Data Structures for Nearest Larger Neighbor
    Jayapaul, Varunkumar
    Jo, Seungbum
    Raman, Venkatesh
    Satti, Srinivasa Rao
    COMBINATORIAL ALGORITHMS, IWOCA 2014, 2015, 8986 : 176 - 187
  • [10] Space efficient data structures for nearest larger neighbor
    Jayapaul, Varunkumar
    Jo, Seungbum
    Raman, Rajeev
    Raman, Venkatesh
    Satti, Srinivasa Rao
    JOURNAL OF DISCRETE ALGORITHMS, 2016, 36 : 63 - 75