Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

被引:0
|
作者
Wang, Jiachen T. [1 ]
Mittal, Prateek [1 ]
Jia, Ruoxi [2 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] Virginia Tech, Blacksburg, VA 24061 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted K nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting a notable improvement from O(N-K), the best result from existing literature. We develop a deterministic approximation algorithm that further improves computational efficiency while maintaining the key fairness properties of the Shapley value. Through extensive experiments, we demonstrate WKNN-Shapley's computational efficiency and its superior performance in discerning data quality compared to its unweighted counterpart.
引用
收藏
页数:39
相关论文
共 50 条
  • [41] Nearest neighbor imputation algorithms: a critical evaluation
    Lorenzo Beretta
    Alessandro Santaniello
    BMC Medical Informatics and Decision Making, 16
  • [42] Evaluation of fast algorithms for finding the nearest neighbor
    Lubiarz, S
    Lockwood, P
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1491 - 1494
  • [43] AN EFFICIENT NEAREST NEIGHBOR SEARCH METHOD
    SOLEYMANI, MR
    MORGERA, SD
    IEEE TRANSACTIONS ON COMMUNICATIONS, 1987, 35 (06) : 677 - 679
  • [44] Efficient Nearest Neighbor Language Models
    He, Junxian
    Neubig, Graham
    Berg-Kirkpatrick, Taylor
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5703 - 5714
  • [45] Nearest neighbor imputation algorithms: a critical evaluation
    Beretta, Lorenzo
    Santaniello, Alessandro
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2016, 16
  • [46] Efficient k-Nearest-Neighbor Search Algorithms for Historical Moving Object Trajectories
    Yun-Jun Gao
    Chun Li
    Gen-Cai Chen
    Ling Chen
    Xian-Ta Jiang
    Chun Chen
    Journal of Computer Science and Technology, 2007, 22 : 232 - 244
  • [47] Efficient k-nearest-neighbor search algorithms for historical moving object trajectories
    Gao, Yun-Jun
    Li, Chun
    Chen, Gen-Cai
    Chen, Ling
    Jiang, Xian-Ta
    Chen, Chun
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2007, 22 (02) : 232 - 244
  • [48] An Approach for Treatment of the Incomplete Data Based on WaveCluster and Weighted 1-Nearest Neighbor
    Li, Xingyi
    Lu, Junyun
    Shi, Huaji
    Ma, Suqin
    IACSIT-SC 2009: INTERNATIONAL ASSOCIATION OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY - SPRING CONFERENCE, 2009, : 3 - 8
  • [49] An Improved Weighted Base Classification for Optimum Weighted Nearest Neighbor Classifiers
    Abbas, Muhammad
    Memon, Kamran Ali
    ul Ain, Noor
    Ajebesone, Ekang Francis
    Usaid, Muhammad
    Bhutto, Zulfiqar Ali
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2020, 7 (27) : 1 - 8
  • [50] Weighted k-nearest neighbor based data complexity metrics for imbalanced datasets
    Singh, Deepika
    Gosain, Anjana
    Saha, Anju
    STATISTICAL ANALYSIS AND DATA MINING, 2020, 13 (04) : 394 - 404