Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

被引:0
|
作者
Wang, Jiachen T. [1 ]
Mittal, Prateek [1 ]
Jia, Ruoxi [2 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] Virginia Tech, Blacksburg, VA 24061 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted K nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting a notable improvement from O(N-K), the best result from existing literature. We develop a deterministic approximation algorithm that further improves computational efficiency while maintaining the key fairness properties of the Shapley value. Through extensive experiments, we demonstrate WKNN-Shapley's computational efficiency and its superior performance in discerning data quality compared to its unweighted counterpart.
引用
收藏
页数:39
相关论文
共 50 条
  • [21] Benefit of Interpolation in Nearest Neighbor Algorithms
    Xing, Yue
    Song, Qifan
    Cheng, Guang
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2022, 4 (02): : 935 - 956
  • [22] Efficient Nearest Neighbor Queries on Non-point Data
    Michalopoulos, Achilleas
    Tsitsigkos, Dimitrios
    Bouros, Panagiotis
    Mamoulis, Nikos
    Terrovitis, Manolis
    31ST ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2023, 2023, : 147 - 150
  • [23] Scalable Algorithms for Nearest-Neighbor Joins on Big Trajectory Data
    Fang, Yixiang
    Cheng, Reynold
    Tang, Wenbin
    Maniu, Silviu
    Yang, Xuan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (03) : 785 - 800
  • [24] Scalable Algorithms for Nearest-Neighbor Joins on Big Trajectory Data
    Fang, Yixiang
    Cheng, Reynold
    Tang, Wenbin
    Maniu, Silviu
    Yang, Xuan
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1528 - 1529
  • [25] Efficient Algorithms for Answering Reverse Spatial-Keyword Nearest Neighbor Queries
    Lu, Ying
    Cong, Gao
    Lu, Jiaheng
    Shahabi, Cyrus
    23RD ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2015), 2015,
  • [26] Improving motion-planning algorithms by efficient nearest-neighbor searching
    Yershova, Anna
    LaValle, Steven M.
    IEEE TRANSACTIONS ON ROBOTICS, 2007, 23 (01) : 151 - 157
  • [27] Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification
    Viswanath, P.
    Murty, M. Narasimha
    Bhatnagar, Shalabh
    PATTERN RECOGNITION LETTERS, 2006, 27 (14) : 1714 - 1724
  • [28] Weighted K-Nearest Neighbor Revisited
    Bicego, M.
    Loog, M.
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 1642 - 1647
  • [29] Efficient Processing of Probabilistic Group Nearest Neighbor Query on Uncertain Data
    Li, Jiajia
    Wang, Botao
    Wang, Guoren
    Bi, Xin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT I, 2014, 8421 : 436 - 450
  • [30] Efficient Probabilistic Reverse Nearest Neighbor Query Processing on Uncertain Data
    Bernecker, Thomas
    Emrich, Tobias
    Kriegel, Hans-Peter
    Renz, Matthias
    Zankl, Stefan
    Zuefle, Andreas
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (10): : 669 - 680