A New Samples Selecting Method based on K Nearest Neighbors

被引:0
|
作者
Yang, Kai [1 ]
Cai, Yi [1 ]
Cai, Zhiwei [1 ]
Tan, Xingwei [1 ]
Xie, Haoran [2 ]
Wong, Tak Lam [2 ]
Chan, Wai Hong [2 ]
机构
[1] South China Univ Technol, Sch Software, Guangzhou, Guangdong, Peoples R China
[2] Educ Univ Hong Kong, Dept Math & Informat Technol, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Short text classification uses a supervised learning process, and it needs a huge amount of labeled data for training. This process consumes a lot of human resources. In traditional supervised learning problems, active learning can reduce the amount of samples that need to be labeled manually. It achieves this goal by selecting the most representative samples to represent the whole training set. Uncertainty sampling is the most popular way in active learning, but it has poor performance when it is affected by outliers. In our paper, we propose a new sampling method for training sets containing short text, which is denoted as Top-K Representative (TKR). However, the optimization process of TKR is a N-P hard problem. To solve this problem, a new algorithm, based on the greedy algorithm, is proposed to obtain the approximating results. The experiments show that our proposed sampling method performs better than the state-of-the-art methods.
引用
收藏
页码:457 / 462
页数:6
相关论文
共 50 条
  • [31] Chameleon algorithm based on mutual k-nearest neighbors
    Yuru Zhang
    Shifei Ding
    Lijuan Wang
    Yanru Wang
    Ling Ding
    Applied Intelligence, 2021, 51 : 2031 - 2044
  • [32] Ensemble k-nearest neighbors based on centroid displacement
    Wang, Alex X.
    Chukova, Stefanka S.
    Nguyen, Binh P.
    INFORMATION SCIENCES, 2023, 629 : 313 - 323
  • [33] A NEW METHOD FOR SELECTION OF NEIGHBORHOOD PARAMETER IN DISTANCE - WEIGHTED K-NEAREST NEIGHBORS CLASSIFFIER (DWKNN): CIRCULAR ATTRIBUTE NEIGHBORS
    Akben, Selahaddin Batuhan
    ISTANBUL UNIVERSITY-JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING, 2016, 16 (01): : 2021 - 2026
  • [34] Search K Nearest Neighbors on air
    Zheng, BH
    Lee, WC
    Lee, DL
    MOBILE DATA MANAGEMENT, PROCEEDINGS, 2003, 2574 : 181 - 195
  • [35] Modernizing k-nearest neighbors
    Elizabeth Yancey, Robin
    Xin, Bochao
    Matloff, Norm
    STAT, 2021, 10 (01):
  • [36] Collaborative representation classifier based on K nearest neighbors for classification
    Wei, Jiangshu
    Qi, Xiangjun
    Wang, Mantao
    Journal of Software Engineering, 2015, 9 (01): : 96 - 104
  • [37] Density Peaks Clustering Algorithm Based on K Nearest Neighbors
    Yin, Shihao
    Wu, Runxiu
    Li, Peiwu
    Liu, Baohong
    Fu, Xuefeng
    ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING (ECC 2021), 2022, 268 : 129 - 144
  • [38] Chameleon algorithm based on mutual k-nearest neighbors
    Zhang, Yuru
    Ding, Shifei
    Wang, Lijuan
    Wang, Yanru
    Ding, Ling
    APPLIED INTELLIGENCE, 2021, 51 (04) : 2031 - 2044
  • [39] A Local Model Reduction Method Based on k-Nearest-Neighbors for Parametrized Nonlocal Problems
    Nan, Caixia
    Li, Qiuqi
    Song, Huailing
    COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2025, 37 (01) : 220 - 249
  • [40] A bi-objective k-nearest-neighbors-based imputation method for multilevel data
    Cubillos, Maximiliano
    Wohlk, Sanne
    Wulff, Jesper N.
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 204