A New Samples Selecting Method based on K Nearest Neighbors

被引:0
|
作者
Yang, Kai [1 ]
Cai, Yi [1 ]
Cai, Zhiwei [1 ]
Tan, Xingwei [1 ]
Xie, Haoran [2 ]
Wong, Tak Lam [2 ]
Chan, Wai Hong [2 ]
机构
[1] South China Univ Technol, Sch Software, Guangzhou, Guangdong, Peoples R China
[2] Educ Univ Hong Kong, Dept Math & Informat Technol, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Short text classification uses a supervised learning process, and it needs a huge amount of labeled data for training. This process consumes a lot of human resources. In traditional supervised learning problems, active learning can reduce the amount of samples that need to be labeled manually. It achieves this goal by selecting the most representative samples to represent the whole training set. Uncertainty sampling is the most popular way in active learning, but it has poor performance when it is affected by outliers. In our paper, we propose a new sampling method for training sets containing short text, which is denoted as Top-K Representative (TKR). However, the optimization process of TKR is a N-P hard problem. To solve this problem, a new algorithm, based on the greedy algorithm, is proposed to obtain the approximating results. The experiments show that our proposed sampling method performs better than the state-of-the-art methods.
引用
收藏
页码:457 / 462
页数:6
相关论文
共 50 条
  • [1] Top K representative: a method to select representative samples based on K nearest neighbors
    Yang, Kai
    Cai, Yi
    Cai, Zhiwei
    Xie, Haoran
    Wong, Tak-Lam
    Chan, Wai Hong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (08) : 2119 - 2129
  • [2] Top K representative: a method to select representative samples based on K nearest neighbors
    Kai Yang
    Yi Cai
    Zhiwei Cai
    Haoran Xie
    Tak-Lam Wong
    Wai Hong Chan
    International Journal of Machine Learning and Cybernetics, 2019, 10 : 2119 - 2129
  • [3] METHOD FOR DETERMINING K-NEAREST NEIGHBORS
    KITTLER, J
    KYBERNETES, 1978, 7 (04) : 313 - 315
  • [4] River Flow Prediction Using Dynamic Method for Selecting and Prioritizing K-Nearest Neighbors Based on Data Features
    Ebrahimi, Ehsan
    Shourian, Mojtaba
    JOURNAL OF HYDROLOGIC ENGINEERING, 2020, 25 (05)
  • [5] AN APPROXIMATE CLUSTERING TECHNIQUE BASED ON THE K-NEAREST NEIGHBORS METHOD
    KOVALENKO, AP
    AUTOMATION AND REMOTE CONTROL, 1992, 53 (10) : 1592 - 1598
  • [6] Improved Euclidean Distance in the K Nearest Neighbors Method
    Boucetta, Cherifa
    Hussenet, Laurent
    Herbin, Michel
    INNOVATIONS FOR COMMUNITY SERVICES, I4CS 2023, 2023, 1876 : 315 - 324
  • [7] NNVDC: A new versatile density-based clustering method using k-Nearest Neighbors
    Prasad, Rabinder Kumar
    Sarmah, Rosy
    Chakraborty, Subrata
    Sarmah, Sauravjyoti
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
  • [8] A UNIMODAL CLUSTERING-ALGORITHM BASED ON THE K-NEAREST NEIGHBORS METHOD
    KOVALENKO, AP
    AUTOMATION AND REMOTE CONTROL, 1993, 54 (05) : 794 - 798
  • [9] NearCount: Selecting critical instances based on the cited counts of nearest neighbors
    Zhu, Zonghai
    Wang, Zhe
    Li, Dongdong
    Du, Wenli
    KNOWLEDGE-BASED SYSTEMS, 2020, 190
  • [10] A NEW FUZZY K-NEAREST NEIGHBORS ALGORITHM
    Li, Chengjie
    Pei, Zheng
    Li, Bo
    Zhang, Zhen
    INTELLIGENT DECISION MAKING SYSTEMS, VOL. 2, 2010, : 246 - +