Spectral Clustering based Active Learning with Applications to Text Classification

被引:1
|
作者
Guo, Wenbo [1 ]
Zhong, Chun [1 ]
Yang, Yupu [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China
关键词
D O I
10.1051/matecconf/20165601003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Active learning is a kind of machine learning algorithms that spontaneously choose data samples from which they will learn. It has been widely used in many data mining fields such as text classification, in which large amounts of unlabelled data samples are available, but labels are hard to get. In this paper, an improved active learning algorithm is proposed, which takes advantages of the distribution feature of the datasets to reduce the labelling cost and increase the accuracy. Before the active learning process, spectral clustering algorithm is applied to divide the datasets into two categories, and instances located at the boundary of two categories are labelled to train the initial classifier. In order to reduce the calculation cost, an incremental method is added in the present algorithm. The algorithm is applied to several text classification problems. The results show it is more effective and more accurate than the traditional active learning algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] LSCALE: Latent Space Clustering-Based Active Learning for Node Classification
    Liu, Juncheng
    Wang, Yiwei
    Hooi, Bryan
    Yang, Renchi
    Xiao, Xiaokui
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 13713 : 55 - 70
  • [22] Deep active learning for multi label text classification
    Wang, Qunbo
    Zhang, Hangu
    Zhang, Wentao
    Dai, Lin
    Liang, Yu
    Shi, Haobin
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [23] Active Learning of Constraints for Semi-supervised Text Clustering
    Huang, Ruizhang
    Lam, Wai
    Zhang, Zhigang
    PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 113 - 124
  • [24] Deep Active Learning for Text Classification with Diverse Interpretations
    Liu, Qiang
    Zhu, Yanqiao
    Liu, Zhaocheng
    Zhang, Yufeng
    Wu, Shu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3263 - 3267
  • [25] Using Active Learning in Text Classification of Quranic Sciences
    Goudjil, Mohamed
    Bedda, Mouldi
    Koudil, Mouloud
    Ghoggali, Noureddine
    2013 TAIBAH UNIVERSITY INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY FOR THE HOLY QURAN AND ITS SCIENCES, 2013, : 209 - 213
  • [26] Active Learning for Text Classification and Fake News Detection
    Sahan, Marko
    Smidl, Vaclav
    Marik, Radek
    2021 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROLS (ISCSIC 2021), 2021, : 87 - 94
  • [27] Barrage Text Classification with Improved Active Learning and CNN
    Qiu, Ningjia
    Cong, Lin
    Zhou, Sicheng
    Wang, Peng
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2019, 23 (06) : 980 - 989
  • [28] An Improved KNN Text Classification Algorithm Based on Clustering
    Zhou Yong
    Li Youwen
    Xia Shixiong
    JOURNAL OF COMPUTERS, 2009, 4 (03) : 230 - 237
  • [29] A Novel Fuzzy Based Clustering Algorithm for Text Classification
    Mohan, A. Krishna
    Rao, V. V. Narasimha
    Prasad, M. H. M. Krishna
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2013, 13 (05): : 100 - 107
  • [30] A Text Classification Algorithm Based on Rocchio and Hierarchical Clustering
    Zeng, Anping
    Huang, Yongping
    ADVANCED INTELLIGENT COMPUTING, 2011, 6838 : 432 - +