Spectral Clustering based Active Learning with Applications to Text Classification

被引:1
|
作者
Guo, Wenbo [1 ]
Zhong, Chun [1 ]
Yang, Yupu [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China
关键词
D O I
10.1051/matecconf/20165601003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Active learning is a kind of machine learning algorithms that spontaneously choose data samples from which they will learn. It has been widely used in many data mining fields such as text classification, in which large amounts of unlabelled data samples are available, but labels are hard to get. In this paper, an improved active learning algorithm is proposed, which takes advantages of the distribution feature of the datasets to reduce the labelling cost and increase the accuracy. Before the active learning process, spectral clustering algorithm is applied to divide the datasets into two categories, and instances located at the boundary of two categories are labelled to train the initial classifier. In order to reduce the calculation cost, an incremental method is added in the present algorithm. The algorithm is applied to several text classification problems. The results show it is more effective and more accurate than the traditional active learning algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Support vector machine active learning with applications to text classification
    Tong, S
    Koller, D
    JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (01) : 45 - 66
  • [2] Active Learning Based on Transfer Learning Techniques for Text Classification
    Onita, Daniela
    IEEE ACCESS, 2023, 11 : 28751 - 28761
  • [3] Text classification with active learning
    Novak, B
    Mladenic, D
    Grobelnik, M
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 398 - +
  • [4] Learning ontologies to improve text clustering and classification
    Bloehdorn, S
    Cimiano, P
    Hotho, A
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 334 - +
  • [5] An Active Learning Approach to Frequent Itemset-Based Text Clustering
    Marcacini, Ricardo M.
    Correa, Geraldo N.
    Rezende, Solange O.
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3529 - 3532
  • [6] Active learning for text classification with reusability
    Hu, Rong
    Mac Namee, Brian
    Delany, Sarah Jane
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 45 : 438 - 449
  • [7] Active Learning for Turkish Text Classification
    Sapci, Ali Osman Berk
    Tastan, Oznur
    Yeniterzi, Reyyan
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [8] Deep Active Learning for Text Classification
    An, Bang
    Wu, Wenjun
    Han, Huimin
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2018), 2018,
  • [9] ITERATIVE CLUSTERING BASED ACTIVE LEARNING FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Lu, Ting
    Li, Shutao
    Benediktsson, Jon Atli
    2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 3664 - 3667
  • [10] Clustering Based Feature Selection using Extreme Learning Machines for Text Classification
    Roul, Rajendra Kumar
    Gugnani, Shashank
    Kalpeshbhai, Shah Mit
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,