Semi-supervised Text Categorization Using Recursive K-means Clustering

被引:1
|
作者
Gowda, Harsha S. [1 ]
Suhil, Mahamad [1 ]
Guru, D. S. [1 ]
Raju, Lavanya Narayana [1 ]
机构
[1] Univ Mysore, Dept Studies Comp Sci, Mysore, Karnataka, India
关键词
Unlabeled text documents; Recursive K-means algorithm; Semi-supervised learning; Text categorization; CLASSIFICATION; DOCUMENTS; EM;
D O I
10.1007/978-981-10-4859-3_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a semi-supervised learning algorithm for classification of text documents. A method of labeling unlabeled text documents is presented. The presented method is based on the principle of divide and conquer strategy. It uses recursive K-means algorithm for partitioning both labeled and unlabeled data collection. The K-means algorithm is applied recursively on each partition till a desired level partition is achieved such that each partition contains labeled documents of a single class. Once the desired clusters are obtained, the respective cluster centroids are considered as representatives of the clusters and the nearest neighbor rule is used for classifying an unknown text document. Series of experiments have been conducted to bring out the superiority of the proposed model over other recent state of the art models on 20Newsgroups dataset.
引用
下载
收藏
页码:217 / 227
页数:11
相关论文
共 50 条
  • [1] A Semi-Supervised Text Clustering Approach Based on K-Means Algorithm
    Zhan, Lizhang
    Xu, Hong
    Chen, Xiuguo
    INTERNATIONAL CONFERENCE ON ENGINEERING AND BUSINESS MANAGEMENT (EBM2011), VOLS 1-6, 2011, : 2616 - 2620
  • [2] A novel rough semi-supervised k-means algorithm for text clustering
    Tang, Lei-yu
    Wang, Zhen-hao
    Wang, Shu-dong
    Fan, Jian-cong
    Yue, Guo-wei
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2023, 21 (02) : 57 - 68
  • [3] An Improved Semi-Supervised K-Means Clustering Algorithm
    Ye Hanmin
    Lv Hao
    Sun Qianting
    2016 IEEE INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2016, : 41 - 44
  • [4] Active Learning for Semi-Supervised K-Means Clustering
    Vu, Viet-Vu
    Labroche, Nicolas
    Bouchon-Meunier, Bernadette
    22ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2010), PROCEEDINGS, VOL 1, 2010,
  • [5] K-means clustering algorithm based on semi-supervised learning
    Department of Mathematics and Computer, Shangrao Normal College, Shangrao 334001, China
    不详
    J. Comput. Inf. Syst., 2008, 5 (2007-2013):
  • [6] Categorization Using Semi-Supervised Clustering
    Hu, Jianying
    Singh, Moninder
    Mojsilovic, Aleksandra
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3666 - 3669
  • [7] Semi-supervised k-means plus
    Yoder, Jordan
    Priebe, Carey E.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2017, 87 (13) : 2597 - 2608
  • [8] Semi-supervised learning techniques: k-means clustering in OODB Fragmentation
    Darabant, AS
    Campan, A
    ICCC 2004: SECOND IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL CYBERNETICS, PROCEEDINGS, 2004, : 333 - 338
  • [9] Semi-supervised K-Means Clustering by Optimizing Initial Cluster Centers
    Wang, Xin
    Wang, Chaofei
    Shen, Junyi
    WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 178 - +
  • [10] Global Optimization for Semi-supervised K-means
    Sun, Xue
    Li, Kunlun
    Zhao, Rui
    Hu, Xikun
    2009 ASIA-PACIFIC CONFERENCE ON INFORMATION PROCESSING (APCIP 2009), VOL 2, PROCEEDINGS, 2009, : 410 - +