A Modified Support Vector Clustering Method for Document Categorization

被引:0
|
作者
Harish, B. S. [1 ]
Revanasiddappa, M. B. [1 ]
Kumar, S. V. Aruna [1 ]
机构
[1] JSS Sci & Technol Univ, Dept Informat Sci & Engn, Mysuru, Karnataka, India
关键词
text categorization; support vector clustering; juzzy C-Means; term document matrix;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we propose a novel text categorization method based on modified Support Vector Clustering (SVC). SVC is a density based clustering approach, which handles the arbitrary shape clusters effectively. The main drawback of traditional SVC is that it treats unclassified documents as outliers. To overcome this problem, we employed Fuzzy C-Means (FCM) to cluster unclassified documents. The modified SVC (SVC-FCM) is applied to categorize text documents. The proposed method consists of three steps: In the first step, Regularized Locality Preserving Indexing (RLPI) is applied on Term Document Matrix (TDM) to reduce dimensionality of features. In second step, we use SVC to find base-cluster centers of documents. Finally, we use FCM to cluster unclassified documents. To evaluate the performance of the proposed method, we conducted experiments on standard 20-NewsGroup dataset.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 50 条
  • [41] A novel method for image categorization based on histogram oriented gradient and support vector machine
    Guedira, Mohammed Reda
    El Qadi, Abderrahim
    Rziza, Mohammed
    El Hassouni, Mohammed
    [J]. PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL AND INFORMATION TECHNOLOGIES (ICEIT 2017), 2017,
  • [42] Efficient stopwords removing method for document categorization
    Joo, Kil Hong
    Park, Nam Hun
    [J]. ASIA LIFE SCIENCES, 2015, : 371 - 388
  • [43] Convex Decomposition Based Cluster Labeling Method for Support Vector Clustering
    Ping, Yuan
    Tian, Ying-Jie
    Zhou, Ya-Jian
    Yang, Yi-Xian
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2012, 27 (02) : 428 - 442
  • [44] Convex Decomposition Based Cluster Labeling Method for Support Vector Clustering
    平源
    田英杰
    周亚建
    杨义先
    [J]. Journal of Computer Science & Technology, 2012, 27 (02) : 428 - 442
  • [45] Convex Decomposition Based Cluster Labeling Method for Support Vector Clustering
    Yuan Ping
    Ying-Jie Tian
    Ya-Jian Zhou
    Yi-Xian Yang
    [J]. Journal of Computer Science and Technology, 2012, 27 : 428 - 442
  • [46] Modelling method with missing values based on clustering and support vector regression
    Wang, Ling
    Fu, Dongmei
    Li, Qing
    Mu, Zhichun
    [J]. JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2010, 21 (01) : 142 - 147
  • [47] Damage identification method based on fuzzy clustering and support vector machine
    Ran, Zhi-Hong
    Li, Qiao
    [J]. Zhendong Gongcheng Xuebao/Journal of Vibration Engineering, 2007, 20 (06): : 618 - 622
  • [49] A hybrid incremental clustering method-combining support vector machine and enhanced clustering by committee clustering algorithm
    Chiu, Deng-Yiv
    Hsieh, Kong-Ling
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 465 - +
  • [50] Semi-supervised Document Clustering with Simultaneous Text Representation and Categorization
    Chen, Yanhua
    Wang, Lijun
    Dong, Ming
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 211 - 226