A Modified Support Vector Clustering Method for Document Categorization

被引:0
|
作者
Harish, B. S. [1 ]
Revanasiddappa, M. B. [1 ]
Kumar, S. V. Aruna [1 ]
机构
[1] JSS Sci & Technol Univ, Dept Informat Sci & Engn, Mysuru, Karnataka, India
关键词
text categorization; support vector clustering; juzzy C-Means; term document matrix;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we propose a novel text categorization method based on modified Support Vector Clustering (SVC). SVC is a density based clustering approach, which handles the arbitrary shape clusters effectively. The main drawback of traditional SVC is that it treats unclassified documents as outliers. To overcome this problem, we employed Fuzzy C-Means (FCM) to cluster unclassified documents. The modified SVC (SVC-FCM) is applied to categorize text documents. The proposed method consists of three steps: In the first step, Regularized Locality Preserving Indexing (RLPI) is applied on Term Document Matrix (TDM) to reduce dimensionality of features. In second step, we use SVC to find base-cluster centers of documents. Finally, we use FCM to cluster unclassified documents. To evaluate the performance of the proposed method, we conducted experiments on standard 20-NewsGroup dataset.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 50 条
  • [1] Web Document Categorization by Support Vector Clustering
    Shi, Daming
    Tsui, Ming Hei
    Liu, Jigang
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 1482 - 1487
  • [2] Hierarchically SVM classification based on support vector clustering method and its application to document categorization
    Hao, Pei-Yi
    Chiang, Jung-Hsien
    Tu, Yi-Kun
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (03) : 627 - 635
  • [3] Document categorization using support vector machines
    Villasana, Sergio
    Seijas, Cesar
    Caralli, Antonino
    Jimenez, Jesus
    Pacheco, Jose
    [J]. INGENIERIA UC, 2008, 15 (03): : 45 - 52
  • [4] Document clustering method using dimension reduction and support vector clustering to overcome sparseness
    Jun, Sunghae
    Park, Sang-Sung
    Jang, Dong-Sik
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (07) : 3204 - 3212
  • [5] A modified fuzzy clustering for documents retrieval: application to document categorization
    Nefti, S.
    Oussalah, M.
    Rezgui, Y.
    [J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2009, 60 (03) : 384 - 394
  • [6] A support vector method for clustering
    Ben-Hur, A
    Horn, D
    Siegelmann, HT
    Vapnik, V
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 367 - 373
  • [7] A support vector clustering method
    Ben-Hur, A
    Horn, D
    Siegelmann, HT
    Vapnik, V
    [J]. 15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 724 - 727
  • [8] Improving Support Vector Data Description for Document Clustering
    Wang, Ziqiang
    Sun, Xia
    [J]. ADVANCES IN FUTURE COMPUTER AND CONTROL SYSTEMS, VOL 2, 2012, 160 : 271 - 276
  • [9] A decomposition method for Support Vector Clustering
    Saradhi, VV
    Karnik, H
    Mitra, P
    [J]. 2005 INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSING, PROCEEDINGS, 2005, : 268 - 271
  • [10] Incremental fuzzy clustering for document categorization
    Mei, Jian-Ping
    Wang, Yangtao
    Chen, Lihui
    Miao, Chunyan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 1518 - 1525