Hierarchically SVM classification based on support vector clustering method and its application to document categorization

被引:73
|
作者
Hao, Pei-Yi [1 ]
Chiang, Jung-Hsien
Tu, Yi-Kun
机构
[1] Natl Kaohsiung Univ Appl Sci, Dept Informat Management, Kaohsiung, Taiwan
[2] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
关键词
information retrieval; document categorization; hierarchical classification; support vector machines; support vector clustering method; machine learning;
D O I
10.1016/j.eswa.2006.06.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic categorization of documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques like support vector machines and related large margin methods have been successfully applied for this task, albeit the fact is that they ignore the inter-class relationships. Unfortunately, in the context of document categorization, we face a large number of classes and a huge number of relevant features needed to distinguish between them. The computational cost of training a classifier for a problem of this size is prohibitive. It has also been observed that obtaining a classifier that discriminates between two groups of classes is much easier than distinguishing simultaneously among all classes. This has prompted substantial research in using hierarchical classifiers to address single multi-class problems. In this paper, we propose a novel hierarchical classification method that generalizes support vector machine learning that is based on the results of support vector clustering method, and are structured in a way that mirrors the class hierarchy. Compared to previous non-hierarchical SVM classifier and famous documents categorization systems, the proposed hierarchical SVM classification has a better improvement in classification accuracy in the standard Reuters corpus. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:627 / 635
页数:9
相关论文
共 50 条
  • [1] A Modified Support Vector Clustering Method for Document Categorization
    Harish, B. S.
    Revanasiddappa, M. B.
    Kumar, S. V. Aruna
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE ENGINEERING AND APPLICATIONS (ICKEA 2016), 2016, : 1 - 5
  • [2] Web Document Categorization by Support Vector Clustering
    Shi, Daming
    Tsui, Ming Hei
    Liu, Jigang
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 1482 - 1487
  • [3] Clustering-Based Support Vector Machine (SVM) for Symptomatic Knee Osteoarthritis Severity Classification
    Halim, Husnir Nasyuha Abdul
    Azaman, Aizreena
    [J]. 2022 9TH INTERNATIONAL CONFERENCE ON BIOMEDICAL AND BIOINFORMATICS ENGINEERING, ICBBE 2022, 2022, : 140 - 146
  • [4] Document categorization using support vector machines
    Villasana, Sergio
    Seijas, Cesar
    Caralli, Antonino
    Jimenez, Jesus
    Pacheco, Jose
    [J]. INGENIERIA UC, 2008, 15 (03): : 45 - 52
  • [5] A novel classification method based on improved SVM and its application
    Wang, Senhua
    Li, Rui
    [J]. International Journal of Database Theory and Application, 2015, 8 (04): : 281 - 290
  • [6] Support Vector Machine (SVM) Classification: Comparison of Linkage Techniques Using a Clustering-Based Method for Training Data Selection
    Su, Lihong
    Huang, Yuxia
    [J]. GISCIENCE & REMOTE SENSING, 2009, 46 (04) : 411 - 423
  • [7] Audio classification and categorization based on wavelets and support vector machine
    Lin, CC
    Chen, SH
    Truong, TK
    Chang, Y
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 644 - 651
  • [8] Document clustering method using dimension reduction and support vector clustering to overcome sparseness
    Jun, Sunghae
    Park, Sang-Sung
    Jang, Dong-Sik
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (07) : 3204 - 3212
  • [9] Application for Web Text Categorization Based on Support Vector Machine
    Pan Hao
    Duan Ying
    Tan Longyuan
    [J]. 2009 INTERNATIONAL FORUM ON COMPUTER SCIENCE-TECHNOLOGY AND APPLICATIONS, VOL 2, PROCEEDINGS, 2009, : 42 - 45
  • [10] Document classification based on support vector machine using a concept vector model
    Deng, Shuang
    Peng, Hong
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 473 - +