Document categorization using support vector machines

被引:0
|
作者
Villasana, Sergio [1 ]
Seijas, Cesar [1 ]
Caralli, Antonino [2 ]
Jimenez, Jesus [3 ]
Pacheco, Jose [4 ]
机构
[1] Univ Carabobo, Fac Ingn, CITAEC, Valencia, Venezuela
[2] Univ Carabobo, Fac Ingn, Ctr Invest & Bioingn, Valencia, Venezuela
[3] Univ Carabobo, Fac Ingn, Dept Matemat, Estudios Basicos, Valencia, Venezuela
[4] Univ Carabobo, Fac Ingn, CPI, Valencia, Venezuela
来源
INGENIERIA UC | 2008年 / 15卷 / 03期
关键词
support vector machine; text categorization; string kernel;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this investigation a nu-SVC and string kernel-based categorizer of documents (undergraduate final projects) was developed. A corpus for training was generated from the compounded phrases (two-word phrases) and single words that were more representative of the selected areas (Electrical and Civilian Engineering), which were extracted from the document titles. The test set was made up with all the document titles of the undergraduate final projects between the years 1997 and 2006 (both years inclusive). The performance of the classifier, varying the parameters of the nu-SVC and the string kernel, was good after the tuning process. Results showed the great potential of the support vector machine in the text classification area.
引用
收藏
页码:45 / 52
页数:8
相关论文
共 50 条
  • [31] An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization
    Lam Hong Lee
    Chin Heng Wan
    Rajprasad Rajkumar
    Dino Isa
    [J]. Applied Intelligence, 2012, 37 : 80 - 99
  • [32] Beamforming using support vector machines
    Ramón, MM
    Xu, N
    Christodoulou, CG
    [J]. IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS, 2005, 4 : 439 - 442
  • [33] An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization
    Lee, Lam Hong
    Wan, Chin Heng
    Rajkumar, Rajprasad
    Isa, Dino
    [J]. APPLIED INTELLIGENCE, 2012, 37 (01) : 80 - 99
  • [34] Fish age categorization from otolith images using multi-class support vector machines
    Bermejo, Sergio
    Monegal, Brais
    Cabestany, Joan
    [J]. FISHERIES RESEARCH, 2007, 84 (02) : 247 - 253
  • [35] Fast text categorization with min-max modular support vector machines
    Liu, FY
    Wu, K
    Zhao, H
    Lu, BL
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 570 - 575
  • [36] An anti-noise text categorization method based on support vector machines
    Chen, L
    Huang, J
    Gong, ZH
    [J]. ADVANCES IN WEB INTELLIGENCE, PROCEEDINGS, 2005, 3528 : 272 - 278
  • [37] Separating the Wheat from the Chaff: Applications of Automated Document Classification Using Support Vector Machines
    D'Orazio, Vito
    Landis, Steven T.
    Palmer, Glenn
    Schrodt, Philip
    [J]. POLITICAL ANALYSIS, 2014, 22 (02) : 224 - 242
  • [38] A fuzzy semi-supervised support vector machines approach to hypertext categorization
    Benbrahim, Houda
    Bramer, Max
    [J]. ARTIFICIAL INTELLIGENCE IN THEORY AND PRACTICE II, 2008, 276 : 97 - 106
  • [39] Informative Vector Machines for text categorization
    Stankovic, Milos
    Stankovic, Srdan
    [J]. NEUREL 2006: EIGHT SEMINAR ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING, PROCEEDINGS, 2006, : 99 - +
  • [40] A hybrid approach to error reduction of support vector machines in document classification
    Tae, Yoon-Shik
    Son, Jeong woo
    Kong, Mi-hwa
    Lee, Jun-Seok
    Park, Seong-Bae
    Lee, Sang-Jo
    [J]. THIRD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, PROCEEDINGS, 2006, : 501 - +