A wrapper approach with support vector machines for text categorization

被引:0
|
作者
Montanés, E [1 ]
Quevedo, JR [1 ]
Díaz, I [1 ]
机构
[1] Univ Oviedo, Ctr Artificial Intelligence, Gijon, Asturias, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text Categorization (TC)-the assignment of predefined categories to documents of a corpus-plays an important role in a wide variety of information organization and management tasks of Information Retrieval (IR). It involves the management of a lot of information, but some of them could be noisy or irrelevant and hence, a previous feature reduction could improve the performance of the classification. In this paper we proposed a wrapper approach. This kind of approach is time-consuming and sometimes could be infeasible. But our wrapper explores a reduced number of feature subsets and also it uses Support Vector Machines (SVM) as the evaluation system; and this two properties make the wrapper fast enough to deal with large number of features present in text domains. Taking the Reuters-21578 corpus, we also compare this wrapper with the common approach for feature reduction widely applied in TC, which consists of filtering according to scoring measures.
引用
收藏
页码:230 / 237
页数:8
相关论文
共 50 条
  • [1] Feature selection for support vector machines in text categorization
    Liu, Y
    Lu, HM
    Lu, ZX
    Wang, P
    [J]. MLMTA'03: INTERNATIONAL CONFERENCE ON MACHINE LEARNING; MODELS, TECHNOLOGIES AND APPLICATIONS, 2003, : 129 - 134
  • [2] Least Squares Twin Support Vector Machines for Text Categorization
    Kumar, M. Arun
    Gopal, M.
    [J]. PROCEEDINGS OF THE 2015 39TH NATIONAL SYSTEMS CONFERENCE (NSC), 2015,
  • [3] Support vector machines for text categorization in Chinese question classification
    Lin, Xu-Dong
    Peng, Hong
    Liu, Bo
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 334 - +
  • [4] Support Vector Machines based on a semantic kernel for text categorization
    Siolas, G
    d'Alché-Buc, F
    [J]. IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL V, 2000, : 205 - 209
  • [5] Virtual relevant documents in text categorization with support vector machines
    Lee, Kyung-Soon
    Kageura, Kyo
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (04) : 902 - 913
  • [6] Informative Vector Machines for text categorization
    Stankovic, Milos
    Stankovic, Srdan
    [J]. NEUREL 2006: EIGHT SEMINAR ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING, PROCEEDINGS, 2006, : 99 - +
  • [7] Improving performance of text categorization by combining filtering and support vector machines
    Díaz, I
    Ranilla, J
    Montañes, E
    Fernández, J
    Combarro, EF
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (07): : 579 - 592
  • [8] Experiments on kernel tree support vector PF machines for text categorization
    Methasate, Ithipan
    Theeramunkong, Thanaruk
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 720 - +
  • [9] Fast text categorization with min-max modular support vector machines
    Liu, FY
    Wu, K
    Zhao, H
    Lu, BL
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 570 - 575
  • [10] A new transductive support vector machine approach to text categorization
    Sun, F
    Sun, MS
    [J]. PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 631 - 635