New methods for text categorization based on a new feature selection method and a new similarity measure between documents

被引:0
|
作者
Lee, Li-Wei [1 ]
Chen, Shyi-Ming [1 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Comp Sci & Informat Engn, Taipei, Taiwan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a new feature selection method based on document frequencies and statistical values. We also present a new similarity measure to calculate the degree of similarity between documents. Based on the proposed feature selection method and the proposed similarity measure between documents, we present three methods for dealing with the Reuters-21578 top 10 categories text categorization. The proposed methods get higher performance for dealing with the Reuters-21578 top 10 categories text categorization than that of the method presented in [4].
引用
收藏
页码:1280 / 1289
页数:10
相关论文
共 50 条
  • [41] Feature trees: A new molecular similarity measure based on tree matching
    Matthias Rarey
    J. Scott Dixon
    [J]. Journal of Computer-Aided Molecular Design, 1998, 12 : 471 - 490
  • [42] Feature trees: A new molecular similarity measure based on tree matching
    Rarey, M
    Dixon, JS
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1998, 12 (05) : 471 - 490
  • [43] Text Categorization Based on Clustering Feature Selection
    Zhou, Xiaofei
    Hu, Yue
    Guo, Li
    [J]. 2ND INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2014, 2014, 31 : 398 - 405
  • [44] A new method for feature selection
    Wu, Yan
    Yang, Yang
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 1367 - 1372
  • [45] A New Similarity Measure for Document Classification and Text Mining
    Eminagaoglu, Mete
    Goksen, Yilmaz
    [J]. ECONOMIES OF THE BALKAN AND EASTERN EUROPEAN COUNTRIES, 2020, : 353 - 366
  • [46] Filter unsupervised spectral feature selection method for mixed data based on a new feature correlation measure
    Solorio-Fernandez, Saul
    Carrasco-Ochoa, J. Ariel
    Martinez-Trinidad, Jose Fco.
    [J]. NEUROCOMPUTING, 2024, 571
  • [47] A new feature selection method for handling redundant information in text classification
    You-wei Wang
    Li-zhou Feng
    [J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19 : 221 - 234
  • [48] A new feature selection method for handling redundant information in text classification
    Wang, You-wei
    Feng, Li-zhou
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (02) : 221 - 234
  • [49] A new approach to feature selection in text classification
    Wang, Y
    Wang, XJ
    [J]. PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 3814 - 3819
  • [50] Enhancement of DTP feature selection method for text categorization
    Moyotl-Hernández, E
    Jiménez-Salazar, H
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 719 - 722