New methods for text categorization based on a new feature selection method and a new similarity measure between documents

被引:0
|
作者
Lee, Li-Wei [1 ]
Chen, Shyi-Ming [1 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Comp Sci & Informat Engn, Taipei, Taiwan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a new feature selection method based on document frequencies and statistical values. We also present a new similarity measure to calculate the degree of similarity between documents. Based on the proposed feature selection method and the proposed similarity measure between documents, we present three methods for dealing with the Reuters-21578 top 10 categories text categorization. The proposed methods get higher performance for dealing with the Reuters-21578 top 10 categories text categorization than that of the method presented in [4].
引用
收藏
页码:1280 / 1289
页数:10
相关论文
共 50 条
  • [1] New Feature Selection Methods Based on Context Similarity for Text Categorization
    Chen, Yifei
    Han, Bingqing
    Hou, Ping
    [J]. 2014 11TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2014, : 598 - 604
  • [2] A New Feature Selection Method for Text Categorization of Customer Reviews
    Liu, Miao
    Lu, Xiaoling
    Song, Jie
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2016, 45 (04) : 1397 - 1409
  • [3] A New Approach of Feature Selection for Text Categorization
    CUI Zifeng~1
    2. Department of Computer Science and Engineering
    [J]. Wuhan University Journal of Natural Sciences, 2006, (05) : 1335 - 1339
  • [4] A new approach to feature selection for text categorization
    Li, SS
    Zong, CQ
    [J]. PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 626 - 630
  • [5] Five new feature selection metrics in text categorization
    Song, Fengxi
    Zhang, David
    Xu, Yong
    Wang, Jizhong
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2007, 21 (06) : 1085 - 1101
  • [6] A New Feature Selection Method based on Intuitionistic Fuzzy Entropy to Categorize Text Documents
    Revanasiddappa, M. B.
    Harish, B. S.
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2018, 5 (03): : 106 - 117
  • [7] Three New Feature Weighting Methods for Text Categorization
    Xue, Wei
    Xu, Xinshun
    [J]. WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 352 - 359
  • [8] A NEW FEATURE SELECTION METHOD FOR TEXT CATEGORIZATION BASED ON INFORMATION GAIN AND PARTICLE SWARM OPTIMIZATION
    Yigit, Ferruh
    Baykan, Omer Kaan
    [J]. 2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 523 - 529
  • [9] An Evaluation of Existing and New Feature Selection Metrics in Text Categorization
    Tasci, Serafettin
    Gungor, Tunga
    [J]. 23RD INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2008, : 238 - 243
  • [10] GU metric - A new feature selection algorithm for text categorization
    Uchyigit, Gulden
    Clark, Keith
    [J]. ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS, 2007, : 399 - 402