Enhancement of DTP feature selection method for text categorization

被引:0
|
作者
Moyotl-Hernández, E [1 ]
Jiménez-Salazar, H [1 ]
机构
[1] Univ Autonoma Puebla, Fac Ciencias Comp, Puebla 72570, Mexico
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the structure of vectors obtained by using term selection methods in high-dimensional text collection. We found that the distance to transition point (DTP) method omits commonly occurring terms, which are poor discriminators between documents, but which convey important information about a collection. Experimental results obtained on the Reuters-21578 collection with the k-NN classifier show that feature selection by DTP combined with common terms outperforms slightly simple document frequency.
引用
收藏
页码:719 / 722
页数:4
相关论文
共 50 条
  • [1] A hybrid feature selection method for text categorization
    Montanes, E.
    Quevedo, J. R.
    Combarro, E. F.
    Diaz, I.
    Ranilla, J.
    [J]. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2007, 15 (02) : 133 - 151
  • [2] An Effective Feature Selection Method for Text Categorization
    Qiu, Xipeng
    Zhou, Jinlong
    Huang, Xuanjing
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6634 : 50 - 61
  • [3] A discriminative and semantic feature selection method for text categorization
    Zong, Wei
    Wu, Feng
    Chu, Lap-Keung
    Sculli, Domenic
    [J]. INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2015, 165 : 215 - 222
  • [4] Comparison and Improvement of feature selection method for text categorization
    Shan, Li-Li
    Liu, Bing-Quan
    Sun, Cheng-Jie
    [J]. Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2011, 43 (SUPPL. 1): : 319 - 324
  • [5] Feature Selection Method Based on Crossed Centroid for Text Categorization
    Yang, Jieming
    Liu, Zhiying
    Qu, Zhaoyang
    Wang, Jing
    [J]. 2014 15TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2014, : 11 - 15
  • [6] A New Feature Selection Method for Text Categorization of Customer Reviews
    Liu, Miao
    Lu, Xiaoling
    Song, Jie
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2016, 45 (04) : 1397 - 1409
  • [7] Improved Comprehensive Measurement Feature Selection Method for Text Categorization
    Feng, LiZhou
    Zuo, WanLi
    Wang, YouWei
    [J]. 2015 INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC), 2015, : 125 - 128
  • [8] Trigonometric comparison measure: A feature selection method for text categorization
    Kim, Kyoungok
    Zzang, See Young
    [J]. DATA & KNOWLEDGE ENGINEERING, 2019, 119 : 1 - 21
  • [9] A two-stage feature selection method for text categorization
    Meng, Jiana
    Lin, Hongfei
    Yu, Yuhai
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2011, 62 (07) : 2793 - 2800
  • [10] Feature selection in SVM text categorization
    Taira, H
    Haruno, M
    [J]. SIXTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-99)/ELEVENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE (IAAI-99), 1999, : 480 - 486