Discriminative Feature Analysis and Selection for Document Classification

被引:0
|
作者
Chinta, Punya Murthy [1 ]
Murty, M. Narasimha [1 ]
机构
[1] Indian Inst Sci, Bangalore 560012, Karnataka, India
关键词
Large document collection; Feature selection methods; Discriminative features; Classification; Scalability;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification of a large document collection involves dealing with a huge feature space where each distinct word is a feature. In such an environment, classification is a costly task both in terms of running time and computing resources. Further it will not guarantee optimal results because it is likely to overfit by considering every feature for classification. In such a context, feature selection is inevitable. This work analyses the feature selection methods, explores the relations among them and attempts to find a minimal subset of features which are discriminative for document classification.
引用
收藏
页码:366 / 374
页数:9
相关论文
共 50 条
  • [1] Feature selection for document type classification
    Taghva, Kazem
    Vergara, Jason
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2008, : 179 - 182
  • [2] Discriminative Gabor Feature Selection for Hyperspectral Image Classification
    Shen, Linlin
    Zhu, Zexuan
    Jia, Sen
    Zhu, Jiasong
    Sun, Yiwen
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2013, 10 (01) : 29 - 33
  • [3] Discriminative Feature Combination Selection for Enhancing Multiclass Classification
    Song, Aibo
    Qian, Wei
    Wu, Zhiang
    Zhao, Jinghua
    [J]. PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, SOCIO-CULTURAL COMPUTING (BESC), 2015, : 89 - 95
  • [4] Feature selection for the classification of large document collections
    Brank, Janez
    Mladenic, Dunja
    Grobelnik, Marko
    Milic-Frayling, Natasa
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2008, 14 (10) : 1562 - 1596
  • [5] The impact of feature selection on medical document classification
    Parlak, Bekir
    Uysal, Alper Kursat
    [J]. 2016 11TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2016,
  • [6] On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis
    Pratiwi, Asriyanti Indah
    Adiwijaya
    [J]. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2018, 2018
  • [7] Feature selection for document classification based on topology
    El Barbary, O. G.
    Salama, A. S.
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2018, 19 (02) : 129 - 132
  • [8] Discriminative feature selection with directional outliers correcting for data classification
    Yuan, Lixin
    Yang, Guoqiang
    Xu, Qian
    Lu, Tong
    [J]. PATTERN RECOGNITION, 2022, 126
  • [9] Discriminative Least Squares Regression for Multiclass Classification and Feature Selection
    Xiang, Shiming
    Nie, Feiping
    Meng, Gaofeng
    Pan, Chunhong
    Zhang, Changshui
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (11) : 1738 - 1754
  • [10] A COMBINED APPROACH FOR FILTER FEATURE SELECTION IN DOCUMENT CLASSIFICATION
    Le Nguyen Hoai Nam
    Ho Bao Quoc
    [J]. 2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 317 - 324