Feature selection strategies for automated classification of digital media content

被引:5
|
作者
Rocha, Rocio [1 ]
Cobo, Angel [2 ]
机构
[1] Univ Cantabria, Dept Business Adm, E-39005 Santander, Spain
[2] Univ Cantabria, Dept Appl Math & Computat Sci, E-39005 Santander, Spain
关键词
automatic classification; clustering; digital media; feature selection; machine learning; text mining;
D O I
10.1177/0165551511412028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes strategies for feature selection of digital news articles that allow an effective implementation of learning algorithms for the unsupervised classification of news articles. With the appropriate selection of a small subset of features a correct identification of related news can be achieved, thus enabling organizations and individual users to keep track of current events. The paper defines a quality measure of the discriminatory power of each feature and verifies that the selection of a feature subset with higher quality values allows obtaining good classification results. A Particle Swarm Optimization (PSO) based selection method is also proposed. Both proposals are validated on two collections of press clippings collated from news search services in digital media. Experimental results reveal that good classification accuracy can be achieved with small subsets of between 3 per cent and 6 per cent of the features.
引用
收藏
页码:418 / 428
页数:11
相关论文
共 50 条
  • [41] Feature Selection for Gender Classification
    Zhang, Zhihong
    Hancock, Edwin R.
    PATTERN RECOGNITION AND IMAGE ANALYSIS: 5TH IBERIAN CONFERENCE, IBPRIA 2011, 2011, 6669 : 76 - 83
  • [42] Sequential Feature Selection for Classification
    Rueckstiess, Thomas
    Osendorfer, Christian
    van der Smagt, Patrick
    AI 2011: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 7106 : 132 - +
  • [43] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [44] A New Approach for Automated Feature Selection
    Gocht, Andreas
    Lehmann, Christoph
    Schoene, Robert
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 4915 - 4920
  • [45] Feature Selection for Automated QoE Prediction
    Kikuzuki, Tatsuya
    Mashhadi, Mahdi Boloursaz
    Ma, Yi
    Tafazolli, Rahim
    2023 IEEE 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PIMRC, 2023,
  • [46] AutoLearn - Automated Feature Generation and Selection
    Kaul, Ambika
    Maheshwary, Saket
    Pudi, Vikram
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 217 - 226
  • [47] Enhance Term Weighting Algorithm as Feature Selection Technique for Illicit Web Content Classification
    Lee, Zhi-Sam
    Maarof, Mohd Aizaini
    Selamat, Ali
    Shamsuddin, Siti Mariyam
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 2, PROCEEDINGS, 2008, : 145 - 150
  • [48] A method of feature selection of voice content classification based on analysis of variance in orthogonal experiments
    An, Si
    Fan, Xin-Hua
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 4133 - +
  • [49] Low Level Feature Selection for a Content Based Digital Mammography Image Retrieval System
    Ozturk, Ozlem
    Bulu, Hakan
    Alpkocak, Adil
    Guzelis, Cuneyt
    2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 682 - +
  • [50] A Gaussian-fuzzy content feature recognition system for digital media asset objects
    Cao, SX
    Lu, R
    DIGITAL LIBRARIES: INTERNATIONAL COLLABORATION AND CROSS-FERTILIZATION, PROCEEDINGS, 2004, 3334 : 444 - 448