Feature Selection for Fake News Classification

被引:0
|
作者
Sverdrup-Thygeson, Simen [1 ]
Haddow, Pauline C. [1 ]
机构
[1] Norwegian Univ Sci & Technol, CRAB Lab, Trondheim, Norway
关键词
Fake news; classification; feature selection; term frequency; sentiment analysis; text embeddings; BERT;
D O I
10.1109/SSCI50451.2021.9660080
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An explosive growth of misleading and untrustworthy news articles has been observed over the last years. These news articles are often referred to as fake news and have been found to severely impact fair elections and democratic values. Computational Intelligence models may be applied to the classification of news articles, assuming that an efficient feature set is available as input to the model. However, the selection of appropriate feature sets is an open question for such high-dimensional tasks. A further challenge is the general applicability of feature selection strategies, where testing on a single dataset may convey misleading results. The work herein evaluates a wide-range of potential news article features resulting in twenty-five potential features. Feature selection, based on a combination of feature scoring, feature ranking and mutual information is then applied, evaluated on multiple datasets: Kaggle, Liar and FakeNewsNet. An Artificial Immune System model is applied in the feature ranking and as the classification model. The accuracy obtained is compared to state of the art fake news classification models, highlighting that the approach shows promise in terms of accuracy despite the small feature sets provided for classification.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Active Learning for Text Classification and Fake News Detection
    Sahan, Marko
    Smidl, Vaclav
    Marik, Radek
    [J]. 2021 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROLS (ISCSIC 2021), 2021, : 87 - 94
  • [42] Application of Machine Learning Techniques for Fake News Classification
    Silva, Kim
    Paixao, Crysttian
    Rodrigues, Paulo Canas
    [J]. MEASUREMENT-INTERDISCIPLINARY RESEARCH AND PERSPECTIVES, 2024,
  • [43] A transformer-based architecture for fake news classification
    Mehta, Divyam
    Dwivedi, Aniket
    Patra, Arunabha
    Anand Kumar, M.
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2021, 11 (01)
  • [44] Feature selection for classification
    Department of Information Systems and Computer Science, National University of Singapore, Singapore 119260, Singapore
    [J]. Intell. Data Anal., 3 (131-156):
  • [45] Fake news: The truth about fake news
    Jimenez-Rodriguez, Alvaro
    [J]. REVISTA MEDITERRANEA COMUNICACION-JOURNAL OF COMMUNICATION, 2020, 11 (02): : 331 - 333
  • [46] Fake news classification for Indonesian news using Extreme Gradient Boosting (XGBoost)
    Haumahu, J. P.
    Permana, S. D. H.
    Yaddarabullah, Y.
    [J]. 5TH ANNUAL APPLIED SCIENCE AND ENGINEERING CONFERENCE (AASEC 2020), 2021, 1098
  • [48] Bee swarm based feature selection for fake and real fingerprint classification using neural network classifiers
    Sasikala, V.
    Lakshmi Prabha, V.
    [J]. IAENG International Journal of Computer Science, 2015, 42 (04) : 389 - 403
  • [49] Feature selection by integrating document frequency with genetic algorithm for Amharic news document classification
    Endalie, Demeke
    Haile, Getamesay
    Abebe, Wondmagegn Taye
    [J]. PEERJ COMPUTER SCIENCE, 2022, 8
  • [50] Fake News. The truth about fake news
    Lopez Jimenez, David
    [J]. REVISTA LATINA DE COMUNICACION SOCIAL, 2021, 79