Text Classification Algorithms: A Survey

被引:693
|
作者
Kowsari, Kamran [1 ,2 ]
Meimandi, Kiana Jafari [1 ]
Heidarysafa, Mojtaba [1 ]
Mendu, Sanjana [1 ]
Barnes, Laura [1 ,2 ,3 ]
Brown, Donald [1 ,3 ]
机构
[1] Univ Virginia, Dept Syst & Informat Engn, Charlottesville, VA 22904 USA
[2] Univ Virginia, Sensing Syst Hlth Lab, Charlottesville, VA 22911 USA
[3] Univ Virginia, Sch Data Sci, Charlottesville, VA 22904 USA
关键词
text classification; text mining; text representation; text categorization; text analysis; document classification; ROC CURVE; DIMENSIONALITY REDUCTION; LOGISTIC-REGRESSION; COMPONENT ANALYSIS; NEURAL-NETWORK; BAYES THEOREM; NAIVE BAYES; AREA; MODELS; TREE;
D O I
10.3390/info10040150
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understand complex models and non-linear relationships within data. However, finding suitable structures, architectures, and techniques for text classification is a challenge for researchers. In this paper, a brief overview of text classification algorithms is discussed. This overview covers different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods. Finally, the limitations of each technique and their application in real-world problems are discussed.
引用
收藏
页数:68
相关论文
共 50 条
  • [1] A Survey on Text Classification Algorithms: From Text to Predictions
    Gasparetto, Andrea
    Marcuzzo, Matteo
    Zangari, Alessandro
    Albarelli, Andrea
    INFORMATION, 2022, 13 (02)
  • [2] A Comprehensive Study of Text Classification Algorithms
    Vijayan, Vikas K.
    Bindu, K. R.
    Parameswaran, Latha
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1109 - 1113
  • [3] A survey of Arabic text classification approaches
    Sayed, Mostafa
    Salem, Rashed K.
    Khder, Ayman E.
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2019, 59 (03) : 236 - 251
  • [4] Text classification using embeddings: a survey
    Liliane Soares da Costa
    Italo L. Oliveira
    Renato Fileto
    Knowledge and Information Systems, 2023, 65 : 2761 - 2803
  • [5] Text classification using embeddings: a survey
    da Costa, Liliane Soares
    Oliveira, Italo L.
    Fileto, Renato
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (07) : 2761 - 2803
  • [6] A SURVEY ON CLASSIFICATION TECHNIQUES FOR TEXT MINING
    Brindha, S.
    Sukumaran, S.
    Prabha, K.
    2016 3RD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2016,
  • [7] A survey on text classification and its applications
    Zhou, Xujuan
    Gururajan, Raj
    Li, Yuefeng
    Venkataraman, Revathi
    Tao, Xiaohui
    Bargshady, Ghazal
    Barua, Prabal D.
    Kondalsamy-Chennakesavan, Srinivas
    WEB INTELLIGENCE, 2020, 18 (03) : 205 - 216
  • [8] A Survey of Topic Models in Text Classification
    Xia, Linzhong
    Luo, Dean
    Zhang, Chunxiao
    Wu, Zhou
    2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, : 244 - 250
  • [9] A Survey on Data Augmentation for Text Classification
    Bayer, Markus
    Kaufhold, Marc-Andre
    Reuter, Christian
    ACM COMPUTING SURVEYS, 2023, 55 (07)
  • [10] Two scalable algorithms for associative text classification
    Yoon, Yongwook
    Lee, Gary G.
    INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (02) : 484 - 496