Text Classification Algorithms: A Survey

被引:693
|
作者
Kowsari, Kamran [1 ,2 ]
Meimandi, Kiana Jafari [1 ]
Heidarysafa, Mojtaba [1 ]
Mendu, Sanjana [1 ]
Barnes, Laura [1 ,2 ,3 ]
Brown, Donald [1 ,3 ]
机构
[1] Univ Virginia, Dept Syst & Informat Engn, Charlottesville, VA 22904 USA
[2] Univ Virginia, Sensing Syst Hlth Lab, Charlottesville, VA 22911 USA
[3] Univ Virginia, Sch Data Sci, Charlottesville, VA 22904 USA
关键词
text classification; text mining; text representation; text categorization; text analysis; document classification; ROC CURVE; DIMENSIONALITY REDUCTION; LOGISTIC-REGRESSION; COMPONENT ANALYSIS; NEURAL-NETWORK; BAYES THEOREM; NAIVE BAYES; AREA; MODELS; TREE;
D O I
10.3390/info10040150
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understand complex models and non-linear relationships within data. However, finding suitable structures, architectures, and techniques for text classification is a challenge for researchers. In this paper, a brief overview of text classification algorithms is discussed. This overview covers different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods. Finally, the limitations of each technique and their application in real-world problems are discussed.
引用
收藏
页数:68
相关论文
共 50 条
  • [41] A Survey on Text Classification: From Traditional to Deep Learning
    Li, Qian
    Peng, Hao
    Li, Jianxin
    Xia, Congying
    Yang, Renyu
    Sun, Lichao
    Yu, Philip S.
    He, Lifang
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2022, 13 (02)
  • [42] Performance comparison and analysis of several general text classification algorithms
    Lu, Wei
    Peng, Ya
    Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2007, 34 (06): : 67 - 69
  • [43] Intelligent Funds Assistant Exploiting Hierarchical Text Classification Algorithms
    Saraiva, Ines
    Moniz, Daniela
    Almeida, Alexandre
    Sousa, Joao
    Vieira, Susana
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [44] Performance Analysis of Supervised Machine Learning Algorithms for Text Classification
    Mishu, Sadia Zaman
    Rafiuddin, S. M.
    PROCEEDINGS OF THE 2016 19TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2016, : 409 - 413
  • [45] Applying Text Classification Algorithms in Web Services Robustness Testing
    Laranjeiro, Nuno
    Oliveira, Rui
    Vieira, Marco
    2010 29TH IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS SRDS 2010, 2010, : 255 - 264
  • [46] Information-theoretic feature selection algorithms for text classification
    Novovicová, J
    Malík, A
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 3272 - 3277
  • [47] Automatic text classification using machine learning and optimization algorithms
    Janani, R.
    Vijayarani, S.
    SOFT COMPUTING, 2021, 25 (02) : 1129 - 1145
  • [48] Text Message Classification Using Supervised Machine Learning Algorithms
    Merugu, Suresh
    Reddy, M. Chandra Shekhar
    Goyal, Ekansh
    Piplani, Lakshay
    ICCCE 2018, 2019, 500 : 141 - 150
  • [49] Application of improved distributed naive Bayesian algorithms in text classification
    Gao, Hongyi
    Zeng, Xi
    Yao, Chunhua
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (09): : 5831 - 5847
  • [50] Application of improved distributed naive Bayesian algorithms in text classification
    Hongyi Gao
    Xi Zeng
    Chunhua Yao
    The Journal of Supercomputing, 2019, 75 : 5831 - 5847