A recent overview of the state-of-the-art elements of text classification

被引:166
|
作者
Mironczuk, Marcin Michal [1 ]
Protasiewicz, Jaroslaw [1 ]
机构
[1] Natl Informat Proc Inst, Al Niepodleglosci 188 B, PL-00608 Warsaw, Poland
关键词
Text classification; Document classification; Text classification overview; Document classification overview; FEATURE-SELECTION METHOD; LINGUAL SENTIMENT CLASSIFICATION; COMBINING MULTIPLE CLASSIFIERS; PROPAGATION NEURAL-NETWORK; TERM WEIGHTING SCHEMES; DOCUMENT CLASSIFICATION; NAIVE BAYES; AUTOMATIC CLASSIFICATION; DIMENSION REDUCTION; INSTANCE SELECTION;
D O I
10.1016/j.eswa.2018.03.058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this study is to provide an overview the state-of-the-art elements of text classification. For this purpose, we first select and investigate the primary and recent studies and objectives in this field. Next, we examine the state-of-the-art elements of text classification. In the following steps, we qualitatively and quantitatively analyse the related works. Herein, we describe six baseline elements of text classification including data collection, data analysis for labelling, feature construction and weighing, feature selection and projection, training of a classification model, and solution evaluation. This study will help readers acquire the necessary information about these elements and their associated techniques. Thus, we believe that this study will assist other researchers and professionals to propose new studies in the field of text classification. (C) 2018 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:36 / 54
页数:19
相关论文
共 50 条
  • [1] OVERVIEW - STATE-OF-THE-ART AND STATE OF THE FUTURE
    ESTABROOK, NB
    [J]. MARINE TECHNOLOGY SOCIETY JOURNAL, 1990, 24 (02) : 45 - 48
  • [2] Fibromyalgia: state-of-the-art overview
    Choy, Ernest H.
    [J]. CLINICAL AND EXPERIMENTAL RHEUMATOLOGY, 2019, 37 (01) : S117 - S117
  • [3] State-of-the-art methods in healthcare text classification system: AI paradigm
    Srivastava, Saurabh Kumar
    Singh, Sandeep Kumar
    Suri, Jasjit S.
    [J]. FRONTIERS IN BIOSCIENCE-LANDMARK, 2020, 25 : 646 - 672
  • [4] State-of-the-art and recent advances Spectrum Sensing for Cognitive Radio State-of-the-art and recent advances
    Axell, Erik
    Leus, Geert
    Larsson, Erik G.
    Poor, H. Vincent
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (03) : 101 - 116
  • [5] A Complete Process of Text Classification System Using State-of-the-Art NLP Models
    Dogra, Varun
    Verma, Sahil
    Kavita
    Chatterjee, Pushpita
    Shafi, Jana
    Choi, Jaeyoung
    Ijaz, Muhammad Fazal
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [6] A Complete Process of Text Classification System Using State-of-the-Art NLP Models
    Dogra, Varun
    Verma, Sahil
    Kavita
    Chatterjee, Pushpita
    Shafi, Jana
    Choi, Jaeyoung
    Ijaz, Muhammad Fazal
    [J]. Computational Intelligence and Neuroscience, 2022, 2022
  • [7] Sleep Hypoventilation: A State-of-the-Art Overview
    Mokhlesi, Babak
    [J]. SLEEP MEDICINE CLINICS, 2014, 9 (03) : XV - XVI
  • [8] An Overview of Recent Advances in State-of-the-Art Techniques in the Demulsification of Crude Oil Emulsions
    Saad, M. A.
    Kamil, Mohammed
    Abdurahman, N. H.
    Yunus, Rosli Mohd
    Awad, Omar, I
    [J]. PROCESSES, 2019, 7 (07)
  • [9] PINCH ANALYSIS - A STATE-OF-THE-ART OVERVIEW
    LINNHOFF, B
    [J]. CHEMICAL ENGINEERING RESEARCH & DESIGN, 1993, 71 (A5): : 503 - 522
  • [10] TEXT RETRIEVAL - THE STATE-OF-THE-ART - GILLMAN,P
    TEDD, LA
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1991, 27 (05) : 596 - 597