A novel semi supervised approach for text classification

被引:3
|
作者
Barman D. [1 ]
Chowdhury N. [2 ]
机构
[1] Department of Computer and System Sciences, Visva-Bharati, Santiniketan
[2] Department of Computer Science and Engineering, Jadavpur University, Kolkata
关键词
Decision tree; Kohonen self organizing map; Naïve Bayes; Semi supervised learning; Support vector machine; Text categorization;
D O I
10.1007/s41870-018-0137-9
中图分类号
学科分类号
摘要
Text categorization, also known as text classification is a supervised classification problem. It aims to assign a predefined class label or group to a new or unknown text document. Most of the time we need a collection of large data from each class to train the classifier. It may be noted that, it is very hard or expensive to collect labelled text data. In most cases we assign the label manually which is neither cost effective nor efficient. In this paper, we have introduced a semi-supervised classification approach where the learner needs very small amount of labelled data with a large amount of unlabeled data to assign a class label to a new or unknown text document. The proposed method uses Kohonen self organizing map (SOM) for labelling the unlabeled data and three classifiers namely support vector machine (SVM), Naïve Bayes (NB), and decision tree (DT): classification and regression tree (CART) for observing the accuracy of classification. The experimental results obtained show the effectiveness of our proposed method. © 2018, Bharati Vidyapeeth's Institute of Computer Applications and Management.
引用
收藏
页码:1147 / 1157
页数:10
相关论文
共 50 条
  • [21] Semi-supervised text classification using partitioned EM
    Cong, G
    Lee, WS
    Wu, HR
    Liu, B
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2004, 2973 : 482 - 493
  • [22] Imbalanced Classification Algorithm for Semi Supervised Text Learning (iCASSTLE)
    Banerjee, Debanjana
    Prabhat, Gyan
    Bhowal, Riyanka
    [J]. 2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 1012 - 1016
  • [23] Semi-supervised Text Classification Using RBF Networks
    Jiang, Eric P.
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS VIII, PROCEEDINGS, 2009, 5772 : 95 - 106
  • [24] Keyword-Based Semi-Supervised Text Classification
    Severin, Karl
    Gokhale, Swapna S.
    Dagnino, Aldo
    [J]. 2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 417 - 422
  • [25] SEMI-SUPERVISED LEARNING FOR TEXT CLASSIFICATION BY LAYER PARTITIONING
    Li, Alexander Hanbo
    Sethy, Abhinav
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6164 - 6168
  • [26] Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification
    Xiao, Huiru
    Liu, Xin
    Song, Yangqiu
    [J]. WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3370 - 3376
  • [27] Effective Text Classification by a Supervised Feature Selection Approach
    Basu, Tanmay
    Murthy, C. A.
    [J]. 12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 918 - 925
  • [28] Topic Labeled Text Classification: A Weakly Supervised Approach
    Hingmire, Swapnil
    Chakraborti, Sutanu
    [J]. SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 385 - 394
  • [29] Semi-supervised text classification with deep convolutional neural network using feature fusion approach
    Shayegh, Parvaneh
    Li, Yuefeng
    Zhang, Jinglan
    Zhang, Qing
    [J]. 2019 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2019), 2019, : 363 - 366
  • [30] Novel automatic traffic sign classification system using a semi-supervised approach
    Pupezescu, Marilena-Catalina
    Pupezescu, Valentin
    [J]. 2022 23RD INTERNATIONAL CARPATHIAN CONTROL CONFERENCE (ICCC), 2022, : 177 - 180