Cooperative Hybrid Semi-Supervised Learning for Text Sentiment Classification

被引:5
|
作者
Li, Yang [1 ]
Lv, Ying [2 ]
Wang, Suge [1 ,3 ]
Liang, Jiye [1 ,3 ]
Li, Juanzi [4 ]
Li, Xiaoli [5 ]
机构
[1] Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Shanxi, Peoples R China
[2] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[3] Shanxi Univ, Key Lab Computat Intelligence & Chinese Informat, Minist Educ, Taiyuan 030006, Shanxi, Peoples R China
[4] Tsinghua Univ, Comp Sci Dept, Beijing 100084, Peoples R China
[5] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
来源
SYMMETRY-BASEL | 2019年 / 11卷 / 02期
基金
中国国家自然科学基金;
关键词
text sentiment classification; semi-supervised learning; seed selecting; training data updating; alternately co-training;
D O I
10.3390/sym11020133
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A large-scale and high-quality training dataset is an important guarantee to learn an ideal classifier for text sentiment classification. However, manually constructing such a training dataset with sentiment labels is a labor-intensive and time-consuming task. Therefore, based on the idea of effectively utilizing unlabeled samples, a synthetical framework that covers the whole process of semi-supervised learning from seed selection, iterative modification of the training text set, to the co-training strategy of the classifier is proposed in this paper for text sentiment classification. To provide an important basis for selecting the seed texts and modifying the training text set, three kinds of measures-the cluster similarity degree of an unlabeled text, the cluster uncertainty degree of a pseudo-label text to a learner, and the reliability degree of a pseudo-label text to a learner-are defined. With these measures, a seed selection method based on Random Swap clustering, a hybrid modification method of the training text set based on active learning and self-learning, and an alternately co-training strategy of the ensemble classifier of the Maximum Entropy and Support Vector Machine are proposed and combined into our framework. The experimental results on three Chinese datasets (COAE2014, COAE2015, and a Hotel review, respectively) and five English datasets (Books, DVD, Electronics, Kitchen, and MR, respectively) in the real world verify the effectiveness of the proposed framework.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A review of semi-supervised learning for text classification
    José Marcio Duarte
    Lilian Berton
    [J]. Artificial Intelligence Review, 2023, 56 : 9401 - 9469
  • [2] A review of semi-supervised learning for text classification
    Duarte, Jose Marcio
    Berton, Lilian
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (09) : 9401 - 9469
  • [3] Semi-Supervised Text Classification With Universum Learning
    Liu, Chien-Liang
    Hsaio, Wen-Hoar
    Lee, Chia-Hoang
    Chang, Tao-Hsing
    Kuo, Tsung-Hsun
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (02) : 462 - 473
  • [4] TEXT CLASSIFICATION BASED ON SEMI-SUPERVISED LEARNING
    Vo Duy Thanh
    Vo Trung Hung
    Pham Minh Tuan
    Doan Van Ban
    [J]. 2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2013, : 232 - 236
  • [5] A Semi-supervised Learning Approach for Microblog Sentiment Classification
    Yu, Zhiwei
    Wong, Raymond K.
    Chi, Chi-Hung
    Chen, Fang
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 339 - 344
  • [6] SEMI-SUPERVISED LEARNING FOR TEXT CLASSIFICATION BY LAYER PARTITIONING
    Li, Alexander Hanbo
    Sethy, Abhinav
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6164 - 6168
  • [7] Semi-supervised Sentiment Classification Based on Auxiliary Task Learning
    Liu, Huan
    Wang, Jingjing
    Li, Shoushan
    Li, Junhui
    Zhou, Guodong
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 372 - 382
  • [8] Multi-view Learning for Semi-supervised Sentiment Classification
    Su, Yan
    Li, Shoushan
    Ju, Shengfeng
    Zhou, Guodong
    Li, Xiaojun
    [J]. 2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 13 - 16
  • [9] Active deep learning method for semi-supervised sentiment classification
    Zhou, Shusen
    Chen, Qingcai
    Wang, Xiaolong
    [J]. NEUROCOMPUTING, 2013, 120 : 536 - 546
  • [10] BERT Based Semi-Supervised Hybrid Approach for Aspect and Sentiment Classification
    Avinash Kumar
    Pranjal Gupta
    Raghunathan Balan
    Lalita Bhanu Murthy Neti
    Aruna Malapati
    [J]. Neural Processing Letters, 2021, 53 : 4207 - 4224