Using Informative Score for Instance Selection Strategy in Semi-Supervised Sentiment Classification

被引:0
|
作者
Shan, Vivian Lee Lay [1 ]
Hoon, Gan Keng [1 ]
Ping, Tan Tien [1 ]
Abdullah, Rosni [1 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town 11800, Malaysia
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2023年 / 74卷 / 03期
关键词
Document-level sentiment classification; semi-supervised learning; instance selection; informative score;
D O I
10.32604/cmc.2023.033752
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service. Existing studies heav-ily rely on sentiment classification methods that require fully annotated inputs. However, there is limited labelled text available, making the acquire-ment process of the fully annotated input costly and labour-intensive. Lately, semi-supervised methods emerge as they require only partially labelled input but perform comparably to supervised methods. Nevertheless, some works reported that the performance of the semi-supervised model degraded after adding unlabelled instances into training. Literature also shows that not all unlabelled instances are equally useful; thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model. To achieve this, an informative score is proposed and incorporated into semi -supervised sentiment classification. The evaluation is performed on a semi -supervised method without an informative score and with an informative score. By using the informative score in the instance selection strategy to iden-tify informative unlabelled instances, semi-supervised models perform better compared to models that do not incorporate informative scores into their training. Although the performance of semi-supervised models incorporated with an informative score is not able to surpass the supervised models, the results are still found promising as the differences in performance are subtle with a small difference of 2% to 5%, but the number of labelled instances used is greatly reduced from 100% to 40%. The best finding of the proposed instance selection strategy is achieved when incorporating an informative score with a baseline confidence score at a 0.5:0.5 ratio using only 40% labelled data.
引用
收藏
页码:4801 / 4818
页数:18
相关论文
共 50 条
  • [1] Instance Selection in Semi-supervised Learning
    Guo, Yuanyuan
    Zhang, Harry
    Liu, Xiaobo
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 6657 : 158 - 169
  • [2] Semi-Stacking for Semi-supervised Sentiment Classification
    Li, Shoushan
    Huang, Lei
    Wang, Jingjing
    Zhou, Guodong
    [J]. PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 27 - 31
  • [3] Using unsupervised information to improve semi-supervised tweet sentiment classification
    Felipe da Silva, Nadia Felix
    Coletta, Luiz F. S.
    Hruschka, Eduardo R.
    Hruschka, Estevam R., Jr.
    [J]. INFORMATION SCIENCES, 2016, 355 : 348 - 365
  • [4] A Semi-supervised Learning Approach for Microblog Sentiment Classification
    Yu, Zhiwei
    Wong, Raymond K.
    Chi, Chi-Hung
    Chen, Fang
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 339 - 344
  • [5] Semi-supervised target-oriented sentiment classification
    Xu, Weidi
    Tan, Ying
    [J]. NEUROCOMPUTING, 2019, 337 : 120 - 128
  • [6] Leveraging Emotional Consistency for Semi-supervised Sentiment Classification
    Minh Luan Nguyen
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT I, 2016, 9651 : 369 - 381
  • [7] Employing Personal/Impersonal Views in Supervised and Semi-supervised Sentiment Classification
    Li, Shoushan
    Huang, Chu-Ren
    Zhou, Guodong
    Lee, Sophia Yat Mei
    [J]. ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 414 - 423
  • [8] Semi-Supervised Document Classification using Heterogeneous Rule Selection
    Wong, William Xiu Shun
    Kim, Namgyu
    [J]. ICEC'17: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ELECTRONIC COMMERCE, 2017,
  • [9] Enhancing TripleGAN for Semi-Supervised Conditional Instance Synthesis and Classification
    Wu, Si
    Deng, Guangchang
    Li, Jichang
    Li, Rui
    Yu, Zhiwen
    Wong, Hau-San
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10083 - 10092
  • [10] Using Multiple Resources in Graph-Based Semi-supervised Sentiment Classification
    Xu, Ge
    Wang, Houfeng
    [J]. 2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS (WI-IAT WORKSHOPS 2012), VOL 3, 2012, : 132 - 136