Sentiment analysis in Turkish: Supervised, semi-supervised, and unsupervised techniques

被引:6
|
作者
Aydin, Cem Rifki [1 ]
Gungor, Tunga [1 ]
机构
[1] Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey
关键词
Sentiment analysis; Opinion mining; Machine learning; Text classification; Morphological analysis; CLASSIFIERS;
D O I
10.1017/S1351324920000200
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although many studies on sentiment analysis have been carried out for widely spoken languages, this topic is still immature for Turkish. Most of the works in this language focus on supervised models, which necessitate comprehensive annotated corpora. There are a few unsupervised methods, and they utilize sentiment lexicons either built by translating from English lexicons or created based on corpora. This results in improper word polarities as the language and domain characteristics are ignored. In this paper, we develop unsupervised (domain-independent) and semi-supervised (domain-specific) methods for Turkish, which are based on a set of antonym word pairs as seeds. We make a comprehensive analysis of supervised methods under several feature weighting schemes. We then form ensemble of supervised classifiers and also combine the unsupervised and supervised methods. Since Turkish is an agglutinative language, we perform morphological analysis and use different word forms. The methods developed were tested on two datasets having different styles in Turkish and also on datasets in English to show the portability of the approaches across languages. We observed that the combination of the unsupervised and supervised approaches outperforms the other methods, and we obtained a significant improvement over the state-of-the-art results for both Turkish and English.
引用
收藏
页码:455 / 483
页数:29
相关论文
共 50 条
  • [1] A hybrid semi-supervised boosting to sentiment analysis
    Tanha, Jafar
    Mahmudyan, Solmaz
    Farahi, Ahmad
    [J]. INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (02): : 1769 - 1784
  • [2] Using unsupervised information to improve semi-supervised tweet sentiment classification
    Felipe da Silva, Nadia Felix
    Coletta, Luiz F. S.
    Hruschka, Eduardo R.
    Hruschka, Estevam R., Jr.
    [J]. INFORMATION SCIENCES, 2016, 355 : 348 - 365
  • [3] Semi-supervised dimensional sentiment analysis with variational autoencoder
    Wu, Chuhan
    Wu, Fangzhao
    Wu, Sixing
    Yuan, Zhigang
    Liu, Junxin
    Huang, Yongfeng
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 165 : 30 - 39
  • [4] Attention Aware Semi-supervised Framework for Sentiment Analysis
    Liu, Jingshuang
    Rong, Wenge
    Tian, Chuan
    Gao, Min
    Xiong, Zhang
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 208 - 215
  • [5] Semi-supervised distributed representations of documents for sentiment analysis
    Park, Saerom
    Lee, Jaewook
    Kim, Kyoungok
    [J]. NEURAL NETWORKS, 2019, 119 : 139 - 150
  • [6] Semi-supervised Multi-view Sentiment Analysis
    Lazarova, Gergana
    Koychev, Ivan
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 181 - 190
  • [7] A Framework for the Unsupervised and Semi-Supervised Analysis of Visual Frames
    Torres, Michelle
    [J]. POLITICAL ANALYSIS, 2024, 32 (02): : 199 - 220
  • [8] Supervised, semi-supervised and unsupervised inference of gene regulatory networks
    Maetschke, Stefan R.
    Madhamshettiwar, Piyush B.
    Davis, Melissa J.
    Ragan, Mark A.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2014, 15 (02) : 195 - 211
  • [9] Statistical Models for Unsupervised, Semi-Supervised, and Supervised Transliteration Mining
    Sajjad, Hassan
    Schmid, Helmut
    Fraser, Alexander
    Schuetze, Hinrich
    [J]. COMPUTATIONAL LINGUISTICS, 2017, 43 (02) : 349 - 375
  • [10] Ensemble learning with trees and rules: Supervised, semi-supervised, unsupervised
    Akdemir, Deniz
    Jannink, Jean-Luc
    [J]. INTELLIGENT DATA ANALYSIS, 2014, 18 (05) : 857 - 872