Sentiment Analysis for Thai Language in Hotel Domain Using Machine Learning Algorithms

被引:7
|
作者
Khamphakdee, Nattawat [1 ]
Seresangtakul, Pusadee [1 ]
机构
[1] Khon Kaen Univ, Nat Language & Speech Proc Lab, Fac Sci, Dept Comp Sci, 123 Mittraparb Rd, Khon Kaen 40002, Thailand
关键词
Feature extraction; Machine learning algorithms; Natural language processing; Sentiment analysis; REVIEWS;
D O I
10.18267/j.aip.155
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Sentiment analysis is one of the most frequently used aspects of Natural Language Processing (NLP), which utilizes the polarity classification of reviews expressed at the aspect, sentence or document level. Several businesses and organizations utilize this technique to improve production, as well as employee and service efficiency. However, the users' reviews in our study were expressed in an unstructured data form, which contained spelling errors, leading to complex classifications for both the users and the machine. To solve the problem, a supervised technique of Machine Learning (ML) algorithms can be applied to the data extraction, where classification polarity can be categorized into a positive, negative or neutral class. In this research, we compared nine ML algorithms to determine the most suitable ML algorithm for creating sentiment polarity classification of customer reviews in Thai, which is a low-resource language. The dataset was collected manually from two online agencies (Agoda.com and Booking.com) utilizing a special Thai language. We employed 11 preprocessing steps to clean and handle the large amount of noise data. Next, the Delta TF-IDF, TF-IDF, N-Gram, and Word2Vec techniques were applied to convert the text reviews into vectors, processed with different ML algorithms, to determine sentiment polarity classification and to make accurate comparisons. All ML algorithms were evaluated for sentiment polarity classification with ten-fold cross-validation, with which to compare the values of recall, precision, F1-score and accuracy. The experiment results show that the Support Vector Machine (SVM) using the Delta TF-IDF technique was the best ML algorithm for polarity classification of hotel reviews in the Thai language with the highest accuracy of 89.96%. The results of this research can be applied as the tool for small and medium-sized enterprises within the field of sentiment analysis of the Thai language in the hotel domain.
引用
收藏
页码:155 / 171
页数:17
相关论文
共 50 条
  • [1] Sentiment Analysis Using Machine Learning Algorithms
    Jemai, Fatma
    Hayouni, Mohamed
    Baccar, Sahbi
    [J]. IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 775 - 779
  • [2] Sentiment Analysis of Twitter Posts using Machine Learning Algorithms
    Gupta, Ashutosh
    Singh, Anusha
    Pandita, Ishan
    Parashar, Harsh
    [J]. PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 980 - 983
  • [3] Sentiment Analysis on Different Domains Using Machine Learning Algorithms
    Ahuja, Ravinder
    Sharma, S. C.
    [J]. ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 143 - 153
  • [4] Domain Based Sentiment Analysis in Regional Language-Kannada using Machine Learning Algorithm
    Rohini, V
    Thomas, Merin
    Latha, C. A.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2016, : 503 - 507
  • [5] Thai Sentiment Analysis for Social Media Monitoring using Machine Learning Approach
    Srikamdee, Supawadee
    Suksawatchon, Ureerat
    Suksawatchon, Jakkarin
    [J]. 2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 832 - 835
  • [6] Evaluation of Social Human Sentiment Analysis Using Machine Learning Algorithms
    Agarwal, Anjali
    Das, Ajanta
    Das, Roshni Rupali
    [J]. PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENGINEERING AND COMMUNICATION SYSTEMS, ICACECS 2021, 2022, : 255 - 263
  • [7] Sentiment Analysis of Social Media Comments Using Machine Learning Algorithms
    Taghiyeva, Laman
    Hasanova, Narmin
    Omarova, Masuda
    Rustamov, Samir
    [J]. 2023 5th International Conference on Problems of Cybernetics and Informatics, PCI 2023, 2023,
  • [8] Sentiment Analysis for Arabic Reviews using Machine Learning Classification Algorithms
    Sayed, Awny A.
    Elgeldawi, Enas
    Zaki, Alaa M.
    Galal, Ahmed R.
    [J]. PROCEEDINGS OF 2020 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMMUNICATION AND COMPUTER ENGINEERING (ITCE), 2020, : 56 - 63
  • [9] Cross Domain Sentiment Analysis Using Different Machine Learning Techniques
    Mahalakshmi, S.
    Sivasankar, E.
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON FUZZY AND NEURO COMPUTING (FANCCO - 2015), 2015, 415 : 77 - 87
  • [10] Sentiment Analysis of Movie Reviews in Hindi Language using Machine Learning
    Nanda, Charu
    Dua, Mohit
    Nanda, Garima
    [J]. PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2018, : 1069 - 1072