Online Unstructured Data Analysis Models with KoBERT and Word2vec: A Study on Sentiment Analysis of Public Opinion in Korean

被引:2
|
作者
Baek, Changwon [1 ]
Kang, Jiho [2 ]
Choi, Sangsoo [1 ]
机构
[1] Korea Inst Sci & Technol KIST, Technol Convergence Ctr, Seoul, South Korea
[2] Korea Univ, Inst Engn Res, Seoul, South Korea
关键词
KoBERT; Word2vec; Public opinion analysis; Sentiment classification; INTERNET;
D O I
10.5391/IJFIS.2023.23.3.244
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Online news articles and comments play a vital role in shaping public opinion. Numerous studies have conducted online opinion analyses using these as raw data. Bidirectional encoder representations from transformer (BERT)-based sentiment analysis of public opinion have recently attracted significant attention. However, owing to its limited linguistic versatility and low accuracy in domains with insufficient learning data, the application of BERT to Korean is challenging. Conventional public opinion analysis focuses on term frequency; hence, low-frequency words are likely to be excluded because their importance is underestimated. This study aimed to address these issues and facilitate the analysis of public opinion regarding Korean news articles and comments. We propose a method for analyzing public opinion using word2vec to increase the word-frequency-centered analytical limit in conjunction with KoBERT, which is optimized for Korean language by improving BERT. Naver news articles and comments were analyzed using a sentiment classification model developed on the KoBERT framework. The experiment demonstrated a sentiment classification accuracy of over 90%. Thus, it yields faster and more precise results than conventional methods. Words with a low frequency of occurrence, but high relevance, can be identified using word2vec.
引用
收藏
页码:244 / 258
页数:15
相关论文
共 50 条
  • [1] A Word2vec Model for Sentiment Analysis of Weibo
    Shi, Bowen
    Zhao, Jichang
    Xu, Ke
    2019 16TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM2019), 2019,
  • [2] Word2vec and Clustering based Twitter Sentiment Analysis
    Coban, Onder
    Ozyer, Gulsah Tumuklu
    2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP), 2018,
  • [3] Sentiment Analysis of Bengali Comments With Word2Vec and Sentiment Information of Words
    Al-Amin, Md.
    Islam, Md. Saiful
    Das Uzzal, Shapan
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), 2017, : 186 - 190
  • [4] Improvement of Sentiment Analysis based on Clustering of Word2Vec Features
    Alshari, Eissa M.
    Azman, Azreen
    Doraisamy, Shyamala
    Mustapha, Norwati
    Alkeshr, Mustafa
    2017 28TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2017, : 123 - 126
  • [5] Word2Vec for Indonesian Sentiment Analysis towards Hotel Reviews: An Evaluation Study
    Nawangsari, Rizka Putri
    Kusumaningrum, Retno
    Wibowo, Adi
    4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE (ICCSCI 2019) : ENABLING COLLABORATION TO ESCALATE IMPACT OF RESEARCH RESULTS FOR SOCIETY, 2019, 157 : 360 - 366
  • [6] Detection of Suspicious Accounts on Twitter Using Word2Vec and Sentiment Analysis
    Conde-Cespedes, Patricia
    Chavando, Julie
    Deberry, Eliza
    MULTIMEDIA AND NETWORK INFORMATION SYSTEMS, 2019, 833 : 362 - 371
  • [7] Sentiment Analysis of Twitter Messages using Word2vec by Weighted Average
    Djaballah, Kamel Ahsene
    Boukhalfa, Kamel
    Boussaid, Omar
    2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2019, : 223 - 228
  • [8] Sentiment Analysis Based on Weighted Word2vec and Att-LSTM
    Yuan, Huanhuan
    Wang, Yongli
    Feng, Xia
    Sun, Shurong
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 420 - 424
  • [9] Effective Method for Sentiment Lexical Dictionary Enrichment based on Word2Vec for Sentiment Analysis
    Alshari, Eissa M.
    Azman, Azreen
    Doraisamy, Shyamala
    Mustapha, Norwati
    Alkeshr, Mostafa
    2018 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP), 2018, : 177 - 181
  • [10] SENTI2VEC: AN EFFECTIVE FEATURE EXTRACTION TECHNIQUE FOR SENTIMENT ANALYSIS BASED ON WORD2VEC
    Alshari, Eissa M.
    Azman, Azreen
    Doraisamy, Shyamala
    Mustapha, Norwati
    Alksher, Mostafa
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2020, 33 (03) : 240 - 251