Emo-SL Framework: Emoji Sentiment Lexicon Using Text-Based Features and Machine Learning for Sentiment Analysis

被引:3
|
作者
Alfreihat, Manar [1 ]
Almousa, Omar Saad [1 ]
Tashtoush, Yahya [1 ]
Alsobeh, Anas [2 ,3 ]
Mansour, Khalid [4 ]
Migdady, Hazem [5 ]
机构
[1] Jordan Univ Sci & Technol, Dept Comp Sci, Irbid 22110, Jordan
[2] Yarmouk Univ, Fac Informat Technol & Comp Sci, Irbid 21163, Jordan
[3] Southern Illinois Univ, Sch Comp, Carbondale, IL 62901 USA
[4] Kingdom Univ, Coll Informat Technol, Riffa 3903, Bahrain
[5] Oman Coll Management & Technol, Barka 320, Oman
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Emoji sentiment lexicon for Arabic (Emo-SL); Arabic-language tweets; machine learning (ML); social media analysis; VADER model; data modeling and analysis; X tweets; EMOTICON; TWEETS;
D O I
10.1109/ACCESS.2024.3382836
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, given the rise of types of social media networks, the analysis of sentiment and opinions in textual data has gained significant importance. However, sentiment analysis in informal Arabic text presents challenges due to morphological complexities and dialectal variances. This research aims to develop an Emoji Sentiment Lexicon (Emo-SL) tailored to Arabic-language tweets and demonstrate performance improvements by combining emoji-based features with machine learning (ML) for sentiment classification. We constructed the Emo-SL using a corpus of 58K Arabic tweets containing emojis, calculating sentiment scores for 222 frequently occurring emojis based on their distribution across positive and negative categories. Emoji weighting is integrated with text-based feature extraction using lexicons to train classifiers on an Arabic tweet dataset. ML models, including Support Vector Machines (SVM), Naive Bayes, Random Forests, and K-Nearest Neighbors (KNN) are evaluated after optimal preprocessing and normalization. The results show that adding Emo-SL derived emoji features to ML classifiers can significantly improve accuracy by 26.7% over just textual features. The emoji-aware integrated approach achieves 89% F1 score, outperforming the rule-based VADER sentiment analyzer. Additionally, analysis of n-gram impacts further confirms the value of fusing emoji and text semantics for Arabic sentiment classification. The Emo-SL lexicon provides an effective framework for extracting nuanced emotional insights from noisy micro-text, which demonstrates the potential of contextualized emoji understanding to advance multilingual sentiment analysis performance.
引用
收藏
页码:81793 / 81812
页数:20
相关论文
共 50 条
  • [1] Emoji-Based Sentiment Analysis of Arabic Microblogs Using Machine Learning
    Al-Azani, Sadam
    El-Alfy, El-Sayed M.
    [J]. 2018 21ST SAUDI COMPUTER SOCIETY NATIONAL COMPUTER CONFERENCE (NCC), 2018,
  • [2] Realization of natural language processing and machine learning approaches for text-based sentiment analysis
    Naithani, Kanchan
    Raiwani, Yadav Prasad
    [J]. EXPERT SYSTEMS, 2023, 40 (05)
  • [3] A Combination of Machine Learning and Lexicon Based Techniques for Sentiment Analysis
    Neshan, Seydeh Akram Saadat
    Akbari, Reza
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 8 - 14
  • [4] Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits
    Syed, Afraz Z.
    Aslam, Muhammad
    Maria Martinez-Enriquez, Ana
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, MICAI 2010, PT I, 2010, 6437 : 32 - 43
  • [5] A Machine Learning-Based Lexicon Approach for Sentiment Analysis
    Sahu, Tirath Prasad
    Khandekar, Sarang
    [J]. INTERNATIONAL JOURNAL OF TECHNOLOGY AND HUMAN INTERACTION, 2020, 16 (02) : 8 - 22
  • [6] Sentiment Analysis of Student Feedback Using Machine Learning and Lexicon Based Approaches
    Nasim, Zarmeen
    Rajput, Quratulain
    Haider, Sajjad
    [J]. 2017 5TH INTERNATIONAL CONFERENCE ON RESEARCH AND INNOVATION IN INFORMATION SYSTEMS (ICRIIS 2017): SOCIAL TRANSFORMATION THROUGH DATA SCIENCE, 2017,
  • [7] Comparison of Text Sentiment Analysis based on Machine Learning
    Zhang, Xueying
    Zheng, Xianghan
    [J]. 2016 15TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC), 2016, : 230 - 233
  • [8] A hybrid method for text-based sentiment analysis
    Thanh Le
    [J]. 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 1392 - 1397
  • [9] An empirical comparison of machine learning methods for text-based sentiment analysis of online consumer reviews
    Alantari, Huwail J.
    Currim, Imran S.
    Deng, Yiting
    Singh, Sameer
    [J]. INTERNATIONAL JOURNAL OF RESEARCH IN MARKETING, 2022, 39 (01) : 1 - 19
  • [10] DESIGN OF SENTIMENT ANALYSIS FRAMEWORK OF DIGITAL MEDIA SHORT TEXT BASED ON MULTI-PATTERN SENTIMENT LEXICON
    Lin, Shuqin
    [J]. SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2023, 24 (03): : 287 - 298