Sentiment Analysis of Code-Mixed Roman Urdu-English Social Media Text using Deep Learning Approaches

被引:12
|
作者
Younas, Aqsa [1 ]
Nasim, Raheela [1 ]
Ali, Saqib [1 ,2 ]
Wang, Guojun [2 ]
Qi, Fang [3 ]
机构
[1] Univ Agr Faisalabad, Dept Comp Sci, Faisalabad 38000, Pakistan
[2] Guangzhou Univ, Sch Comp Sci, Guangzhou 510006, Peoples R China
[3] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
基金
中国国家自然科学基金;
关键词
Sentiment analysis; Code-mixed text; Roman Urdu; Deep learning; XLM-RoBERTa; Multilingual BERT;
D O I
10.1109/CSE50738.2020.00017
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Sentiment analysis is the computational study of attitudes, opinions, and sentiments towards certain issues, products, individuals, and organizations. Companies and customers are making decisions by seeking opinions from social media text. Sentiment analysis is getting intelligent with the advancement of artificial intelligence and natural language processing. With a stunning increase in the use of social media, a huge volume of text available on these platforms is in imperfect and informal languages like Roman Urdu mixed with the English language. Present sentiment analysis techniques do not perform precisely on these code-mixed imperfect, informal, and poorly resourced languages. A promising solution is the use of deep learning models on these code-mixed Roman Urdu and English text. Therefore, the objective of this paper is to perform a sentiment analysis of code-mixed Roman Urdu and English social media text using state-of-the-art deep learning models. Our work is independent of lexical normalization, language dictionary, and code transfer indication. We perform sentiment analysis using Multilingual BERT (mBERT) and XLM-RoBERTa (XLM-R) models. The results reveal that performance of XLM-R model with tuned hyperparameters for code-mixed Roman Urdu and English social media text is better than the mBERT model with F1 score of 71%.
引用
收藏
页码:66 / 71
页数:6
相关论文
共 50 条
  • [1] Emotion Detection in Code-Mixed Roman Urdu - English Text
    Ilyas, Abdullah
    Shahzad, Khurram
    Malik, Muhammad Kamran
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (02)
  • [2] Social media text analytics of Malayalam–English code-mixed using deep learning
    S. Thara
    Prabaharan Poornachandran
    [J]. Journal of Big Data, 9
  • [3] Bilingual Sentiment Analysis for a Code-mixed Punjabi English Social Media Text
    Yadav, Konark
    Lamba, Aashish
    Gupta, Dhruv
    Gupta, Ansh
    Karmakar, Purnendu
    Saini, Sandeep
    [J]. PROCEEDINGS OF THE 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND SECURITY (ICCCS-2020), 2020,
  • [4] Sentiment Analysis of Code-Mixed Bambara-French Social Media Text Using Deep Learning Techniques
    Arouna KONATE
    DU Ruiying
    [J]. Wuhan University Journal of Natural Sciences, 2018, 23 (03) : 237 - 243
  • [5] Social media text analytics of Malayalam-English code-mixed using deep learning
    Thara, S.
    Poornachandran, Prabaharan
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [6] Deep Learning Based Sentiment Analysis in a Code-Mixed English-Hindi and English-Bengali Social Media Corpus
    Jamatia, Anupam
    Swamy, Steve Durairaj
    Gamback, Bjorn
    Das, Amitava
    Debbarma, Swapan
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (05)
  • [7] Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media
    Khan, Lal
    Amjad, Ammar
    Afaq, Kanwar Muhammad
    Chang, Hsien-Tsung
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (05):
  • [8] Sentiment Analysis for Code-Mixed Indian Social Media Text With Distributed Representation
    Shalini, K.
    Ganesh, Barathi H. B.
    Kumar, Anand M.
    Soman, K. P.
    [J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 1126 - 1131
  • [9] An Unsupervised Approach for Sentiment Analysis on Social Media Short Text Classification in Roman Urdu Sentiment analysis on short text classification in Roman Urdu
    Rana, Toqir A.
    Shahzadi, Kiran
    Rana, Tauseef
    Arshad, Ahsan
    Tubishat, Mohammad
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (02)
  • [10] Code-Mixed Sentiment Analysis using Transformer for Twitter Social Media Data
    Astuti, Laksmita Widya
    Sari, Yunita
    Suprapto
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (10) : 498 - 504