Evaluating Ensembled Transformers for Multilingual Code-Switched Sentiment Analysis

被引:0
|
作者
Aryal, Saurav K. [1 ]
Prioleau, Howard [1 ]
Washington, Gloria [1 ]
Burge, Legand [1 ]
机构
[1] Howard Univ, Comp Sci, Washington, DC 20059 USA
基金
美国国家卫生研究院;
关键词
Code Switching; Ensembling; BERT; Transformers;
D O I
10.1109/CSCI62032.2023.00032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis is essential for understanding human-authored texts, especially in multilingual communities where code-switching is common. Most existing research focuses on single-language pair sentiment analysis. We introduce a three-step approach for sentiment analysis on code-switched data: translating the code-switched data into English at word and sentence levels, training on Transformer models, and utilizing a stacking classifier to ensemble the models for sentiment classification. We establish a performance benchmark for binary and ternary sentiment classification by applying this to five datasets featuring English mixed with Spanish, Tamil, Telugu, Hindi, and Malayalam. Our method emphasizes the potential of ensembled Transformer models in this domain, paving the way for future advancements.
引用
收藏
页码:165 / 173
页数:9
相关论文
共 50 条
  • [1] Late Fusion of Transformers for Sentiment Analysis of Code-Switched Data
    Sharma, Gagan
    Chinmay, R.
    Sharma, Raksha
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6485 - 6490
  • [2] Towards Zero-Shot Multilingual Transfer for Code-Switched Responses
    Wu, Ting-Wei
    Zhao, Changsheng
    Chang, Ernie
    Shi, Yangyang
    Chuang, Pierce
    Chandra, Vikas
    Juang, Biing
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7551 - 7563
  • [3] Sentiment Analysis on Code-Switched Dravidian Languages with Kernel Based Extreme Learning Machines
    Kumar, Mithun S. R.
    Kumar, Lov
    Malapati, Aruna
    PROCEEDINGS OF THE SECOND WORKSHOP ON SPEECH AND LANGUAGE TECHNOLOGIES FOR DRAVIDIAN LANGUAGES (DRAVIDIANLANGTECH 2022), 2022, : 184 - 190
  • [4] Representativeness as a Forgotten Lesson for Multilingual and Code-switched Data Collection and Preparation
    Dogruoz, A. Seza
    Sitaram, Sunayana
    Yong, Zheng-Xin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5751 - 5767
  • [5] Sentiment Analysis of Code-Switched Tunisian Dialect: Exploring RNN-Based Techniques
    Jerbi, Mohamed Amine
    Achour, Hadhemi
    Souissi, Emna
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 122 - 131
  • [6] Approaches for Multilingual Phone Recognition in Code-switched and Non-code-switched Scenarios Using Indian Languages
    Manjunath, K. E.
    Raghavan, Srinivasa K. M.
    Rao, K. Sreenivasa
    Jayagopi, Dinesh Babu
    Ramasubramanian, V
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (04)
  • [7] A First South African Corpus of Multilingual Code-switched Soap Opera Speech
    van der Westhuizen, Ewald
    Niesler, Thomas
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2854 - 2859
  • [8] Adapting Deep Learning for Sentiment Classification of Code-Switched Informal Short Text
    Shakeel, Muhammad Haroon
    Karim, Asim
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 903 - 906
  • [9] Improving Low Resource Code-switched ASR using Augmented Code-switched TTS
    Sharma, Yash
    Abraham, Basil
    Taneja, Karan
    Jyothi, Preethi
    INTERSPEECH 2020, 2020, : 4771 - 4775
  • [10] Multi-label Masked Language Modeling on Zero-shot Code-switched Sentiment Analysis
    Li, Zhi
    Gao, Xing
    Zhang, Ji
    Zhang, Yin
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2663 - 2668