A Roman Urdu Corpus for sentiment analysis

被引:1
|
作者
Khan, Marwa [1 ]
Naseer, Asma [1 ]
Wali, Aamir [1 ]
Tamoor, Maria [2 ]
机构
[1] Natl Univ Comp & Emerging Sci, FAST Sch Comp, 852-B, Lahore, Pakistan
[2] Forman Christian Coll Univ, Dept Comp Sci, Zahoor Ilahi Rd, Lahore, Pakistan
来源
关键词
D O I
10.1093/comjnl/bxae052
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis is a dynamic field focused on understanding and predicting emotional sentiments in text or images. With the prevalence of smartphones, e-commerce and social networks, individuals readily express opinions, aiding businesses, political analysts and organizations in decision-making. Despite extensive research in sentiment analysis for various languages, challenges persist in low-resource languages like Roman Urdu. Roman Urdu, the use of Roman script to write Urdu, has gained popularity, yet limited linguistic resources hinder sentiment analysis research. This study addresses this gap by developing a bidirectional long short-term memory network with FastText embeddings and additional layers. A large Roman Urdu corpus for sentiment analysis, consisting of over 51 000 reviews, is crated and the proposed model is trained and compared with 14 other models, demonstrating an accuracy of 0.854 and an F1-score of 0.84.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Sentiment analysis with word-based Urdu speech recognition
    Shaik, Riyaz
    Venkatramaphanikumar, S.
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 13 (5) : 2511 - 2531
  • [42] Annotation of a Corpus of Tweets for Sentiment Analysis
    dos Santos, Allisfrank
    Barros Junior, Jorge Daniel
    Camargo, Heloisa de Arruda
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 294 - 302
  • [43] Sentiment Analysis on (Bengali Horoscope) Corpus
    Ghosal, Tirthankar
    Das, Sajal K.
    Bhattacharjee, Saprativa
    [J]. 2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [44] Building Corpus with Emoticons for Sentiment Analysis
    Li, Changliang
    Wang, Yongguan
    Li, Changsong
    Qi, Ji
    Liu, Pengyuan
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 309 - 318
  • [45] KurdiSent: a corpus for kurdish sentiment analysis
    Badawi, Soran
    Kazemi, Arefeh
    Rezaie, Vali
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2024,
  • [46] The Generation of a Corpus for Clinical Sentiment Analysis
    Deng, Yihan
    Declerck, Thierry
    Lendvai, Piroska
    Denecke, Kerstin
    [J]. SEMANTIC WEB, ESWC 2016, 2016, 9989 : 311 - 324
  • [47] Sentiment Analysis for Urdu News Tweets Using Decision Tree
    Bibi, Raheela
    Qamar, Usman
    Ansar, Munazza
    Shaheen, Asma
    [J]. 2019 IEEE/ACIS 17TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS (SERA), 2019, : 66 - 70
  • [48] Identification and handling of intensifiers for enhancing accuracy of Urdu sentiment analysis
    Mukhtar, Neelam
    Khan, Mohammad Abid
    Chiragh, Nadia
    Nazir, Shah
    [J]. EXPERT SYSTEMS, 2018, 35 (06)
  • [49] Opinion within Opinion: Segmentation Approach for Urdu Sentiment Analysis
    Hassan, Muhammad
    Shoaib, Muhammad
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (01) : 21 - 28
  • [50] Sentiment analysis with word-based Urdu speech recognition
    Riyaz Shaik
    S. Venkatramaphanikumar
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 2511 - 2531