A Roman Urdu Corpus for sentiment analysis

被引:1
|
作者
Khan, Marwa [1 ]
Naseer, Asma [1 ]
Wali, Aamir [1 ]
Tamoor, Maria [2 ]
机构
[1] Natl Univ Comp & Emerging Sci, FAST Sch Comp, 852-B, Lahore, Pakistan
[2] Forman Christian Coll Univ, Dept Comp Sci, Zahoor Ilahi Rd, Lahore, Pakistan
来源
关键词
D O I
10.1093/comjnl/bxae052
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis is a dynamic field focused on understanding and predicting emotional sentiments in text or images. With the prevalence of smartphones, e-commerce and social networks, individuals readily express opinions, aiding businesses, political analysts and organizations in decision-making. Despite extensive research in sentiment analysis for various languages, challenges persist in low-resource languages like Roman Urdu. Roman Urdu, the use of Roman script to write Urdu, has gained popularity, yet limited linguistic resources hinder sentiment analysis research. This study addresses this gap by developing a bidirectional long short-term memory network with FastText embeddings and additional layers. A large Roman Urdu corpus for sentiment analysis, consisting of over 51 000 reviews, is crated and the proposed model is trained and compared with 14 other models, demonstrating an accuracy of 0.854 and an F1-score of 0.84.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Creating sentiment lexicon for sentiment analysis in Urdu: The case of a resource-poor language
    Asghar, Muhammad Zubair
    Sattar, Anum
    Khan, Aurangzeb
    Ali, Amjad
    Kundi, Fazal Masud
    Ahmad, Shakeel
    [J]. EXPERT SYSTEMS, 2019, 36 (03)
  • [32] A survey on sentiment analysis in Urdu: A resource-poor language
    Khattak, Asad
    Asghar, Muhammad Zubair
    Saeed, Anam
    Hameed, Ibrahim A.
    Hassan, Syed Asif
    Ahmad, Shakeel
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2021, 22 (01) : 53 - 74
  • [33] A hybrid dependency-based approach for Urdu sentiment analysis
    Urooba Sehar
    Summrina Kanwal
    Nasser I. Allheeib
    Sultan Almari
    Faiza Khan
    Kia Dashtipur
    Mandar Gogate
    Osama A. Khashan
    [J]. Scientific Reports, 13
  • [34] A hybrid dependency-based approach for Urdu sentiment analysis
    Sehar, Urooba
    Kanwal, Summrina
    Allheeib, Nasser I.
    Almari, Sultan
    Khan, Faiza
    Dashtipur, Kia
    Gogate, Mandar
    Khashan, Osama A.
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [35] Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits
    Syed, Afraz Z.
    Aslam, Muhammad
    Maria Martinez-Enriquez, Ana
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, MICAI 2010, PT I, 2010, 6437 : 32 - 43
  • [36] Effective lexicon-based approach for Urdu sentiment analysis
    Neelam Mukhtar
    Mohammad Abid Khan
    [J]. Artificial Intelligence Review, 2020, 53 : 2521 - 2548
  • [37] Urdu Sentiment Analysis Using Supervised Machine Learning Approach
    Mukhtar, Neelam
    Khan, Mohammad Abid
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (02)
  • [38] Effective lexicon-based approach for Urdu sentiment analysis
    Mukhtar, Neelam
    Khan, Mohammad Abid
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (04) : 2521 - 2548
  • [39] Sentiment Analysis on (Bengali Horoscope) Corpus
    Ghosal, Tirthankar
    Das, Sajal K.
    Bhattacharjee, Saprativa
    [J]. 2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [40] Annotation of a Corpus of Tweets for Sentiment Analysis
    dos Santos, Allisfrank
    Barros Junior, Jorge Daniel
    Camargo, Heloisa de Arruda
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 294 - 302