Urdu Sentiment Analysis

被引:5
|
作者
Rehman, Iffraah [1 ]
Soomro, Tariq Rahim [1 ]
机构
[1] Inst Business Management IoBM, CCSIS, Karachi, Pakistan
关键词
Machine learning algorithms; sentiment analysis; Tweepy; WEKA; TEXT;
D O I
10.2478/acss-2022-0004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The world is heading towards more modernized and digitalized data and therefore a significant growth is observed in the active number of social media users with each passing day. Each post and comment can give an insight into valuable information about a certain topic or issue, a product or a brand, etc. Similarly, the process to uncover the underlying information from the opinion that a person keeps about any entity is called a sentiment analysis. The analysis can be carried out through two main approaches, i.e., either lexicon-based or machine learning algorithms. A significant amount of work in the different domains has been done in numerous languages for sentiment analysis, but minimal research has been conducted on the national language of Pakistan, which is Urdu. Twitter users who are familiar with Urdu update the tweets in two different textual formats either in Urdu Script (Nastaleeq) or in Roman Urdu. Thus, the paper is an attempt to perform the sentiment analysis on the Urdu language by extracting the tweets (Nastaleeq and Roman Urdu both) from Twitter using Tweepy APL A machine learning-based approach has been adopted for this study and the tool opted for the purpose is WEKA. The best algorithm was identified based on evaluation metrics, which comprise the number of correctly and incorrectly classified instances, accuracy, precision, and recall. SMO was found to be the most suitable machine learning algorithm for performing the sentiment analysis on Urdu (Nastaleeq) tweets, while the Roman Urdu Random Forest algorithm was identified as the best one.
引用
收藏
页码:30 / 42
页数:13
相关论文
共 50 条
  • [1] Urdu Sentiment Analysis
    Khan, Khairullah
    Rahman, Atta Ur
    Khan, Aurangzeb
    Khan, Ashraf Ullah
    Saqia, Bibi
    Khan, Wahab
    Khans, Asfandyar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 646 - 651
  • [2] Sentiment Analysis for Roman Urdu
    Rafique, Ayesha
    Malik, Muhammad Kamran
    Nawaz, Zubair
    Bukhari, Faisal
    Jalbani, Akhtar Hussain
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2019, 38 (02) : 463 - 470
  • [3] A Roman Urdu Corpus for sentiment analysis
    Khan, Marwa
    Naseer, Asma
    Wali, Aamir
    Tamoor, Maria
    Computer Journal, 2024, 67 (09): : 2864 - 2876
  • [4] A Roman Urdu Corpus for sentiment analysis
    Khan, Marwa
    Naseer, Asma
    Wali, Aamir
    Tamoor, Maria
    COMPUTER JOURNAL, 2024,
  • [5] Sentiment Analysis System for Roman Urdu
    Mehmood, Khawar
    Essam, Daryl
    Shafi, Kamran
    INTELLIGENT COMPUTING, VOL 1, 2019, 858 : 29 - 42
  • [6] A Review of Urdu Sentiment Analysis with Multilingual Perspective: A Case of Urdu and Roman Urdu Language
    Khan, Ihsan Ullah
    Khan, Aurangzeb
    Khan, Wahab
    Su'ud, Mazliham Mohd
    Alam, Muhammad Mansoor
    Subhan, Fazli
    Asghar, Muhammad Zubair
    COMPUTERS, 2022, 11 (01)
  • [7] Urdu Sentiment Analysis With Deep Learning Methods
    Khan, Lal
    Amjad, Ammar
    Ashraf, Noman
    Chang, Hsien-Tsung
    Gelbukh, Alexander
    IEEE ACCESS, 2021, 9 : 97803 - 97812
  • [8] RUSAS: Roman Urdu Sentiment Analysis System
    Jawad, Kazim
    Ahmad, Muhammad
    Alvi, Majdah
    Alvi, Muhammad Bux
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 79 (01): : 1463 - 1480
  • [9] A machine learning approach for urdu text sentiment analysis
    Akhtar, Muhammad
    Shoukat, Rana Saud
    Rehman, Saif Ur
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2023, 42 (02) : 75 - 87
  • [10] Medical assistant chatbot Urdu text sentiment analysis
    Syeda Haneen Ashfaq
    Muhammad Ameen Chhajro
    Shahbaz Khan
    Asif Ali Laghari
    Human-Intelligent Systems Integration, 2024, 6 (1) : 131 - 144