Creating sentiment lexicon for sentiment analysis in Urdu: The case of a resource-poor language

被引:35
|
作者
Asghar, Muhammad Zubair [1 ]
Sattar, Anum [1 ]
Khan, Aurangzeb [2 ]
Ali, Amjad [3 ]
Kundi, Fazal Masud [1 ]
Ahmad, Shakeel [4 ]
机构
[1] Gomal Univ, ICIT, Dera Ismail Khan, KP, Pakistan
[2] Univ Sci & Technol, Dept Comp Sci, Bannu, Pakistan
[3] Univ Swat, Dept Comp & Software Technol, Saidu Sharif, Pakistan
[4] King Abdul Aziz Univ KAU, FCITR, Jeddah, Saudi Arabia
关键词
polarity lexicon; sentiment analysis; Urdu sentiment lexicon; Urdu SentiWordNet; FRAMEWORK;
D O I
10.1111/exsy.12397
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The sentiment analysis (SA) applications are becoming popular among the individuals and organizations for gathering and analysing user's sentiments about products, services, policies, and current affairs. Due to the availability of a wide range of English lexical resources, such as part-of-speech taggers, parsers, and polarity lexicons, development of sophisticated SA applications for the English language has attracted many researchers. Although there have been efforts for creating polarity lexicons in non-English languages such as Urdu, they suffer from many deficiencies, such as lack of publically available sentiment lexicons with a proper scoring mechanism of opinion words and modifiers. In this work, we present a word-level translation scheme for creating a first comprehensive Urdu polarity resource: "Urdu Lexicon" using a merger of existing resources: list of English opinion words, SentiWordNet, English-Urdu bilingual dictionary, and a collection of Urdu modifiers. We assign two polarity scores, positive and negative, to each Urdu opinion word. Moreover, modifiers are collected, classified, and tagged with proper polarity scores. We also perform an extrinsic evaluation in terms of subjectivity detection and sentiment classification, and the evaluation results show that the polarity scores assigned by this technique are more accurate than the baseline methods.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] A survey on sentiment analysis in Urdu: A resource-poor language
    Khattak, Asad
    Asghar, Muhammad Zubair
    Saeed, Anam
    Hameed, Ibrahim A.
    Hassan, Syed Asif
    Ahmad, Shakeel
    EGYPTIAN INFORMATICS JOURNAL, 2021, 22 (01) : 53 - 74
  • [2] Sentiment Analysis for a Resource Poor Language-Roman Urdu
    Mehmood, Khawar
    Essam, Daryl
    Shafi, Kamran
    Malik, Muhammad Kamran
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
  • [3] Lexicon-based Sentiment Analysis for Urdu Language
    Ul Rehman, Zia
    Bajwa, Imran Sarwar
    2016 SIXTH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING TECHNOLOGY (INTECH), 2016, : 497 - 501
  • [4] A Review of Urdu Sentiment Analysis with Multilingual Perspective: A Case of Urdu and Roman Urdu Language
    Khan, Ihsan Ullah
    Khan, Aurangzeb
    Khan, Wahab
    Su'ud, Mazliham Mohd
    Alam, Muhammad Mansoor
    Subhan, Fazli
    Asghar, Muhammad Zubair
    COMPUTERS, 2022, 11 (01)
  • [5] Leveraging User Ratings for Resource-Poor Sentiment Classification
    Ngo Xuan Bach
    Tu Minh Phuong
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 19TH ANNUAL CONFERENCE, KES-2015, 2015, 60 : 322 - 331
  • [6] Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits
    Syed, Afraz Z.
    Aslam, Muhammad
    Maria Martinez-Enriquez, Ana
    ADVANCES IN ARTIFICIAL INTELLIGENCE, MICAI 2010, PT I, 2010, 6437 : 32 - 43
  • [7] Effective lexicon-based approach for Urdu sentiment analysis
    Neelam Mukhtar
    Mohammad Abid Khan
    Artificial Intelligence Review, 2020, 53 : 2521 - 2548
  • [8] Effective lexicon-based approach for Urdu sentiment analysis
    Mukhtar, Neelam
    Khan, Mohammad Abid
    ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (04) : 2521 - 2548
  • [9] Aspect-based sentiment analysis in Urdu language: resource creation and evaluation
    Altaf, Amna
    Anwar, Muhammad Waqas
    Jamal, Muhammad Hasan
    Bajwa, Usama Ijaz
    Rani, Sadaf
    Neural Computing and Applications, 2024, 36 (34) : 21365 - 21381
  • [10] Sentiment Analysis of Reviews in Natural Language: Roman Urdu as a Case Study
    Qureshi, Muhammad Aasim
    Asif, Muhammad
    Hassan, Mohd Fadzil
    Abid, Adnan
    Kamal, Asad
    Safdar, Sohail
    Akber, Rehan
    IEEE ACCESS, 2022, 10 : 24945 - 24954