SentiALG: Automated Corpus Annotation for Algerian Sentiment Analysis

被引:21
|
作者
Guellil, Imane [1 ,2 ]
Adeel, Ahsan [3 ]
Azouaou, Faical [2 ]
Hussain, Amir [3 ]
机构
[1] Ecole Super Sci Appl Alger ESSA, Algiers, Algeria
[2] Ecole Natl Super Informat, Lab Methodes Concept Syst LMCS, BP 68M, Algiers 16309, Algeria
[3] Univ Stirling, Inst Comp Sci & Math, Sch Nat Sci, Stirling, Scotland
基金
英国工程与自然科学研究理事会;
关键词
Arabic sentiment analysis; Algerian dialect; Sentiment lexicon; Sentiment corpus; Sentiment classification;
D O I
10.1007/978-3-030-00563-4_54
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data annotation is an important but time-consuming and costly procedure. To sort a text into two classes, the very first thing we need is a good annotation guideline, establishing what is required to qualify for each class. In the literature, the difficulties associated with an appropriate data annotation has been underestimated. In this paper, we present a novel approach to automatically construct an annotated sentiment corpus for Algerian dialect (A Maghrebi Arabic dialect). The construction of this corpus is based on an Algerian sentiment lexicon that is also constructed automatically. The presented work deals with the two widely used scripts on Arabic social media: Arabic and Arabizi. The proposed approach automatically constructs a sentiment corpus containing 8000 messages (where 4000 are dedicated to Arabic and 4000 to Arabizi). The achieved F1-score is up to 72% and 78% for an Arabic and Arabizi test sets, respectively. Ongoing work is aimed at integrating transliteration process for Arabizi messages to further improve the obtained results.
引用
收藏
页码:557 / 567
页数:11
相关论文
共 50 条
  • [1] Annotation of a Corpus of Tweets for Sentiment Analysis
    dos Santos, Allisfrank
    Barros Junior, Jorge Daniel
    Camargo, Heloisa de Arruda
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 294 - 302
  • [2] A Review on Corpus Annotation for Arabic Sentiment Analysis
    Almuqren, Latifah
    Alzammam, Arwa
    Alotaibi, Shahad
    Cristea, Alexandra
    Alhumoud, Sarah
    [J]. SOCIAL COMPUTING AND SOCIAL MEDIA: APPLICATIONS AND ANALYTICS, SCSM 2017, PT II, 2017, 10283 : 215 - 225
  • [3] An Algerian Corpus and an Annotation Platform for Opinion and Emotion Analysis
    Moudjari, Leila
    Akli-Astouati, Karima
    Benamara, Farah
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1202 - 1210
  • [4] AlgBERT: Automatic Construction of Annotated Corpus for Sentiment Analysis in Algerian Dialect
    Hamadouche, Khaoula
    Bousmaha, Kheira Zineb
    Bekkoucha, Mohamed Abdelwaret
    Hadrich-Belguith, Lamia
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (12)
  • [5] Sentiment and Behaviour Annotation in a Corpus of Dialogue Summaries
    Roman, Norton Trevisan
    Piwek, Paul
    Brito Rizzoni Carvalho, Ariadne Maria
    Alvares, Alexandre Rossi
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2015, 21 (04) : 561 - 586
  • [6] A Semi-supervised Corpus Annotation for Saudi Sentiment Analysis Using Twitter
    Alqarafi, Abdulrahman
    Adeel, Ahsan
    Hawalah, Ahmed
    Swingler, Kevin
    Hussain, Amir
    [J]. ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2018, 2018, 10989 : 589 - 596
  • [7] Enhancing Business Intelligence with Hybrid Transformers and Automated Annotation for Arabic Sentiment Analysis
    Yafooz, Wael M. S.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (08) : 197 - 207
  • [8] FinnSentiment: a Finnish social media corpus for sentiment polarity annotation
    Linden, Krister
    Jauhiainen, Tommi
    Hardwick, Sam
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (02) : 581 - 609
  • [9] FinnSentiment: a Finnish social media corpus for sentiment polarity annotation
    Krister Lindén
    Tommi Jauhiainen
    Sam Hardwick
    [J]. Language Resources and Evaluation, 2023, 57 : 581 - 609
  • [10] Sentiment Analysis on Algerian Dialect with Transformers
    Benmounah, Zakaria
    Boulesnane, Abdennour
    Fadheli, Abdeladim
    Khial, Mustapha
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (20):