AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis

被引:0
|
作者
Abdul-Mageed, Muhammad [1 ,2 ,3 ]
Diab, Mona [3 ]
机构
[1] Indiana Univ, Dept Linguist, Bloomington, IN 47405 USA
[2] Indiana Univ, Sch Lib & Informat Sci, Bloomington, IN 47405 USA
[3] Indiana Univ, Ctr Computat Learning Syst, Bloomington, IN 47405 USA
关键词
Arabic; sentiment analysis; opinion mining; OPINIONS;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
We present AWATIF, a multi-genre corpus of Modern Standard Arabic (MSA) labeled for subjectivity and sentiment analysis (SSA) at the sentence level. The corpus is labeled using both regular as well as crowd sourcing methods under three different conditions with two types of annotation guidelines. We describe the sub-corpora constituting the corpus and provide examples from the various SSA categories. In the process, we present our linguistically-motivated and genre-nuanced annotation guidelines and provide evidence showing their impact on the labeling task.
引用
收藏
页码:3907 / 3914
页数:8
相关论文
共 50 条
  • [1] SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis
    Abdul-Mageed, Muhammad
    Diab, Mona
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1162 - 1169
  • [2] The Bahrain Corpus: A Multi-genre Corpus of Bahraini Arabic
    Abdulrahim, Dana
    Inoue, Go
    Shamsan, Latifa
    Khalifa, Salam
    Habash, Nizar
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2345 - 2352
  • [3] A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic
    Cotterell, Ryan
    Callison-Burch, Chris
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [4] An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
    Refaee, Eshrag
    Rieser, Verena
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2268 - 2273
  • [5] A Prototype for a Standard Arabic Sentiment Analysis Corpus
    Al-Kabi, Mohammed
    Al-Ayyoub, Mahmoud
    Alsmadi, Izzat
    Wahsheh, Heider
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2016, 13 (1A) : 163 - 170
  • [6] The Corpus Based Approach to Sentiment Analysis in Modern Standard Arabic and Arabic Dialects: A Literature Review
    Alnawas, Anwar
    Arici, Nursal
    [J]. JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2018, 21 (02): : 461 - 470
  • [7] A multi-genre SMT system for Arabic to French
    Hasan, Sasa
    Ney, Hermann
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2167 - 2170
  • [8] The discourse connector list: a multi-genre cross-cultural corpus analysis
    Kalajahi, Seyed Ali Rezvani
    Abdullah, Ain Nadzimah
    Neufeld, Steve
    [J]. TEXT & TALK, 2017, 37 (03) : 283 - 310
  • [9] Camel Treebank: An Open Multi-genre Arabic Dependency Treebank
    Habash, Nizar
    AbuOdeh, Muhammed
    Taji, Dima
    Faraj, Reem
    El Gizuli, Jamila
    Kallas, Omar
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2672 - 2681
  • [10] Sentiment Analysis of Modern Standard Arabic and Egyptian Dialectal Arabic Tweets
    El-Naggar, Nadine
    El-Sonbaty, Yasser
    Abou El-Nasr, Mohamad
    [J]. 2017 COMPUTING CONFERENCE, 2017, : 880 - 887