OCA: Opinion Corpus for Arabic

被引:154
|
作者
Rushdi-Saleh, Mohammed [1 ]
Teresa Martin-Valdivia, M. [1 ]
Alfonso Urena-Lopez, L. [1 ]
Perea-Ortega, Jose M. [1 ]
机构
[1] Univ Jaen, Dept Comp Sci, SINAI Res Grp, Jaen 23071, Spain
关键词
Data mining - Learning systems - Websites - Learning algorithms;
D O I
10.1002/asi.21598
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis is a challenging new task related to text mining and natural language processing. Although there are, at present, several studies related to this theme, most of these focus mainly on English texts. The resources available for opinion mining (OM) in other languages are still limited. In this article, we present a new Arabic corpus for the OM task that has been made available to the scientific community for research purposes. The corpus contains 500 movie reviews collected from different web pages and blogs in Arabic, 250 of them considered as positive reviews, and the other 250 as negative opinions. Furthermore, different experiments have been carried out on this corpus, using machine learning algorithms such as support vector machines and Naive Bayes. The results obtained are very promising and we are encouraged to continue this line of research.
引用
收藏
页码:2045 / 2054
页数:10
相关论文
共 50 条
  • [31] Overview of Opinion Detection Approaches in Arabic
    Nejjari, Manal
    Meziane, Abdelouafi
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON NETWORKING, INFORMATION SYSTEMS & SECURITY (NISS19), 2019,
  • [32] Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
    Alotaibi, Yousef Ajami
    Alghamdi, Mansour
    Alotaiby, Fabad
    [J]. IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 122 - +
  • [33] The WAW Corpus: The First Corpus of Interpreted Speeches and their Translations for English and Arabic
    Abdelali, Ahmed
    Temnikova, Irina
    Hedaya, Samy
    Vogel, Stephan
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2135 - 2140
  • [34] Annotating Arguments in a Corpus of Opinion Articles
    Rocha, Gil
    Trigo, Luis
    Cardoso, Henrique Lopes
    Sousa-Silva, Rui
    Carvalho, Paula
    Martins, Bruno
    Won, Miguel
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1890 - 1899
  • [35] COMFO: Multilingual Corpus for Opinion Mining
    Faty, Lamine
    Drame, Khadim
    Sarr, Edouard Ngor
    Ndiaye, Marie
    Diop, Ibrahima
    Dia, Yoro
    Sall, Ousmane
    [J]. ARTIFICIAL GENERAL INTELLIGENCE, AGI 2022, 2023, 13539 : 14 - 19
  • [36] Opinion mining in a telephone survey corpus
    Camelin, Nathalie
    Damnati, Geraldine
    Bechet, Frederic
    De Mori, Renato
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1041 - +
  • [37] Refine Crude Corpus for Opinion Mining
    Bhattacharyya, Debnath
    Das, Poulami
    Mitra, Kheyali
    Mukherjee, Swarnendu
    Ganguly, Debashis
    Bandyopadhyay, Samir Kumar
    Kim, Tai-hoon
    [J]. 2009 1ST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, COMMUNICATION SYSTEMS AND NETWORKS(CICSYN 2009), 2009, : 17 - +
  • [38] Multilingual Corpus Development for Opinion Mining
    Schulz, Julia Maria
    Womser-Hacker, Christa
    Mandl, Thomas
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3409 - 3412
  • [39] Development of a Bilingual Corpus of Arabic and Arabic Sign Language based on a Signed Content
    El Maazouzi, Zakaria
    El Mohajir, Badr Eddine
    Al Achhab, Mohammed
    Souri, Adnan
    [J]. 2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 349 - 354
  • [40] Altruistic Crowdsourcing for Arabic Speech Corpus Annotation
    Bougrine, Soumia
    Cherroun, Hadda
    Abdelali, Ahmed
    [J]. ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 : 137 - 144