A System for Extracting Sentiment from Large-Scale Arabic Social Data

被引:2
|
作者
Wang, Hao [1 ]
Bommireddipalli, Vijay R. [1 ]
Hanafy, Ayman [2 ]
Bahgat, Mohamed [2 ]
Noeman, Sara [2 ]
Emam, Ossama S. [2 ]
机构
[1] IBM Corp, Silicon Valley Lab, San Jose, CA 95120 USA
[2] IBM Corp, Cairo Human Language Technol Grp, Cairo, Egypt
关键词
Arabic; Sentiment Analysis; Social Data; Big Data;
D O I
10.1109/ACLing.2015.17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media data in Arabic language is becoming more and more abundant. It is a consensus that valuable information lies in social media data. Mining this data and making the process easier are gaining momentum in the industries. This paper describes an enterprise system we developed for extracting sentiment from large volumes of social data in Arabic dialects. First, we give an overview of the Big Data system for information extraction from multilingual social data from a variety of sources. Then, we focus on the Arabic sentiment analysis capability that was built on top of the system including normalizing written Arabic dialects, building sentiment lexicons, sentiment classification, and performance evaluation. Lastly, we demonstrate the value of enriching sentiment results with user profiles in understanding sentiments of a specific user group.
引用
收藏
页码:71 / 77
页数:7
相关论文
共 50 条
  • [41] A Large-Scale Leveled Readability Lexicon for Standard Arabic
    Al Khalil, Muhamed
    Habash, Nizar
    Jiang, Zhengyang
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3053 - 3062
  • [42] BANGLABOOK: A Large-scale Bangla Dataset for Sentiment Analysis from Book Reviews
    Kabir, Mohsinul
    Bin Mahfuz, Obayed
    Raiyan, Syed Rifat
    Mahmud, Hasan
    Hasan, Md Kamrul
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1237 - 1247
  • [43] LANS: Large-scale Arabic News Summarization Corpus
    Alhamadani, Abdulaziz
    Zhang, Xuchao
    He, Jianfeng
    Khatri, Aadyant
    Lu, Chang-Tien
    ArabicNLP 2023 - 1st Arabic Natural Language Processing Conference, Proceedings, 2023, : 89 - 100
  • [44] Extracting actionable knowledge from large scale in vitro pharmacology data
    Griffen, Edward
    Leach, Andrew
    Dossetter, Alexander
    Reid, Lauren
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 251
  • [45] An interactive system for Extracting Arabic Lexicon from Arabic Newspaper Text
    Ben Halima, Mohamed
    Alimi, Adel M.
    IIT: 2008 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY, 2008, : 449 - 453
  • [46] A distributed data management system to support large-scale data analysis
    Emara, Tamer Z.
    Huang, Joshua Zhexue
    JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 148 : 105 - 115
  • [47] Extracting text from scanned Arabic books: a large-scale benchmark dataset and a fine-tuned Faster-R-CNN model
    Elanwar, Randa
    Qin, Wenda
    Betke, Margrit
    Wijaya, Derry
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2021, 24 (04) : 349 - 362
  • [48] Intelligent Technologies for Large-Scale Social System Sustainable Development
    Tsyganov, Vladimir V.
    CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, (CIT&DS), 2017, 754 : 107 - 118
  • [49] Who post more negatively on social media? A large-scale sentiment analysis of Weibo users
    Zeyang Yang
    Wenting Xu
    Current Psychology, 2023, 42 : 25270 - 25278
  • [50] Extracting text from scanned Arabic books: a large-scale benchmark dataset and a fine-tuned Faster-R-CNN model
    Randa Elanwar
    Wenda Qin
    Margrit Betke
    Derry Wijaya
    International Journal on Document Analysis and Recognition (IJDAR), 2021, 24 : 349 - 362