Distributed Supervised Sentiment Analysis of Tweets: Integrating Machine Learning and Streaming Analytics for Big Data Challenges in Communication and Audience Research

被引:0
|
作者
Arcila Calderon, Carlos [1 ]
Ortega Mohedano, Felix [1 ]
Alvarez, Mateo [2 ]
Vicente Marino, Miguel [3 ]
机构
[1] Univ Salamanca, Salamanca, Spain
[2] Univ Rey Juan Carlos, Madrid, Spain
[3] Univ Valladolid, Valladolid, Spain
来源
EMPIRIA | 2019年 / 42期
关键词
Sentiment Analysis; Twitter; Big Data; Streaming; Machine Learning; Communication and Audience Research; Apache Spark;
D O I
暂无
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
The large-scale analysis of tweets in real-time using supervised sentiment analysis depicts a unique opportunity for communication and audience research. Bringing together machine learning and streaming analytics approaches in a distributed environment might help scholars to obtain valuable data from Twitter in order to immediately classify messages depending on the context with no restrictions of time or storage, empowering cross-sectional, longitudinal and experimental designs with new inputs. Even when communication and audience researchers begin to use computational methods, most of them remain unfamiliar with distributed technologies to face big data challenges. This paper describes the implementation of parallelized machine learning methods in Apache Spark to predict sentiments in real-time tweets and explains how this process can be scaled up using academic or commercial distributed computing when personal computers do not support computations and storage. We discuss the limitation of these methods and their implications in communication, audience and media studies.
引用
收藏
页码:113 / 136
页数:24
相关论文
共 48 条
  • [1] Sentiment Analysis of Sindhi Tweets Dataset using Supervised Machine Learning Techniques
    Hammad, Muhammad
    Anwar, Haris
    [J]. 2019 22ND IEEE INTERNATIONAL MULTI TOPIC CONFERENCE (INMIC), 2019, : 108 - 113
  • [2] Parallel and Distributed Machine Learning Algorithms for Scalable Big Data Analytics
    Bal, Henri
    Pal, Arindam
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 108 : 1159 - 1161
  • [3] A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis
    Rustam, Furqan
    Khalid, Madiha
    Aslam, Waqar
    Rupapara, Vaibhav
    Mehmood, Arif
    Choi, Gyu Sang
    [J]. PLOS ONE, 2021, 16 (02):
  • [4] Big Data Machine Learning and Graph Analytics: Current State and Future Challenges
    Huang, H. Howie
    Liu, Hang
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [5] Big data analytics and machine learning: A retrospective overview and bibliometric analysis
    Zhang, Justin Zuopeng
    Srivastava, Praveen Ranjan
    Sharma, Dheeraj
    Eachempati, Prajwal
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [6] Big Data Analytics and Deep Learning Based Sentiment Analysis System for Sales Prediction
    Khatiwada, Aamod
    Kadariya, Pradeep
    Agrahari, Sandip
    Dhakal, Rabin
    [J]. 2019 IEEE PUNE SECTION INTERNATIONAL CONFERENCE (PUNECON), 2019,
  • [7] A Mini-Review of Machine Learning in Big Data Analytics:Applications,Challenges,and Prospects
    Isaac Kofi Nti
    Juanita Ahia Quarcoo
    Justice Aning
    Godfred Kusi Fosu
    [J]. Big Data Mining and Analytics, 2022, 5 (02) : 81 - 97
  • [8] A Mini-Review of Machine Learning in Big Data Analytics: Applications, Challenges, and Prospects
    Nti, Isaac Kofi
    Quarcoo, Juanita Ahia
    Aning, Justice
    Fosu, Godfred Kusi
    [J]. BIG DATA MINING AND ANALYTICS, 2022, 5 (02): : 81 - 97
  • [9] Supervised sentiment analysis of political messages in Spanish: Real-time classification of tweets based on machine learning
    Arcila-Calderon, Carlos
    Ortega-Mohedano, Felix
    Jimenez-Amores, Javier
    Trullenque, Sofia
    [J]. PROFESIONAL DE LA INFORMACION, 2017, 26 (05): : 973 - 982
  • [10] E-Learning: Challenges and Research Opportunities Using Machine Learning & Data Analytics
    Moubayed, Abdallah
    Injadat, Mohammadnoor
    Nassif, Ali Bou
    Lutfiyya, Hanan
    Shami, Abdallah
    [J]. IEEE ACCESS, 2018, 6 : 39117 - 39138