Tweets opinion analysis integration: ETL modeling with MapReduce

被引:0
|
作者
Afef Walha [1 ]
Hana Mallek [2 ]
Faiza Ghozzi [1 ]
Faiez Gargouri [1 ]
机构
[1] University of Sfax,Multimedia, InfoRmation systems and Advanced Computing Laboratory (MIRACL)
[2] University of Gabes,Higher Institute of Information Science and Multimedia of Gabes (ISIMG)
[3] University of Sfax,Higher Institute of Information Science and Multimedia of Sfax (ISIMS)
关键词
Social media analytics; Twitter text analysis; Opinion mining; Big data integration; ETL modeling; MapReduce; Distributed computing;
D O I
10.1007/s10586-024-04983-6
中图分类号
学科分类号
摘要
The advent of social media has revolutionized the way people communicate and share information, leading to new business opportunities and challenges. Social media platforms offer a valuable resource in user-generated content (UGC), widely used for opinion analysis and business intelligence. However, traditional integration methods, particularly Extract-Transform-Load (ETL) tasks, need help to keep up with the vast volume, variety, and speed of big data generated on these platforms. This paper proposes MR_ETLSent, an ETL process model adopting MapReduce to perform opinion integration of a large volume of UGC data. It underlines the problem of covering time, cost, and complexity to semantically analyze informal and unstructured UGC texts and transform them into sentiments. This approach provides reusable models for the complex process that extracts UGC text, cleans it, semantically analyzes its data to detect sentiment, and transforms it into the data warehouse. The experimentation results of MR_ETLSent components in the Hadoop framework indicate that the proposed sentiment analysis method based on MapReduce performs well with large sets of UGC while minimizing time and computing resources. Overall, our approach is scalable, efficient, and cost-effective and can be integrated into decision-making systems that analyze opinions and handle large volumes of data.
引用
收藏
相关论文
共 50 条
  • [21] Data Integration in ETL Using TALEND
    Sreemathy, J.
    Joseph, Infant, V
    Nisha, S.
    Prabha, Chaaru, I
    Priya, Gokula R. M.
    2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 1444 - 1448
  • [22] Automated Topic Modeling and Sentiment Analysis of Tweets on SparkR
    Monish, Prema
    Kumari, Santoshi
    Babu, Narendra C.
    2018 9TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2018,
  • [23] ETL Processes Security Modeling
    Dammak, Salma
    Ghozzi, Faiza
    Gargouri, Faiez
    INTERNATIONAL JOURNAL OF INFORMATION SYSTEM MODELING AND DESIGN, 2019, 10 (01) : 60 - 84
  • [24] OLAP4Tweets: Multidimensional Modeling of Tweets
    Ben Kraiem, Maha
    Feki, Jamel
    Khrouf, Kais
    Ravat, Franck
    Teste, Olivier
    NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2015), 2015, 539 : 68 - 75
  • [25] 基于MapReduce的分布式ETL调度优化方法
    宋杰
    郝文宁
    陈刚
    靳大尉
    指挥信息系统与技术, 2013, 4 (04) : 17 - 20
  • [26] Systems Analysis and Modeling of Opinion Infection
    Liu, Yijun
    Gu, Jifa
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 484 - +
  • [27] Real time clustering of Tweets using Adaptive PSO Technique and MapReduce
    Chunne, Akhilesh P.
    Chandrasekhar, Uddagiri
    Malhotra, Chetan
    2015 GLOBAL CONFERENCE ON COMMUNICATION TECHNOLOGIES (GCCT), 2015, : 450 - 455
  • [28] Modeling MapReduce with CSP
    Su, Wen
    Yang, Fan
    Zhu, Huibiao
    Li, Qin
    THIRD INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF SOFTWARE ENGINEERING, PROCEEDINGS, 2009, : 301 - 302
  • [29] Research on Data Integration Based on ETL and ODS
    Yang, Bin
    Li, Huihui
    2011 INTERNATIONAL CONFERENCE ON FUTURE COMPUTERS IN EDUCATION (ICFCE 2011), VOL III, 2011, : 498 - 500
  • [30] Target-Oriented Opinion Mining from Tweets
    Hangya, Viktor
    Farkas, Richard
    2013 IEEE 4TH INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM), 2013, : 251 - 254