Role of twitter user profile features in retweet prediction for big data streams

被引:0
|
作者
Saurabh Sharma
Vishal Gupta
机构
[1] University Institute of Engineering and Technology,
[2] Panjab University,undefined
来源
关键词
Twitter; Social media analysis; Retweet prediction; User behavior; User profiling; Big data analysis;
D O I
暂无
中图分类号
学科分类号
摘要
To study the various factors influencing the process of information sharing on Twitter is a very active research area. This paper aims to explore the impact of numerical features extracted from user profiles in retweet prediction from the real-time raw feed of tweets. The originality of this work comes from the fact that the proposed model is based on simple numerical features with the least computational complexity, which is a scalable solution for big data analysis. This research work proposes three new features from the tweet author profile to capture the unique behavioral pattern of the user, namely “Author total activity”, “Author total activity per year”, and “Author tweets per year”. The features set is tested on a dataset of 100 million random tweets collected through Twitter API. The binary labels regression gave an accuracy of 0.98 for user-profile features and gave an accuracy of 0.99 when combined with tweet content features. The regression analysis to predict the retweet count gave an R-squared value of 0.98 with combined features. The multi-label classification gave an accuracy of 0.9 for combined features and 0.89 for user-profile features. The user profile features performed better than tweet content features and performed even better when combined. This model is suitable for near real-time analysis of live streaming data coming through Twitter API and provides a baseline pattern of user behavior based on numerical features available from user profiles only.
引用
收藏
页码:27309 / 27338
页数:29
相关论文
共 47 条
  • [1] Role of twitter user profile features in retweet prediction for big data streams
    Sharma, Saurabh
    Gupta, Vishal
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (19) : 27309 - 27338
  • [2] Language independent Big-Data system for the prediction of user location on Twitter
    Alonso-Lorenzo, Jaime
    Costa-Montenegro, Enrique
    Fernandez-Gavilanes, Milagros
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2437 - 2446
  • [3] Prediction of Interest for Dynamic Profile of Twitter User
    Siswanto, Elisafina
    Khodra, Masayu Leylia
    Dewi, Luh Joni Erawati
    [J]. 2014 INTERNATIONAL CONFERENCE OF ADVANCED INFORMATICS: CONCEPT, THEORY AND APPLICATION (ICAICTA), 2014, : 266 - 271
  • [4] Twitter Streams Fuel Big Data Approaches to Health Forecasting
    Kuehn, Bridget M.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2015, 314 (19): : 2010 - 2012
  • [5] Processing Big Trajectory and Twitter Data Streams using Apache STORM
    Stojanovic, Dragan
    Stojanovic, Natalija
    Turanjanin, Jovan
    [J]. 2015 12TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS IN MODERN SATELLITE, CABLE AND BROADCASTING SERVICES (TELSIKS), 2015, : 301 - 304
  • [6] Internet user perception on data privacy protection: Big data analytics on twitter
    Soonthornphisaj, Nuanwan
    Tuomchomtam, Sarach
    [J]. Frontiers in Artificial Intelligence and Applications, 2019, 320 : 170 - 180
  • [7] Internet User Perception on Data Privacy Protection: Big Data Analytics on Twitter
    Soonthornphisaj, Nuanwan
    Tuomchomtam, Sarach
    [J]. FUZZY SYSTEMS AND DATA MINING V (FSDM 2019), 2019, 320 : 170 - 180
  • [8] Retweet Prediction Based on Heterogeneous Data Sources: The Combination of Text and Multilayer Network Features
    Mestrovic, Ana
    Petrovic, Milan
    Beliga, Slobodan
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [9] Security Attack Prediction Based on User Sentiment Analysis of Twitter Data
    Hernandez, Aldo
    Sanchez, Victor
    Sanchez, Gabriel
    Perez, Hector
    Olivares, Jesus
    Toscano, Karina
    Nakano, Mariko
    Martinez, Victor
    [J]. PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 610 - 617
  • [10] Sentiment Analysis of Twitter Data within Big Data Distributed Environment for Stock Prediction
    Skuza, Michal
    Romanowski, Andrzej
    [J]. PROCEEDINGS OF THE 2015 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 5 : 1349 - 1354