Big data analytics for critical information classification in online social networks using classifier chains

被引:0
|
作者
Silva, Douglas H. [1 ]
Maziero, Erick G. [1 ]
Saadi, Muhammad [2 ]
Rosa, Renata L. [1 ]
Silva, Juan C. [3 ]
Rodriguez, Demostenes Z. [1 ]
Igorevich, Kostromitin K. [4 ]
机构
[1] Univ Fed Lavras, Dept Comp Sci, BR-37200900 Lavras, Brazil
[2] Univ Cent Punjab, Fac Engn, Dept Elect Engn, Lahore, Pakistan
[3] Pontifical Catholic Univ Peru, Dept Sci, Lima, Peru
[4] South Ural State Univ, 76 Lenin Ave, Chelyabinsk 454080, Russia
关键词
Big data; Age-group classifier; Gender classifier; Feature selection; Feature transformation; Multi-label classification; GENDER IDENTIFICATION; BEHAVIOR; AGE;
D O I
10.1007/s12083-021-01269-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Industrial and academic organizations are using online social network (OSN) for different purposes, such as social and economic aspects. Now, OSN is a new mean of obtaining information from people about their preferences, and interests. Due to the large volume of user-generated content, researchers use various techniques, such as sentiment analysis or data mining to evaluate this information automatically. However, the sentiment analysis of OSN content is performed by different methods, but there are some problems to obtain highly reliable results, mainly because of the lack of user profile information, such as gender and age. In this work, a novel dataset is built, which contains the writing characteristics of 160,000 users of the Twitter OSN. Before creating classification models with Machine Learning (ML) techniques, feature transformation and feature selection methods are applied to determine the most relevant set of characteristics. To create the models, the Classifier Chain (CC) transformation technique and different machine learning algorithms are applied to the training set. Simulation results show that the Random Forest, XGBoost and Decision Tree algorithms obtain the best performance results. In the testing phase, these algorithms reached Hamming Loss values of 0.033, 0.033, and 0.034, respectively, and all of them reached the same F1 micro-average value equal to 0.976. Therefore, our proposal based on a multidimensional learning technique using CC transformation overcomes other similar proposals.
引用
收藏
页码:626 / 641
页数:16
相关论文
共 50 条
  • [1] Big data analytics for critical information classification in online social networks using classifier chains
    Douglas H. Silva
    Erick G. Maziero
    Muhammad Saadi
    Renata L. Rosa
    Juan C. Silva
    Demostenes Z. Rodriguez
    Kostromitin K. Igorevich
    [J]. Peer-to-Peer Networking and Applications, 2022, 15 : 626 - 641
  • [2] Research on opinion polarization by big data analytics capabilities in online social networks
    Xing, Yunfei
    Wang, Xiwei
    Qiu, Chengcheng
    Li, Yueqi
    He, Wu
    [J]. TECHNOLOGY IN SOCIETY, 2022, 68
  • [3] Incremental Ant-Miner Classifier for Online Big Data Analytics
    Al-Dawsari, Amal
    Al-Turaiki, Isra
    Kurdi, Heba
    [J]. SENSORS, 2022, 22 (06)
  • [4] Distributed Online Big Data Classification Using Context Information
    Tekin, Cem
    van der Schaar, Mihaela
    [J]. 2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 1435 - 1442
  • [5] Analysis of Dengue Outbreaks Using Big Data Analytics and Social Networks
    Carlos, Marcelo Aparecido
    Nogueira, Marcelo
    Machado, Ricardo J.
    [J]. 2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 1592 - 1597
  • [6] The convergence of new computing paradigms and big data analytics methodologies for online social networks
    Zhang, Zhiyong
    Choo, Kim-Kwang Raymond
    Gupta, Brij B.
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 26 : 453 - 455
  • [7] Big Data Driven Information Diffusion Analysis and Control in Online Social Networks
    Zhang, Kai
    Wang, Jingjing
    Jiang, Chunxiao
    Wei, Zhongxiang
    Ren, Yong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2017,
  • [8] Tutorial on big spectrum data analytics for space information networks
    Guoru Ding
    Lin Li
    Juzhen Wang
    Yumeng Wang
    Lei Chen
    [J]. EURASIP Journal on Wireless Communications and Networking, 2018
  • [9] Tutorial on big spectrum data analytics for space information networks
    Ding, Guoru
    Li, Lin
    Wang, Juzhen
    Wang, Yumeng
    Chen, Lei
    [J]. EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
  • [10] Big Data Analytics using Multi-Classifier Approach with RHadoop
    Hiranandani, Priyanka
    Pilli, Emmanuel S.
    Chand, Nanak
    Ramakrishna, C.
    Gupta, Madhuri
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE CONFLUENCE 2018 ON CLOUD COMPUTING, DATA SCIENCE AND ENGINEERING, 2018, : 478 - 484