Big data analytics for critical information classification in online social networks using classifier chains

被引:0
|
作者
Silva, Douglas H. [1 ]
Maziero, Erick G. [1 ]
Saadi, Muhammad [2 ]
Rosa, Renata L. [1 ]
Silva, Juan C. [3 ]
Rodriguez, Demostenes Z. [1 ]
Igorevich, Kostromitin K. [4 ]
机构
[1] Univ Fed Lavras, Dept Comp Sci, BR-37200900 Lavras, Brazil
[2] Univ Cent Punjab, Fac Engn, Dept Elect Engn, Lahore, Pakistan
[3] Pontifical Catholic Univ Peru, Dept Sci, Lima, Peru
[4] South Ural State Univ, 76 Lenin Ave, Chelyabinsk 454080, Russia
关键词
Big data; Age-group classifier; Gender classifier; Feature selection; Feature transformation; Multi-label classification; GENDER IDENTIFICATION; BEHAVIOR; AGE;
D O I
10.1007/s12083-021-01269-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Industrial and academic organizations are using online social network (OSN) for different purposes, such as social and economic aspects. Now, OSN is a new mean of obtaining information from people about their preferences, and interests. Due to the large volume of user-generated content, researchers use various techniques, such as sentiment analysis or data mining to evaluate this information automatically. However, the sentiment analysis of OSN content is performed by different methods, but there are some problems to obtain highly reliable results, mainly because of the lack of user profile information, such as gender and age. In this work, a novel dataset is built, which contains the writing characteristics of 160,000 users of the Twitter OSN. Before creating classification models with Machine Learning (ML) techniques, feature transformation and feature selection methods are applied to determine the most relevant set of characteristics. To create the models, the Classifier Chain (CC) transformation technique and different machine learning algorithms are applied to the training set. Simulation results show that the Random Forest, XGBoost and Decision Tree algorithms obtain the best performance results. In the testing phase, these algorithms reached Hamming Loss values of 0.033, 0.033, and 0.034, respectively, and all of them reached the same F1 micro-average value equal to 0.976. Therefore, our proposal based on a multidimensional learning technique using CC transformation overcomes other similar proposals.
引用
收藏
页码:626 / 641
页数:16
相关论文
共 50 条
  • [41] A Classifier Using Online Bagging Ensemble Method for Big Data Stream Learning
    Lv, Yanxia
    Peng, Sancheng
    Yuan, Ying
    Wang, Cong
    Yin, Pengfei
    Liu, Jiemin
    Wang, Cuirong
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2019, 24 (04) : 379 - 388
  • [42] Measuring Radicalization in Online Social Networks Using Markov Chains
    Wadhwa, Pooja
    Bhatia, M. P. S.
    [J]. JOURNAL OF APPLIED SECURITY RESEARCH, 2015, 10 (01) : 23 - 47
  • [43] A Big Data Model supporting Information Recommendation in Social Networks
    Han, Xiaoyue
    Tian, Lianhua
    Yoon, Minjoo
    Lee, Minsoo
    [J]. SECOND INTERNATIONAL CONFERENCE ON CLOUD AND GREEN COMPUTING / SECOND INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AND ITS APPLICATIONS (CGC/SCA 2012), 2012, : 810 - 813
  • [44] Fake Account Detection in Social Media Using Big Data Analytics
    Mujeeb, Shaik
    Gupta, Sangeeta
    [J]. PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENGINEERING AND COMMUNICATION SYSTEMS, ICACECS 2021, 2022, : 587 - 596
  • [45] Big Data Analytics in Mobile Cellular Networks
    He, Ying
    Yu, Fei Richard
    Zhao, Nan
    Yin, Hongxi
    Yao, Haipeng
    Qiu, Robert C.
    [J]. IEEE ACCESS, 2016, 4 : 1985 - 1996
  • [46] Online Forum Authenticity: Big Data Analytics in Healthcare
    Zhan, Ge
    [J]. ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 290 - 294
  • [47] Online learning algorithms for big data analytics: A survey
    Li, Zhijie
    Li, Yuanxiang
    Wang, Feng
    He, Guoliang
    Kuang, Li
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (08): : 1707 - 1721
  • [48] Social Media Analytics Based on Big Data
    Shaikh, Farzana
    Rangrez, Firdaus
    Khan, Afsha
    Shaikh, Uzma
    [J]. PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL (I2C2), 2017,
  • [49] Social media big data analytics: A survey
    Ghani, Norjihan Abdul
    Hamid, Suraya
    Hashem, Ibrahim Abaker Targio
    Ahmed, Ejaz
    [J]. COMPUTERS IN HUMAN BEHAVIOR, 2019, 101 : 417 - 428
  • [50] A Framework for the Efficient Collection of Big Data from Online Social Networks
    Petrillo, Umberto Ferraro
    Consolo, Stefano
    [J]. 2014 INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2014, : 34 - 41