How Machine Learning Classification Accuracy Changes in a Happiness Dataset with Different Demographic Groups

被引:5
|
作者
Sweeney, Colm [1 ]
Ennis, Edel [1 ]
Mulvenna, Maurice [2 ]
Bond, Raymond [2 ]
O'Neill, Siobhan [1 ]
机构
[1] Ulster Univ, Sch Psychol, Coleraine BT52 1SA, Londonderry, North Ireland
[2] Ulster Univ, Sch Comp, Jordanstown BT37 0QB, North Ireland
关键词
machine learning; classification; positive psychology; GENDER;
D O I
10.3390/computers11050083
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This study aims to explore how machine learning classification accuracy changes with different demographic groups. The HappyDB is a dataset that contains over 100,000 happy statements, incorporating demographic information that includes marital status, gender, age, and parenthood status. Using the happiness category field, we test different types of machine learning classifiers to predict what category of happiness the statements belong to, for example, whether they indicate happiness relating to achievement or affection. The tests were initially conducted with three distinct classifiers and the best performing model was the convolutional neural network (CNN) model, which is a deep learning algorithm, achieving an F1 score of 0.897 when used with the complete dataset. This model was then used as the main classifier to further analyze the results and to establish any variety in performance when tested on different demographic groups. We analyzed the results to see if classification accuracy was improved for different demographic groups, and found that the accuracy of prediction within this dataset declined with age, with the exception of the single parent subgroup. The results also showed improved performance for the married and parent subgroups, and lower performances for the non-parent and un-married subgroups, even when investigating a balanced sample.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Comparative Analysis of Different Machine Learning Algorithms in Classification
    Wang, Lincong
    Xu, Weiwen
    Zhu, Zhenghao
    2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 257 - 263
  • [32] Efficient image dataset classification difficulty estimation for predicting deep-learning accuracy
    Scheidegger, Florian
    Istrate, Roxana
    Mariani, Giovanni
    Benini, Luca
    Bekas, Costas
    Malossi, Cristiano
    VISUAL COMPUTER, 2021, 37 (06): : 1593 - 1610
  • [33] Efficient image dataset classification difficulty estimation for predicting deep-learning accuracy
    Florian Scheidegger
    Roxana Istrate
    Giovanni Mariani
    Luca Benini
    Costas Bekas
    Cristiano Malossi
    The Visual Computer, 2021, 37 : 1593 - 1610
  • [34] Assessing Land Cover Classification Accuracy: Variations in Dataset Combinations and Deep Learning Models
    Sim, Woo-Dam
    Yim, Jong-Su
    Lee, Jung-Soo
    REMOTE SENSING, 2024, 16 (14)
  • [35] Accuracy evaluation of different machine learning force field features
    Han, Ting
    Li, Jie
    Liu, Liping
    Li, Fengyu
    Wang, Lin-Wang
    NEW JOURNAL OF PHYSICS, 2023, 25 (09):
  • [36] How Big Data changes Statistical Machine Learning
    Bottou, Leon
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1 - 1
  • [37] Machine learning classification of diagnostic accuracy in pathologists interpreting breast biopsies
    Brunye, Tad T.
    Booth, Kelsey
    Hendel, Dalit
    Kerr, Kathleen F.
    Shucard, Hannah
    Weaver, Donald L.
    Elmore, Joann G.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (03) : 552 - 562
  • [38] Hybrid Machine Learning Classification Technique for Improve Accuracy of Heart Disease
    Maru, Ajay
    Sharma, Ajay Kumar
    Patel, Mayank
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 1107 - 1110
  • [39] Classification model for accuracy and intrusion detection using machine learning approach
    Agarwal A.
    Sharma P.
    Alshehri M.
    Mohamed A.A.
    Alfarraj O.
    PeerJ Computer Science, 2021, 7 : 1 - 22
  • [40] SigD: A Cross-Session Dataset for PPG-based User Authentication in Different Demographic Groups
    Li, Lin
    Chen, Chao
    Pan, Lei
    Zhang, Jun
    Xiang, Yang
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,