Computing Happiness from Textual Data

被引:3
|
作者
Mohamed, Emad [1 ]
Mostafa, Sayed A. [2 ]
机构
[1] Univ Wolverhampton, Res Grp Computat Linguist, Wolverhampton WV1 1LY, England
[2] North Carolina A&T State Univ, Dept Math & Stat, Greensboro, NC 27411 USA
来源
STATS | 2019年 / 2卷 / 03期
关键词
fastText; gradient boosting; happiness; lemmatization; lexical analysis; logistic regression; parsing; topic modeling;
D O I
10.3390/stats2030025
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In this paper, we use a corpus of about 100,000 happy moments written by people of different genders, marital statuses, parenthood statuses, and ages to explore the following questions: Are there differences between men and women, married and unmarried individuals, parents and non-parents, and people of different age groups in terms of their causes of happiness and how they express happiness? Can gender, marital status, parenthood status and/or age be predicted from textual data expressing happiness? The first question is tackled in two steps: first, we transform the happy moments into a set of topics, lemmas, part of speech sequences, and dependency relations; then, we use each set as predictors in multi-variable binary and multinomial logistic regressions to rank these predictors in terms of their influence on each outcome variable (gender, marital status, parenthood status and age). For the prediction task, we use character, lexical, grammatical, semantic, and syntactic features in a machine learning document classification approach. The classification algorithms used include logistic regression, gradient boosting, and fastText. Our results show that textual data expressing moments of happiness can be quite beneficial in understanding the "causes of happiness" for different social groups, and that social characteristics like gender, marital status, parenthood status, and, to some extent age, can be successfully predicted form such textual data. This research aims to bring together elements from philosophy and psychology to be examined by computational corpus linguistics methods in a way that promotes the use of Natural Language Processing for the Humanities.
引用
收藏
页码:347 / 370
页数:24
相关论文
共 50 条
  • [1] COMPUTING IN TEXTUAL STUDIES
    DEEGAN, M
    LEE, S
    MULLINGS, C
    COMPUTERS & EDUCATION, 1992, 19 (1-2) : 183 - 191
  • [2] Textual Data Analysis from Data Lakes
    Sawadogo, Pegdwende N.
    NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2019, 2019, 1064 : 558 - 563
  • [3] Computing Online Average Happiness Maximization Sets over Data Streams
    Hao, Zhiyang
    Zheng, Jiping
    WEB AND BIG DATA, PT III, APWEB-WAIM 2022, 2023, 13423 : 19 - 33
  • [4] Learning to Predict from Textual Data
    Radinsky, Kira
    Davidovich, Sagie
    Markovitch, Shaul
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2012, 45 : 641 - 684
  • [5] Happiness Recognition from Mobile Phone Data
    Bogomolov, Andrey
    Lepri, Bruno
    Pianesi, Fabio
    2013 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM), 2013, : 790 - 795
  • [6] A survey on narrative extraction from textual data
    Santana, Brenda
    Campos, Ricardo
    Amorim, Evelin
    Jorge, Alipio
    Silvano, Purificacao
    Nunes, Sergio
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (08) : 8393 - 8435
  • [7] Learning from Textual Data in Database Systems
    Guenther, Michael
    Oehme, Philipp
    Thiele, Maik
    Lehner, Wolfgang
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 375 - 384
  • [8] Mining causality knowledge from textual data
    Pechsiri, C
    Kawtrakul, A
    Piriyakul, R
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, 2006, : 85 - +
  • [9] A survey on narrative extraction from textual data
    Brenda Santana
    Ricardo Campos
    Evelin Amorim
    Alípio Jorge
    Purificação Silvano
    Sérgio Nunes
    Artificial Intelligence Review, 2023, 56 : 8393 - 8435
  • [10] Mining explanation knowledge from textual data
    Pechsiri, Chaveevan
    Kawtrakul, Asance
    Piriyakul, Rapepun
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER SCIENCE AND TECHNOLOGY, 2006, : 322 - +