Assessing Suicide Risk and Emotional Distress in Chinese Social Media: A Text Mining and Machine Learning Study

被引:125
|
作者
Cheng, Qijin [1 ]
Li, Tim M. H. [2 ]
Kwok, Chi-Leung [1 ]
Zhu, Tingshao [3 ,4 ]
Yip, Paul S. F. [1 ]
机构
[1] Univ Hong Kong, HKJC Ctr Suicide Res & Prevent, 2-F Hong Kong Jockey Club Bldg Interdisciplinary, Hong Kong, Hong Kong, Peoples R China
[2] Univ Hong Kong, LKS Fac Med, Dept Paediat & Adolescent Med, Hong Kong, Hong Kong, Peoples R China
[3] Chinese Acad Sci, Inst Psychol, Beijing, Peoples R China
[4] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
关键词
suicide; psychological stress; social media; Chinese; natural language; machine learning; LARGE NONCLINICAL SAMPLE; MENTAL-HEALTH PROBLEMS; OUT CROSS-VALIDATION; PROBABILITY SCALE; NORMATIVE DATA; YOUNG-PEOPLE; ADOLESCENTS; ACHIEVEMENT; ANXIETY; SERVICES;
D O I
10.2196/jmir.7276
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Early identification and intervention are imperative for suicide prevention. However, at-risk people often neither seek help nor take professional assessment. A tool to automatically assess their risk levels in natural settings can increase the opportunity for early intervention. Objective: The aim of this study was to explore whether computerized language analysis methods can be utilized to assess one's suicide risk and emotional distress in Chinese social media. Methods: A Web-based survey of Chinese social media (ie, Weibo) users was conducted to measure their suicide risk factors including suicide probability, Weibo suicide communication (WSC), depression, anxiety, and stress levels. Participants' Weibo posts published in the public domain were also downloaded with their consent. The Weibo posts were parsed and fitted into Simplified Chinese-Linguistic Inquiry and Word Count (SC-LIWC) categories. The associations between SC-LIWC features and the 5 suicide risk factors were examined by logistic regression. Furthermore, the support vector machine (SVM) model was applied based on the language features to automatically classify whether a Weibo user exhibited any of the 5 risk factors. Results: A total of 974 Weibo users participated in the survey. Those with high suicide probability were marked by a higher usage of pronoun (odds ratio, OR=1.18, P=.001), prepend words (OR=1.49, P=.02), multifunction words (OR=1.12, P=.04), a lower usage of verb (OR=0.78, P<.001), and a greater total word count (OR=1.007, P=.008). Second-person plural was positively associated with severe depression (OR=8.36, P=.01) and stress (OR=11, P=.005), whereas work-related words were negatively associated with WSC (OR=0.71, P=.008), severe depression (OR=0.56, P=.005), and anxiety (OR=0.77, P=.02). Inconsistently, third-person plural was found to be negatively associated with WSC (OR=0.02, P=.047) but positively with severe stress (OR=41.3, P=.04). Achievement-related words were positively associated with depression (OR=1.68, P=.003), whereas health-(OR=2.36, P=.004) and death-related (OR=2.60, P=.01) words positively associated with stress. The machine classifiers did not achieve satisfying performance in the full sample set but could classify high suicide probability (area under the curve, AUC=0.61, P=.04) and severe anxiety (AUC=0.75, P<.001) among those who have exhibited WSC. Conclusions: SC-LIWC is useful to examine language markers of suicide risk and emotional distress in Chinese social media and can identify characteristics different from previous findings in the English literature. Some findings are leading to new hypotheses for future verification. Machine classifiers based on SC-LIWC features are promising but still require further optimization for application in real life.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Spam Detection in Social Media Employing Machine Learning Tool for Text Mining
    Zaman, Zakia
    Sharmin, Sadia
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY AND INTERNET-BASED SYSTEMS (SITIS), 2017, : 137 - 142
  • [2] Sinhala Hate Speech Detection in Social Media using Text Mining and Machine learning
    Sandaruwan, H. M. S. T.
    Lorensuhewa, S. A. S.
    Kalyani, M. A. L.
    [J]. 2019 19TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER - 2019), 2019,
  • [3] Nonresponse to an item assessing firearm ownership: Associations with suicide risk and emotional distress
    Daruwala, Samantha E.
    Bauder, C. Rosie
    Bozzay, Melanie L.
    Bryan, Craig J.
    [J]. SUICIDE AND LIFE-THREATENING BEHAVIOR, 2024,
  • [4] Creating a Chinese suicide dictionary for identifying suicide risk on social media
    Lv, Meizhen
    Li, Ang
    Liu, Tianli
    Zhu, Tingshao
    [J]. PEERJ, 2015, 3
  • [5] Suicide Ideation Estimators within Canadian Provinces Using Machine Learning Tools on Social Media Text
    Skaik, Ruba
    Inkpen, Diana
    [J]. JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2021, 12 (04) : 357 - 362
  • [6] Detection of Cyber-Aggressive Comments on Social Media Networks: A Machine Learning and Text mining approach
    Rasel, Risul Islam
    Sultana, Nasrin
    Akhter, Sharna
    Meesad, Phayung
    [J]. PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL (NLPIR 2018), 2018, : 37 - 41
  • [7] Psychometric and Validity Issues in Machine Learning Approaches to Personality Assessment: A Focus on Social Media Text Mining
    Tay, Louis
    Woo, Sang Eun
    Hickman, Louis
    Saef, Rachel M.
    [J]. EUROPEAN JOURNAL OF PERSONALITY, 2020, 34 (05) : 826 - 844
  • [8] A Comprehensive Study on Machine Learning Concepts for Text Mining
    Surya, K.
    Subramanian, R. Nithin
    Prasanna, S.
    Venkatesan, R.
    [J]. PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT 2016), 2016,
  • [9] Hybrid Text Representation for Explainable Suicide Risk Identification on Social Media
    Naseem, Usman
    Khushi, Matloob
    Kim, Jinman
    Dunn, Adam G.
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (04) : 4663 - 4672
  • [10] Enhancing the government accounting information systems using social media information: An application of text mining and machine learning
    Duan, Huijue Kelly
    Vasarhelyi, Miklos A.
    Codesso, Mauricio
    Alzamil, Zamil
    [J]. INTERNATIONAL JOURNAL OF ACCOUNTING INFORMATION SYSTEMS, 2023, 48