Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study

被引:16
|
作者
Li, Genghao [1 ]
Li, Bing [1 ]
Huang, Langlin [1 ]
Hou, Sibing [2 ]
机构
[1] Univ Int Business & Econ, Sch Informat Technol & Management, Huixin East St, Beijing 100029, Peoples R China
[2] Columbia Univ, Grad Sch Art & Sci, New York, NY USA
关键词
depression detection; depression diagnosis; social media; automatic construction; domain-specific lexicon; depression lexicon; label propagation;
D O I
10.2196/17650
中图分类号
R-058 [];
学科分类号
摘要
Background: According to a World Health Organization report in 2017, there was almost one patient with depression among every 20 people in China. However, the diagnosis of depression is usually difficult in terms of clinical detection owing to slow observation, high cost, and patient resistance. Meanwhile, with the rapid emergence of social networking sites, people tend to share their daily life and disclose inner feelings online frequently, making it possible to effectively identify mental conditions using the rich text information. There are many achievements regarding an English web-based corpus, but for research in China so far, the extraction of language features from web-related depression signals is still in a relatively primary stage. Objective: The purpose of this study was to propose an effective approach for constructing a depression-domain lexicon. This lexicon will contain language features that could help identify social media users who potentially have depression. Our study also compared the performance of detection with and without our lexicon. Methods: We autoconstructed a depression-domain lexicon using Word2Vec, a semantic relationship graph, and the label propagation algorithm. These two methods combined performed well in a specific corpus during construction. The lexicon was obtained based on 111,052 Weibo microblogs from 1868 users who were depressed or nondepressed. During depression detection, we considered six features, and we used five classification methods to test the detection performance. Results: The experiment results showed that in terms of the F1 value, our autoconstruction method performed 1% to 6% better than baseline approaches and was more effective and steadier. When applied to detection models like logistic regression and support vector machine, our lexicon helped the models outperform by 2% to 9% and was able to improve the final accuracy of potential depression detection. Conclusions: Our depression-domain lexicon was proven to be a meaningful input for classification algorithms, providing linguistic insights on the depressive status of test subjects. We believe that this lexicon will enhance early depression detection in people on social media. Future work will need to be carried out on a larger corpus and with more complex methods.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Automatic construction of a core lexicon for specific domain
    Ji, Luning
    Lu, Qin
    Li, Wenjie
    Chen, YiRong
    ALPIT 2007: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, 2007, : 183 - +
  • [2] Automatic construction of domain sentiment lexicon for semantic disambiguation
    Yanyan Wang
    Fulian Yin
    Jianbo Liu
    Marco Tosato
    Multimedia Tools and Applications, 2020, 79 : 22355 - 22373
  • [3] Automatic construction of domain sentiment lexicon for semantic disambiguation
    Wang, Yanyan
    Yin, Fulian
    Liu, Jianbo
    Tosato, Marco
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (31-32) : 22355 - 22373
  • [4] Automatic Construction of Domain-specific Sentiment Lexicon Based on the Semantics Graph
    Xiong, Gen
    Fang, Yilin
    Liu, Quan
    2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2017,
  • [5] A text mining approach for automatic construction of hypertexts
    Yang, HC
    Lee, CH
    EXPERT SYSTEMS WITH APPLICATIONS, 2005, 29 (04) : 723 - 734
  • [6] On method and automatic construction theory of domain ontology based on depended text
    Liu Yao
    Sui Zhifang
    Chen Xuefei
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 63 - +
  • [7] Constructing a broad-coverage lexicon for text mining in the patent domain
    Oostdijk, Nelleke
    Verberne, Suzan
    Koster, Cornelis
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2292 - 2298
  • [8] Automatic construction of domain-specific sentiment lexicon based on constrained label propagation
    Huang, Sheng
    Niu, Zhendong
    Shi, Chongyang
    KNOWLEDGE-BASED SYSTEMS, 2014, 56 : 191 - 200
  • [9] Automatic construction of domain-specific sentiment lexicon for unsupervised domain adaptation and sentiment classification
    Beigi, Omid Mohamad
    Moattar, Mohammad H.
    KNOWLEDGE-BASED SYSTEMS, 2021, 213
  • [10] A random walk algorithm for automatic construction of domain-oriented sentiment lexicon
    Tan, Songbo
    Wu, Qiong
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (10) : 12094 - 12100