Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study

被引:1
|
作者
Cascalheira, Cory J. [1 ,2 ]
Flinn, Ryan E. [3 ,4 ]
Zhao, Yuxuan [1 ]
Klooster, Dannie [5 ]
Laprade, Danica [6 ]
Hamdi, Shah Muhammad [7 ]
Scheer, Jillian R. [2 ]
Gonzalez, Alejandra [8 ]
Lund, Emily M. [9 ,10 ]
Gomez, Ivan N. [1 ]
Saha, Koustuv [11 ]
De Choudhury, Munmun [1 ,12 ]
机构
[1] New Mexico State Univ, Dept Counseling & Educ Psychol, 1220 Stewart St, Las Cruces, NM 88003 USA
[2] Syracuse Univ, Dept Psychol, Syracuse, NY USA
[3] Augusta Univ, Augusta, GA USA
[4] Univ North Dakota, Grand Forks, ND USA
[5] Oklahoma State Univ, Stillwater, OK USA
[6] No Arizona Univ, Flagstaff, AZ USA
[7] Utah State Univ, Dept Comp Sci, Logan, UT USA
[8] Xavier Univ, Cincinnati, OH USA
[9] Univ Alabama, Tuscaloosa, AL USA
[10] Ewha Womans Univ, Seoul, South Korea
[11] Univ Illinois, Champaign, IL USA
[12] Georgia Inst Technol, Atlanta, GA USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
gender diverse; gender dysphoria; social media; social computing; digital health; mobile phone; HEALTH-CARE; ADOLESCENTS; BARRIERS; SUPPORT; PEOPLE;
D O I
10.2196/47256
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: The optimal treatment for gender dysphoria is medical intervention, but many transgender and nonbinary people face significant treatment barriers when seeking help for gender dysphoria. When untreated, gender dysphoria is associated with depression, anxiety, suicidality, and substance misuse. Technology-delivered interventions for transgender and nonbinary people can be used discretely, safely, and flexibly, thereby reducing treatment barriers and increasing access to psychological interventions to manage distress that accompanies gender dysphoria. Technology-delivered interventions are beginning to incorporate machine learning (ML) and natural language processing (NLP) to automate intervention components and tailor intervention content. A critical step in using ML and NLP in technology-delivered interventions is demonstrating how accurately these methods model clinical constructs. Objective: This study aimed to determine the preliminary effectiveness of modeling gender dysphoria with ML and NLP, using transgender and nonbinary people's social media data. Methods: Overall, 6 ML models and 949 NLP-generated independent variables were used to model gender dysphoria from the text data of 1573 Reddit (Reddit Inc) posts created on transgender-and nonbinary-specific web-based forums. After developing a codebook grounded in clinical science, a research team of clinicians and students experienced in working with transgender and nonbinary clients used qualitative content analysis to determine whether gender dysphoria was present in each Reddit post (ie, the dependent variable). NLP (eg, n-grams, Linguistic Inquiry and Word Count, word embedding, sentiment, and transfer learning) was used to transform the linguistic content of each post into predictors for ML algorithms. A k-fold cross-validation was performed. Hyperparameters were tuned with random search. Feature selection was performed to demonstrate the relative importance of each NLP-generated independent variable in predicting gender dysphoria. Misclassified posts were analyzed to improve future modeling of gender dysphoria. Results: Results indicated that a supervised ML algorithm (ie, optimized extreme gradient boosting [XGBoost]) modeled gender dysphoria with a high degree of accuracy (0.84), precision (0.83), and speed (1.23 seconds). Of the NLP-generated independent variables, Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) clinical keywords (eg, dysphoria and disorder) were most predictive of gender dysphoria. Misclassifications of gender dysphoria were common in posts that expressed uncertainty, featured a stressful experience unrelated to gender dysphoria, were incorrectly coded, expressed insufficient linguistic markers of gender dysphoria, described past experiences of gender dysphoria, showed evidence of identity exploration, expressed aspects of human sexuality unrelated to gender dysphoria, described socially based gender dysphoria, expressed strong affective or cognitive reactions unrelated to gender dysphoria, or discussed body image. Conclusions: Findings suggest that ML-and NLP-based models of gender dysphoria have significant potential to be integrated into technology-delivered interventions. The results contribute to the growing evidence on the importance of incorporating ML and NLP designs in clinical science, especially when studying marginalized populations.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Machine learning and Natural Language Processing of social media data for event detection in smart cities
    Hodorog, Andrei
    Petri, Ioan
    Rezgui, Yacine
    [J]. SUSTAINABLE CITIES AND SOCIETY, 2022, 85
  • [2] Detection of social media platform insults using Natural language processing and comparative study of machine learning algorithms
    Chiramel, Sruthi
    Logofatu, Doina
    Goldenthal, Gheorghe
    [J]. 2020 24TH INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2020, : 98 - 101
  • [3] Using social media, machine learning and natural language processing to map multiple recreational beneficiaries
    Gosal, Arjan S.
    Geijzendorffer, Ilse R.
    Vaclavik, Tomas
    Poulin, Brigitte
    Ziv, Guy
    [J]. ECOSYSTEM SERVICES, 2019, 38
  • [4] Social Media Content Categorization Using Supervised Based Machine Learning Methods and Natural Language Processing in Bangla Language
    Alam, Md Rejaul
    Akter, Afsana
    Shafin, Minhajul Abedin
    Hasan, Md Mehedi
    Mahmud, Antara
    [J]. PROCEEDINGS OF 2020 11TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (ICECE), 2020, : 270 - 273
  • [5] Standardization of Featureless Variables for Machine Learning Models Using Natural Language Processing
    Modarresi, Kourosh
    Munir, Abdurrahman
    [J]. COMPUTATIONAL SCIENCE - ICCS 2018, PT II, 2018, 10861 : 234 - 246
  • [6] Crime Detection and Analysis from Social Media Messages Using Machine Learning and Natural Language Processing Technique
    Lombo, Xolani
    Oyelade, Olaide N.
    Ezugwu, Absalom E.
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2022 WORKSHOPS, PART V, 2022, 13381 : 502 - 517
  • [7] Surveying the Use of Social Media Data and Natural Language Processing Techniques to Investigate Natural Disasters
    Ma, Zihui
    Li, Lingyao
    Mao, Yujie
    Wang, Yu
    Patsy, Olivia Grace
    Bensi, Michelle T.
    Hemphill, Libby
    Baecher, Gregory B.
    [J]. NATURAL HAZARDS REVIEW, 2024, 25 (04)
  • [8] Detection of Arabic offensive language in social media using machine learning models
    Mousa, Aya
    Shahin, Ismail
    Nassif, Ali Bou
    Elnagar, Ashraf
    [J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
  • [9] A Review of Natural Language Processing and Machine Learning Tools Used to Analyze Arabic Social Media
    Kanan, Tarek
    Sadaqa, Odai
    Aldajeh, Amal
    Alshwabka, Hanadi
    AL-dolime, Wassan
    AlZu'bi, Shadi
    Elbes, Mohammed
    Hawashin, Bilal
    Alia, Mohammad A.
    [J]. 2019 IEEE JORDAN INTERNATIONAL JOINT CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY (JEEIT), 2019, : 622 - 628
  • [10] Monitoring COVID-19 pandemic through the lens of social media using natural language processing and machine learning
    Liu, Yang
    Whitfield, Christopher
    Zhang, Tianyang
    Hauser, Amanda
    Reynolds, Taeyonn
    Anwar, Mohd
    [J]. HEALTH INFORMATION SCIENCE AND SYSTEMS, 2021, 9 (01)