KLOSURE: Closing in on open-ended patient questionnaires with text mining

被引:7
|
作者
Spasic, Irena [1 ]
Owen, David [1 ]
Smith, Andrew [2 ]
Button, Kate [3 ]
机构
[1] Cardiff Univ, Sch Comp Sci & Informat, Cardiff, S Glam, Wales
[2] Cardiff Univ, Sch Psychol, Cardiff, S Glam, Wales
[3] Cardiff Univ, Sch Healthcare Sci, Cardiff, S Glam, Wales
基金
英国惠康基金; 英国工程与自然科学研究理事会;
关键词
Text mining; Natural language processing; Text classification; Named entity recognition; Sentiment analysis; Patient reported outcome measure; Open-ended questionnaire; BIOMEDICAL TEXT; UMLS;
D O I
10.1186/s13326-019-0215-3
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Knee injury and Osteoarthritis Outcome Score (KOOS) is an instrument used to quantify patients' perceptions about their knee condition and associated problems. It is administered as a 42-item closed-ended questionnaire in which patients are asked to self-assess five outcomes: pain, other symptoms, activities of daily living, sport and recreation activities, and quality of life. We developed KLOG as a 10-item open-ended version of the KOOS questionnaire in an attempt to obtain deeper insight into patients' opinions including their unmet needs. However, the open-ended nature of the questionnaire incurs analytical overhead associated with the interpretation of responses. The goal of this study was to automate such analysis. We implemented KLOSURE as a system for mining free-text responses to the KLOG questionnaire. It consists of two subsystems, one concerned with feature extraction and the other one concerned with classification of feature vectors. Feature extraction is performed by a set of four modules whose main functionalities are linguistic pre-processing, sentiment analysis, named entity recognition and lexicon lookup respectively. Outputs produced by each module are combined into feature vectors. The structure of feature vectors will vary across the KLOG questions. Finally, Weka, a machine learning workbench, was used for classification of feature vectors. Results The precision of the system varied between 62.8 and 95.3%, whereas the recall varied from 58.3 to 87.6% across the 10 questions. The overall performance in terms of F-measure varied between 59.0 and 91.3% with an average of 74.4% and a standard deviation of 8.8. Conclusions We demonstrated the feasibility of mining open-ended patient questionnaires. By automatically mapping free text answers onto a Likert scale, we can effectively measure the progress of rehabilitation over time. In comparison to traditional closed-ended questionnaires, our approach offers much richer information that can be utilised to support clinical decision making. In conclusion, we demonstrated how text mining can be used to combine the benefits of qualitative and quantitative analysis of patient experiences.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation
    Karpinska, Marzena
    Akoury, Nader
    Iyyer, Mohit
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1265 - 1285
  • [32] USING PLACEHOLDER TEXT IN NARRATIVE OPEN-ENDED QUESTIONS IN WEB SURVEYS
    Kunz, Tanja
    Quoss, Franziska
    Gummer, Tobias
    JOURNAL OF SURVEY STATISTICS AND METHODOLOGY, 2021, 9 (05) : 992 - 1012
  • [33] Perception Score: A Learned Metric for Open-ended Text Generation Evaluation
    Gu, Jing
    Wu, Qingyang
    Yu, Zhou
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 12902 - 12910
  • [34] Towards Open-Ended Text-to-Face Generation, Combination and Manipulation
    Peng, Jun
    Pan, Han
    Zhou, Yiyi
    He, Jing
    Sun, Xiaoshuai
    Wang, Yan
    Wu, Yongjian
    Ji, Rongrong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5045 - 5054
  • [35] Automatic coding of open-ended surveys using text categorization techniques
    Giorgetti, D
    Prodanof, I
    Sebastiani, F
    ASC 2003: THE IMPACT OF TECHNOLOGY ON THE SURVEY PROCESS, 2003, : 173 - 184
  • [36] Towards Informative Open-ended Text Generation with Dynamic Knowledge Triples
    Ren, Zixuan
    Zhao, Yang
    Zong, Chengqing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3189 - 3203
  • [37] kNN-LM Does Not Improve Open-ended Text Generation
    Wang, Shufan
    Song, Yixiao
    Drozdov, Andrew
    Garimella, Aparna
    Manjunatha, Varun
    Iyyer, Mohit
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15023 - 15037
  • [38] Open-ended Long Text Generation via Masked Language Modeling
    Liang, Xiaobo
    Tang, Zecheng
    Li, Juntao
    Zhang, Min
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 223 - 241
  • [39] Analysis Support System of Open-ended Questionnaires Based on Atypical and Typical Opinions Classification
    Akiyoshi, Masanori
    Kimura, Keishi
    Oiso, Hiroaki
    Komoda, Norihisa
    STUDIES IN INFORMATICS AND CONTROL, 2009, 18 (03): : 195 - 204
  • [40] Association Analysis on Open-Ended Concept Maps using Data Mining
    Prasetya, Didik Dwi
    Putro, Setiadi Cahyono
    Ashar, Muhammad
    Ulfa, Saida
    Hirashima, Tsukasa
    30TH INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION, ICCE 2022, VOL 1, 2022, : 346 - 351