Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding

被引：53

作者：

Xiao, Ziang ^{[1
]}

Yuan, Xingdi ^{[1
]}

Liao, Q. Vera ^{[1
]}

Abdelghani, Rania ^{[2
]}

Oudeyer, Pierre-Yves ^{[2
]}

机构：

[1] Microsoft Res, Montreal, PQ, Canada

[2] INRIA, Paris, France

来源：

COMPANION PROCEEDINGS OF 2023 28TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2023 COMPANION | 2023年

关键词：

Qualitative Analysis; Deductive Coding; Large Language Model; GPT-3;

D O I：

10.1145/3581754.3584136

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Qualitative analysis of textual contents unpacks rich and valuable information by assigning labels to the data. However, this process is often labor-intensive, particularly when working with large datasets. While recent AI-based tools demonstrate utility, researchers may not have readily available AI resources and expertise, let alone be challenged by the limited generalizability of those task-specific models. In this study, we explored the use of large language models (LLMs) in supporting deductive coding, a major category of qualitative analysis where researchers use pre-determined code-books to label the data into a fixed set of codes. Instead of training task-specific models, a pre-trained LLM could be used directly for various tasks without fine-tuning through prompt learning. Using a curiosity-driven questions coding task as a case study, we found, by combining GPT-3 with expert-drafted codebooks, our proposed approach achieved fair to substantial agreements with expert-coded results. We lay out challenges and opportunities in using LLMs to support qualitative coding and beyond.

引用

页码：75 / 78

页数：4

共 48 条

[31] How large language models including generative pre-trained transformer (GPT) 3 and 4 will impact medicine and surgery
S. B. Atallah
N. R. Banda
A. Banda
N. A. Roeck
Techniques in Coloproctology, 2023, 27 : 609 - 614
[32] Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity
Amirova, Aliya
Fteropoulli, Theodora
Ahmed, Nafiso
Cowie, Martin R.
Leibo, Joel Z.
PLOS ONE, 2024, 19 (03):
[33] Unmasking large language models by means of OpenAI GPT-4 and Google AI: A deep instruction-based analysis
Zahid, Idrees A.
Joudar, Shahad Sabbar
Albahri, A. S.
Albahri, O. S.
Alamoodi, A. H.
Santamaria, Jose
Alzubaidi, Laith
INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 23
[34] Performance of large language models in the National Dental Licensing Examination in China: a comparative analysis of ChatGPT, GPT-4, and New Bing
Hu, Ziyang
Xu, Zhe
Shi, Ping
Zhang, Dandan
Yue, Qu
Zhang, Jiexia
Lei, Xin
Lin, Zitong
INTERNATIONAL JOURNAL OF COMPUTERIZED DENTISTRY, 2024, 27 (04)
[35] Screening oncology articles in a qualitative literature review using large language models: A comparison of GPT4 versus fine-tuned open source models using expert-annotated data
Thorlund, Kristian
Lloyd-Price, Lucy
Jafar, Reza
Nourizade, Milad
Burbridge, Claire
Hudgens, Stacie
JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
[36] Evaluating large language models for surgical chart review of second stage implant-based breast reconstruction: a comparative analysis of manual review, GPT-3.5 Turbo, and GPT-4 Turbo
Lakhlani, Devi
Dadhania, Dhruv
Nazerali, Rahim
EUROPEAN JOURNAL OF PLASTIC SURGERY, 2025, 48 (01)
[37] GPT3 Meets PubMed: A Novel Approach to Meta-Analysis Using a Large Language Model to Crowdsource Migraine Medication Reviews
Mackenzie, Elyse
Cheng, Roger
Zhang, Pengfei
CEPHALALGIA, 2023, 43 (1supp) : 68 - 69
[38] Human-Comparable Sensitivity of Large Language Models inIdenti fying Eligible Studies Through Title and Abstract Screening:3-Layer Strategy Using GPT-3.5 and GPT-4 for Systematic Reviews
Matsui, Kentaro
Utsumi, Tomohiro
Aoki, Yumi
Maruki, Taku
Takeshima, Masahiro
Takaesu, Yoshikazu
JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
[39] Comparative diagnostic accuracy of GPT-4o and LLaMA 3-70b: Proprietary vs. open-source large language models in radiology☆
Li, David
Gupta, Kartik
Bhaduri, Mousumi
Sathiadoss, Paul
Bhatnagar, Sahir
Chong, Jaron
CLINICAL IMAGING, 2025, 118
[40] Unraveling media perspectives: a comprehensive methodology combining large language models, topic modeling, sentiment analysis, and ontology learning to analyse media bias
Jaehde, Orlando
Weber, Thorsten
Buchkremer, Ruediger
JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2025, 8 (02):

← 1 2 3 4 5 →