Comparing human text classification performance and explainability with large language and machine learning models using eye-tracking

被引：0

作者：

Venkatesh, Jeevithashree Divya ^{[1
]}

Jaiswal, Aparajita ^{[2
]}

Nanda, Gaurav ^{[1
]}

机构：

[1] Purdue Univ, Sch Engn Technol, W Lafayette, IN 47907 USA

[2] Purdue Univ, Ctr Intercultural Learning Mentorship Assessment &, W Lafayette, IN 47907 USA

来源：

SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期

关键词：

Human-AI alignment; Large language models; Explainable AI; Eye tracking; Cognitive engineering; Human-computer interaction; MOVEMENTS; GAZE;

D O I：

10.1038/s41598-024-65080-7

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

To understand the alignment between reasonings of humans and artificial intelligence (AI) models, this empirical study compared the human text classification performance and explainability with a traditional machine learning (ML) model and large language model (LLM). A domain-specific noisy textual dataset of 204 injury narratives had to be classified into 6 cause-of-injury codes. The narratives varied in terms of complexity and ease of categorization based on the distinctive nature of cause-of-injury code. The user study involved 51 participants whose eye-tracking data was recorded while they performed the text classification task. While the ML model was trained on 120,000 pre-labelled injury narratives, LLM and humans did not receive any specialized training. The explainability of different approaches was compared based on the top words they used for making classification decision. These words were identified using eye-tracking for humans, explainable AI approach LIME for ML model, and prompts for LLM. The classification performance of ML model was observed to be relatively better than zero-shot LLM and non-expert humans, overall, and particularly for narratives with high complexity and difficult categorization. The top-3 predictive words used by ML and LLM for classification agreed with humans to a greater extent as compared to later predictive words.

引用

页数：12

共 50 条

[21] Development of performance and learning rate evaluation models in robot-assisted surgery using electroencephalography and eye-tracking
Somayeh B. Shafiei
Saeed Shadpour
Farzan Sasangohar
James L. Mohler
Kristopher Attwood
Zhe Jing
npj Science of Learning, 9
[22] Using eye-tracking technology as an indirect instruction tool to improve text and picture processing and learning
Mason, Lucia
Pluchino, Patrik
Tornatora, Maria Caterina
BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY, 2016, 47 (06) : 1083 - 1095
[23] Comparison of Machine Learning Methods for Classification of Alexithymia in Individuals with and without Autism from Eye-Tracking Data
Ilgin, Furkan
Witherow, Megan A.
Iftekharuddin, Khan M.
APPLICATIONS OF MACHINE LEARNING 2023, 2023, 12675
[24] Studying Human Factors Aspects of Text Classification Task Using Eye Tracking
Venkatesh, Jeevithashree Divya
Jaiswal, Aparajita
Suthar, Meet Tusharbhai
Pradhan, Romila
Nanda, Gaurav
AUGMENTED COGNITION, AC 2023, 2023, 14019 : 89 - 107
[25] Measuring Raven's Progressive Matrices Combining Eye-Tracking Technology and Machine Learning (ML) Models
Ma, Shumeng
Jia, Ning
JOURNAL OF INTELLIGENCE, 2024, 12 (11)
[26] Color Vision Deficiency Recognition Based on Eye-Tracking Metrics Using Machine Learning Approaches
Bitkina, Olga Vl.
Park, Jaehyun
Ryu, Do-Hyeon
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2024,
[27] Classification of Children With Autism and Typical Development Using Eye-Tracking Data From Face-to-Face Conversations: Machine Learning Model Development and Performance Evaluation
Zhao, Zhong
Tang, Haiming
Zhang, Xiaobin
Qu, Xingda
Hu, Xinyao
Lu, Jianping
JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (08)
[28] Pre-diagnosis for Autism Spectrum Disorder Using Eye-Tracking and Machine Learning Techniques
Mehmood, Mustafa
Amin, Hafeez Ullah
Chen, Po Ling
ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2023, 2024, 14374 : 239 - 250
[29] Large language models, social demography, and hegemony: comparing authorship in human and synthetic text
Alvero, A. J.
Lee, Jinsook
Regla-Vargas, Alejandra
Kizilcec, Rene F.
Joachims, Thorsten
Antonio, Anthony Lising
JOURNAL OF BIG DATA, 2024, 11 (01)
[30] Discrimination of Radiologists' Experience Level Using Eye-Tracking Technology and Machine Learning: Case Study
Martinez, Stanford
Ramirez-Tamayo, Carolina
Faruqui, Syed Hasib Akhter
Clark, Kal
Alaeddini, Adel
Czarnek, Nicholas
Aggarwal, Aarushi
Emamzadeh, Sahra
Mock, Jeffrey R.
Golob, Edward J.
JMIR FORMATIVE RESEARCH, 2025, 9

← 1 2 3 4 5 →