Dietary Assessment With Multimodal ChatGPT: A Systematic Analysis

被引:2
|
作者
Lo, Frank P. -W. [1 ]
Qiu, Jianing [2 ]
Wang, Zeyu [1 ]
Chen, Junhong [1 ]
Xiao, Bo [1 ]
Yuan, Wu [2 ]
Giannarou, Stamatia [1 ]
Frost, Gary [3 ]
Lo, Benny [3 ]
机构
[1] Imperial Coll London, Hamlyn Ctr, London SW7 2AZ, England
[2] Chinese Univ Hong Kong, Dept Biomed Engn, Hong Kong, Peoples R China
[3] Imperial Coll London, Fac Med, Dept Metab Digest & Reprod, London SW7 2AZ, England
基金
比尔及梅琳达.盖茨基金会; 英国医学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
Artificial intelligence; Estimation; Task analysis; Chatbots; Monitoring; Accuracy; Visualization; ChatGPT; deep learning; dietary assessment; food recognition; foundation model; GPT-4V; passive monitoring; COUNTING BITES; FOOD;
D O I
10.1109/JBHI.2024.3417280
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Conventional approaches to dietary assessment are primarily grounded in self-reporting methods or structured interviews conducted under the supervision of dietitians. These methods, however, are often subjective, inaccurate, and time-intensive. Although artificial intelligence (AI)-based solutions have been devised to automate the dietary assessment process, prior AI methodologies tackle dietary assessment in a fragmented landscape (e.g., merely recognizing food types or estimating portion size) and encounter challenges in their ability to generalize across a diverse range of food categories, dietary behaviors, and cultural contexts. Recently, the emergence of multimodal foundation models, such as GPT-4V, has exhibited transformative potential across a wide range of tasks in various research domains. These models have demonstrated remarkable generalist intelligence and accuracy, owing to their large-scale pre-training on broad datasets and substantially scaled model size. In this study, we explore the application of GPT-4V powering multimodal ChatGPT for dietary assessment, along with prompt engineering and passive monitoring techniques. We evaluated the proposed pipeline using a self-collected, semi free-living dietary intake dataset, captured through wearable cameras. Our findings reveal that GPT-4V excels in food detection under challenging conditions without any fine-tuning or adaptation using food-specific datasets. By guiding the model with specific language prompts (e.g., African cuisine), it shifts from recognizing common staples like rice and bread to accurately identifying regional dishes like banku and ugali. Another standout feature of GPT-4V is its contextual awareness. GPT-4V can leverage surrounding objects as scale references to deduce the portion sizes of food items, further facilitating the process of dietary assessment.
引用
收藏
页码:7577 / 7587
页数:11
相关论文
共 50 条
  • [31] The use of ChatGPT in teaching and learning: a systematic review through SWOT analysis approach
    Mai, Duong Thi Thuy
    Da, Can Van
    Hanh, Nguyen Van
    FRONTIERS IN EDUCATION, 2024, 9
  • [32] Diagnostic Accuracy of ChatGPT for Patients' Triage; a Systematic Review and Meta-Analysis
    Kaboudi, Navid
    Firouzbakht, Saeedeh
    Eftekhar, Mohammad Shahir
    Fayazbakhsh, Fatemeh
    Joharivarnoosfaderani, Niloufar
    Ghaderi, Salar
    Dehdashti, Mohammadreza
    Kia, Yasmin Mohtasham
    Afshari, Maryam
    Vasaghi-Gharamaleki, Maryam
    Haghani, Leila
    Moradzadeh, Zahra
    Khalaj, Fattaneh
    Mohammadi, Zahra
    Hasanabadi, Zahra
    Shahidi, Ramin
    ARCHIVES OF ACADEMIC EMERGENCY MEDICINE, 2024, 12 (01)
  • [33] Screening articles for systematic reviews with ChatGPT
    Syriani, Eugene
    David, Istvan
    Kumar, Gauransh
    JOURNAL OF COMPUTER LANGUAGES, 2024, 80
  • [34] ChatGPT in Teaching and Learning: A Systematic Review
    Ali, Duha
    Fatemi, Yasin
    Boskabadi, Elahe
    Nikfar, Mohsen
    Ugwuoke, Jude
    Ali, Haneen
    EDUCATION SCIENCES, 2024, 14 (06):
  • [35] Robot Control Platform for Multimodal Interactions with Humans Based on ChatGPT
    Qu, Jingtao
    Jarosz, Mateusz
    Sniezynski, Bartlomiej
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [36] Evaluating ChatGPT in pathology: towards multimodal AI in medical imaging
    Koga, Shunsuke
    JOURNAL OF CLINICAL PATHOLOGY, 2025, 78 (01) : 70 - 70
  • [37] Assessment of the clinical knowledge of ChatGPT-4 in neonatal-perinatal medicine: a comparative analysis with ChatGPT-3.5
    Sharma, Puneet
    Luo, Guangze
    Wang, Cindy
    Brodsky, Dara
    Martin, Camilia R.
    Beam, Andrew
    Beam, Kristyn
    JOURNAL OF PERINATOLOGY, 2024, 44 (9) : 1365 - 1366
  • [38] A Multimodal Analysis of Picture Books for Children: A Systematic Functional Approach
    Cotton, Penni
    LIBRI & LIBERI, 2015, 4 (01): : 151 - 154
  • [39] A Systematic Review on Responsible Multimodal Sentiment Analysis in Marketing Applications
    Cesar, Ines
    Pereira, Ivo
    Rodrigues, Fatima
    Migueis, Vera L.
    Nicola, Susana
    Madureira, Ana
    Reis, Jose Luis
    Dos Santos, Jose Paulo Marques
    De Oliveira, Daniel Alves
    IEEE ACCESS, 2024, 12 : 111943 - 111961
  • [40] Multimodal interpersonal synchrony: Systematic review and meta-analysis
    Ohayon, Shay
    Gordon, Ilanit
    BEHAVIOURAL BRAIN RESEARCH, 2025, 480