Assessing the Impact of GPT-4 Turbo in Generating Defeaters for Assurance Cases

被引：2

作者：

Shahandashti, Kimya Khakzad ^{[1
]}

Sivakumar, Mithila ^{[1
]}

Mohajer, Mohammad Mahdi ^{[1
]}

Belle, Alvine B. ^{[1
]}

Wang, Song ^{[1
]}

Lethbridge, Timothy C. ^{[2
]}

机构：

[1] York Univ, Toronto, ON, Canada

[2] Univ Ottawa, Ottawa, ON, Canada

来源：

PROCEEDINGS 2024 IEEE/ACM FIRST INTERNATIONAL CONFERENCE ON AI FOUNDATION MODELS AND SOFTWARE ENGINEERING, FORGE 2024 | 2024年

关键词：

Large Language Models; assurance cases; assurance defeaters; system certification; FM for Requirement Engineering;

D O I：

10.1145/3650105.3652291

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Assurance cases (ACs) are structured arguments that allowverifying the correct implementation of the created systems' non-functional requirements (e.g., safety, security). This allows for preventing system failure. The latter may result in catastrophic outcomes (e.g., loss of lives). ACs support the certification of systems in compliance with industrial standards, e.g., DO-178C and ISO 26262. Identifying defeaters -arguments that challenge these ACs - is crucial for enhancing ACs' robustness and confidence. To automatically support that task, we propose a novel approach that explores the potential of GPT-4 Turbo, an advanced Large Language Model (LLM) developed by OpenAI, in identifying defeaters within ACs formalized using the Eliminative Argumentation (EA) notation. Our preliminary evaluation assesses the model's ability to comprehend and generate arguments in this context and the results show that GPT-4 turbo is very proficient in EA notation and can generate different types of defeaters.

引用

页码：52 / 56

页数：5

共 50 条

[41] Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study
Zack, Travis
Lehman, Eric
Suzgun, Mirac
Rodriguez, Jorge A.
Celi, Leo Anthony
Gichoya, Judy
Jurafsky, Dan
Szolovits, Peter
Bates, David W.
Abdulnour, Raja-Elie E.
Butte, Atul J.
Alsentzer, Emily
LANCET DIGITAL HEALTH, 2024, 6 (01):
[42] Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis
Hou, Wenpin
Ji, Zhicheng
NATURE METHODS, 2024, 21 (04) : 1462 - 1465
[43] Exploring new educational approaches in neuropathic pain: assessing accuracy and consistency of artificial intelligence responses from GPT-3.5 and GPT-4
Garcia-Rudolph, Alejandro
Sanchez-Pinsach, David
Opisso, Eloy
Soler, Maria Dolors
PAIN MEDICINE, 2024, 26 (01) : 48 - 50
[44] RE: Exploring new educational approaches in neuropathic pain: assessing accuracy and consistency of AI responses from GPT-3.5 and GPT-4
Daungsupawong, Hinpetch
Wiwanitkit, Viroj
PAIN MEDICINE, 2024,
[45] Harnessing GPT-4 turbo for intelligent laboratory test recommendations: A step forward in next-generation clinical decision support
Zayed, A.
Delvaux, N.
CLINICA CHIMICA ACTA, 2024, 558
[46] CAN GPT-4 BE A VIABLE ALTERNATIVE FOR DISCUSSING COMPLEX CASES IN DIGITAL ORAL RADIOLOGY? A CRITICAL ANALYSIS
Santana, Lucas Alves da Mota
Floresta, Lara Gois
Alves, Emilly Victoria Maciel
dos Santos, Marcos Antonio Lima
Barbosa, Breno Ferreira
Vasconcellos, Sara Juliana de Abreu de
Valadares, Carolina Vieira
EXCLI JOURNAL, 2023, 22 : 749 - 751
[47] RE: Exploring new educational approaches in neuropathic pain: assessing accuracy and consistency of AI responses from GPT-3.5 and GPT-4
Garcia-Rudolph, Alejandro
Sanchez-Pinsach, David
Opisso, Eloy
Soler, Maria Dolors
PAIN MEDICINE, 2024,
[48] ChatGPT in radiology structured reporting: analysis of ChatGPT-3.5 Turbo and GPT-4 in reducing word count and recalling findings
Mallio, Carlo A.
Bernetti, Caterina
Sertorio, Andrea C.
Zobel, Bruno Beomonte
QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2024, 14 (02)
[49] Can we trust LLMs to help us? An examination of the potential use of GPT-4 in generating quality literature reviews
Zhao, Min
Li, Fuan
Cai, Francis
Chen, Haiyang
Li, Zheng
NANKAI BUSINESS REVIEW INTERNATIONAL, 2025, 16 (01) : 128 - 142
[50] Assessing the role of GPT-4 in thyroid ultrasound diagnosis and treatment recommendations: enhancing interpretability with a chain of thought approach
Wang, Zhixiang
Zhang, Zhen
Traverso, Alberto
Dekker, Andre
Qian, Linxue
Sun, Pengfei
QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2024, 14 (02) : 1602 - 1615

← 1 2 3 4 5 →