Commonsense Reasoning and Explainable Artificial Intelligence Using Large Language Models

被引:1
|
作者
Krause, Stefanie [1 ]
Stolzenburg, Frieder [1 ]
机构
[1] Harz Univ Appl Sci, Automat & Comp Sci Dept, Friedrichstr 57-59, D-38855 Wernigerode, Germany
关键词
large language models; explainable AI; commonsense reasoning; question answering; ChatGPT; KNOWLEDGE;
D O I
10.1007/978-3-031-50396-2_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commonsense reasoning is a difficult task for a computer, but a critical skill for an artificial intelligence (AI). It can enhance the explainability of AI models by enabling them to provide intuitive and human-like explanations for their decisions. This is necessary in many areas but especially in the field of question answering (QA), which is one of the most important tasks of natural language processing (NLP). Over time, a multitude of methods have emerged for solving commonsense reasoning problems such as knowledge-based approaches using formal logic or linguistic analysis. In this paper, we investigate the effectiveness of large language models (LLMs) on different QA tasks with focus on their abilities on reasoning and producing explanations. For this, we study the recent and very prominent LLM ChatGPT and evaluate the results by means of a questionnaire. We demonstrate ChatGPT's ability to reason with common sense, and although ChatGPT's accuracy ranges from 56% to 93% on various QA benchmarks, it outperforms human accuracy. Furthermore we can appraise that, in the sense of explainable artificial intelligence (XAI), ChatGPT gives good explanations for its decisions. In our questionnaire we found that 68% of the participants quantify ChatGPT's explanations as "good" or " excellent". Taken together, these findings enrich our understanding of current LLMs and pave the way for future investigations of reasoning and explainability.
引用
收藏
页码:302 / 319
页数:18
相关论文
共 50 条
  • [1] Commonsense Reasoning and Commonsense Knowledge in Artificial Intelligence
    Davis, Ernest
    Marcus, Gary
    [J]. COMMUNICATIONS OF THE ACM, 2015, 58 (09) : 92 - 103
  • [2] Psycholinguistic Diagnosis of Language Models' Commonsense Reasoning
    Cong, Yan
    [J]. PROCEEDINGS OF THE FIRST WORKSHOP ON COMMONSENSE REPRESENTATION AND REASONING (CSRR 2022), 2022, : 17 - 22
  • [3] Artificial intelligence, large language models, and you
    Marquardt, Charles
    [J]. JOURNAL OF VASCULAR SURGERY CASES INNOVATIONS AND TECHNIQUES, 2023, 9 (04):
  • [4] Explainable Artificial Intelligence for Simulation Models
    Grigoryan, Gayane
    [J]. PROCEEDINGS OF THE 38TH ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACM SIGSIM-PADS 2024, 2024, : 59 - 60
  • [5] Explain Yourself! Leveraging Language Models for Commonsense Reasoning
    Rajani, Nazneen Fatema
    McCann, Bryan
    Xiong, Caiming
    Socher, Richard
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4932 - 4942
  • [6] Artificial intelligence: Augmenting telehealth with large language models
    Snoswell, Centaine L.
    Snoswell, Aaron J.
    Kelly, Jaimon T.
    Caffery, Liam J.
    Smith, Anthony C.
    [J]. JOURNAL OF TELEMEDICINE AND TELECARE, 2023,
  • [7] Artificial intelligence: revolutionizing cardiology with large language models
    Boonstra, Machteld
    Weissenbacher, Davy
    Moore, Jason
    Gonzalez-Hernandez, Graciela
    Asselbergs, Folkert
    [J]. EUROPEAN HEART JOURNAL, 2024, 45 (05) : 332 - 345
  • [8] A Generative Artificial Intelligence Using Multilingual Large Language Models for ChatGPT Applications
    Tuan, Nguyen Trung
    Moore, Philip
    Thanh, Dat Ha Vu
    Pham, Hai Van
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [9] Large language models and artificial intelligence chatbots in vascular surgery
    Lareyre, Fabien
    Nasr, Bahaa
    Poggi, Elise
    Di Lorenzo, Gilles
    Ballaith, Ali
    Sliti, Imen
    Chaudhuri, Arindam
    Raffort, Juliette
    [J]. SEMINARS IN VASCULAR SURGERY, 2024, 7 (03) : 314 - 320
  • [10] Editorial 2024: Large language models, artificial intelligence and geomorphology
    Lane, Stuart N.
    [J]. EARTH SURFACE PROCESSES AND LANDFORMS, 2024, 49 (01) : 3 - 9