DeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language Models via Causal Intervention

被引:0
|
作者
Wu, Junda [1 ]
Yu, Tong [2 ]
Chen, Xiang [2 ]
Wang, Haoliang [2 ]
Rossi, Ryan A. [2 ]
Kim, Sungchul [2 ]
Rao, Anup [2 ]
McAuley, Julian [1 ]
机构
[1] Univ Calif San Diego, La Jolla, CA 92093 USA
[2] Adobe Res, San Jose, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) often require task-relevant knowledge to augment their internal knowledge through prompts. However, simply injecting external knowledge into prompts does not guarantee that LLMs can identify and use relevant information in the prompts to conduct chain-of-thought reasoning, especially when the LLM's internal knowledge is derived from biased information on the pretraining data. In this paper, we propose a novel causal view to formally explain the internal knowledge bias of LLMs via a Structural Causal Model (SCM). We review the chain-of-thought (CoT) prompting from a causal perspective, and discover that the biased information from pretrained models can impair LLMs' reasoning abilities. When the CoT reasoning paths are misled by irrelevant information from prompts and are logically incorrect, simply editing factual information is insufficient to reach the correct answer. To estimate the confounding effect on CoT reasoning in LLMs, we use external knowledge as an instrumental variable. We further introduce CoT as a mediator to conduct front-door adjustment and generate logically correct CoTs where the spurious correlation between LLMs' pretrained knowledge and task queries is reduced. With extensive experiments, we validate that our approach enables more accurate CoT reasoning and enhances LLM generation on knowledge-intensive tasks.
引用
收藏
页码:14073 / 14087
页数:15
相关论文
共 22 条
  • [1] Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks
    Mishra, Aditi
    Rahman, Sajjadur
    Mitra, Kushan
    Kim, Hannah
    Hruschka, Estevam
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 8117 - 8139
  • [2] Active Prompting with Chain-of-Thought for Large Language Models
    Diao, Shizhe
    Wang, Pengcheng
    Lin, Yong
    Pan, Rui
    Liu, Xiang
    Zhang, Tong
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1330 - 1350
  • [3] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Wei, Jason
    Wang, Xuezhi
    Schuurmans, Dale
    Bosma, Maarten
    Ichter, Brian
    Xia, Fei
    Chi, Ed H.
    Le, Quoc V.
    Zhou, Denny
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [4] Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
    Trivedi, Harsh
    Balasubramanian, Niranjan
    Khot, Tushar
    Sabharwal, Ashish
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 10014 - 10037
  • [5] Chain-of-Thought Improves Text Generation with Citations in Large Language Models
    Ji, Bin
    Liu, Huijun
    Du, Mingzhe
    Ng, See-Kiong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18345 - 18353
  • [6] Prompting Large Language Models with Chain-of-Thought for Few-Shot Knowledge Base Question Generation
    Liang, Yuanyuan
    Wang, Jianing
    Zhu, Hanlun
    Wang, Lei
    Qian, Weining
    Lan, Yunshi
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4329 - 4343
  • [7] Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
    Kang, Minki
    Lee, Seanie
    Baek, Jinheon
    Kawaguchi, Kenji
    Hwang, Sung Ju
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning
    Chen, Jiangui
    Zhang, Ruqing
    Guo, Jiafeng
    de Rijke, Maarten
    Liu, Yiqun
    Fan, Yixing
    Cheng, Xueqi
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1448 - 1457
  • [9] Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models
    Vishwamitra, Nishant
    Guo, Keyan
    Romit, Farhan Tajwar
    Ondracek, Isabelle
    Cheng, Long
    Zhao, Ziming
    Hu, Hongxin
    45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 788 - 806
  • [10] KG-CoT: Chain-of-Thought Prompting of Large Language Models over Knowledge Graphs for Knowledge-Aware Question Answering
    Zhao, Ruilin
    Zhao, Feng
    Wang, Long
    Wang, Xianzhi
    Xu, Guandong
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 6642 - 6650