Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model

被引:0
|
作者
Muludi, Kurnia [1 ]
Fitria, Kaira Milani [1 ]
Triloka, Joko [1 ]
Sutedi [1 ]
机构
[1] Darmajaya Informat & Business Inst, Informat Engn Grad Program, Bandar Lampung, Indonesia
关键词
Natural Language Processing; Large Language Model; Retrieval Augmented Generation; Question Answering; GPT;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This study introduces the Retrieval Augmented Generation (RAG) method to improve Question-Answering (QA) systems by addressing document processing in Natural Language Processing problems. It represents the latest breakthrough in applying RAG to document question and answer applications, overcoming previous QA system obstacles. RAG combines search techniques in vector store and text generation mechanism developed by Large Language Models, offering a time-efficient alternative to manual reading limitations. The research evaluates RAG's that use Generative Pre-trained Transformer 3.5 or GPT-3.5-turbo from the ChatGPT model and its impact on document data processing, comparing it with other applications. This research also provides datasets to test the capabilities of the QA document system. The proposed dataset and Stanford Question Answering Dataset (SQuAD) are used for performance testing. The study contributes theoretically by advancing methodologies and knowledge representation, supporting benchmarking in research communities. Results highlight RAG's superiority: achieving a precision of 0.74 in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) testing, outperforming others at 0.5; obtaining an F1 score of 0.88 in BERTScore, surpassing other QA apps at 0.81; attaining a precision of 0.28 in Bilingual Evaluation Understudy (BLEU) testing, surpassing others with a precision of 0.09; and scoring 0.33 in Jaccard Similarity, outshining others at 0.04. These findings underscore RAG's efficiency and competitiveness, promising a positive impact on various industrial sectors through advanced Artificial Intelligence (AI) technology.
引用
收藏
页码:776 / 785
页数:10
相关论文
共 50 条
  • [1] Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering
    Xu, Zhentao
    Cruz, Mark Jerome
    Guevara, Matthew
    Wang, Tie
    Deshpande, Manasi
    Wang, Xiaofeng
    Li, Zheng
    [J]. PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2905 - 2909
  • [2] Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models
    Louis, Antoine
    van Dijck, Gijs
    Spanakis, Gerasimos
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22266 - 22275
  • [3] Benchmarking Large Language Models in Retrieval-Augmented Generation
    Chen, Jiawei
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762
  • [4] Retrieval-Augmented Knowledge Graph Reasoning for Commonsense Question Answering
    Sha, Yuchen
    Feng, Yujian
    He, Miao
    Liu, Shangdong
    Ji, Yimu
    [J]. MATHEMATICS, 2023, 11 (15)
  • [5] Design and Implementation of an Interactive Question-Answering System with Retrieval-Augmented Generation for Personalized Databases
    Byun, Jaeyeon
    Kim, Bokyeong
    Cha, Kyung-Ae
    Lee, Eunhyung
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [6] Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology
    Luo, Ming-Jie
    Pang, Jianyu
    Bi, Shaowei
    Lai, Yunxi
    Zhao, Jiaman
    Shang, Yuanrui
    Cui, Tingxin
    Yang, Yahan
    Lin, Zhenzhe
    Zhao, Lanqin
    Wu, Xiaohang
    Lin, Duoru
    Chen, Jingjing
    Lin, Haotian
    [J]. JAMA OPHTHALMOLOGY, 2024, 142 (09) : 798 - 805
  • [7] TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models
    Shanghai Jiao Tong University, China
    [J]. arXiv,
  • [8] Natural language Question - Answering model applied to document retrieval system
    Dang, Nguyen Tuan
    Tuyen, Do Thi Thanh
    [J]. World Academy of Science, Engineering and Technology, 2009, 39 : 36 - 39
  • [9] Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature
    Lozano, Alejandro
    Fleming, Scott L.
    Chiang, Chia-Chun
    Shah, Nigam
    [J]. BIOCOMPUTING 2024, PSB 2024, 2024, : 8 - 23
  • [10] Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications
    Miao, Jing
    Thongprayoon, Charat
    Suppadungsuk, Supawadee
    Valencia, Oscar A. Garcia
    Cheungpasitporn, Wisit
    [J]. MEDICINA-LITHUANIA, 2024, 60 (03):