SLIDE: A Framework Integrating Small and Large Language Models for Open-Domain Dialogues Evaluation

被引：0

作者：

Zhao, Kun ^{[1
]}

Yang, Bohao ^{[2
]}

Tang, Chen ^{[2
]}

Lin, Chenghua ^{[2
]}

Zhan, Liang ^{[1
]}

机构：

[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA

[2] Univ Manchester, Dept Comp Sci, Manchester, Lancs, England

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年

基金：

美国国家科学基金会;

关键词：

ENERGY;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The long-standing one-to-many problem of gold standard responses in open-domain dialogue systems presents challenges for automatic evaluation metrics. Though prior works have demonstrated some success by applying powerful Large Language Models (LLMs), existing approaches still struggle with the oneto-many problem, and exhibit subpar performance in domain-specific scenarios. We assume the commonsense reasoning biases within LLMs may hinder their performance in domainspecific evaluations. To address both issues, we propose a novel framework SLIDE (Small and Large Integrated for Dialogue Evaluation), that leverages both a small, specialised model (SLM), and LLMs for the evaluation of open domain dialogues. Our approach introduces several techniques: (1) Contrastive learning to differentiate between robust and non-robust response embeddings; (2) A novel metric for semantic sensitivity that combines embedding cosine distances with similarity learned through neural networks, and (3) A strategy for incorporating the evaluation results from both the SLM and LLMs. Our empirical results demonstrate that our approach achieves state-of-the-art performance in both the classification and evaluation tasks, and additionally the SLIDE evaluator exhibits better correlation with human judgements. Our code is available at https:// github.com/hegehongcha/SLIDE- ACL2024.

引用

页码：15421 / 15435

页数：15

共 50 条

[21] DISTRIBUTED OPEN-DOMAIN CONVERSATIONAL UNDERSTANDING FRAMEWORK WITH DOMAIN INDEPENDENT EXTRACTORS
Li, Qi
Tur, Gokhan
Hakkani-Tur, Dilek
Li, Xiang
Paek, Tim
Gunawardana, Asela
Quirk, Chris
2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 566 - 571
[22] SHONGLAP: A Large Bengali Open-Domain Dialogue Corpus
Monsur, Syed Mostofa
Chowdhury, Sakib
Fatemi, Md Shahrar
Ahmed, Shafayat
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5797 - 5804
[23] Towards a small language model powered chain-of-reasoning for open-domain question answering
Roh, Jihyeon
Kim, Minho
Bae, Kyoungman
ETRI JOURNAL, 2024, 46 (01) : 11 - 21
[24] Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation
Kim, Beomsu
Seo, Seokjun
Han, Seungju
Erdenee, Enkhbayar
Chang, Buru
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3357 - 3373
[25] Intent-calibrated Self-training for Answer Selection in Open-domain Dialogues
Deng, Wentao
Pei, Jiahuan
Ren, Zhaochun
Chen, Zhumin
Ren, Pengjie
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 1232 - 1249
[26] Evaluating Open-Domain Dialogues in Latent Space with Next Sentence Prediction and Mutual Information
Zhao, Kun
Yang, Bohao
Lin, Chenghua
Rong, Wenge
Villavicencio, Aline
Cui, Xiaohui
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 562 - 574
[27] Learning Strategies for Open-Domain Natural Language Question Answering
Grois, Eugene
Wilkins, David C.
19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 1054 - 1060
[28] Personality prediction from task-oriented and open-domain human–machine dialogues
Ao Guo
Ryu Hirai
Atsumoto Ohashi
Yuya Chiba
Yuiko Tsunomori
Ryuichiro Higashinaka
Scientific Reports, 14
[29] Enhancing the Open-Domain Dialogue Evaluation in Latent Space
Chan, Zhangming
Liu, Lemao
Li, Juntao
Zhang, Haisong
Zhao, Dongyan
Shi, Shuming
Yan, Rui
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4889 - 4900
[30] Text is NOT Enough: Integrating Visual Impressions into Open-domain Dialogue Generation
Shen, Lei
Zhan, Haolan
Shen, Xin
Song, Yonghao
Zhao, Xiaofang
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4287 - 4296

← 1 2 3 4 5 →