ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems

被引:0
|
作者
Ghazarian, Sarik [1 ]
Shao, Yijia [2 ]
Han, Rujun [3 ]
Galstyan, Aram [1 ]
Peng, Nanyun [4 ]
机构
[1] Univ Southern Calif, Inst Informat Sci, Los Angeles, CA 90007 USA
[2] Peking Univ, Beijing, Peoples R China
[3] AWS AI Labs, Santa Clara, CA USA
[4] Univ Calif Los Angeles, Comp Sci Dept, Los Angeles, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commonsense reasoning is omnipresent in human communications and thus is an important feature for open-domain dialogue systems. However, evaluating commonsense in dialogue systems is still an open challenge. We take the first step by focusing on event commonsense that considers events and their relations, and is crucial in both dialogues and general commonsense reasoning. We propose ACCENT, an event commonsense evaluation metric empowered by commonsense knowledge bases (CSKBs). ACCENT first extracts eventrelation tuples from a dialogue, and then evaluates the response by scoring the tuples in terms of their compatibility with the CSKB. To evaluate ACCENT, we construct the first public event commonsense evaluation dataset for open-domain dialogues. Our experiments show that ACCENT is an efficient metric for event commonsense evaluation, which achieves higher correlations with human judgments than existing baselines.
引用
收藏
页码:4398 / 4419
页数:22
相关论文
共 50 条
  • [1] Predictive Engagement: An Efficient Metric for Automatic Evaluation of Open-Domain Dialogue Systems
    Ghazarian, Sarik
    Weischedel, Ralph
    Galstyan, Aram
    Peng, Nanyun
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7789 - 7796
  • [2] PONE: A Novel Automatic Evaluation Metric for Open-domain Generative Dialogue Systems
    Lan, Tian
    Mao, Xian-Ling
    Wei, Wei
    Gao, Xiaoyan
    Huang, Heyan
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2020, 39 (01)
  • [3] Towards Multilingual Automatic Open-Domain Dialogue Evaluation
    Mendonca, John
    Lavie, Alon
    Trancoso, Isabel
    [J]. 24TH MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE, SIGDIAL 2023, 2023, : 130 - 141
  • [4] GRADE Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems
    Huang, Lishan
    Ye, Zheng
    Qin, Jinghui
    Lin, Liang
    Liang, Xiaodan
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9230 - 9240
  • [5] Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
    Pang, Bo
    Nijkamp, Erik
    Han, Wenjuan
    Zhou, Linqi
    Liu, Yixian
    Tu, Kewei
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3619 - 3629
  • [6] An Automatic Evaluation Method for Open-domain Dialogue Based on BLEURT
    Wu, Shih-Hung
    Lee, Jia-Jun
    [J]. 2022 IEEE 23RD INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2022), 2022, : 83 - 89
  • [7] vBLEu: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems
    Tsuta, Yuma
    Yoshinaga, Naoki
    Toyoda, Masashi
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 199 - 206
  • [8] Adversarial Evaluation for Open-Domain Dialogue Generation
    Bruni, Elia
    Fernandez, Raquel
    [J]. 18TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2017), 2017, : 284 - 288
  • [9] Generative commonsense knowledge subgraph retrieval for open-domain dialogue response generation
    Wu, Sixing
    Yu, Jiong
    Chen, Jiahao
    Zhou, Wei
    [J]. NEURAL NETWORKS, 2024, 180
  • [10] MixEI: Mixing explicit and implicit commonsense knowledge in open-domain dialogue response generation
    Wu, Sixing
    Yu, Jiong
    Zhou, Wei
    [J]. Neurocomputing, 2025, 618