Probing the Robustness of Trained Metrics for Conversational Dialogue Systems

被引:0
|
作者
Deriu, Jan [1 ]
Tuggener, Don [1 ]
von Daeniken, Pius [1 ]
Cieliebak, Mark [1 ]
机构
[1] Zurich Univ Appl Sci ZHAW, Winterthur, Switzerland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces an adversarial method to stress-test trained metrics to evaluate conversational dialogue systems. The method leverages Reinforcement Learning to find response strategies that elicit optimal scores from the trained metrics. We apply our method to test recently proposed trained metrics. We find that they all are susceptible to giving high scores to responses generated by relatively simple and obviously flawed strategies that our method converges on. For instance, simply copying parts of the conversation context to form a response yields competitive scores or even outperforms responses written by humans.
引用
收藏
页码:750 / 761
页数:12
相关论文
共 50 条
  • [31] DiSCoL: Toward Engaging Dialogue Systems through Conversational Line Guided Response Generation
    Ghazarian, Sarik
    Liu, Zixi
    Chakrabarty, Tuhin
    Ma, Xuezhe
    Galstyan, Aram
    Peng, Nanyun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: DEMONSTRATIONS (NAACL-HLT 2021), 2021, : 26 - 34
  • [32] System and User Strategies to Repair Conversational Breakdowns of Spoken Dialogue Systems: A Scoping Review
    Alghamdi, Essam
    Halvey, Martin
    Nicol, Emma
    PROCEEDINGS OF THE 6TH CONFERENCE ON ACM CONVERSATIONAL USER INTERFACES, CUI 2024, 2024,
  • [33] MEASURING FUNCTIONAL ROBUSTNESS WITH NETWORK TOPOLOGICAL ROBUSTNESS METRICS
    Haley, Brandon
    Dong, Andy
    Tumer, Irem
    ICED 15, VOL 6: DESIGN METHODS AND TOOLS - PT 2, 2015,
  • [34] Robustness Metrics: Consolidating the Multiple Approaches to Quantify Robustness
    Gohler, Simon Moritz
    Eifler, Tobias
    Howard, Thomas J.
    JOURNAL OF MECHANICAL DESIGN, 2016, 138 (11)
  • [35] Metrics for the Evaluation of localisation Robustness
    Yi, Siqi
    Worrall, Stewart
    Nebot, Eduardo
    2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 1247 - 1253
  • [36] Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties
    Artemova, Ekaterina
    Blaschke, Verena
    Pank, Barbara
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 445 - 468
  • [37] Evaluating and Enhancing the Robustness of Retrieval-Based Dialogue Systems with Adversarial Examples
    Li, Jia
    Tao, Chongyang
    Peng, Nanyun
    Wu, Wei
    Zhao, Dongyan
    Yan, Rui
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 142 - 154
  • [38] Adaptive Dialogue Management for Conversational Information Elicitation
    Sahijwani, Harshita
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3495 - 3495
  • [39] Enhancing Dialogue Generation with Conversational Concept Flows
    Li, Siheng
    Jiang, Wangjie
    Si, Pengda
    Yang, Cheng
    Qiu, Yao
    Zhang, Jinchao
    Zhou, Jie
    Yang, Yujiu
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1514 - 1525
  • [40] The conversational organization of misunderstandings - The case of the tutorial dialogue
    Trognon, A
    Saint-Dizier, V
    JOURNAL OF PRAGMATICS, 1999, 31 (06) : 787 - 815