A Study of Automatic Metrics for the Evaluation of Natural Language Explanations

被引：0

作者：

Clinciu, Miruna-Adriana ^{[1
,2
,3
]}

Eshghi, Arash ^{[2
]}

Hastie, Helen ^{[2
]}

机构：

[1] Edinburgh Ctr Robot, Edinburgh, Midlothian, Scotland

[2] Heriot Watt Univ, Edinburgh, Midlothian, Scotland

[3] Univ Edinburgh, Edinburgh, Midlothian, Scotland

来源：

16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021) | 2021年

基金：

英国工程与自然科学研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As transparency becomes key for robotics and AI, it will be necessary to evaluate the methods through which transparency is provided, including automatically generated natural language (NL) explanations. Here, we explore parallels between the generation of such explanations and the much-studied field of evaluation of Natural Language Generation (NLG). Specifically, we investigate which of the NLG evaluation measures map well to explanations. We present the ExBAN corpus: a crowd-sourced corpus of NL explanations for Bayesian Networks. We run correlations comparing human subjective ratings with NLG automatic measures. We find that embedding-based automatic NLG evaluation methods, such as BERTScore and BLEURT, have a higher correlation with human ratings, compared to word-overlap metrics, such as BLEU and ROUGE. This work has implications for Explainable AI and transparent robotic and autonomous systems.

引用

页码：2376 / 2387

页数：12

共 50 条

[1] The price of debiasing automatic metrics in natural language evaluation
Chaganty, Arun Tejasvi
Mussmann, Stephen
Liang, Percy
[J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 643 - 653
[2] Automatic Generation of Natural Language Explanations
Costa, Felipe
Ouyang, Sixun
Dolog, Peter
Lawlor, Aonghus
[J]. COMPANION OF THE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES (IUI'18), 2018,
[3] Natural Language Generation, its Evaluation and Metrics
Gehrmann, Sebastian
Adewumi, Tosin
Aggarwal, Karmanya
Ammanamanchi, Pawan Sasanka
Anuoluwapo, Aremu
Bosselut, Antoine
Chandu, Khyathi Raghavi
Clinciu, Miruna
Das, Dipanjan
Dhole, Kaustubh D.
Du, Wanyu
Durmus, Esin
Gangal, Varun
Garbacea, Cristina
Hashimoto, Tatsunori
Hou, Yufang
Jernite, Yacine
Jhamtani, Harsh
Ji, Yangfeng
Jolly, Shailza
Kale, Mihir
Kumar, Dhruv
Ladhak, Faisal
Madaan, Aman
Maddela, Mounica
Mahajan, Khyati
Mahamood, Saad
Majumder, Bodhisattwa Prasad
Martins, Pedro Henrique
McMillan-Major, Angelina
Mille, Simon
van Miltenburg, Emiel
Nadeem, Moin
Narayan, Shashi
Nikolaev, Vitaly
Niyongabo, Rubungo Andre
Osei, Salomey
Parikh, Ankur
Perez-Beltrachini, Laura
Rao, Niranjan Ramesh
Raunak, Vikas
Rodriguez, Juan Diego
Santhanam, Sashank
Sedoc, Joao
Sellam, Thibault
Shaikh, Samira
Shimorina, Anastasia
Sobrevilla Cabezudo, Marco Antonio
Strobelt, Hendrik
Subramani, Nishant
[J]. 1ST WORKSHOP ON NATURAL LANGUAGE GENERATION, EVALUATION, AND METRICS (GEM 2021), 2021, : 96 - 120
[4] Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations
Yao, Bingsheng
Sen, Prithviraj
Popa, Lucian
Hendler, James
Wang, Dakuo
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14698 - 14713
[5] Automatic recognition and evaluation of natural language commands
Majewski, Maciej
Kacalak, Wojciech
[J]. ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 3, PROCEEDINGS, 2006, 3973 : 1155 - 1160
[6] MENLI: Robust Evaluation Metrics from Natural Language Inference
Chen, Yanran
Eger, Steffen
[J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 804 - 825
[7] Recommendation with Dynamic Natural Language Explanations
Li, Xi
Zhang, Jingsen
Bo, Xiaohe
Wang, Lei
Chen, Xu
[J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[8] Faithfulness Tests for Natural Language Explanations
Atanasova, Pepa
Camburu, Oana-Maria
Lioma, Christina
Lukasiewicz, Thomas
Simonsen, Jakob Grue
Augenstein, Isabelle
[J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 283 - 294
[9] The Glass Ceiling of Automatic Evaluation in Natural Language Generation
Colombo, Pierre
Peyrard, Maxime
Noiry, Nathan
West, Robert
Piantanida, Pablo
[J]. 13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 178 - 183
[10] A survey on XAI and natural language explanations
Cambria, Erik
Malandri, Lorenzo
Mercorio, Fabio
Mezzanzanica, Mario
Nobani, Navid
[J]. INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (01)

← 1 2 3 4 5 →