Learning to Compare for Better Training and Evaluation of Open Domain Natural Language Generation Models

被引:0
|
作者
Zhou, Wangchunshu [1 ]
Xu, Ke [1 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated evaluation of open domain natural language generation (NLG) models remains a challenge and widely used metrics such as BLEU and Perplexity can be misleading in some cases. In our paper, we propose to evaluate natural language generation models by learning to compare a pair of generated sentences by fine-tuning BERT, which has been shown to have good natural language understanding ability. We also propose to evaluate the model-level quality of NLG models with sample-level comparison results with skill rating system. While able to be trained in a fully self-supervised fashion, our model can be further fine-tuned with a little amount of human preference annotation to better imitate human judgment. In addition to evaluating trained models, we propose to apply our model as a performance indicator during training for better hyperparameter tuning and early-stopping. We evaluate our approach on both story generation and chitchat dialogue response generation. Experimental results show that our model correlates better with human preference compared with previous automated evaluation approaches. Training with the proposed metric yields better performance in human evaluation, which further demonstrates the effectiveness of the proposed model.
引用
收藏
页码:9717 / 9724
页数:8
相关论文
共 50 条
  • [1] Using reinforcement learning with external rewards for open-domain natural language generation
    Vidhushini Srinivasan
    Sashank Santhanam
    Samira Shaikh
    Journal of Intelligent Information Systems, 2021, 56 : 189 - 206
  • [2] Using reinforcement learning with external rewards for open-domain natural language generation
    Srinivasan, Vidhushini
    Santhanam, Sashank
    Shaikh, Samira
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2021, 56 (01) : 189 - 206
  • [3] Language Models Learning for Domain-Specific Natural Language User Interaction
    Bai, Shuanhu
    Huang, Chien-Lin
    Tan, Yeow-Kee
    Ma, Bin
    2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2009), VOLS 1-4, 2009, : 2480 - 2485
  • [4] Learning Strategies for Open-Domain Natural Language Question Answering
    Grois, Eugene
    Wilkins, David C.
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 1054 - 1060
  • [5] Natural Language Generation Using Transformer Network in an Open-Domain Setting
    Varshney, Deeksha
    Ekbal, Asif
    Nagaraja, Ganesh Prasad
    Tiwari, Mrigank
    Gopinath, Abhijith Athreya Mysore
    Bhattacharyya, Pushpak
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2020), 2020, 12089 : 82 - 93
  • [6] VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
    Crowson, Katherine
    Biderman, Stella
    Kornis, Daniel
    Stander, Dashiell
    Hallahan, Eric
    Castricato, Louis
    Raff, Edward
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 88 - 105
  • [7] A Natural Bias for Language Generation Models
    Meister, Clara
    Stokowiec, Wojciech
    Pimentel, Tiago
    Yu, Lei
    Rimell, Laura
    Kuncoro, Adhiguna
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 243 - 255
  • [8] Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models
    He, Tianxing
    McCann, Bryan
    Xiong, Caiming
    Hosseini-Asl, Ehsan
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1754 - 1761
  • [9] Active Learning for Natural Language Generation
    Perlitz, Yotam
    Gera, Ariel
    Shmueli-Scheuer, Michal
    Sheinwald, Dafna
    Slonim, Noam
    Ein-Dor, Liat
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 9862 - 9876
  • [10] Deduplicating Training Data Makes Language Models Better
    Lee, Katherine
    Ippolito, Daphne
    Nystrom, Andrew
    Zhang, Chiyuan
    Eck, Douglas
    Callison-Burch, Chris
    Carlini, Nicholas
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8424 - 8445