Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data

被引:0
|
作者
Fillippova, Katja [1 ]
机构
[1] Google Res, Berlin, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural text generation (data- or text-to-text) demonstrates remarkable performance when training data is abundant which for many applications is not the case. To collect a large corpus of parallel data, heuristic rules are often used but they inevitably let noise into the data, such as phrases in the output which cannot be explained by the input. Consequently, models pick up on the noise and may hallucinategenerate fluent but unsupported text. Our contribution is a simple but powerful technique to treat such hallucinations as a controllable aspect of the generated text, without dismissing any input and without modifying the model architecture. On the WikiBio corpus (Lebret et al., 2016), a particularly noisy dataset, we demonstrate the efficacy of the technique both in an automatic and in a human evaluation.
引用
收藏
页码:864 / 870
页数:7
相关论文
共 50 条
  • [21] Fuzzy extractors: How to generate strong keys from biometrics and other noisy data
    Dodis, Y
    Reyzin, L
    Smith, A
    ADVANCES IN CRYPTOLOGY - EUROCRYPT 2004, PROCEEDINGS, 2004, 3027 : 523 - 540
  • [22] Reinforcement Learning for Relation Classification from Noisy Data
    Feng, Jun
    Huang, Minlie
    Zhao, Li
    Yang, Yang
    Zhu, Xiaoyan
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5779 - 5786
  • [23] Learning MDL Logic Programs from Noisy Data
    Hocquette, Celine
    Niskanen, Andreas
    Jarvisalo, Matti
    Cropper, Andrew
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 10553 - 10561
  • [24] Guest Editorial Learning From Noisy Multimedia Data
    Zhang, Jian
    Hanjalic, Alan
    Jain, Ramesh
    Hua, Xiansheng
    Satoh, Shin'ichi
    Yao, Yazhou
    Zeng, Dan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1247 - 1252
  • [25] An algorithm of wavelet network learning from noisy data
    Zhang, Zhiguo
    San, Ye
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 2746 - +
  • [26] Online Learning of Noisy Data
    Cesa-Bianchi, Nicolo
    Shalev-Shwartz, Shai
    Shamir, Ohad
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2011, 57 (12) : 7907 - 7931
  • [27] How to Generate Robust Keys from Noisy DRAMs?
    Karimian, Nima
    Tehranipoor, Fatemeh
    GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 465 - 469
  • [28] Learning Explanatory Rules from Noisy Data (Extended Abstract)
    Evans, Richard
    Grefenstette, Edward
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5598 - 5602
  • [29] Learning to Continually Learn Rapidly from Few and Noisy Data
    Kuo, Nicholas I-Hsien
    Harandi, Mehrtash
    Fourrier, Nicolas
    Walder, Christian
    Ferraro, Gabriela
    Suominen, Hanna
    AAAI WORKSHOP ON META-LEARNING AND METADL CHALLENGE, VOL 140, 2021, 140 : 65 - 76
  • [30] Trade-offs in learning controllers from noisy data
    Bisoffi, Andrea
    De Persis, Claudio
    Tesi, Pietro
    SYSTEMS & CONTROL LETTERS, 2021, 154