Detecting Source Contextual Barriers for Understanding Neural Machine Translation

被引:1
|
作者
Li, Guanlin [1 ]
Liu, Lemao [2 ]
Zhu, Conghui [1 ]
Wang, Rui [3 ]
Zhao, Tiejun [1 ]
Shi, Shuming [2 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci Technol, Machine Intelligence & Translat Lab, Harbin 150001, Peoples R China
[2] TencentAI Lab, Shenzhen, Guangdong, Peoples R China
[3] Natl Inst Informat & Commun Technol NICT, Adv Translat Technol Lab, Kyoto 6190289, Japan
关键词
Neural machine translation; evaluation; generalization analysis;
D O I
10.1109/TASLP.2021.3085119
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In machine translation evaluation, the traditional wisdom measures model's generalization ability in an average sense, for example by using corpus BLEU. However, the statistics of corpus BLEU cannot provide comprehensive understanding and fine-grained analysis on model's generalization ability. As a remedy, this paper attempts to understand NMT at fine-grained level, by detecting contextual barriers within an unseen input sentence that cause the degradation in model's translation quality. It proposes a principled definition of source contextual barriers as well as its modified version which is tractable in computation and operates at word-level. Based on the modified one, three simple methods are proposed for barrier detection by search-aware risk estimation through counterfactual generation. Extensive analyses are conducted on those detected contextual barrier words on both Zh double left right arrow En NIST benchmarks. Potential usages motivated from barrier words are also discussed.
引用
收藏
页码:3158 / 3169
页数:12
相关论文
共 50 条
  • [1] Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
    Xu, Weijia
    Agrawal, Sweta
    Briakou, Eleftheria
    Martindale, Marianna J.
    Carpuat, Marine
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 546 - 564
  • [2] Dual contextual module for neural machine translation
    Ampomah, Isaac Kojo Essel
    McClean, Sally
    Hawe, Glenn
    [J]. MACHINE TRANSLATION, 2021, 35 (04) : 571 - 593
  • [3] Visualizing and Understanding Neural Machine Translation
    Ding, Yanzhuo
    Liu, Yang
    Luan, Huanbo
    Sun, Maosong
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1150 - 1159
  • [4] Contextual Parameter Generation for Universal Neural Machine Translation
    Platanios, Emmanouil Antonios
    Sachan, Mrinmaya
    Neubig, Graham
    Mitchell, Tom M.
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 425 - 435
  • [5] Soft Contextual Data Augmentation for Neural Machine Translation
    Gao, Fei
    Zhu, Jinhua
    Wu, Lijun
    Xia, Yingce
    Qin, Tao
    Cheng, Xueqi
    Zhou, Wengang
    Liu, Tie-Yan
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5539 - 5544
  • [6] Detecting Various Types of Noise for Neural Machine Translation
    Herold, Christian
    Rosendahl, Jan
    Vanvinckenroye, Joris
    Ney, Hermann
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2542 - 2551
  • [7] Detecting and Translating Dropped Pronouns in Neural Machine Translation
    Tan, Xin
    Kuang, Shaohui
    Xiong, Deyi
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 343 - 354
  • [8] Improved Neural Machine Translation with Source Syntax
    Wu, Shuangzhi
    Zhou, Ming
    Zhang, Dongdong
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4179 - 4185
  • [9] Modeling Source Syntax for Neural Machine Translation
    Li, Junhui
    Xiong, Deyi
    Tu, Zhaopeng
    Zhu, Muhua
    Zhang, Min
    Zhou, Guodong
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 688 - 697
  • [10] Source Segment Encoding for Neural Machine Translation
    Wang, Qiang
    Xiao, Tong
    Zhu, Jingbo
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 329 - 340