Transformer-based Automatic Post-Editing Model with Joint Encoder and Multi-source Attention of Decoder

被引:0
|
作者
Lee, WonKee [1 ]
Shin, Jaehun [1 ]
Lee, Jong-Hyeok [1 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Dept Comp Sci & Engn, Pohang, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes POSIECH's submission to the WMT 2019 shared task on Automatic Post-Editing (APE). In this paper, we propose a new multi-source APE model by extending Transformer. The main contributions of our study are that we 1) reconstruct the encoder to generate a joint representation of translation (mt) and its src context, in addition to the conventional src encoding and 2) suggest two types of multi-source attention layers to compute attention between two outputs of the encoder and the decoder state in the decoder. Furthermore, we train our model by applying various teacher-forcing ratios to alleviate exposure bias. Finally, we adopt the ensemble technique across variations of our model. Experiments on the WMT19 English-German APE data set show improvements in terms of both TER and BLEU scores over the baseline. Our primary submission achieves -0.73 in TER and +1.49 in BLEU compared to the baseline, and ranks second among all submitted systems.
引用
收藏
页码:112 / 117
页数:6
相关论文
共 41 条
  • [31] Attention-Based Multi-Layered Encoder-Decoder Model for Summarizing Non-Interactive User-Based Videos
    Tiwari, Vasudha
    Bhatnagar, Charul
    [J]. JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (02) : 1 - 13
  • [32] COMPRESSING TRANSFORMER-BASED ASR MODEL BY TASK-DRIVEN LOSS AND ATTENTION-BASED MULTI-LEVEL FEATURE DISTILLATION
    Lv, Yongjie
    Wang, Longbiao
    Ge, Meng
    Li, Sheng
    Ding, Chenchen
    Pan, Lixin
    Wang, Yuguang
    Dang, Jianwu
    Honda, Kiyoshi
    [J]. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2022, 2022-May : 7992 - 7996
  • [33] COMPRESSING TRANSFORMER-BASED ASR MODEL BY TASK-DRIVEN LOSS AND ATTENTION-BASED MULTI-LEVEL FEATURE DISTILLATION
    Lv, Yongjie
    Wang, Longbiao
    Ge, Meng
    Li, Sheng
    Ding, Chenchen
    Pan, Lixin
    Wang, Yuguang
    Dang, Jianwu
    Honda, Kiyoshi
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7992 - 7996
  • [34] Hybrid-DANet: An Encoder-Decoder Based Hybrid Weights Alignment With Multi-Dilated Attention Network for Automatic Brain Tumor Segmentation
    Ilyas, Naveed
    Song, Yoonguu
    Raja, Aamir
    Lee, Boreom
    [J]. IEEE ACCESS, 2022, 10 : 122658 - 122669
  • [35] Research on Line-transformer-user Topological Anomaly Recognition Model Based on Multi-source Data Mining
    Zhao, Guoai
    Chu, Jianxin
    Deng, Liang
    Pan, Keqin
    [J]. 2020 5TH ASIA CONFERENCE ON POWER AND ELECTRICAL ENGINEERING (ACPEE 2020), 2020, : 192 - 196
  • [36] Analysis of Substation Joint Safety Control System and Model Based on Multi-Source Heterogeneous Data Fusion
    Wu, Bo
    Hu, Yifan
    [J]. IEEE ACCESS, 2023, 11 : 35281 - 35297
  • [37] MDformer: A transformer-based method for predicting miRNA-Disease associations using multi-source feature fusion and maximal meta-path instances encoding
    Dong, Benzhi
    Sun, Weidong
    Xu, Dali
    Wang, Guohua
    Zhang, Tianjiao
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 167
  • [38] Multi-Source PM2.5 Prediction Model Based on Fusion of Graph Attention Networks and Multiple Time Periods
    Qi, Bolin
    Jiang, Yong
    Wang, Hongliang
    Jin, Jixin
    [J]. IEEE ACCESS, 2024, 12 : 57603 - 57612
  • [39] MIFAM-DTI: a drug-target interactions predicting model based on multi-source information fusion and attention mechanism
    Li, Jianwei
    Sun, Lianwei
    Liu, Lingbo
    Li, Ziyu
    [J]. FRONTIERS IN GENETICS, 2024, 15
  • [40] LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism
    Zeng, Min
    Wu, Yifan
    Li, Yiming
    Yin, Rui
    Lu, Chengqian
    Duan, Junwen
    Li, Min
    [J]. BIOINFORMATICS, 2023, 39 (12)