Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework

被引:0
|
作者
Zhang, Shaolei [1 ,2 ]
Feng, Yang [1 ,2 ]
机构
[1] Chinese Acad Sci ICT CAS, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simultaneous machine translation (SiMT) starts translating while receiving the streaming source inputs, and hence the source sentence is always incomplete during translating. Different from the full-sentence MT using the conventional seq-to-seq architecture, SiMT often applies prefix-to-prefix architecture, which forces each target word to only align with a partial source prefix to adapt to the incomplete source in streaming inputs. However, the source words in the front positions are always illusoryly considered more important since they appear in more prefixes, resulting in position bias, which makes the model pay more attention on the front source positions in testing. In this paper, we first analyze the phenomenon of position bias in SiMT, and develop a Length-Aware Framework to reduce the position bias by bridging the structural gap between SiMT and full-sentence MT. Specifically, given the streaming inputs, we first predict the full-sentence length and then fill the future source position with positional encoding, thereby turning the streaming inputs into a pseudo full-sentence. The proposed framework can be integrated into most existing SiMT methods to further improve performance. Experiments on two representative SiMT methods, including the state-of-the-art adaptive policy, show that our method successfully reduces the position bias and thereby achieves better SiMT performance.
引用
收藏
页码:6775 / 6788
页数:14
相关论文
共 13 条
  • [1] Length-aware Byte Pair Encoding for Mitigating Over-segmentation in Korean Machine Translation
    Lee, Jungseob
    Moon, Hyeonseok
    Lee, Seungjun
    Park, Chanjun
    Eo, Sugyeong
    Ko, Hyunwoong
    Seo, Jaehyung
    Lee, Seungyoon
    Lim, Heuiseok
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 2287 - 2303
  • [2] Word Position Aware Translation Memory for Neural Machine Translation
    He, Qiuxiang
    Huang, Guoping
    Liu, Lemao
    Li, Li
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 367 - 379
  • [3] A Generative Framework for Simultaneous Machine Translation
    Miao, Yishu
    Blunsom, Phil
    Specia, Lucia
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6697 - 6706
  • [4] A General Framework for Adaptation of Neural Machine Translation to Simultaneous Translation
    Chen, Yun
    Li, Liangyou
    Jiang, Xin
    Chen, Xiao
    Liu, Qun
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 191 - 200
  • [5] VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
    Wu, Yihan
    Guo, Junliang
    Tan, Xu
    Zhang, Chen
    Li, Bohan
    Song, Ruihua
    He, Lei
    Zhao, Sheng
    Menezes, Arul
    Bian, Jiang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13772 - 13779
  • [6] Addressing the Length Bias Problem in Document-Level Neural Machine Translation
    Zhang, Zhuocheng
    Gu, Shuhao
    Zhang, Min
    Feng, Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11545 - 11556
  • [7] PAL: A Position-bias Aware Learning Framework for CTR Prediction in Live Recommender Systems
    Guo, Huifeng
    Yu, Jinkai
    Liu, Qing
    Tang, Ruiming
    Zhang, Yuzhou
    RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2019, : 452 - 456
  • [8] Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation
    Zhou, Chulun
    Meng, Fandong
    Zhou, Jie
    Zhang, Min
    Wang, Hongji
    Su, Jinsong
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2878 - 2889
  • [9] Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation
    Lee, Minwoo
    Koh, Hyukhun
    Lee, Kang-il
    Zhang, Dongdong
    Kim, Minsung
    Jung, Kyomin
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16825 - 16839
  • [10] Similarity-aware neural machine translation: reducing human translator efforts by leveraging high-potential sentences with translation memory
    Zhang, Tianfu
    Huang, Heyan
    Feng, Chong
    Wei, Xiaochi
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (23): : 17623 - 17635