Effect of Linguistic Information in Neural Machine Translation

被引:0
|
作者
Nakamura, Naomichi [1 ]
Isahara, Hitoshi [2 ]
机构
[1] Toyohashi Univ Technol, Dept Comp Sci & Engn, Toyohashi, Aichi, Japan
[2] Toyohashi Univ Technol, Informat & Media Ctr, Toyohashi, Aichi, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks(DNNs) outperform previous works in many fields such as in natural language processing. Neural Machine Translation(NMT) also outperforms Statistical Machine Translation(SMT) which has complex features and rules. However, NMT requires a large corpus and a long calculation time. In order to suppress calculation cost, recent researches replaced low frequency words with symbols. However, the symbols make sentences ambiguous and deteriorates translation accuracy. To solve this problem, sub-word units such as Byte Pair Encoding(BPE) and Wordpiece Model(WPM) creating vocabularies in a prespecified vocabulary size has been proposed. Nevertheless, these tokenize methods break words and treat them as symbols. Words as symbols are compatible with neural networks and NMT performance has increased. This result shows that linguistic correctness is not necessarily important in NMT. If that is the case, we wonder to what extent linguistic correctness contributes to NMT accuracy. In this research, we experiment to incorporate linguistic information into sub-word units. Experimentally, we demonstrate that morpheme as linguistic information is a helpful factor for sub-word units.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Better Neural Machine Translation by Extracting Linguistic Information from BERT
    Shavarani, Hassan S.
    Sarkar, Anoop
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2772 - 2783
  • [2] Linguistic Knowledge-Aware Neural Machine Translation
    Li, Qiang
    Wong, Derek F.
    Chao, Lidia S.
    Zhu, Muhua
    Xiao, Tong
    Zhu, Jingbo
    Zhang, Min
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2341 - 2354
  • [3] On the linguistic representational power of neural machine translation models
    Belinkov, Yonatan
    Durrani, Nadir
    Dalvi, Fahim
    Sajjad, Hassan
    Glass, James
    [J]. 1600, MIT Press Journals (46): : 1 - 52
  • [4] On the Linguistic Representational Power of Neural Machine Translation Models
    Belinkov, Yonatan
    Durrani, Nadir
    Dalvi, Fahim
    Sajjad, Hassan
    Glass, James
    [J]. COMPUTATIONAL LINGUISTICS, 2020, 46 (01) : 1 - 52
  • [5] Multilingual Neural Machine Translation: Can Linguistic Hierarchies Help?
    Saleh, Fahimeh
    Buntine, Wray
    Haffari, Gholamreza
    Du, Lan
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1313 - 1330
  • [6] Linguistic knowledge-based vocabularies for Neural Machine Translation
    Casas, Noe
    Costa-jussa, Marta R.
    Fonollosa, Jose A. R.
    Alonso, Juan A.
    Fanlo, Ramon
    [J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (04) : 485 - 506
  • [7] Residual Information Flow for Neural Machine Translation
    Mohamed, Shereen A.
    Abdou, Mohamed A.
    Elsayed, Ashraf A.
    [J]. IEEE ACCESS, 2022, 10 : 118313 - 118320
  • [8] Improving Chinese-Vietnamese Neural Machine Translation with Linguistic Differences
    Yu, Zhiqiang
    Yu, Zhengtao
    Xian, Yantuan
    Huang, Yuxin
    Guo, Junjun
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (02)
  • [9] Observing the Learning Curve of Neural Machine Translation with regard to Linguistic Phenomena
    Stadler, Patrick
    Macketanz, Vivien
    Avramidis, Eleftherios
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 186 - 196
  • [10] Neural Machine Translation by Fusing Key Information of Text
    Hu, Shijie
    Li, Xiaoyu
    Bai, Jiayu
    Lei, Hang
    Qian, Weizhong
    Hu, Sunqiang
    Zhang, Cong
    Kofi, Akpatsa Samuel
    Qiu, Qian
    Zhou, Yong
    Yang, Shan
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 2803 - 2815