Research and Implementation on Machine Translation System with Online Corpora Extraction Technology

被引:1
|
作者
Lin Chirong [1 ]
机构
[1] Changsha Aeronaut Vocat & Tech Coll, Changsha 410014, Hunan, Peoples R China
关键词
corpora; extraction; bilingual parallel; MTS; webpages;
D O I
10.1109/ISDEA.2014.172
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bilingual parallel sentence pairs are important resources of machine translation. Due to the limitation of obtaining ways, sentence leveled parallel corpora are not only limited in quantity, but they also concentrate in specific field. So they are difficult to be adapted to genuine application requirements. This paper introduces a Web-based automatic acquisition system of bilingual parallel sentence pairs. The system integrates the advantages of current system and improves its key technologies. We proposes a URL naming method in automatic discovery bilingual network and improves the extraction technology of bilingual parallel sentence pairs. Experimental results show that the methods in this paper greatly improves recalling rate of candidate bilingual network discovery. Its recall rate of obtaining bilingual parallel sentence pairs is 93% as well as accuracy rate is 96%, which proves its effectiveness. In addition, this paper also studies bilingual parallel sentence pairs inside bilingual network and obtains some primary result. Multi-group experiments of statistical machine translation prove that our method can improve the performance of machine translation system so that it can play a part in practical application of online corpora.
引用
收藏
页码:759 / 763
页数:5
相关论文
共 50 条
  • [41] Braille Translation System Using Neural Machine Translation Technology I - Code Conversion
    Shimomura, Yuko
    Kawabe, Hiroyuki
    Nambo, Hidetaka
    Seto, Shuichi
    PROCEEDINGS OF THE THIRTEENTH INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING MANAGEMENT, VOL 1, 2020, 1001 : 335 - 345
  • [42] Braille translation system using neural machine translation technology I - code conversion
    Shimomura, Yuko
    Kawabe, Hiroyuki
    Nambo, Hidetaka
    Seto, Shuichi
    Advances in Intelligent Systems and Computing, 2020, 1001 : 335 - 345
  • [43] Multiword units in machine translation and translation technology
    Haque, Rejwanul
    Hasanuzzaman, Mohammed
    Way, Andy
    MACHINE TRANSLATION, 2019, 33 (04) : 349 - 354
  • [44] Machine translation: the (in)visible technology of audiovisual translation
    Oncins, Estella
    TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2022, (20): : 302 - 311
  • [45] Multiword Units in Machine Translation and Translation Technology
    Boitet, Christian
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2018, 59 (01): : 91 - 96
  • [46] Multiword Units in Machine Translation and Translation Technology
    Wang Hui
    Zhang Xiaojun
    BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION, 2019, 65 (05): : 735 - 740
  • [47] Analysis of Feature Extraction Model for Machine Korean Translation Judging System
    Piao, Yidan
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (05) : 2457 - 2465
  • [48] The automatic extraction of translation patterns and matching algorithm in an English-Chinese machine translation system
    Li, J
    Wang, B
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 839 - 843
  • [49] Online Learning for Statistical Machine Translation
    Ortiz-Martinez, Daniel
    COMPUTATIONAL LINGUISTICS, 2016, 42 (01) : 121 - 161
  • [50] Research and implementation of camera calibration system for machine vision
    Zhang, Ming-yu
    Liu, Ming-zhu
    Li, Xiao-qin
    INFORMATION SCIENCE AND ELECTRONIC ENGINEERING, 2017, : 167 - 171