Research and Implementation on Machine Translation System with Online Corpora Extraction Technology

被引:1
|
作者
Lin Chirong [1 ]
机构
[1] Changsha Aeronaut Vocat & Tech Coll, Changsha 410014, Hunan, Peoples R China
关键词
corpora; extraction; bilingual parallel; MTS; webpages;
D O I
10.1109/ISDEA.2014.172
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bilingual parallel sentence pairs are important resources of machine translation. Due to the limitation of obtaining ways, sentence leveled parallel corpora are not only limited in quantity, but they also concentrate in specific field. So they are difficult to be adapted to genuine application requirements. This paper introduces a Web-based automatic acquisition system of bilingual parallel sentence pairs. The system integrates the advantages of current system and improves its key technologies. We proposes a URL naming method in automatic discovery bilingual network and improves the extraction technology of bilingual parallel sentence pairs. Experimental results show that the methods in this paper greatly improves recalling rate of candidate bilingual network discovery. Its recall rate of obtaining bilingual parallel sentence pairs is 93% as well as accuracy rate is 96%, which proves its effectiveness. In addition, this paper also studies bilingual parallel sentence pairs inside bilingual network and obtains some primary result. Multi-group experiments of statistical machine translation prove that our method can improve the performance of machine translation system so that it can play a part in practical application of online corpora.
引用
收藏
页码:759 / 763
页数:5
相关论文
共 50 条
  • [21] Parallel subtitle corpora and their applications in machine translation and translatology
    Bywood, Lindsay
    Volk, Martin
    Fishel, Mark
    Georgakopoulou, Panayota
    PERSPECTIVES-STUDIES IN TRANSLATOLOGY, 2013, 21 (04): : 595 - 610
  • [22] Improving Machine Translation Performance Using Comparable Corpora
    Eisele, Andreas
    Xu, Jia
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 35 - 41
  • [23] Research on the Key Implementation Technology of Removable Vending Machine
    Wu, Xiao-yan
    Shen, Long-zhang
    Qiu, Chang-jun
    Ke, Guo-jun
    Li, Qi
    2018 INTERNATIONAL CONFERENCE ON ELECTRICAL, CONTROL, AUTOMATION AND ROBOTICS (ECAR 2018), 2018, 307 : 121 - 124
  • [24] Automatic filtering of bilingual corpora for statistical machine translation
    Khadivi, S
    Ney, H
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2005, 3513 : 263 - 274
  • [25] Research and Implementation of Online Travel Planning System
    Dong, Chen
    AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (01): : 1079 - 1083
  • [26] Recent advances in machine translation using comparable corpora
    Rapp, Reinhard
    Sharoff, Serge
    Zweigenbaum, Pierre
    NATURAL LANGUAGE ENGINEERING, 2016, 22 (04) : 501 - 516
  • [27] Building and using multimodal comparable corpora for machine translation
    Afli, Haithem
    Barrault, Loic
    Schwenk, Holger
    NATURAL LANGUAGE ENGINEERING, 2016, 22 (04) : 603 - 625
  • [28] Machine translation technology
    Neubig, Graham
    Journal of the Institute of Electronics, Information and Communication Engineers, 2015, 98 (08): : 718 - 725
  • [29] Demonstration of a Neural Machine Translation System with Online Learning for Translators
    Domingo, Miguel
    Garcia-Martinez, Mercedes
    Estela, Amando
    Bie, Laurent
    Helle, Alexandre
    Peris, Alvaro
    Casacuberta, Francisco
    Herranz, Manuel
    PROCEEDINGS OF THE 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: SYSTEM DEMONSTRATIONS, (ACL 2019), 2019, : 70 - 74
  • [30] Research on Collaborative Machine English Translation Using the HIC Technology
    Lv, Jingjing
    INTERNATIONAL JOURNAL OF INFORMATION SYSTEM MODELING AND DESIGN, 2022, 13 (03)