Research and Implementation on Machine Translation System with Online Corpora Extraction Technology

被引:1
|
作者
Lin Chirong [1 ]
机构
[1] Changsha Aeronaut Vocat & Tech Coll, Changsha 410014, Hunan, Peoples R China
关键词
corpora; extraction; bilingual parallel; MTS; webpages;
D O I
10.1109/ISDEA.2014.172
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bilingual parallel sentence pairs are important resources of machine translation. Due to the limitation of obtaining ways, sentence leveled parallel corpora are not only limited in quantity, but they also concentrate in specific field. So they are difficult to be adapted to genuine application requirements. This paper introduces a Web-based automatic acquisition system of bilingual parallel sentence pairs. The system integrates the advantages of current system and improves its key technologies. We proposes a URL naming method in automatic discovery bilingual network and improves the extraction technology of bilingual parallel sentence pairs. Experimental results show that the methods in this paper greatly improves recalling rate of candidate bilingual network discovery. Its recall rate of obtaining bilingual parallel sentence pairs is 93% as well as accuracy rate is 96%, which proves its effectiveness. In addition, this paper also studies bilingual parallel sentence pairs inside bilingual network and obtains some primary result. Multi-group experiments of statistical machine translation prove that our method can improve the performance of machine translation system so that it can play a part in practical application of online corpora.
引用
收藏
页码:759 / 763
页数:5
相关论文
共 50 条
  • [31] AUTOMATED IMPLEMENTATION PROCESS OF MACHINE TRANSLATION SYSTEM FOR RELATED LANGUAGES
    Vicic, Jernej
    Homola, Petr
    Kubon, Vladislav
    COMPUTING AND INFORMATICS, 2016, 35 (02) : 441 - 469
  • [32] Research on system combination of machine translation based on Transformer
    刘文斌
    HE Yanqing
    LAN Tian
    WU Zhenfeng
    High Technology Letters, 2023, 29 (03) : 310 - 317
  • [33] Applications Research of Machine Learning Algorithm in Translation System
    Yang, Lu
    Chen, Da
    Wu, Wenxue
    PROCEEDINGS OF THE 2018 6TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND COMPUTING TECHNOLOGY (ICMMCT 2018), 2018, 152 : 73 - 80
  • [34] Research on system combination of machine translation based on Transformer
    Liu W.
    He Y.
    Lan T.
    Wu Z.
    High Technology Letters, 2023, 29 (03) : 310 - 317
  • [35] Research on English machine translation system based on the internet
    Zhang Y.
    Zhang, Yu (zhagyu_123@sina.com), 1600, Springer Science and Business Media, LLC (20): : 1017 - 1022
  • [36] Parallel Corpora Preparation for English-Amharic Machine Translation
    Biadgligne, Yohanens
    Smaili, Kamel
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2021, PT I, 2021, 12861 : 443 - 455
  • [37] Design of removable vending machine and research on the key implementation technology
    Shen, Longzhang
    Qiu, Changjun
    Wu, Xiaoyan
    Han, Changxing
    Hu, Liangbin
    JOURNAL OF ENGINEERING-JOE, 2019, (13): : 402 - 405
  • [38] Web-based parallel corpora for statistical machine translation
    Li, Bo
    Liu, Juan
    Shi, Wenjuan
    ICMLA 2007: SIXTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2007, : 444 - 449
  • [39] Learning Curve with Machine Translation Based on Parallel, Bilingual Corpora
    Kowalski, Maciej
    MACHINE INTELLIGENCE AND BIG DATA IN INDUSTRY, 2016, 19 : 11 - 21
  • [40] Mining Parallel Resources for Machine Translation from Comparable Corpora
    Pal, Santanu
    Pakray, Partha
    Gelbukh, Alexander
    van Genabith, Josef
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 534 - 544