Bilingual chunk alignment in statistical machine translation

被引:0
|
作者
Zhou, Y [1 ]
Zong, CQ [1 ]
Xu, B [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, NLPR, Beijing 100864, Peoples R China
关键词
alignment; chunking; multi-layer filtering; statistical machine translation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper a new algorithm called Multi-Layer Filtering (MLF) is proposed for extracting bilingual alignment chunks automatically from a Chinese-English parallel corpus. Multiple layers are used to extract bilingual chunks according to different features of chunks in the bilingual corpus. And the alignment chunks are one-to-one corresponding with each other. The chunking and alignment algorithm doesn't rely on the information from tagging, parsing, syntax analyzing or segmenting for Chinese corpus as most conventional algorithms do. Preliminary experimental results show that the algorithm achieves a good performance in chunking and alignment. Moreover, the translations generated by this algorithm are much better than the results generated by the baseline (word-based statistical machine translation).
引用
收藏
页码:1401 / 1406
页数:6
相关论文
共 50 条
  • [1] Bilingual Segmenter for Statistical Machine Translation
    Huang, Chung-Chi
    Chen, Wei-Teh
    Chang, Jason S.
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 97 - +
  • [2] Bilingual phrases for statistical machine translation
    Garcia-Varea, I.
    Nevado, F.
    Ortiz, D.
    Tomas, J.
    Casacuberta, F.
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (35): : 93 - 100
  • [3] Construction of Chunk-Aligned Bilingual Lecture Corpus for Simultaneous Machine Translation
    Murata, Masaki
    Ohno, Tomohiro
    Matsubara, Shigeki
    Inagaki, Yasuyoshi
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1765 - 1770
  • [4] Bilingual Sense Similarity for Statistical Machine Translation
    Chen, Boxing
    Foster, George
    Kuhn, Roland
    [J]. ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 834 - 843
  • [5] Bilingual knowledge extraction using chunk alignment
    Hwang, Young-Sook
    Paik, Kyonghee
    Sasaki, Yutaka
    [J]. PACLIC 18: Proceedings of the 18th Pacific Asia Conference on Language, Information and Computation, 2004, : 127 - 137
  • [6] Bilingual cluster based models for statistical machine translation
    Yamamoto, Hirofumi
    Sumita, Eiichiro
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03) : 588 - 597
  • [7] Automatic filtering of bilingual corpora for statistical machine translation
    Khadivi, S
    Ney, H
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2005, 3513 : 263 - 274
  • [8] Phrase Alignment Confidence for Statistical Machine Translation
    Ananthakrishnan, Sankaranarayanan
    Prasad, Rohit
    Natarajan, Prem
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2878 - 2881
  • [9] The alignment template approach to statistical machine translation
    Och, FJ
    Ney, H
    [J]. COMPUTATIONAL LINGUISTICS, 2004, 30 (04) : 417 - 449
  • [10] Chunk-based statistical translation
    Watanabe, T
    Sumita, M
    Okuno, HG
    [J]. 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 303 - 310