DUSTer: A method for unraveling cross-language divergences for statistical word-level alignment

被引:0
|
作者
Dorr, BJ [1 ]
Pearl, L [1 ]
Hwa, R [1 ]
Habash, N [1 ]
机构
[1] Univ Maryland, Inst Adv Comp Studies, College Pk, MD 20740 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The frequent occurrence of divergences-structural differences between languages-presents a great challenge for statistical word-level alignment. In this paper, we introduce DUSTer, a method for systematically identifying common divergence types and transforming an English sentence structure to bear a closer resemblance to that of another language. Our ultimate goal is to enable more accurate alignment and projection of dependency trees in another language without requiring any training on dependency-tree data in that language. We present an empirical analysis comparing the complexities of performing word-level alignments with and without divergence handling. Our results suggest that our approach facilitates word-level alignment, particularly for sentence pairs containing divergences.
引用
收藏
页码:31 / 43
页数:13
相关论文
共 50 条
  • [1] Cross-language message- and word-level transfer effects in bilingual text processing
    Deanna C. Friesen
    Debra Jared
    [J]. Memory & Cognition, 2007, 35 : 1542 - 1556
  • [2] Cross-language message- and word-level transfer effects in bilingual text processing
    Friesen, Deanna C.
    Jared, Debra
    [J]. MEMORY & COGNITION, 2007, 35 (07) : 1542 - 1556
  • [3] Incorporating Linguistic Information to Statistical Word-Level Alignment
    Cendejas, Eduardo
    Barcelo, Grettel
    Gelbukh, Alexander
    Sidorov, Grigori
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS, 2009, 5856 : 387 - 394
  • [4] A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT
    Nagata, Masaaki
    Chousa, Katsuki
    Nishino, Masaaki
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 555 - 565
  • [5] One language or two? Navigating cross-language conflict in statistical word segmentation
    Antovich, Dylan M.
    Graf Estes, Katharine
    [J]. DEVELOPMENTAL SCIENCE, 2020, 23 (06)
  • [6] Hybrid Algorithm for Word-Level Alignment of Parallel Texts
    Cendejas, Eduardo
    Barcelo, Grettel
    Gelbukh, Alexander
    Sidorov, Grigori
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 5723 : 293 - 294
  • [7] Enhancing the Bilingual Concordancer TransSearch with Word-Level Alignment
    Bourdaillet, Julien
    Huet, Stephane
    Gotti, Fabrizio
    Lapalme, Guy
    Langlais, Philippe
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5549 : 27 - +
  • [8] Unsupervised Deep Cross-Language Entity Alignment
    Jiang, Chuanyu
    Qian, Yiming
    Chen, Lijun
    Gu, Yang
    Xie, Xia
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT IV, 2023, 14172 : 3 - 19
  • [9] Perception of Word-level Prominence in Free Word Order Language Discourse
    Luchkina, Tatiana
    Cole, Jennifer S.
    [J]. LANGUAGE AND SPEECH, 2021, 64 (02) : 381 - 412
  • [10] Creating word-level language models for handwriting recognition
    Pitrelli, JF
    Roy, A
    [J]. SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 721 - 725