A Discriminative Neural Model for Cross-Lingual Word Alignment

被引:0
|
作者
Stengel-Estrin, Elias [1 ]
Su, Tzu-Ray [1 ]
Post, Matt [1 ]
Van Durme, Benjamin [1 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a novel discriminative word alignment model, which we integrate into a Transformer-based machine translation model. In experiments based on a small number of labeled examples (similar to 1.7K-5K sentences) we evaluate its performance intrinsically on both English-Chinese and English-Arabic alignment, where we achieve major improvements over unsupervised baselines (11-27 F1). We evaluate the model extrinsically on data projection for Chinese NER, showing that our alignments lead to higher performance when used to project NER tags from English to Chinese. Finally, we perform an ablation analysis and an annotation experiment that jointly support the utility and feasibility of future manual alignment elicitation.
引用
收藏
页码:910 / 920
页数:11
相关论文
共 50 条
  • [31] WASSERSTEIN CROSS-LINGUAL ALIGNMENT FOR NAMED ENTITY RECOGNITION
    Wang, Rui
    Henao, Ricardo
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8342 - 8346
  • [32] Cross-lingual entity matching and infobox alignment in Wikipedia
    Rinser, Daniel
    Lange, Dustin
    Naumann, Felix
    [J]. INFORMATION SYSTEMS, 2013, 38 (06) : 887 - 907
  • [33] Iterative Cross-Lingual Entity Alignment Based on TransC
    Kang, Shize
    Ji, Lixin
    Li, Zhenglian
    Hao, Xindi
    Ding, Yuehang
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (05) : 1002 - 1005
  • [34] Cross-lingual Ontology Alignment using EuroWordNet and Wikipedia
    Bouma, Gosse
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,
  • [35] Inducing word senses for cross-lingual document clustering
    Tang, Guoyu
    Xia, Yunqing
    Cambria, Erik
    Jin, Peng
    [J]. 2013 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2013, : 409 - 414
  • [36] Coarse Alignment of Topic and Sentiment: A Unified Model for Cross-Lingual Sentiment Classification
    Wang, Deqing
    Jing, Baoyu
    Lu, Chenwei
    Wu, Junjie
    Liu, Guannan
    Du, Chenguang
    Zhuang, Fuzhen
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) : 736 - 747
  • [37] Incorporating Word Embedding into Cross-lingual Topic Modeling
    Chang, Chia-Hsuan
    Hwang, San-Yih
    Xui, Tou-Hsiang
    [J]. 2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 17 - 24
  • [38] Improving Cross-Lingual Word Embeddings by Meeting in the Middle
    Doval, Yerai
    Camacho-Collados, Jose
    Espinosa-Anke, Luis
    Schockaert, Steven
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 294 - 304
  • [39] Unsupervised Cross-lingual Transfer of Word Embedding Spaces
    Xu, Ruochen
    Yang, Yiming
    Otani, Naoki
    Wu, Yuexin
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2465 - 2474
  • [40] Cross-Lingual Word Sense Clustering for Sense Disambiguation
    Casteleiro, Joao
    da Silva, Joaquim Ferreira
    Lopes, Gabriel Pereira
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE-BK, 2015, 9273 : 747 - 758