Multilingual Training of Crosslingual Word Embeddings

被引:0
|
作者
Duong, Long [1 ]
Kanayama, Hiroshi [2 ]
Ma, Tengfei [3 ]
Bird, Steven [1 ,4 ]
Cohn, Trevor [1 ]
机构
[1] Univ Melbourne, Dept Comp & Informat Syst, Melbourne, Vic, Australia
[2] IBM Res Tokyo, Tokyo, Japan
[3] IBM TJ Watson Res Ctr, Ossining, NY USA
[4] Univ Calif Berkeley, Int Comp Sci Inst, Berkeley, CA USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crosslingual word embeddings represent lexical items from different languages using the same vector space, enabling crosslingual transfer. Most prior work constructs embeddings for a pair of languages, with English on one side. We investigate methods for building high quality crosslingual word embeddings for many languages in a unified vector space. In this way, we can exploit and combine information from many languages. We report competitive performance on bilingual lexicon induction, monolingual similarity and crosslingual document classification tasks.
引用
收藏
页码:894 / 904
页数:11
相关论文
共 50 条
  • [1] Unsupervised Multilingual Word Embeddings
    Chen, Xilun
    Cardie, Claire
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 261 - 270
  • [2] Morphological Segmentation to Improve Crosslingual Word Embeddings for Low Resource Languages
    Chimalamarri, Santwana
    Sitaram, Dinkar
    Jain, Ashritha
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (05)
  • [3] Learning Unsupervised Multilingual Word Embeddings with Incremental Multilingual Hubs
    Heyman, Geert
    Verreet, Bregt
    Vulic, Ivan
    Moens, Marie-Francine
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1890 - 1902
  • [4] MULTILINGUAL AND CROSSLINGUAL SPEECH RECOGNITION USING PHONOLOGICAL-VECTOR BASED PHONE EMBEDDINGS
    Zhu, Chengrui
    An, Keyu
    Zheng, Huahuan
    Ou, Zhijian
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1034 - 1041
  • [5] GlobalTrait: Personality Alignment of Multilingual Word Embeddings
    Bin Siddique, Farhad
    Bertero, Dario
    Fung, Pascale
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7015 - 7022
  • [6] Crosslingual interrogation of multilingual catalogs
    Fluhr, C
    Schmit, D
    Andrieux, C
    Ortet, P
    Bisson, F
    Combet, V
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS, 1999, 1696 : 294 - 310
  • [7] NORMA: Neighborhood Sensitive Maps for Multilingual Word Embeddings
    Nakashole, Ndapa
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 512 - 522
  • [8] Multilingual Financial Word Embeddings for Arabic, English and French
    Zmandar, Nadhem
    El-Haj, Mahmoud
    Rayson, Paul
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4584 - 4589
  • [9] Multilingual Jointly Trained Acoustic and Written Word Embeddings
    Hu, Yushi
    Settle, Shane
    Livescu, Karen
    INTERSPEECH 2020, 2020, : 1052 - 1056
  • [10] Joint learning of frequency and word embeddings for multilingual readability assessment
    Le, Dieu-Thu
    Nguyen, Cam-Tu
    Wang, Xiaoliang
    NATURAL LANGUAGE PROCESSING TECHNIQUES FOR EDUCATIONAL APPLICATIONS, 2018, : 103 - 107