GENERATING SEMANTIC SIMILARITY ATLAS FOR NATURAL LANGUAGES

被引:0
|
作者
Senel, Lutfi Kerem [1 ,2 ,3 ]
Utlu, Ihsan [1 ,2 ]
Yucesoy, Veysel [1 ]
Koc, Aykut [1 ]
Cukur, Tolga [2 ,3 ,4 ]
机构
[1] ASELSAN Res Ctr, Ankara, Turkey
[2] Bilkent Univ, Dept Elect & Elect Engn, Ankara, Turkey
[3] Bilkent Univ, UMRAM, Sabuncu Brain Res Ctr, Ankara, Turkey
[4] Bilkent Univ, Neurosci Program, Ankara, Turkey
关键词
cross-lingual semantic similarity; natural language processing; semantic similarity; word embedding; computational linguistics;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual studies attract a growing interest in natural language processing (NLP) research, and several studies showed that similar languages are more advantageous to work with than fundamentally different languages in transferring knowledge. Different similarity measures for the languages are proposed by researchers from different domains. However, a similarity measure focusing on semantic structures of languages can be useful for selecting pairs or groups of languages to work with, especially for the tasks requiring semantic knowledge such as sentiment analysis or word sense disambiguation. For this purpose, in this work, we leverage a recently proposed word embedding based method to generate a language similarity atlas for 76 different languages around the world. This atlas can help researchers select similar language pairs or groups in cross-lingual applications. Our findings suggest that semantic similarity between two languages is strongly correlated with the geographic proximity of the countries in which they are used.
引用
收藏
页码:795 / 799
页数:5
相关论文
共 50 条
  • [1] Semantic similarity of short texts in languages with a deficient natural language processing support
    Furlan, Bojan
    Batanovic, Vuk
    Nikolic, Bosko
    DECISION SUPPORT SYSTEMS, 2013, 55 (03) : 710 - 719
  • [2] A UNIFIED AND SYNCHRONOUS GENERATING SYSTEM FOR MULTIPLE NATURAL LANGUAGES BASED ON CFG AND SEMANTIC LANGUAGE
    Li, Li
    Liu, Honglai
    Gao, Qingshi
    Wang, Peifeng
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2012, 10 (04)
  • [3] Wordform Similarity Increases With Semantic Similarity: An Analysis of 100 Languages
    Dautriche, Isabelle
    Mahowald, Kyle
    Gibson, Edward
    Piantadosi, Steven T.
    COGNITIVE SCIENCE, 2017, 41 (08) : 2149 - 2169
  • [4] Word Semantic Similarity for Morphologically Rich Languages
    Zervanou, Kalliopi
    Iosif, Elias
    Potamianos, Alexandros
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1642 - 1648
  • [5] Similarity for Natural Semantic Networks
    Torres, Francisco
    Garza, Sara E.
    SIMILARITY SEARCH AND APPLICATIONS, 2014, 8821 : 195 - 200
  • [6] A Free Energy Foundation of Semantic Similarity in Automata and Languages
    Cui, Cewei
    Dang, Zhe
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2016, 2016, 9939 : 34 - 47
  • [7] The abessive in the Permian languages: similarity and difference in semantic structure
    Nekrasova, G. A.
    VESTNIK UGROVEDENIYA-BULLETIN OF UGRIC STUDIES, 2022, 12 (02): : 264 - 271
  • [8] Generating Fluent Adversarial Examples for Natural Languages
    Zhang, Huangzhao
    Zhou, Hao
    Miao, Ning
    Li, Lei
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5564 - 5569
  • [9] Rank distance with applications in similarity of natural languages
    Dinu, LP
    FUNDAMENTA INFORMATICAE, 2005, 64 (1-4) : 135 - 149
  • [10] Framework for Measuring the Similarity of Visual and Semantic Structures in Sign Languages
    Silva de Lima, Matheus
    Sato, Ryota
    K. Shimomoto, Erica
    Alves Beleza, Suzana Rita
    Kato, Nobuko
    Fukui, Kazuhiro
    Communications in Computer and Information Science, 2024, 2143 CCIS : 93 - 107