Using k-Way Co-Occurrences for Learning Word Embeddings

被引:0
|
作者
Bollegala, Danushka [1 ]
Yoshida, Yuichi [2 ]
Kawarabayashi, Ken-ichi [2 ,3 ]
机构
[1] Univ Liverpool, Liverpool L69 3BX, Merseyside, England
[2] Natl Inst Informat, Chiyoda Ku, 2-1-2 Hitotsubashi, Tokyo 1018430, Japan
[3] Japan Sci & Technol Agcy, ERATO, Kawarabayashi Large Graph Project, Kawaguchi, Saitama, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Co-occurrences between two words provide useful insights into the semantics of those words. Consequently, numerous prior work on word embedding learning has used co-occurrences between two words as the training signal for learning word embeddings. Flowever, in natural language texts it is common for multiple words to be related and cooccurring in the same context. We extend the notion of co-occurrences to cover k(>= 2)-way co-occurrences among a set of k-words. Specifically, we prove a theoretical relationship between the joint probability of k(>= 2) words, and the sum of l(2) norms of their embeddings. Next, we propose a learning objective motivated by our theoretical result that utilises k-way Co-occurrences for learning word embeddings. Our experimental results show that the derived theoretical relationship does indeed hold empirically, and despite data sparsity, for some smaller k(<= 5) values, k-way embeddings perform comparably or better than 2-way embeddings in a range of tasks.
引用
收藏
页码:5037 / 5044
页数:8
相关论文
共 50 条
  • [41] Learning Word Embeddings Using Spatial Information
    Joko, Hideaki
    Oka, Ryunosuke
    Uchide, Hayato
    Itsui, Hiroyasu
    Otsuka, Takahiro
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 2959 - 2964
  • [42] Learning Turkish Hypernymy Using Word Embeddings
    Yildirim, Savas
    Yildiz, Tugba
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 371 - 383
  • [43] Online multiclass learning with k-way limited feedback and an application to utterance classification
    Alshawi, H
    MACHINE LEARNING, 2005, 60 (1-3) : 97 - 115
  • [44] A CORPUS BASED TECHNIQUE FOR REPAIRING ILL-FORMED SENTENCES WITH WORD ORDER ERRORS USING CO-OCCURRENCES OF N-GRAMS
    Athanaselis, Theologos
    Mamouras, Konstantinos
    Bakamidis, Stelios
    Dologlou, Ioannis
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2011, 20 (03) : 401 - 424
  • [45] Developmental changes in the word co-occurrences of Spanish-English bilingual children with and without developmental language disorder
    Shivabasappa, Prarthana
    Pena, Elizabeth D.
    Bedore, Lisa M.
    INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2024,
  • [46] Online Multiclass Learning with k-Way Limited Feedback and an Application to Utterance Classification
    Hiyan Alshawi
    Machine Learning, 2005, 60 : 97 - 115
  • [47] The Effect of Unobserved Word-Context Co-occurrences on a Vector-Mixture Approach for Compositional Distributional Semantics
    Bakarov, Amir
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA (CLIB '18), 2018, : 153 - 161
  • [48] JIGSAW PUZZLE SOLVING USING LOCAL FEATURE CO-OCCURRENCES IN DEEP NEURAL NETWORKS
    Paumard, Marie-Morgane
    Picard, David
    Tabia, Hedi
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1018 - 1022
  • [49] Using toponym co-occurrences to measure relationships between places: review, application and evaluation
    Meijers, Evert
    Peris, Antoine
    INTERNATIONAL JOURNAL OF URBAN SCIENCES, 2019, 23 (02) : 246 - 268
  • [50] Learning Bilingual Word Embeddings Using Lexical Definitions
    Shi, Weijia
    Chen, Muhao
    Tian, Yingtao
    Chang, Kai-Wei
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 142 - 147