On Approximately Searching for Similar Word Embeddings

被引:0
|
作者
Sugawara, Kohei [1 ]
Kobayashi, Hayato [1 ]
Iwasaki, Masajiro [1 ]
机构
[1] Yahoo Japan Corp, Chiyoda Ku, 1-3 Kioicho, Tokyo 1028282, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We discuss an approximate similarity search for word embeddings, which is an operation to approximately find embeddings close to a given vector. We compared several metric-based search algorithms with hash-, tree-, and graph-based indexing from different aspects. Our experimental results showed that a graph-based indexing exhibits robust performance and additionally provided useful information, e.g., vector normalization achieves an efficient search with cosine similarity.
引用
收藏
页码:2265 / 2275
页数:11
相关论文
共 50 条
  • [1] Analysis of The Characteristics of Similar Words Computed by Word Embeddings
    Zhou, Shuhui
    Liu, Peihan
    Liu, Lizhen
    Song, Wei
    Cheng, Miaomiao
    [J]. PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, : 327 - 330
  • [2] Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings
    Tkachenko, Maksim
    Chia, Chong Cher
    Lauw, Hady W.
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1212 - 1221
  • [3] Approximately Counting Embeddings into Random Graphs
    Fuerer, Martin
    Kasiviswanathan, Shiva Prasad
    [J]. APPROXIMATION RANDOMIZATION AND COMBINATORIAL OPTIMIZATION: ALGORITHMS AND TECHNIQUES, PROCEEDINGS, 2008, 5171 : 416 - 429
  • [4] Approximately Counting Embeddings into Random Graphs
    Fuerer, Martin
    Kasiviswanathan, Shiva Prasad
    [J]. COMBINATORICS PROBABILITY & COMPUTING, 2014, 23 (06): : 1028 - 1056
  • [5] Socialized Word Embeddings
    Zeng, Ziqian
    Yin, Yichun
    Song, Yangqiu
    Zhang, Ming
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3915 - 3921
  • [6] Urdu Word Embeddings
    Haider, Samar
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 964 - 968
  • [7] Dynamic Word Embeddings
    Bamler, Robert
    Mandt, Stephan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [8] isiZulu Word Embeddings
    Dlamini, Sibonelo
    Jembere, Edgar
    Pillay, Anban
    van Niekerk, Brett
    [J]. 2021 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY (ICTAS), 2021, : 121 - 126
  • [9] Topical Word Embeddings
    Liu, Yang
    Liu, Zhiyuan
    Chua, Tat-Seng
    Sun, Maosong
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2418 - 2424
  • [10] Bias in Word Embeddings
    Papakyriakopoulos, Orestis
    Hegelich, Simon
    Serrano, Juan Carlos Medina
    Marco, Fabienne
    [J]. FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, : 446 - 457