Performance Improvement of Semantic Search Using Sentence Embeddings by Dimensionality Reduction

被引:0
|
作者
Tsumuraya, Kenshin [1 ]
Uehara, Minoru [1 ]
Adachi, Yoshihiro [2 ]
机构
[1] Toyo Univ, Grad Sch Informat Sci & Arts, Kawagoe, Saitama, Japan
[2] Toyo Univ, RIIT, Kawagoe, Saitama, Japan
关键词
D O I
10.1007/978-3-031-57870-0_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic search, which searches for sentences with a high similarity in meaning to that of queries, allows a user to search for the desired sentences even when they cannot think of the appropriate keywords for a lexical search. Moreover, the search function can appropriately handle synonyms and spelling variations. We previously reported a semantic search method for Japanese sentences using sentence embeddings that appropriately processed queries in which sentences were combined using the logical operators AND, OR, and NOT. Reducing the dimensionality of sentence embeddings is expected to make semantic search more robust to noise in the embeddings, resulting in improved search accuracy and faster semantic search computation. In this study, we experimentally verified the improvement in semantic search performance by reducing the dimensionality of sentence embeddings generated by Japanese SimCSE. We also evaluated the runtimes for generating sentence embeddings and reducing dimensionality with PCA.
引用
收藏
页码:123 / 132
页数:10
相关论文
共 50 条
  • [1] Performance Improvement of Semantic Search Using Sentence Embeddings by Dimensionality Reduction
    Tsumuraya, Kenshin
    Uehara, Minoru
    Adachi, Yoshihiro
    Lecture Notes on Data Engineering and Communications Technologies, 2024, 201 : 123 - 132
  • [2] On the Dimensionality of Sentence Embeddings
    Wang, Hongwei
    Zhang, Hongming
    Yu, Dong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10344 - 10354
  • [3] Sequential Sentence Embeddings for Semantic Similarity
    Carta, Antonio
    Bacciu, Davide
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 1354 - 1361
  • [4] Exploring Semantic Properties of Sentence Embeddings
    Zhu, Xunjie
    Li, Tingfeng
    de Melo, Gerard
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 632 - 637
  • [5] Nonrigid embeddings for dimensionality reduction
    Brand, M
    MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 47 - 59
  • [6] Dimensionality Reduction Using Similarity-Induced Embeddings
    Passalis, Nikolaos
    Tefas, Anastasios
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) : 3429 - 3441
  • [7] SEMANTIC SENTENCE EMBEDDINGS FOR PARAPHRASING AND TEXT SUMMARIZATION
    Zhang, Chi
    Sah, Shagan
    Thang Nguyen
    Peri, Dheeraj
    Loui, Alexander
    Salvaggio, Carl
    Ptucha, Raymond
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 705 - 709
  • [8] Performance improvement in ATR from dimensionality reduction
    Schmid, NA
    O'Sullivan, JA
    2000 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2000, : 320 - 320
  • [9] Semantic Embeddings for Food Search Using Siamese Networks
    Vijjali, Rutvik
    Mishra, Anurag
    Nagamalla, Srinivas
    Sathyanarayna, Jairaj
    2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 138 - 143
  • [10] Effective Dimensionality Reduction for Word Embeddings
    Raunak, Vikas
    Gupta, Vivek
    Metze, Florian
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 235 - 243