Performance Improvement of Semantic Search Using Sentence Embeddings by Dimensionality Reduction

被引:0
|
作者
Tsumuraya, Kenshin [1 ]
Uehara, Minoru [1 ]
Adachi, Yoshihiro [2 ]
机构
[1] Toyo Univ, Grad Sch Informat Sci & Arts, Kawagoe, Saitama, Japan
[2] Toyo Univ, RIIT, Kawagoe, Saitama, Japan
关键词
D O I
10.1007/978-3-031-57870-0_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic search, which searches for sentences with a high similarity in meaning to that of queries, allows a user to search for the desired sentences even when they cannot think of the appropriate keywords for a lexical search. Moreover, the search function can appropriately handle synonyms and spelling variations. We previously reported a semantic search method for Japanese sentences using sentence embeddings that appropriately processed queries in which sentences were combined using the logical operators AND, OR, and NOT. Reducing the dimensionality of sentence embeddings is expected to make semantic search more robust to noise in the embeddings, resulting in improved search accuracy and faster semantic search computation. In this study, we experimentally verified the improvement in semantic search performance by reducing the dimensionality of sentence embeddings generated by Japanese SimCSE. We also evaluated the runtimes for generating sentence embeddings and reducing dimensionality with PCA.
引用
收藏
页码:123 / 132
页数:10
相关论文
共 50 条
  • [31] Reducing Dimensionality to Improve Search in Semantic Genetic Programming
    Oliveira, Luiz Otavio V. B.
    Miranda, Luis F.
    Pappa, Gisele L.
    Otero, Fernando E. B.
    Takahashi, Ricardo H. C.
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XIV, 2016, 9921 : 375 - 385
  • [32] Semantic search framework over knowledge bases using embeddings-based similarity
    Khan, Aatif Ahmad
    Malik, Sanjay Kumar
    Jain, Vanita
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2024, 27 (06): : 1963 - 1975
  • [33] AN EMPIRICAL EVALUATION OF DIMENSIONALITY REDUCTION USING LATENT SEMANTIC ANALYSIS ON HINDI TEXT
    Krishnamurthi, Karthik
    Sudi, Ravi Kumar
    Panuganti, Vijayapal Reddy
    Bulusu, Vishnu Vardhan
    2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 21 - 24
  • [34] Approaching Human Performance in Behavior Estimation in Couples Therapy Using Deep Sentence Embeddings
    Tseng, Shao-Yen
    Baucom, Brian
    Georgiou, Panayiotis
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3291 - 3295
  • [35] An improvement of a technique for color quantization using reduction of color space dimensionality
    Hung, Kuo-Lung
    Chang, Chin-Chen
    Informatica (Ljubljana), 2002, 26 (01) : 11 - 16
  • [36] Semantic Interest Modeling and Content-Based Scientific Publication Recommendation Using Word Embeddings and Sentence Encoders
    Guesmi, Mouadh
    Chatti, Mohamed Amine
    Kadhim, Lamees
    Joarder, Shoeb
    Ain, Qurat Ul
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2023, 7 (09)
  • [37] Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
    Reimers, Nils
    Gurevych, Iryna
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3982 - 3992
  • [38] Constructing Semantic Summaries Using Embeddings
    Trouli, Georgia Eirini
    Papadakis, Nikos
    Kondylakis, Haridimos
    INFORMATION, 2024, 15 (04)
  • [39] DRESS: dimensionality reduction for efficient sequence search
    Kotsifakos, Alexios
    Stefan, Alexandra
    Athitsos, Vassilis
    Das, Gautam
    Papapetrou, Panagiotis
    DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (05) : 1280 - 1311
  • [40] Dimensionality Reduction and Prioritized Exploration for Policy Search
    Memmel, Marius
    Liu, Puze
    Tateo, Davide
    Peters, Jan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151