Query expansion based on clustering and personalized information retrieval

被引:7
|
作者
Khalifi, Hamid [1 ]
Cherif, Walid [2 ]
El Qadi, Abderrahim [3 ]
Ghanou, Youssef [1 ]
机构
[1] Moulay Ismail Univ, High Sch Technol, TIM Team, Meknes, Morocco
[2] Natl Inst Stat & Appl Econ, Lab SI2M, Rabat, Morocco
[3] Mohammed V Univ, High Sch Technol, Rabat, Morocco
关键词
Information retrieval; Personalized information retrieval; Automatic query completion; Clustering; Performance evaluation; Support vector machines; MODELS;
D O I
10.1007/s13748-019-00178-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information retrieval systems are used to describe a variety of processes involving the delivery of information to people who need it. Although several mathematical approaches have been studied in order to formalize the main components of an information retrieval system: queries representation, information items representations and the retrieval process, such systems still face many difficulties to extract relevant information for users especially when the processed data are texts. This is due to the complex nature of text databases. Generally, an information retrieval system reformulates queries according to associations among information items before matching them to dataset items. In this sense, semantic relationships or machine learning techniques can be applied to refine the returned results. This paper presents a formal model to organize data, and a new search algorithm to browse it. It incorporates a natural language preprocessing stage, a statistical representation of short documents and queries and a machine learning model to select relevant results. We propose later in this paper two further optimizations that proved quite interesting and returned significantly satisfying results on two datasets in a reasonable computation time. The first optimization concerns queries expansions, while the second one concerns dataset restructuration. Thus, we formally evaluate the impact of each optimization by computing the performance of the information retrieval system with and without it; the highest reached recall and precision were 96.2% and 99.2%, respectively.
引用
收藏
页码:241 / 251
页数:11
相关论文
共 50 条
  • [1] Query expansion based on clustering and personalized information retrieval
    Hamid Khalifi
    Walid Cherif
    Abderrahim El Qadi
    Youssef Ghanou
    [J]. Progress in Artificial Intelligence, 2019, 8 : 241 - 251
  • [2] Clustering Algorithms for Query Expansion Based Information Retrieval
    Khennak, Ilyes
    Drias, Habiba
    Kechid, Amine
    Moulai, Hadjer
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, PT II, 2019, 11684 : 261 - 272
  • [3] Query Expansion for Personalized Cross-Language Information Retrieval
    Zhou, Dong
    Lawless, Seamus
    Liu, Jianxun
    Zhang, Sanrong
    Xu, Yu
    [J]. 10TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION AND PERSONALIZATION SMAP 2015, 2015, : 18 - 22
  • [4] An information retrieval model based on query expansion
    Huang, Mingxuan
    Zhang, Shichao
    Yan, Xiaowei
    Huang, Faliang
    [J]. RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 217 - 221
  • [5] Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval
    Chang, CH
    Hsu, CC
    [J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 621 - 623
  • [6] Parallel information retrieval with query expansion
    Chung, YJ
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (06) : 1593 - 1595
  • [7] Parallel information retrieval with query expansion
    Chung, Y
    [J]. APPLIED PARALLEL COMPUTING: ADVANCED SCIENTIFIC COMPUTING, 2002, 2367 : 195 - 202
  • [8] Parallel information retrieval with query expansion
    Chung, Y
    [J]. APPLIED PARALLEL COMPUTING: ADVANCED SCIENTIFIC COMPUTING, 2002, 2367 : 195 - 202
  • [9] Personalized query suggestion diversification in information retrieval
    Wanyu Chen
    Fei Cai
    Honghui Chen
    Maarten De Rijke
    [J]. Frontiers of Computer Science, 2020, 14
  • [10] Personalized query suggestion diversification in information retrieval
    Chen, Wanyu
    Cai, Fei
    Chen, Honghui
    De Rijke, Maarten
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (03)