Finding Relevant Documents in a Search Engine Using N-Grams Model and Reinforcement Learning

被引:0
|
作者
El Hadi, Amine [1 ]
Madani, Youness [1 ]
El Ayachi, Rachid [2 ]
Erritali, Mohamed [2 ]
机构
[1] Sultan Moulay Slimane Univ, Fac Sci & Tech, Beni Mellal, Morocco
[2] Sultan Moulay Slimane Univ, Fac Sci & Tech, Lab TIAD, Beni Mellal, Morocco
关键词
N-Grams Model; Query Reformulation; Reinforcement Learning; Search Engine; Semantic Similarity; SEMANTIC SIMILARITY;
D O I
10.4018/JITR.299930
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The field of information retrieval (IR) is an important area in computer science. This domain helps us to find information that we are interested in from an important volume of information. A search engine is the best example of the application of information retrieval to get the most relevant results. In this paper, the authors propose a new recommendation approach for recommending relevant documents to a search engine's users. In this work, they proposed a new approach for calculating the similarity between a user query and a list of documents in a search engine. The proposed method uses a new reinforcement learning algorithm based on n-grams model (i.e., a sub-sequence of n constructed elements from a given sequence) and a similarity measure. Results show that the method outperforms some methods from the literature with a high value of accuracy.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Use of N-grams Model and Semantic Similarity to Improve the Results of Search Engine
    El Hadi, Amine
    Madani, Youness
    El Ayachi, Rachid
    Erritali, Mohamed
    ADVANCED INTELLIGENT SYSTEMS FOR SUSTAINABLE DEVELOPMENT (AI2SD'2020), VOL 2, 2022, 1418 : 437 - 444
  • [2] The Method of Search for Falsifications in Copies of Contractual Documents based on N-grams
    Slavin, Oleg
    Andreeva, Elena
    Arlazarov, Vladimir V.
    THIRTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2020), 2021, 11605
  • [3] Automated Mining of Relevant N-grams in Relation to Predominant Topics of Text Documents
    Zizka, Jan
    Darena, Frantisek
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 461 - 469
  • [4] Modeling documents for structure recognition using generalized N-grams
    Brugger, R
    Zramdini, A
    Ingold, R
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 56 - 60
  • [5] Reconstructing Textual Documents from n-grams
    Galle, Matthias
    Tealdi, Matias
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 329 - 338
  • [6] Hierarchical classification of Chinese documents based on N-grams
    Guan, JH
    Zhou, SG
    DIGITAL LIBRARIES: TECHNOLOGY AND MANAGEMENT OF INDIGENOUS KNOWLEDGE FOR GLOBAL ACCESS, 2003, 2911 : 643 - 652
  • [7] PLAGIARISM DETECTION IN TEXT DOCUMENTS USING SENTENCE BOUNDED STOP WORD N-GRAMS
    Gupta, Deepa
    Vani, K.
    Leema, L. M.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2016, 11 (10) : 1403 - 1420
  • [8] AS-Index: A Structure for String Search Using n-Grams and Algebraic Signatures
    Constantin, Camelia
    du Mouza, Cedric
    Litwin, Witold
    Rigaux, Philippe
    Schwarz, Thomas
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2016, 31 (01) : 147 - 166
  • [9] AS-Index: A Structure for String Search Using n-Grams and Algebraic Signatures
    Camelia Constantin
    Cédric du Mouza
    Witold Litwin
    Philippe Rigaux
    Thomas Schwarz
    Journal of Computer Science and Technology, 2016, 31 : 147 - 166
  • [10] SPEECH RECOGNITION USING FUNCTION-WORD N-GRAMS AND CONTENT-WORD N-GRAMS
    ISOTANI, R
    MATSUNAGA, S
    SAGAYAMA, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (06) : 692 - 697