Finding Relevant Documents in a Search Engine Using N-Grams Model and Reinforcement Learning

被引：0

作者：

El Hadi, Amine ^{[1
]}

Madani, Youness ^{[1
]}

El Ayachi, Rachid ^{[2
]}

Erritali, Mohamed ^{[2
]}

机构：

[1] Sultan Moulay Slimane Univ, Fac Sci & Tech, Beni Mellal, Morocco

[2] Sultan Moulay Slimane Univ, Fac Sci & Tech, Lab TIAD, Beni Mellal, Morocco

来源：

JOURNAL OF INFORMATION TECHNOLOGY RESEARCH | 2022年 / 15卷 / 01期

关键词：

N-Grams Model; Query Reformulation; Reinforcement Learning; Search Engine; Semantic Similarity; SEMANTIC SIMILARITY;

D O I：

10.4018/JITR.299930

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The field of information retrieval (IR) is an important area in computer science. This domain helps us to find information that we are interested in from an important volume of information. A search engine is the best example of the application of information retrieval to get the most relevant results. In this paper, the authors propose a new recommendation approach for recommending relevant documents to a search engine's users. In this work, they proposed a new approach for calculating the similarity between a user query and a list of documents in a search engine. The proposed method uses a new reinforcement learning algorithm based on n-grams model (i.e., a sub-sequence of n constructed elements from a given sequence) and a similarity measure. Results show that the method outperforms some methods from the literature with a high value of accuracy.

引用

页数：17

共 50 条

[1] Use of N-grams Model and Semantic Similarity to Improve the Results of Search Engine
El Hadi, Amine
Madani, Youness
El Ayachi, Rachid
Erritali, Mohamed
ADVANCED INTELLIGENT SYSTEMS FOR SUSTAINABLE DEVELOPMENT (AI2SD'2020), VOL 2, 2022, 1418 : 437 - 444
[2] The Method of Search for Falsifications in Copies of Contractual Documents based on N-grams
Slavin, Oleg
Andreeva, Elena
Arlazarov, Vladimir V.
THIRTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2020), 2021, 11605
[3] Automated Mining of Relevant N-grams in Relation to Predominant Topics of Text Documents
Zizka, Jan
Darena, Frantisek
TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 461 - 469
[4] Modeling documents for structure recognition using generalized N-grams
Brugger, R
Zramdini, A
Ingold, R
PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 56 - 60
[5] Reconstructing Textual Documents from n-grams
Galle, Matthias
Tealdi, Matias
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 329 - 338
[6] Hierarchical classification of Chinese documents based on N-grams
Guan, JH
Zhou, SG
DIGITAL LIBRARIES: TECHNOLOGY AND MANAGEMENT OF INDIGENOUS KNOWLEDGE FOR GLOBAL ACCESS, 2003, 2911 : 643 - 652
[7] PLAGIARISM DETECTION IN TEXT DOCUMENTS USING SENTENCE BOUNDED STOP WORD N-GRAMS
Gupta, Deepa
Vani, K.
Leema, L. M.
JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2016, 11 (10) : 1403 - 1420
[8] AS-Index: A Structure for String Search Using n-Grams and Algebraic Signatures
Constantin, Camelia
du Mouza, Cedric
Litwin, Witold
Rigaux, Philippe
Schwarz, Thomas
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2016, 31 (01) : 147 - 166
[9] AS-Index: A Structure for String Search Using n-Grams and Algebraic Signatures
Camelia Constantin
Cédric du Mouza
Witold Litwin
Philippe Rigaux
Thomas Schwarz
Journal of Computer Science and Technology, 2016, 31 : 147 - 166
[10] SPEECH RECOGNITION USING FUNCTION-WORD N-GRAMS AND CONTENT-WORD N-GRAMS
ISOTANI, R
MATSUNAGA, S
SAGAYAMA, S
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (06) : 692 - 697

← 1 2 3 4 5 →