Method of Lexical Enrichment in Information Retrieval System in Arabic

被引:9
|
作者
Mallat, Souheyl [1 ]
Zouaghi, Anis [2 ]
Hkiri, Emna [1 ]
Zrigui, Mounir [1 ]
机构
[1] Univ Monastir, Dept Comp Sci, Monastir, Tunisia
[2] Sousse Univ, Dept Comp Sci, Higher Inst Appl Sci & Technol Sousse, Sousse, Tunisia
关键词
Arabic NL; Information Retrieval; Lexical Enrichment; Query Enrichment; Weighting;
D O I
10.4018/ijirr.2013100103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the authors propose a method for lexical enrichment of Arabic queries in order to improve the performance of the information retrieval systems SRI. This method has two types of enrichment: linguistic and contextual. The first one is based on the linguistic analysis (lemmatization, morphological, syntactic and semantic analysis), whose goal is to generate a descriptive list (list-desc). This list contains a set of linguistic lexicon assigned to each significant term in the query. The second enrichment consists in integrating contextual information derived from the corpus documents. It is based on statistical analysis using Salton weighting functions: TF-IDF and TF-IEF. The TF-IDF function is applied on the list-desc and documents in the corpus in order to identify relevant documents. TF-IEF function is made between the list-desc and sentences belonging to the relevant documents to identify relevant sentences. Then, terms in these sentences are weighted, and those with highest weights are considered rich in terms of informative and contextual importance are added to the original query. The authors' lexical enrichment method was evaluated on a corpus of documents belonging to a specialized domain and results show its interest in terms of precision and recall.
引用
收藏
页码:35 / 51
页数:17
相关论文
共 50 条
  • [21] Current techniques in lexical information retrieval and manipulation
    Ortiz, AJM
    PROCEEDINGS OF THE XIXTH INTERNATIONAL CONFERENCE ON AEDEAN (ASOCIACION ESPANOLA DE ESTUDIOS ANGLONORTEAMERICANOS), 1996, : 425 - 429
  • [22] Stemming methodologies over individual query words for an Arabic Information Retrieval System
    Abu-Salem, H
    Al-Omari, M
    Evens, MW
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1999, 50 (06): : 524 - 529
  • [23] AXON: a personalized retrieval information system in Arabic texts based on linguistic features
    Houssem, Safi
    Maher, Jaoua
    Lamia, Belguith Hadrich
    2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND ECONOMIC INTELLIGENCE (SIIE), 2015, : 165 - 172
  • [24] Arabic Stemmer for Search Engines Information Retrieval
    Khalid, Ahmed
    Hussain, Zakir
    Baig, Mirza Anwarullah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (01) : 407 - 411
  • [25] Combining Indexing Units for Arabic Information Retrieval
    Ben Guirat, Souheila
    Bounhas, Ibrahim
    Slimani, Yahya
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2016, 4 (04) : 1 - 14
  • [26] Arabic Word Sense Disambiguation for Information Retrieval
    Abderrahim, Mohammed Alaeddine
    Abderrahim, Mohammed El-Amine
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [27] Modern information retrieval in Arabic - catering to standard and colloquial Arabic users
    Azmi, Aqil M.
    Aljafari, Eman A.
    JOURNAL OF INFORMATION SCIENCE, 2015, 41 (04) : 506 - 517
  • [28] Arabic WordNet semantic relations enrichment through morpho-lexical patterns
    Boudabous, Mohamed Mahdi
    Kammoun, Nouha Chaaben
    Khedher, Nacef
    Belguith, Lamia Hadrich
    Sadat, Fatiha
    2013 FIRST INTERNATIONAL CONFERENCE ON COMMUNICATIONS SIGNAL PROCESSING, AND THEIR APPLICATIONS (ICCSPA'13), 2013,
  • [29] Lexical Scoring System of Lexical Chain for Quranic Document Retrieval
    Rad, Hamed Zakeri
    Tiun, Sabrina
    Saad, Saidah
    GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2018, 18 (02): : 59 - 79
  • [30] Lexical retrieval after Arabic aphasia: Syntactic access and predictors of spoken naming
    Khwaileh, Tariq
    Body, Richard
    Herbert, Ruth
    JOURNAL OF NEUROLINGUISTICS, 2017, 42 : 140 - 155