Pre-indexing Techniques in Arabic Information Retrieval

被引:1
|
作者
Ben Guirat, Souheila [1 ,2 ,4 ]
Bounhas, Ibrahim [2 ,4 ]
Slimani, Yahia [2 ,3 ,4 ]
机构
[1] Prince Sattam Bin Abdulaziz Univ, Comp Sci Dept, Al Kharj, Saudi Arabia
[2] Carthage Univ, Lab Comp Sci Ind Syst, Tunis, Tunisia
[3] Manouba Univ, Higher Inst Multimedia Arts Manouba ISAMM, Manouba, Tunisia
[4] JARIR Joint Grp Artificial Reasoning & Informat R, Manouba, Tunisia
关键词
Arabic Information Retrieval; Hybrid Index; Statistical Modeling; Smoothing; ALGORITHM; SEARCH;
D O I
10.5220/0007393402370246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Arabic document indexing is yet challenging given the morphological specificities of this language. Although there has been much effort in the field, developing more efficient indexing approaches is more and more demanding. One of the most important issues concerns the choice of the indexing units (e. g. stems, roots, lemmas, etc.) which both enhances retrieval efficiency and optimizes the indexing process. The question is how to process Arabic texts to retrieve the basic forms which better reflect the meaning of words and documents? In the literature several indexing units have been compared, while combining multiple indexes seems to be promising. In our previous works, we showed that hybrid indexes based on stems, patterns and roots enhances results. However, we need to find the optimal weight of each indexing unit. Therefore, this paper proposes to contribute in optimizing hybrid indexing. We compare and evaluate four pre-indexing methods.
引用
收藏
页码:237 / 246
页数:10
相关论文
共 50 条
  • [1] PRE-INDEXING AND CONVERSATIONAL ORGANIZATION
    BEACH, WA
    DUNNING, DG
    [J]. QUARTERLY JOURNAL OF SPEECH, 1982, 68 (02) : 170 - 185
  • [2] Combining Indexing Units for Arabic Information Retrieval
    Ben Guirat, Souheila
    Bounhas, Ibrahim
    Slimani, Yahya
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2016, 4 (04) : 1 - 14
  • [3] Semantic indexing of Arabic texts for information retrieval system
    Abderrahim, Mohammed Alaeddine
    Dib, Mohammed
    Abderrahim, Mohammed El-Amine
    Chikh, Mohammed Amine
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (02) : 229 - 236
  • [4] Fast Information Retrieval using Indexing Techniques
    Stoica Spahiu, Cosmin
    Stanescu, Liana
    Brezovan, Marius
    [J]. INTELLIGENT INTERACTIVE MULTIMEDIA SYSTEMS AND SERVICES, 2013, 254 : 89 - 98
  • [5] The Impact of Online Indexing in Improving Arabic Information Retrieval Systems
    Dilekh, Tahar
    Benharzallah, Saber
    Behloul, Ali
    [J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2018, 42 (04): : 607 - 616
  • [6] Design and implementation of automatic indexing for information retrieval with Arabic documents
    Hmeidi, I
    Kanaan, G
    Evens, M
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1997, 48 (10): : 867 - 881
  • [7] Pre-indexing for fast partial shape matching of vertebrae images
    Xu, Xiaoqian
    Lee, D. J.
    Antani, S.
    Long, L. R.
    [J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2006, : 105 - +
  • [8] Vector Space Model for Arabic Information Retrieval - Application to "Hadith" Indexing
    Harrag, Fouzi
    Hamdi-Cherif, Aboubekeur
    El-Qawasmeh, Eyas
    [J]. 2008 FIRST INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES, VOLS 1 AND 2, 2008, : 114 - +
  • [9] Comprehensive Study and Comparison of Information Retrieval Indexing Techniques
    Malki, Zohair
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (01) : 132 - 140
  • [10] A methodology for comparison of any indexing techniques for information retrieval system
    Arshad, S
    Shoaib, M
    Shah, A
    [J]. SERP '05: Proceedings of the 2005 International Conference on Software Engineering Research and Practice, Vols 1 and 2, 2005, : 897 - 904