Design and implementation of automatic indexing for information retrieval with Arabic documents

被引:0
|
作者
Hmeidi, I
Kanaan, G
Evens, M
机构
[1] IIT,DEPT COMP SCI,CHICAGO,IL 60616
[2] JORDAN INST SCI & TECHNOL,IRBID,JORDAN
关键词
D O I
10.1002/(SICI)1097-4571(199710)48:10<867::AID-ASI3>3.0.CO;2-#
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We have put together a corpus of 242 abstracts of Arabic documents using the Proceedings of the Saudi Arabian National Conferences as a source. All these abstracts involve computer science and information systems. We also designed and built an automatic information retrieval system from scratch to handle Arabic data. The system was implemented in the C language using the GCC compiler and runs on IBM/PCs and compatible microcomputers. We have implemented both automatic and manual indexing techniques for this corpus. A long series of experiments using measures of recall and precision has demonstrated that automatic indexing is at least as effective as manual indexing and more effective in some cases. Since automatic indexing is both cheaper and faster, our results suggest that we can achieve a wider coverage of the literature with less money and produce as good results as with manual indexing. We have also compared the retrieval results using words as index terms versus stems and roots, and confirmed the results obtained by Al-Kharashi and Abu-Salem with smaller corpora that root indexing is more effective than word indexing.
引用
收藏
页码:867 / 881
页数:15
相关论文
共 50 条
  • [31] Latent Topic Model for Indexing Arabic Documents
    Ayadi, Rami
    Maraoui, Mohsen
    Zrigui, Mounir
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2014, 4 (01) : 29 - 45
  • [32] Exemplary documents: a foundation for information retrieval design
    Blair, DC
    Kimbrough, SO
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2002, 38 (03) : 363 - 379
  • [33] Latent Topic Model for Indexing Arabic Documents
    Ayadi, Rami
    Maraoui, Mohsen
    Zrigui, Mounir
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2014, 4 (02) : 57 - 72
  • [34] Indexing and retrieval of words in old documents
    Marinai, S
    Marino, E
    Soda, G
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 223 - 227
  • [35] Building an automatic stemmer to enhance arabic information retrieval systems
    Alsamara, K
    Abuleil, S
    Abu-Salem, H
    Hammo, B
    [J]. International Conference on Computing, Communications and Control Technologies, Vol 5, Proceedings, 2004, : 270 - 274
  • [36] Design issues of multimedia information indexing and retrieval systems
    Lu, GJ
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 1999, 22 (03) : 175 - 198
  • [37] Information Retrieval Approach based on Indexing Text Documents: Application to Biomedical Domain
    Boukhari, Kabil
    Omri, Mohamed Nazih
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017, : 2213 - 2220
  • [38] The problem of automatic understanding of full text documents in information retrieval
    Zabezhailo, MI
    [J]. JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL, 1998, 37 (05) : 822 - 830
  • [39] An automatic classification technique and tool for information retrieval of web documents
    Di Martino, B
    Mazzocca, N
    Squeglia, A
    Mazzeo, A
    [J]. CONCURRENT ENGINEERING: ENHANCED INTEROPERABLE SYSTEMS, 2003, : 1043 - 1050
  • [40] The problem of automatic understanding of full text documents in information retrieval
    Zabezhailo, M.I.
    [J]. Izvestiya Akademii Nauk. Teoriya i Sistemy Upravleniya, 1998, 37 (05):