Arabic text data mining: A root-based hierarchical indexing model

被引:0
|
作者
Eldos, T.M. [1 ]
机构
[1] Department of Computer Engineering, Fac. of Comp./Information Technology, Jordan Univ. of Sci. and Technology, Irbid 22110-3030, Jordan
来源
关键词
Digital libraries - Indexing (of information) - Information retrieval - Linguistics;
D O I
10.1080/02286203.2003.11442267
中图分类号
学科分类号
摘要
The world has recently witnessed a tremendous growth in the volume of text documents available on the Internet, digital libraries, news sources, and company-wide intranets. Text data mining, as a multidisciplinary field involving information retrieval, text analysis, information extraction, clustering, categorization, linguistics, database technology, machine learning, and data mining, is becoming more significant, and efforts have been intensified in studies like information retrieval, practical applications of which are becoming more and more necessary to end users and to the scientific community itself, in order to fetch the increasingly available information efficiently. In the past few years, not only have new documents been produced directly in digital form, thus being suitable for automatic indexing, but also many of the older documents have been ported from their physical medium to the digital one. The meaning of a document is represented by a vector of features, which are weighted according to a measure that best estimate relevance. Text categorization presents unique challenges due to the large number of attributes present in the data set, large number of training samples, and attributes dependencies. This article focuses on speeding up the information retrieval process in Arabic document base by using a root-based hierarchical indexing model. Simulation results demonstrated that speed gain in the range of 50-100 can be achieved for typical queries.
引用
收藏
页码:158 / 166
相关论文
共 50 条
  • [41] Big Data Storage Index Mechanism Based on Hierarchical Indexing and Concurrent Updating
    Yan, Bowen
    Yang, Yang
    Guo, Wei
    Zhou, Zhaoguo
    Wen, Hongwu
    Xu, Zhenyuan
    Wang, Yaobin
    2022 6TH INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROL, ISCSIC, 2022, : 363 - 367
  • [42] Predicting blastocyst formation rate using a hierarchical and data mining-based statistical model
    Serna, J. Yao
    Milewski, R.
    Bodri, D.
    Sugimoto, T.
    Kato, R.
    Matsumoto, T.
    Kawachiya, S.
    HUMAN REPRODUCTION, 2018, 33 : 68 - 68
  • [43] Knowledge Graph-based Algorithm for Text Data Mining
    Zhao, Yu-Feng
    He, Jie
    Journal of Network Intelligence, 2024, 9 (03): : 1892 - 1906
  • [44] Building a Knowledge Based Summarization System for Text Data Mining
    Timofeyev, Andrey
    Choi, Ben
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, CD-MAKE 2018, 2018, 11015 : 118 - 133
  • [45] Mining Text Value in Japanese Translation Teaching Based on KH Coder Text Data
    Zhang X.
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [46] A Text Mining Based Approach for Mining Customer Attribute Data on Undefined Quality Problem
    Zhu, Qing
    Wu, Yiqiong
    Li, Yuze
    Zuo, Renxian
    SEVENTEENTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, 2018, : 276 - 289
  • [47] Model-Based Video Classification toward Hierarchical Representation, Indexing and Access
    Jianping Fan
    Xingquan Zhu
    Mohand-Said Hacid
    Ahmed K. Elmagarmid
    Multimedia Tools and Applications, 2002, 17 : 97 - 120
  • [48] Model-based video classification toward hierarchical representation, indexing and access
    Fan, JP
    Zhu, XQ
    Hacid, MS
    Elmagarmid, AK
    MULTIMEDIA TOOLS AND APPLICATIONS, 2002, 17 (01) : 97 - 120
  • [49] Application of data mining model in English text writing training system
    Song T.
    Applied Mathematics and Nonlinear Sciences, 2023, 8 (02) : 2505 - 2512
  • [50] A Text Error Correction Model based on Hierarchical Editing Framework
    Ye J.-M.
    Luo D.-X.
    Chen S.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (02): : 401 - 407