Advanced text documents information retrieval system for search services

被引:4
|
作者
Chiranjeevi, H. S. [1 ]
Shenoy, Manjula K. [1 ]
机构
[1] Manipal Inst Technol, Manipal Acad Higher Educ, ICT, Manipal 576104, India
来源
COGENT ENGINEERING | 2020年 / 7卷 / 01期
关键词
information technology; text documents; search engine; information retrieval; tokenization; recurrent convolutional neural network; retrieval efficiency;
D O I
10.1080/23311916.2020.1856467
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Information technology has explored the growth of text documents data in many organizations and the structural arrangement of voluminous data is a complex task. Handling the text document data is a challenging process involving not only the training of models but also numerous additional procedures, e.g., data pre-processing, transformation, and dimensionality reduction. In this paper, we describe the system's architecture, the technical challenges, and the novel solution we have built. We propose a Recurrent Convolutional Neural network (RCNN), based text information retrieval system which efficiently retrieves the text documents and information for the user query. Pre-processing using tokenization and stemming, retrieval using TF-IDF (Term Frequency-Inverse Document Frequency), and RCNN classifier which captures the contextual information is implemented. A real-time advanced search system is developed on a huge set of MAHE University dataset. The performance of the proposed text document retrieval system is compared with other existing algorithms and the efficacy of the method is discussed. The proposed RCNN-based text document information retrieval model performs better in terms of precision, recall, and F-measure. A high-quality and high-performance text document retrieval search system is presented.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] INFORMATION RETRIEVAL IN CLINICAL FREE TEXT DOCUMENTS
    Spat, S.
    Cadonna, B.
    Rakovac, I
    Guetl, C.
    Leitner, H.
    Stark, G.
    Beck, P.
    [J]. EHEALTH2008 - MEDICAL INFORMATICS MEETS EHEALTH, 2008, : 205 - 210
  • [2] Information retrieval system for handwritten documents
    Srihari, S
    Ganesh, A
    Tomai, C
    Shin, YC
    Huang, C
    [J]. DOCUMENT ANALYSIS SYSTEMS VI, PROCEEDINGS, 2004, 3163 : 298 - 309
  • [3] System of information retrieval in XML documents
    Smadhi, S
    [J]. ISSUES AND TRENDS OF INFORMATION TECHNOLOGY MANAGEMENT IN CONTEMPORARY ORGANIZATIONS, VOLS 1 AND 2, 2002, : 736 - 739
  • [4] The problem of automatic understanding of full text documents in information retrieval
    Zabezhailo, MI
    [J]. JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL, 1998, 37 (05) : 822 - 830
  • [5] Information Retrieval for Unstructured Text Documents in Serbian into the Crime Domain
    Nikolic, Vojkan
    Markoski, Branko
    Ivkovic, Miodrag
    Kuk, Kristijan
    Djikanovic, Predrag
    [J]. 2015 16TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI), 2015, : 267 - 271
  • [6] The problem of automatic understanding of full text documents in information retrieval
    Zabezhailo, M.I.
    [J]. Izvestiya Akademii Nauk. Teoriya i Sistemy Upravleniya, 1998, 37 (05):
  • [7] An Information Retrieval System for Medical Records & Documents
    Chou, Shihchieh
    Chang, Weiping
    Cheng, Chin-Yi
    Jehng, Jihn-Chang
    Chang, Chenchao
    [J]. 2008 30TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-8, 2008, : 1474 - +
  • [8] Waypoint: An integrated search and retrieval system for engineering documents
    McMahon, C
    Lowe, A
    Culley, S
    Corderoy, M
    Crossland, R
    Shah, T
    Stewart, D
    [J]. JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2004, 4 (04) : 329 - 338
  • [9] Enrichment of text documents using information retrieval techniques in a distributed environment
    Bueno, Francisco
    Garcia-Serrano, Ana
    Martinez-Fernandez, Jose L.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) : 8348 - 8358
  • [10] Learning To Rank Relevant Documents for Information Retrieval in Bioengineering Text Corpora
    Cheng, Kowk Sun
    Song, Myoungkyu
    [J]. 2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021), 2021, : 1565 - 1572