An End-to-End Efficient Lucene-Based Framework of Document/Information Retrieval

被引:0
|
作者
Ben Ayed, Alaidine [1 ]
Biskri, Ismail [2 ]
Meunier, Jean-Guy [3 ]
机构
[1] Univ Quebec Montreal, Cognit Comp Sci, Montreal, PQ, Canada
[2] Univ Quebec Trois Rivieres, Comp Sci Dept, Computat Linguist & Artificial Intelligence, Trois Rivieres, PQ, Canada
[3] Univ Quebec Montreal, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Data and Knowledge Representation; Document Retrieval; Internet and Web Applications; Mono/Multi-Document Summarization; RELEVANCE;
D O I
10.4018/IJIRR.289950
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the context of big data and the Industrial Revolution 4.0 era, enhancing document/information retrieval framework efficiency to handle the ever-growing volume of text data in an ever more digital world is a must. This article describes a double-stage system of document/information retrieval. First, a Lucene-based document retrieval tool is implemented, and a couple of query expansion techniques using a comparable corpus (Wikipedia) and word embeddings are proposed and tested. Second, a retention-fidelity summarization protocol is performed on top of the retrieved documents to create a short, accurate, and fluent extract of a longer retrieved single document (or a set of top retrieved documents). Obtained results show that using word embeddings is an excellent way to achieve higher precision rates and retrieve more accurate documents. Also, obtained summaries satisfy the retention and fidelity criteria of relevant summaries.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] End-to-End Adaptive Framework for Multimedia Information Retrieval
    Sokhn, Maria
    Mugellini, Elena
    Khaled, OmarAbou
    Serhrouchni, Ahmed
    WIRED/WIRELESS INTERNET COMMUNICATIONS, 2011, 6649 : 197 - 206
  • [2] An end-to-end pseudo relevance feedback framework for neural document retrieval
    Wang, Le
    Luo, Ze
    Li, Canjia
    He, Ben
    Sun, Le
    Yu, Hao
    Sun, Yingfei
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (02)
  • [3] Auto Search Indexer for End-to-End Document Retrieval
    Yang, Tianchi
    Song, Minghui
    Zhang, Zihan
    Huang, Haizhen
    Deng, Weiwei
    Sun, Feng
    Zhang, Qi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6955 - 6970
  • [4] End-to-End Contextualized Document Indexing and Retrieval with Neural Networks
    Hofstaetter, Sebastian
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 2481 - 2481
  • [5] END-TO-END LEARNING OF PARSING MODELS FOR INFORMATION RETRIEVAL
    Gillenwater, Jennifer
    He, Xiaodong
    Gao, Jianfeng
    Deng, Li
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3312 - 3316
  • [6] End-to-end Distantly Supervised Information Extraction with Retrieval Augmentation
    Zhang, Yue
    Fei, Hongliang
    Li, Ping
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2449 - 2455
  • [7] An End-To-End Emotion Recognition Framework Based on Temporal Aggregation of Multimodal Information
    Radoi, Anamaria
    Birhala, Andreea
    Ristea, Nicolae-Catalin
    Dutu, Liviu-Cristian
    IEEE ACCESS, 2021, 9 : 135559 - 135570
  • [8] Efficient genomics-based 'end-to-end' selective tree breeding framework
    El-Kassaby, Yousry A.
    Cappa, Eduardo P.
    Chen, Charles
    Ratcliffe, Blaise
    Porth, Ilga M.
    HEREDITY, 2024, 132 (02) : 98 - 105
  • [9] Efficient genomics-based ‘end-to-end’ selective tree breeding framework
    Yousry A. El-Kassaby
    Eduardo P. Cappa
    Charles Chen
    Blaise Ratcliffe
    Ilga M. Porth
    Heredity, 2024, 132 : 98 - 105
  • [10] End-to-end learning of representations for instance-level document image retrieval
    Liu, Li
    Lu, Yue
    Suen, Ching Y.
    APPLIED SOFT COMPUTING, 2023, 136