Text segmentation based on document understanding for information retrieval

被引:0
|
作者
Prince, Violaine [1 ]
Labadie, Alexandre [1 ]
机构
[1] LIRMM, 161 Rue Ada, F-34392 Montpellier 5, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information retrieval needs to match relevant texts with a given query. Selecting appropriate parts is useful when documents are 4 long, and only portions are interesting to the user. In this paper, we 9 describe a method that extensively uses natural language techniques for text segmentation based on topic change detection. The method requires a NLP-parser and a semantic representation in Roget-based vectors. We have run the experiment on French documents, for which we have the appropriate tools, but the method could be transposed to any other language with the same requirements. The article sketches an overview of the NL understanding environment functionalities, and the algorithms related to our text segmentation method. An experiment in text segmentation is also presented and its result in an information retrieval task is shown.
引用
收藏
页码:295 / +
页数:3
相关论文
共 50 条
  • [1] Information retrieval beyond the text document
    Rui, Y
    Ortega, M
    Huang, TS
    Mehrotra, S
    LIBRARY TRENDS, 1999, 48 (02) : 455 - 474
  • [2] Learning-Based Word Segmentation for Reliable Text Document Retrieval and Augmentation
    Lomaliza, Jean-Pierre
    Park, Hanhoon
    24TH ACM SYMPOSIUM ON VIRTUAL REALITY SOFTWARE AND TECHNOLOGY (VRST 2018), 2018,
  • [3] Applying machine learning to text segmentation for information retrieval
    Huang, XJ
    Peng, FC
    Schuurmans, D
    Cercone, N
    Robertson, SE
    INFORMATION RETRIEVAL, 2003, 6 (3-4): : 333 - 362
  • [4] Applying Machine Learning to Text Segmentation for Information Retrieval
    Xiangji Huang
    Fuchun Peng
    Dale Schuurmans
    Nick Cercone
    Stephen E. Robertson
    Information Retrieval, 2003, 6 : 333 - 362
  • [5] Interactive text retrieval based on document similarities
    Klose, A
    Nürnberger, A
    Kruse, R
    Hartmann, G
    Richards, M
    PHYSICS AND CHEMISTRY OF THE EARTH PART A-SOLID EARTH AND GEODESY, 2000, 25 (08): : 649 - 654
  • [6] Intelligent Interface for Web Information Retrieval with Document Understanding
    Khokale, Rahul S.
    Atique, Mohammad
    HUMAN-COMPUTER INTERACTION: APPLICATIONS AND SERVICES, PT III, 2014, 8512 : 21 - 31
  • [7] Information Retrieval from Unstructured Web Text Document Based on Automatic Learning of the Threshold
    Fkih, Fethi
    Omri, Mohamed Nazih
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2012, 2 (04) : 12 - 30
  • [8] Improving information retrieval by combining user profile and document segmentation
    LaineCruzel, S
    Lafouge, T
    Lardy, JP
    BenAbdallah, N
    INFORMATION PROCESSING & MANAGEMENT, 1996, 32 (03) : 305 - 315
  • [9] Applying Topic Segmentation to Document-Level Information Retrieval
    Shtekh, Gennady
    Kazakova, Polina
    Nikitinsky, Nikita
    Skachkov, Nikolay
    CEE-SECR'18: PROCEEDINGS OF THE 14TH CENTRAL AND EASTERN EUROPEAN SOFTWARE ENGINEERING CONFERENCE RUSSIA, 2018,
  • [10] A Semantic and Feature Aggregated Information Retrieval Technique for Efficient Geospatial Text Document Retrieval
    Uma, R.
    Muneeswaran, K.
    JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2017, 28 (06) : 547 - 569