Rhetorical structure theory for content-based indexing and retrieval of Web documents

被引:1
|
作者
Marir, F [1 ]
Haouam, K [1 ]
机构
[1] Univ N London, Sch Informat & Multimedia Technol, London N7 8DB, England
关键词
document indexing and retrieval; rhetorical structure theory;
D O I
10.1109/ITRE.2004.1393667
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The amount of information available on the Internet is currently growing at an incredible rate. However, the lack of efficient indexing is still a major barrier to effective information retrieval on the web. This paper presents the design of a technique for content-based indexing and retrieval of relevant documents from a large collection of documents such as the Internet. The technique aims at improving the quality of retrieval by capturing the semantics of the documents. It introduces a thematic relationship between parts of text using a linguistics theory called Rhetorical Structure Theory (RST) based on cue phrases to determine the set of rhetorical relations. Once these structures are determined, they can be saved into a database. We can then query that collection using not only keywords, as traditional Information retrieval systems, but also rhetorical relations. The indexing and retrieval technique described in this paper is under development and initial results on a small number of documents have been very successful.
引用
下载
收藏
页码:160 / 164
页数:5
相关论文
共 50 条
  • [1] An intelligent agent for content-based indexing and retrieval of documents.
    Mimouni, NK
    Marir, F
    Meziane, F
    KES'2000: FOURTH INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ENGINEERING SYSTEMS & ALLIED TECHNOLOGIES, VOLS 1 AND 2, PROCEEDINGS, 2000, : 831 - 834
  • [2] Content-based multimedia indexing and retrieval
    Djeraba, C
    IEEE MULTIMEDIA, 2002, 9 (02) : 18 - 22
  • [3] Extended symbolic projection for content-based indexing and retrieval of audio, video, and multimedia documents
    Arndt, T
    Guercio, A
    STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 : 417 - 426
  • [4] Content-based image and video indexing and retrieval
    Lu, Hong
    Xue, Xiangyang
    Tan, Yap-Peng
    COGNITIVE SYSTEMS, 2007, 4429 : 118 - +
  • [6] Content-based indexing and retrieval of TV news
    Bertini, M
    Del Bimbo, A
    Pala, P
    PATTERN RECOGNITION LETTERS, 2001, 22 (05) : 503 - 516
  • [7] Content-based image indexing and retrieval in ImageRoadMap
    Golshani, F
    Park, Y
    MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS II, 1997, 3229 : 194 - 205
  • [8] Toward efficient indexing structure for scalable content-based music retrieval
    Shen, Jialie
    Tao, Mei
    Qu, Qiang
    Tao, Dacheng
    Rui, Yong
    MULTIMEDIA SYSTEMS, 2019, 25 (06) : 639 - 653
  • [9] Multidimensional indexing for content-based image retrieval
    Zhao, JL
    Kwok, SH
    ELECTRONIC IMAGING AND MULTIMEDIA SYSTEMS II, 1998, 3561 : 14 - 21
  • [10] A Bayesian framework for content-based indexing and retrieval
    Vasconcelos, N
    Lippman, A
    DCC '98 - DATA COMPRESSION CONFERENCE, 1998, : 580 - 580