Rhetorical structure theory for content-based indexing and retrieval of Web documents

被引:1
|
作者
Marir, F [1 ]
Haouam, K [1 ]
机构
[1] Univ N London, Sch Informat & Multimedia Technol, London N7 8DB, England
关键词
document indexing and retrieval; rhetorical structure theory;
D O I
10.1109/ITRE.2004.1393667
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The amount of information available on the Internet is currently growing at an incredible rate. However, the lack of efficient indexing is still a major barrier to effective information retrieval on the web. This paper presents the design of a technique for content-based indexing and retrieval of relevant documents from a large collection of documents such as the Internet. The technique aims at improving the quality of retrieval by capturing the semantics of the documents. It introduces a thematic relationship between parts of text using a linguistics theory called Rhetorical Structure Theory (RST) based on cue phrases to determine the set of rhetorical relations. Once these structures are determined, they can be saved into a database. We can then query that collection using not only keywords, as traditional Information retrieval systems, but also rhetorical relations. The indexing and retrieval technique described in this paper is under development and initial results on a small number of documents have been very successful.
引用
下载
收藏
页码:160 / 164
页数:5
相关论文
共 50 条
  • [21] RETIN: A Content-Based Image Indexing and Retrieval System
    J. Fournier
    M. Cord
    S. Philipp-Foliguet
    Pattern Analysis & Applications, 2001, 4 : 153 - 173
  • [22] Content-based indexing and retrieval-by-example in audio
    Liu, Z
    Huang, Q
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 877 - 880
  • [23] A new indexing scheme for content-based image retrieval
    Cha, GH
    Chung, CW
    MULTIMEDIA TOOLS AND APPLICATIONS, 1998, 6 (03) : 263 - 288
  • [24] An efficient and robust indexing structure for content-based image retrieval in trademark databases
    Lin, HY
    Huang, PW
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IX, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING: I, 2004, : 184 - 187
  • [25] Automatic image indexing for rapid content-based retrieval
    Zheng, ZJ
    Leung, CHC
    INTERNATIONAL WORKSHOP ON MULTI-MEDIA DATABASE MANAGEMENT SYSTEMS, PROCEEDINGS, 1996, : 38 - 45
  • [26] Indexing and retrieval scheme for content-based multimedia applications
    Dmitry, Martynov
    Bovbel, Eugenij
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 162 - 169
  • [27] Guest Editorial: Content-Based Multimedia Indexing and Retrieval
    Chabane Djeraba
    Multimedia Tools and Applications, 2001, 14 : 107 - 111
  • [28] A Survey on Visual Content-Based Video Indexing and Retrieval
    Hu, Weiming
    Xie, Nianhua
    Li, Li
    Zeng, Xianglin
    Maybank, Stephen
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2011, 41 (06): : 797 - 819
  • [29] Generic content-based audio indexing and retrieval framework
    Kiranyaz, S.
    Gabbouj, M.
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2006, 153 (03): : 285 - 297
  • [30] A New Indexing Scheme for Content-Based Image Retrieval
    Guang-Ho Cha
    Chin-Wan Chung
    Multimedia Tools and Applications, 1998, 6 : 263 - 288