Exploiting concept clusters for content-based information retrieval

被引:28
|
作者
Kang, BY [1 ]
Kim, DW
Lee, SJ
机构
[1] Kyungpook Natl Univ, Dept Comp Engn, Taegu 702701, South Korea
[2] Korea Adv Inst Sci & Technol, Dept Comp Sci, Taejon 305701, South Korea
关键词
information retrieval; indexing; term frequency; weighting function;
D O I
10.1016/j.ins.2004.03.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current approaches to index weighting for information retrieval from texts are based on statistical analysis of the texts' contents. A key shortcoming of these indexing schemes, which consider only the terms in a document, is that they cannot extract semantically exact indexes that represent the semantic content of a document. To address this issue, we proposed a new indexing formalism that considers not only the terms in a document, but also the concepts. In the proposed method, concepts are extracted by exploiting clusters of terms that are semantically related, referred to as concept clusters. Through experiments on the TREC-2 collection of Wall Street Journal documents, we show that the proposed method outperforms an indexing method based on term frequency (TF), especially in regard to the highest-ranked documents. Moreover, the index term dimension was 53.3% lower for the proposed method than for the TF-based method, which is expected to significantly reduce the document search time in a real environment. (C) 2004 Elsevier Inc. All rights reserved.
引用
收藏
页码:443 / 462
页数:20
相关论文
共 50 条
  • [41] Content-based image retrieval
    [J]. Multimedia Tools and Applications, 2023, 82 : 37903 - 37903
  • [42] Association and content-based retrieval
    Djeraba, C
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (01) : 118 - 135
  • [43] Content-Based Image Retrieval
    Zaheer, Yasir
    [J]. SECOND INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING, 2010, 7546
  • [44] Coding for content-based retrieval
    Swanson, MD
    Hosur, S
    Tewfik, AH
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 1958 - 1961
  • [45] Content-based Image Retrieval
    Marinovic, Igor
    Fuerstner, Igor
    [J]. 2008 6TH INTERNATIONAL SYMPOSIUM ON INTELLIGENT SYSTEMS AND INFORMATICS, 2008, : 86 - +
  • [46] Invariance in content-based retrieval
    Smeulders, A
    Gevers, T
    Geusebroek, JM
    Worring, M
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 675 - 678
  • [47] FIRE in ImageCLEF 2005: Combining content-based image retrieval with textual information retrieval
    Deselaers, Thomas
    Weyand, Tobias
    Keysers, Daniel
    Macherey, Wolfgang
    Ney, Hermann
    [J]. ACCESSING MULTILINGUAL INFORMATION REPOSITORIES, 2006, 4022 : 652 - 661
  • [48] Dominant Colour Descriptor with Spatial Information for Content-based Image Retrieval
    Mustaffa, Mas Rina
    Ahmad, Fatimah
    Rahmat, Rahmita Wirza O. K.
    Mahmod, Ramlan
    [J]. INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 1612 - 1620
  • [49] Integrated color, texture and shape information for content-based image retrieval
    Ryszard S. Choraś
    Tomasz Andrysiak
    Michał Choraś
    [J]. Pattern Analysis and Applications, 2007, 10 : 333 - 343
  • [50] Content-Based Retrieval of Human Actions By Analysing The Statistical Information of Features
    Ramezani, Mohsen
    Yaghmaee, Farzin
    [J]. 2014 6TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2014, : 56 - 60