Concept-based information retrieval using explicit semantic analysis

被引:14
|
作者
Egozi O. [1 ]
Markovitch S. [1 ]
Gabrilovich E. [2 ]
机构
[1] Department of Computer Science, Israel Institute of Technology, Technion, Haifa
[2] Technion, Israel Institute of Technology, Yahoo, Research, Santa Clara, CA 95054
关键词
Concept-based retrieval; Explicit semantic analysis; Feature selection; Semantic search;
D O I
10.1145/1961209.1961211
中图分类号
学科分类号
摘要
Information retrieval systems traditionally rely on textual keywords to index and retrieve documents. Keyword-based retrieval may return inaccurate and incomplete results when different keywords are used to describe the same concept in the documents and in the queries. Furthermore, the relationship between these related keywords may be semantic rather than syntactic, and capturing it thus requires access to comprehensive human world knowledge. Concept-based retrieval methods have attempted to tackle these difficulties by using manually built thesauri, by relying on term cooccurrence data, or by extracting latent word relationships and concepts from a corpus. In this article we introduce a new concept-based retrieval approach based on Explicit Semantic Analysis (ESA), a recently proposed method that augments keywordbased text representation with concept-based features, automatically extracted from massive human knowledge repositories such as Wikipedia. Our approach generates new text features automatically, and we have found that high-quality feature selection becomes crucial in this setting to make the retrieval more focused. However, due to the lack of labeled data, traditional feature selection methods cannot be used, hence we propose new methods that use self-generated labeled training data. The resulting system is evaluated on several TREC datasets, showing superior performance over previous state-of-the-art results. © 2011 ACM.
引用
收藏
相关论文
共 50 条
  • [1] Concept-based Document Models using Explicit Semantic Analysis
    Luo, Jing
    Meng, Bo
    Tu, Xinhui
    Liu, Maofu
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC 2012), 2012, : 338 - 342
  • [2] Concept Based Information Retrieval Using Semantic Analysis
    Sherimon, P. C.
    Saad, Youssef
    Krishnan, Reshmy
    Sherimon, Vinu
    [J]. 13TH MIDDLE EASTERN SIMULATION & MODELLING MULTICONFERENCE (MESM 2012) 3RD GAMEON-ARABIA CONFERENCE, 2012, : 74 - 78
  • [3] Concept-based approach for information retrieval
    Wu, Chen
    Zhang, Quan
    Jia, Ning
    [J]. Journal of Southeast University (English Edition), 2006, 22 (03) : 324 - 329
  • [4] Enhancing information retrieval through concept-based language modeling and semantic smoothing
    Lhadj, Lynda Said
    Boughanem, Mohand
    Amrouche, Karima
    [J]. JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2016, 67 (12) : 2909 - 2927
  • [5] Semantic annotation for concept-based cross-language medical information retrieval
    Volk, M
    Ripplinger, B
    Vintar, S
    Buitelaar, P
    Raileanu, D
    Sacaleanu, B
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2002, 67 (1-3) : 97 - 112
  • [6] CONCEPT-BASED RETRIEVAL OF HYPERMEDIA INFORMATION - FROM TERM INDEXING TO SEMANTIC HYPERINDEXING
    ARENTS, HC
    BOGAERTS, WFL
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1993, 29 (03) : 373 - 386
  • [7] Concept-based image retrieval using the new semantic similarity measurement
    Choi, JH
    Cho, MY
    Park, SH
    Kim, P
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 1, PROCEEDINGS, 2003, 2667 : 79 - 88
  • [8] Using WordNet for Concept-Based Document Indexing in Information Retrieval
    Boubekeur, Fatiha
    Boughanem, Mohand
    Tamine, Lynda
    Daoud, Mariam
    [J]. SEMAPRO 2010: THE FOURTH INTERNATIONAL CONFERENCE ON ADVANCES IN SEMANTIC PROCESSING, 2010, : 151 - 157
  • [9] Improving information retrieval by concept-based ranking
    Mehlitz, M
    Li, F
    [J]. HUMAN INTERACTION WITH MACHINES, 2006, : 167 - +
  • [10] Towards semantic search and inference in electronic medical records: An approach using concept-based information retrieval
    Koopman, Bevan
    Bruza, Peter
    Sitbon, Laurianne
    Lawley, Michael
    [J]. AUSTRALASIAN MEDICAL JOURNAL, 2012, 5 (09): : 482 - 488