A faceted approach to information retrieval

被引:1
|
作者
Chen, Patrick S. [1 ]
机构
[1] Tatung Univ, Dept Informat Management, 40 Sect 3,Zhongshan N Rd, Taipei 104, Taiwan
来源
关键词
Facet analysis method; search engine; text retrieval; faceted query; e-Detective;
D O I
10.1080/02522667.2008.10699825
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Information retrieval is the core of many applications such as natural language understanding, document clustering, etc. The Boolean model is one of the basic approaches of information retrieval and faceted query belongs to the category of this model. Unfortunately, the characteristic of binary decision has hindered the faceted query from prevailing use. One of the disadvantages is that it often responds to the query with a large quantity of data that should be filtered manually. In this paper we suggest a new faceted method, Facet Analysis Method, to cope with the problem, in which we give Boolean operators an algebraic interpretation to facilitate partial match that is the key feature of Information retrieval. In our approach, a query specifies an ideal document to be searched and retrieval is a ranking among documents in the collection based on their similarities to the ideal document. The Facet Analysis Method enjoys the property of reflexivity, symmetry, and conditioned transitivity, permitting a large variety of applications. To illustrate the applicability and novelty of our approach, we have applied Facet Analysis Method to construct a high-precision, special-purpose search engine, the e-Detective system, which is able to collect crime information from the Internet automatically. The match program of the search engine is described where we first organize search concepts to represent our data request and use Facet Analysis Method to calculate similarities between the search target and Web pages. We further describe the way of error correction and feedback mechanism for tuning term weights to enhance the retrieval efficacy. The system is tested by a difficult and complex search task, finding Web pages auctioning pirated compact discs. The experiment result is evaluated in the notion of the well-recognized recall/precision measure, where we obtain the results with average search precision 0.59, showing the superiority of this new method.
引用
收藏
页码:631 / 658
页数:28
相关论文
共 50 条
  • [1] The need for a faceted methods of information retrieval
    Broughton, Vanda
    [J]. ASLIB PROCEEDINGS, 2006, 58 (1-2): : 49 - 72
  • [2] Faceted classification and logical division in information retrieval
    Mills, J
    [J]. LIBRARY TRENDS, 2004, 52 (03) : 541 - 570
  • [3] USING FACETED THESAURUS STRUCTURES FOR CORPORATE INFORMATION-RETRIEVAL
    ROCKMORE, M
    [J]. PROCEEDINGS OF THE ASIS ANNUAL MEETING, 1990, 27 : 346 - 346
  • [4] A faceted approach to conceptualizing tasks in information seeking
    Li, Yuelin
    Belkin, Nicholas J.
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (06) : 1822 - 1837
  • [5] SYSTEMATIC METHOD FOR INDEXING AND RETRIEVAL OF PATENTS USING A FACETED APPROACH
    HUNTER, PS
    [J]. JOURNAL OF CHEMICAL DOCUMENTATION, 1971, 11 (04): : 221 - &
  • [6] A Multi Faceted Recommendation Approach for Explorative Video Retrieval Tasks
    Vallet, David
    Halvey, Martin
    Hannah, David
    Jose, Joemon M.
    [J]. IUI 2010, 2010, : 389 - 392
  • [7] Information in Civil Societies - a multi-faceted approach
    Narayan, Bhuva
    [J]. COSMOPOLITAN CIVIL SOCIETIES-AN INTERDISCIPLINARY JOURNAL, 2013, 5 (03): : I - II
  • [8] Multi-faceted information retrieval system for large scale email archives
    Perkiö, J
    Tuulos, V
    Buntine, W
    Tirri, H
    [J]. 2005 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS, 2005, : 557 - 564
  • [9] An outranking approach for information retrieval
    Farah, Mohamed
    Vanderpooten, Daniel
    [J]. INFORMATION RETRIEVAL, 2008, 11 (04): : 315 - 334
  • [10] A dialectical approach to information retrieval
    Thornley, Clare
    Gibb, Forbes
    [J]. JOURNAL OF DOCUMENTATION, 2007, 63 (05) : 755 - 764