Reasoning about Textual Similarity in a Web-Based Information Access System

被引:4
|
作者
Cohen W.W. [1 ]
机构
[1] AT and T Labs. - Research, Florham Park, NJ 07932
关键词
Information agent; Information extraction; Information integration; Information retrieval; Ranked retrieval; Similarity;
D O I
10.1023/A:1010031208520
中图分类号
学科分类号
摘要
The degree to which information sources are pre-processed by Web-based information systems varies greatly. In search engines like Altavista, little pre-processing is done, while in "knowledge integration" systems, complex site-specific "wrappers" are used to integrate different information sources into a common database representation. In this paper we describe an intermediate point between these two models. In our system, information sources are converted into a highly structured collection of small fragments of text. Database-like queries to this structured collection of text fragments are approximated using a novel logic called WHIRL, which combines inference in the style of deductive databases with ranked retrieval methods from information retrieval (IR). WHIRL allows queries that integrate information from multiple Web sites, without requiring the extraction and normalization of object identifiers that can be used as keys; instead, operations that in conventional databases require equality tests on keys are approximated using IR similarity metrics for text. This leads to a reduction in the amount of human engineering required to field a knowledge integration system. Experimental evidence is given showing that many information sources can be easily modeled with WHIRL, and that inferences in the logic are both accurate and efficient.
引用
下载
收藏
页码:65 / 86
页数:21
相关论文
共 50 条
  • [41] Student teachers' acceptance of a web-based information system
    Cheung, Emily Yee Man
    Sachs, John
    PSYCHOLOGIA, 2006, 49 (02) : 132 - 141
  • [42] A GIS web-based traffic accident information system
    Evangelidis, K.
    Basbas, S.
    Papaioannou, P.
    INTERNET SOCIETY II: ADVANCES IN EDUCATION, COMMERCE & GOVERNANCE, 2006, 36 : 363 - 372
  • [43] Web-Based Geographic Information System for UWRSR Evaluations
    Zeng, W. H.
    Zhang, Y. J.
    Liu, J. L.
    Yang, Z. F.
    JOURNAL OF ENVIRONMENTAL INFORMATICS, 2007, 10 (02) : 75 - 81
  • [44] The information feedback system of Tsinghua web-based school
    Huang, L
    Jiang, DX
    Luo, NL
    INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION, VOLS I AND II, PROCEEDINGS, 2002, : 627 - 631
  • [45] WebMIRS: Web-based Medical Information Retrieval System
    Long, LR
    Pillemer, SR
    Lawrence, RC
    Goh, GH
    Neve, L
    Thoma, GR
    STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES VI, 1997, 3312 : 392 - 403
  • [46] Web-Based Ordering Information System on Food Store
    Herikson, R.
    Kurniati, P. S.
    2ND INTERNATIONAL CONFERENCE ON INFORMATICS, ENGINEERING, SCIENCE, AND TECHNOLOGY (INCITEST 2019), 2019, 662
  • [47] DWINS: A dynamically configurable web-based information system
    Dong, XJ
    Du, F
    Ni, LM
    WECWIS 2000: SECOND INTERNATIONAL WORKSHOP ON ADVANCED ISSUES OF E-COMMERCE AND WEB-BASED INFORMATION SYSTEMS, PROCEEDING, 2000, : 85 - 92
  • [48] Design and Implementation of a Web-Based Faculty Information System
    Franco, Geanne Ross L.
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [49] SIS - a web-based Radiation Therapy Information System
    Baier, K.
    Flentje, M.
    STRAHLENTHERAPIE UND ONKOLOGIE, 2011, 187 : 12 - 12
  • [50] Web-based management information system of relay protection
    Wang, Yuansheng
    Dianli Xitong Zidonghue/Automation of Electric Power Systems, 2001, 25 (05): : 64 - 66