Temporal-Textual Retrieval: Time and Keyword Search in Web Documents

被引:0
|
作者
Khodaei, Ali [1 ]
Shahabi, Cyrus [1 ,3 ,4 ]
Khodaei, Amir [2 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90007 USA
[2] Univ Calif Berkeley, Elect Engn & Comp Sci Dept, Berkeley, CA 94720 USA
[3] Univ Southern Calif, Comp Sci & Elect Engn, Los Angeles, CA USA
[4] Univ Southern Calif, NSFs Integrated Media Syst Ctr IMSC, Los Angeles, CA USA
关键词
Web Search; Time-aware ranking; Indexing; Temporal information retrieval;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As the web ages, many web documents become relevant only to certain time periods, such as web-pages containing news and events or those documenting natural phenomena. Hence, to retrieve the most relevant pages, in addition to providing the relevant keywords, one may desire to identify the relevant time period(s) as well, e.g., "Barack Obama 1980-1985". Unfortunately, not much work has been done by industry or academia to support this type of searches. To the best of our knowledge, the only way that some search engines exploit the time information in the user query is to filter out those resulting web pages whose publication/modification time are not within the queried time interval. In this paper, we propose a new indexing and ranking framework for temporal-textual retrieval. The framework leverages the classical vector space model and provides a complete scheme for indexing, query processing and ranking of the temporal-textual queries. We propose a variety of approaches to exploit popular keyword and temporal index structures. We present a novel hybrid index structure which indexes both the temporal and the textual aspects of the documents in a unified, integrated manner. We also study how to rank documents by seamlessly combining their temporal and textual features. We develop a new scoring schema called temporal tf-idf to compute the temporal relevance of a document to a query, and we combine this score with the textual relevance to compute the overall relevance score of the document to the query. We present both a cost model analysis and an extensive set of experiments over real-world datasets (New York Times Annotated Corpus and Freebase) to evaluate the proposed framework and demonstrate its efficiency and effectiveness.
引用
收藏
页码:288 / +
页数:25
相关论文
共 50 条
  • [21] Effective Keyword Search for Candidate Fragments of XML Documents
    Wen, Yanlong
    Zhang, Haiwei
    Zhang, Ying
    Zhang, Lu
    Xu, Lei
    Yuan, Xiaojie
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2011, 2011, 6637 : 427 - 439
  • [22] Downloading textual hidden web content through keyword queries
    Ntoulas, A
    Zerfos, P
    Cho, J
    PROCEEDINGS OF THE 5TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, PROCEEDINGS, 2005, : 100 - 109
  • [23] Semantic Retrieval Approach for Web Documents
    Harb, Hany M.
    Fouad, Khaled M.
    Nagdy, Nagdy M.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (09) : 67 - 76
  • [24] Visual keyword image retrieval based on synergetic neural network for web-based image search
    Zhao, T
    Tang, LH
    Ip, HHS
    Qi, FH
    REAL-TIME SYSTEMS, 2001, 21 (1-2) : 127 - 142
  • [25] A Keyword Search Approach for Semantic Web Data
    Rihany, Mohamad
    Kedad, Zoubida
    Lopes, Stephane
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 131 - 143
  • [26] Visual Keyword Image Retrieval Based on Synergetic Neural Network for Web-Based Image Search
    Tong Zhao
    Lilian H. Tang
    Horace H. S. Ip
    Feihu Qi
    Real-Time Systems, 2001, 21 : 127 - 142
  • [27] SKIF-P: a point-based indexing and ranking of web documents for spatial-keyword search
    Khodaei, Ali
    Shahabi, Cyrus
    Li, Chen
    GEOINFORMATICA, 2012, 16 (03) : 563 - 596
  • [28] Semantic and Keyword Based Web Techniques in Information Retrieval
    Mala, Vajenti
    Lobiyal, D. K.
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 23 - 26
  • [29] SKIF-P: a point-based indexing and ranking of web documents for spatial-keyword search
    Ali Khodaei
    Cyrus Shahabi
    Chen Li
    GeoInformatica, 2012, 16 : 563 - 596
  • [30] Temporal Keyword Search with Aggregates and Group-By
    Gao, Qiao
    Lee, Mong Li
    Ling, Tok Wang
    CONCEPTUAL MODELING, ER 2021, 2021, 13011 : 160 - 175