Automatic search from streaming data

被引:0
|
作者
Anni R. Coden
Eric W. Brown
机构
[1] IBM,T.J. Watson Research Center
来源
Information Retrieval | 2006年 / 9卷
关键词
Speech retrieval; Text mining; Information retrieval;
D O I
暂无
中图分类号
学科分类号
摘要
Streaming data poses a variety of new and interesting challenges for information retrieval and text analysis. Unlike static document collections, which are typically analyzed and indexed off-line to support ad-hoc queries, streaming data often must be analyzed on the fly and acted on as the data passes through the analysis system. Speech is one example of streaming data that is a challenge to exploit, yet has significant potential to provide value in a knowledge management system. We are specifically interested in techniques that analyze streaming data and automatically find collateral information, or information that clarifies, expands, and generally enhances the value of the streaming data. We present a system that analyzes a data stream and automatically finds documents related to the current topic of discussion in the data stream. Experimental results show that the system generates result lists with an average precision at 10 hits of better than 60%. We also present a hit-list re-ranking technique based on named entity analysis and automatic text categorization that can improve the search results by 6%–12%.
引用
收藏
页码:95 / 109
页数:14
相关论文
共 50 条
  • [31] Automatic extraction of user's search intention from web search logs
    Park, Kinam
    Jee, Hyesung
    Lee, Taemin
    Jung, Soonyoung
    Lim, Heuiseok
    MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 61 (01) : 145 - 162
  • [32] Density estimation from streaming data using wavelets
    Wegman, Edward J.
    Caudle, Kyle A.
    COMPSTAT 2006: PROCEEDINGS IN COMPUTATIONAL STATISTICS, 2006, : 231 - +
  • [33] Streaming Principal Component Analysis From Incomplete Data
    Eftekhari, Armin
    Ongie, Gregory
    Balzano, Laura
    Wakin, Michael B.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [34] IoT streaming data integration from multiple sources
    Doan Quang Tu
    A. S. M. Kayes
    Wenny Rahayu
    Kinh Nguyen
    Computing, 2020, 102 : 2299 - 2329
  • [35] Online Reconstruction of Complex Networks From Streaming Data
    Wu, Kai
    Hao, Xingxing
    Liu, Jing
    Liu, Penghui
    Shen, Fang
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (06) : 5136 - 5147
  • [36] Sampling from a moving window over streaming data
    Babcock, B
    Datar, M
    Motwani, R
    PROCEEDINGS OF THE THIRTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2002, : 633 - 634
  • [37] Sparse Travel Time Estimation from Streaming Data
    Jabari, Saif Eddin
    Freris, Nikolaos M.
    Dilip, Deepthi Mary
    TRANSPORTATION SCIENCE, 2020, 54 (01) : 1 - 20
  • [38] Correlated Anomaly Detection from Large Streaming Data
    Chen, Zheng
    Yu, Xinli
    Ling, Yuan
    Song, Bo
    Quan, Wei
    Hu, Xiaohua
    Yan, Erjia
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 982 - 992
  • [39] Stratified random sampling from streaming and stored data
    Nguyen, Trong Duc
    Shih, Ming-Hung
    Srivastava, Divesh
    Tirthapura, Srikanta
    Xu, Bojian
    DISTRIBUTED AND PARALLEL DATABASES, 2021, 39 (03) : 665 - 710
  • [40] Stratified random sampling from streaming and stored data
    Trong Duc Nguyen
    Ming-Hung Shih
    Divesh Srivastava
    Srikanta Tirthapura
    Bojian Xu
    Distributed and Parallel Databases, 2021, 39 : 665 - 710