Iterative query selection for opaque search engines with pseudo relevance feedback

被引:0
|
作者
Reuben, Maor [1 ,2 ]
Elyashar, Aviad [1 ,3 ]
Puzis, Rami [1 ,2 ]
机构
[1] Telekom Innovat Labs, Beer Sheva, Israel
[2] Ben Gurion Univ Negev, Dept Software & Informat Syst Engn, Ben Gurion, Israel
[3] Sami Shamoon Coll Engn, Dept Comp Sci, Beer Sheva, Israel
关键词
Query selection; Opaque search engine; Pseudo relevance feedback; Fake news;
D O I
10.1016/j.eswa.2022.117027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Retrieving information from an online search engine is the first and most important step in many data mining tasks, such as fake news detection. Most of the search engines currently available on the web, including all social media platforms, are black-boxes (i.e., opaque) supporting short keyword queries. In these settings, it is challenging to retrieve all posts and comments discussing a particular news item automatically and on a large scale.In this paper, we propose a method for generating short keyword queries given a prototype document. The proposed iterative query selection (IQS) algorithm interacts with the opaque search engine to iteratively improve the query, by maximizing the number of relevant results retrieved. Our evaluation of IQS was performed on the Twitter TREC Microblog 2012 and TREC-COVID 2019 datasets and demonstrated the algorithm's superior performance compared to state-of-the-art. In addition, we implemented IQS algorithm to automatically collect a large-scale dataset for fake news detection task of about 70K true and fake news items. The dataset, which we have made publicly available to the research community, includes over 22M accounts and 61M tweets. We demonstrate the usefulness of the dataset for fake news detection task achieving state-of-the-art performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Semantics-aware query expansion using pseudo-relevance feedback
    Singh, Pankaj
    Bhowmick, Plaban Kumar
    JOURNAL OF INFORMATION SCIENCE, 2023,
  • [22] Pseudo-relevance feedback based query expansion using boosting algorithm
    Rasheed, Imran
    Banka, Haider
    Khan, Hamaid Mahmood
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 6101 - 6124
  • [23] Improving Pseudo-Relevance Feedback via Tweet Selection
    Miyanishi, Taiki
    Seki, Kazuhiro
    Uehara, Kuniaki
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 439 - 448
  • [24] Query recommendation using query logs in search engines
    BaezaYates, R
    Hurtado, C
    Mendoza, M
    CURRENT TRENDS IN DATABASE TECHNOLOGY - EDBT 2004 WORKSHOPS, PROCEEDINGS, 2004, 3268 : 588 - 596
  • [25] Query recommendation using query logs in search engines
    Baeza-Yates, Ricardo
    Hurtado, Carlos
    Mendoza, Marcelo
    De Chile, Universidad
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3268 : 588 - 596
  • [26] Evaluation of Pseudo Relevance Feedback Techniques for Cross Vertical Aggregated Search
    Ziak, Hermann
    Kern, Roman
    EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION, 2015, 9283 : 92 - 103
  • [27] APRF-Net: Attentive Pseudo-Relevance Feedback Network for Query Categorization
    Ahmadvand, Ali
    Zahiri, Sayyed M.
    Hughes, Simon
    Al Jadda, Khalifa
    Kallumadi, Surya
    Agichtein, Eugene
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1603 - 1607
  • [28] Improving search engines by query clustering
    Baeza-Yates, Ricardo
    Hurtado, Carlos
    Mendoza, Marcelo
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (12): : 1793 - 1804
  • [29] Finding a Good Query-Related Topic for Boosting Pseudo-Relevance Feedback
    Ye, Zheng
    Huang, Jimmy Xiangji
    Lin, Hongfei
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2011, 62 (04): : 748 - 760
  • [30] Rising relevance in search engines
    Notess, GR
    ONLINE, 1999, 23 (03): : 84 - 86