Crowdsourced Top-k Algorithms: An Experimental Evaluation

被引：31

作者：

Zhang, Xiaohang ^{[1
]}

Li, Guoliang ^{[1
]}

Feng, Jianhua ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci, Tsinghua Natl Lab Informat Sci & Technol TNList, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE VLDB ENDOWMENT | 2016年 / 9卷 / 08期

关键词：

D O I：

10.14778/2921558.2921559

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Crowdsourced top-k computation has attracted significant attention recently, thanks to emerging crowdsourcing platforms, e. g., Amazon Mechanical Turk and CrowdFlower. Crowdsourced top-k algorithms ask the crowd to compare the objects and infer the top-k objects based on the crowdsourced comparison results. The crowd may return incorrect answers, but traditional top-k algorithms cannot tolerate the errors from the crowd. To address this problem, the database and machine-learning communities have independently studied the crowdsourced top-k problem. The database community proposes the heuristic-based solutions while the machine-learning community proposes the learning-based methods (e. g., maximum likelihood estimation). However, these two types of techniques have not been compared systematically under the same experimental framework. Thus it is rather difficult for a practitioner to decide which algorithm should be adopted. Furthermore, the experimental evaluation of existing studies has several weaknesses. Some methods assume the crowd returns high-quality results and some algorithms are only tested on simulated experiments. To alleviate these limitations, in this paper we present a comprehensive comparison of crowdsourced top-k algorithms. Using various synthetic and real datasets, we evaluate each algorithm in terms of result quality and efficiency on real crowdsourcing platforms. We reveal the characteristics of different techniques and provide guidelines on selecting appropriate algorithms for various scenarios.

引用

页码：612 / 623

页数：12

共 50 条

[41] Evaluating Top-k Algorithms with Various Sources of Data and User Preferences
Eckhardt, Alan
Hornicak, Erik
Vojtas, Peter
[J]. FLEXIBLE QUERY ANSWERING SYSTEMS, 2011, 7022 : 258 - 269
[42] Optimal algorithms for selecting top-k combinations of attributes: theory and applications
Lin, Chunbin
Lu, Jiaheng
Wei, Zhewei
Wang, Jianguo
Xiao, Xiaokui
[J]. VLDB JOURNAL, 2018, 27 (01): : 27 - 52
[43] Top-k overlapping densest subgraphs: approximation algorithms and computational complexity
Dondi, Riccardo
Hosseinzadeh, Mohammad Mehdi
Mauri, Giancarlo
Zoppis, Italo
[J]. JOURNAL OF COMBINATORIAL OPTIMIZATION, 2021, 41 (01) : 80 - 104
[44] Optimal algorithms for selecting top-k combinations of attributes: theory and applications
Chunbin Lin
Jiaheng Lu
Zhewei Wei
Jianguo Wang
Xiaokui Xiao
[J]. The VLDB Journal, 2018, 27 : 27 - 52
[45] Top-k overlapping densest subgraphs: approximation algorithms and computational complexity
Riccardo Dondi
Mohammad Mehdi Hosseinzadeh
Giancarlo Mauri
Italo Zoppis
[J]. Journal of Combinatorial Optimization, 2021, 41 : 80 - 104
[46] Efficient Algorithms for Skyline Top-K Keyword Queries on XML Streams
Li, Lingli
Wang, Hongzhi
Li, Jianzhong
Gao, Hong
[J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 283 - 287
[47] Anytime measures for top-k algorithms on exact and fuzzy data sets
Benjamin Arai
Gautam Das
Dimitrios Gunopulos
Nick Koudas
[J]. The VLDB Journal, 2009, 18 : 407 - 427
[48] Efficient Top-k Query Processing Algorithms in Highly Distributed Environments
Fang, Qiming
Yang, Guangwen
[J]. JOURNAL OF COMPUTERS, 2014, 9 (09) : 2000 - 2006
[49] Anytime measures for top-k algorithms on exact and fuzzy data sets
Arai, Benjamin
Das, Gautam
Gunopulos, Dimitrios
Koudas, Nick
[J]. VLDB JOURNAL, 2009, 18 (02): : 407 - 427
[50] Top-K Oracle: A New Way to Present Top-K Tuples for Uncertain Data
Song, Chunyao
Li, Zheng
Ge, Tingjian
[J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 146 - 157

← 1 2 3 4 5 →