Evaluation with Confusable Ground Truth

被引：0

作者：

Li, Jiyi ^{[1
]}

Yoshikawa, Masatoshi ^{[1
]}

机构：

[1] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Yoshida Honmachi, Kyoto 6068501, Japan

来源：

INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2016 | 2016年 / 9994卷

关键词：

Human rating; Ground truth; Evaluation; Confusability;

D O I：

10.1007/978-3-319-48051-0_32

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Subjective judgment with human rating has been an important way of constructing ground truth for the evaluation in the research areas including information retrieval. Researchers aggregate the ratings of an instance into a single score by statistical measures or label aggregation methods to evaluate the proposed approaches and baselines. However, the rating distributions of instances are diverse even if the aggregated scores are same. We define a term of confusability which represents how confusable the reviewers are on the instances. We find that confusability has prominent influence on the evaluation results with a exploration study. We thus propose a novel evaluation solution with several effective confusability measures and confusability aware evaluation methods. They can be used as a supplementary to existing rating aggregation methods and evaluation methods.

引用

页码：363 / 369

页数：7

共 50 条

[1] Ground truth and benchmarks for performance evaluation
Takeuchi, A
Shneier, M
Hong, T
Chang, T
Scrapper, C
Cheok, G
[J]. UNMANNED GROUND VEHICLE TECHNOLOGY V, 2003, 5083 : 408 - 413
[2] Automatic Ground Truth Expansion for Timeline Evaluation
McCreadie, Richard
Macdonald, Craig
Ounis, Iadh
[J]. ACM/SIGIR PROCEEDINGS 2018, 2018, : 685 - 694
[3] Ground truth for layout analysis performance evaluation
Antonacopoulos, A
Karatzas, D
Bridson, D
[J]. DOCUMENT ANALYSIS SYSTEMS VII, PROCEEDINGS, 2006, 3872 : 302 - 311
[4] Binary Classifier Evaluation Without Ground Truth
Fedorchuk, Maksym
Lamiroy, Bart
[J]. 2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR), 2017, : 145 - 150
[5] Spam Filter Evaluation with Imprecise Ground Truth
Cormack, Gordon V.
Kolcz, Aleksander
[J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 604 - 611
[6] Ground Truth and Performance Evaluation of Lane Border Detection
Al-Sarraf, Ali
Shin, Bok-Suk
Xu, Zezhong
Klette, Reinhard
[J]. COMPUTER VISION AND GRAPHICS, ICCVG 2014, 2014, 8671 : 66 - +
[7] EVALUATION OF VIDEO CODING FOR MACHINES WITHOUT GROUND TRUTH
Fischer, Kristian
Hofbauer, Markus
Kuhn, Christopher
Steinbach, Eckehard
Kaup, Andre
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1616 - 1620
[8] Evaluation of HTR models without Ground Truth Material
Strobel, Phillip Benjamin
Clematide, Simon
Volk, Martin
Schwitter, Raphael
Hodel, Tobias
Schoch, David
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4395 - 4404
[9] The Effect of Ground Truth Accuracy on the Evaluation of Localization Systems
Gu, Chen
Shokry, Ahmed
Youssef, Moustafa
[J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
[10] Evaluation Without Ground Truth in Social Media Research
Zafarani, Reza
Liu, Huan
[J]. COMMUNICATIONS OF THE ACM, 2015, 58 (06) : 54 - 60

← 1 2 3 4 5 →