Evaluation with Confusable Ground Truth

被引:0
|
作者
Li, Jiyi [1 ]
Yoshikawa, Masatoshi [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Yoshida Honmachi, Kyoto 6068501, Japan
关键词
Human rating; Ground truth; Evaluation; Confusability;
D O I
10.1007/978-3-319-48051-0_32
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Subjective judgment with human rating has been an important way of constructing ground truth for the evaluation in the research areas including information retrieval. Researchers aggregate the ratings of an instance into a single score by statistical measures or label aggregation methods to evaluate the proposed approaches and baselines. However, the rating distributions of instances are diverse even if the aggregated scores are same. We define a term of confusability which represents how confusable the reviewers are on the instances. We find that confusability has prominent influence on the evaluation results with a exploration study. We thus propose a novel evaluation solution with several effective confusability measures and confusability aware evaluation methods. They can be used as a supplementary to existing rating aggregation methods and evaluation methods.
引用
收藏
页码:363 / 369
页数:7
相关论文
共 50 条
  • [1] Ground truth and benchmarks for performance evaluation
    Takeuchi, A
    Shneier, M
    Hong, T
    Chang, T
    Scrapper, C
    Cheok, G
    [J]. UNMANNED GROUND VEHICLE TECHNOLOGY V, 2003, 5083 : 408 - 413
  • [2] Automatic Ground Truth Expansion for Timeline Evaluation
    McCreadie, Richard
    Macdonald, Craig
    Ounis, Iadh
    [J]. ACM/SIGIR PROCEEDINGS 2018, 2018, : 685 - 694
  • [3] Ground truth for layout analysis performance evaluation
    Antonacopoulos, A
    Karatzas, D
    Bridson, D
    [J]. DOCUMENT ANALYSIS SYSTEMS VII, PROCEEDINGS, 2006, 3872 : 302 - 311
  • [4] Binary Classifier Evaluation Without Ground Truth
    Fedorchuk, Maksym
    Lamiroy, Bart
    [J]. 2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR), 2017, : 145 - 150
  • [5] Spam Filter Evaluation with Imprecise Ground Truth
    Cormack, Gordon V.
    Kolcz, Aleksander
    [J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 604 - 611
  • [6] Ground Truth and Performance Evaluation of Lane Border Detection
    Al-Sarraf, Ali
    Shin, Bok-Suk
    Xu, Zezhong
    Klette, Reinhard
    [J]. COMPUTER VISION AND GRAPHICS, ICCVG 2014, 2014, 8671 : 66 - +
  • [7] EVALUATION OF VIDEO CODING FOR MACHINES WITHOUT GROUND TRUTH
    Fischer, Kristian
    Hofbauer, Markus
    Kuhn, Christopher
    Steinbach, Eckehard
    Kaup, Andre
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1616 - 1620
  • [8] Evaluation of HTR models without Ground Truth Material
    Strobel, Phillip Benjamin
    Clematide, Simon
    Volk, Martin
    Schwitter, Raphael
    Hodel, Tobias
    Schoch, David
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4395 - 4404
  • [9] The Effect of Ground Truth Accuracy on the Evaluation of Localization Systems
    Gu, Chen
    Shokry, Ahmed
    Youssef, Moustafa
    [J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [10] Evaluation Without Ground Truth in Social Media Research
    Zafarani, Reza
    Liu, Huan
    [J]. COMMUNICATIONS OF THE ACM, 2015, 58 (06) : 54 - 60