Evaluation with Confusable Ground Truth

被引:0
|
作者
Li, Jiyi [1 ]
Yoshikawa, Masatoshi [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Yoshida Honmachi, Kyoto 6068501, Japan
关键词
Human rating; Ground truth; Evaluation; Confusability;
D O I
10.1007/978-3-319-48051-0_32
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Subjective judgment with human rating has been an important way of constructing ground truth for the evaluation in the research areas including information retrieval. Researchers aggregate the ratings of an instance into a single score by statistical measures or label aggregation methods to evaluate the proposed approaches and baselines. However, the rating distributions of instances are diverse even if the aggregated scores are same. We define a term of confusability which represents how confusable the reviewers are on the instances. We find that confusability has prominent influence on the evaluation results with a exploration study. We thus propose a novel evaluation solution with several effective confusability measures and confusability aware evaluation methods. They can be used as a supplementary to existing rating aggregation methods and evaluation methods.
引用
收藏
页码:363 / 369
页数:7
相关论文
共 50 条
  • [41] Ground-truth and Metric for the Evaluation of Arabic Handwritten Character Segmentation
    Elarian, Yousef
    Zidouri, Abdelmalek
    Al-Khatib, Wasfi
    [J]. 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2014, : 766 - 770
  • [42] Effects of Improper Ground Truth on Target Tracking Performance Evaluation in Benchmark
    Liu, Gaocheng
    Liu, Shuai
    Lu, Mengye
    Pan, Zheng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C), 2017, : 261 - 266
  • [43] Introspective Evaluation of Perception Performance for Parameter Tuning without Ground Truth
    Hu, Humphrey
    Kantor, George
    [J]. ROBOTICS: SCIENCE AND SYSTEMS XIII, 2017,
  • [44] Evaluation of some ground truth designs for satellite estimates of rain rate
    Ha, E
    North, GR
    Yoo, C
    Ha, KJ
    [J]. JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY, 2002, 19 (01) : 65 - 73
  • [45] Positioning Evaluation and Ground Truth Definition for Real Life Use Cases
    de la Osa, Carlos Martinez
    Anagnostopoulos, Grigorios G.
    Togneri, Mauricio
    Deriaz, Michel
    Konstantas, Dimitri
    [J]. 2016 INTERNATIONAL CONFERENCE ON INDOOR POSITIONING AND INDOOR NAVIGATION (IPIN), 2016,
  • [46] Evaluation of some ground truth designs for satellite estimates of rain rate
    [J]. North, G.R. (northead@ariel.tamu.edu), 1600, American Meteorological Society (19):
  • [47] Performance Evaluation of Visual Odometry using an Industrial Robot as Ground Truth
    Rossi, Roberto
    Melacarne, Giulio
    Rocco, Paolo
    [J]. PROCEEDINGS OF THE IECON 2016 - 42ND ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2016, : 761 - 766
  • [48] Introspective evaluation of perception performance for parameter tuning without ground truth
    Hu, Humphrey
    Kantor, George
    [J]. Robotics: Science and Systems, 2017, 13
  • [49] Statistic Metrics for Evaluation of Binary Classifiers without Ground-Truth
    Fedorchuk, Maksym
    Lamiroy, Bart
    [J]. 2017 IEEE FIRST UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON), 2017, : 1066 - 1071
  • [50] Objective Performance Evaluation of Video Segmentation Algorithms with Ground-Truth
    杨高波
    张兆扬
    [J]. Advances in Manufacturing, 2004, (01) : 70 - 74