Do We Need More Training Data?

被引:0
|
作者
Xiangxin Zhu
Carl Vondrick
Charless C. Fowlkes
Deva Ramanan
机构
[1] UC Irvine,Department of Computer Science
[2] MIT,CSAIL
来源
关键词
Object detection; Mixture models; Part models;
D O I
暂无
中图分类号
学科分类号
摘要
Datasets for training object recognition systems are steadily increasing in size. This paper investigates the question of whether existing detectors will continue to improve as data grows, or saturate in performance due to limited model complexity and the Bayes risk associated with the feature spaces in which they operate. We focus on the popular paradigm of discriminatively trained templates defined on oriented gradient features. We investigate the performance of mixtures of templates as the number of mixture components and the amount of training data grows. Surprisingly, even with proper treatment of regularization and “outliers”, the performance of classic mixture models appears to saturate quickly (∼10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sim }10$$\end{document} templates and ∼100\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sim }100$$\end{document} positive training examples per template). This is not a limitation of the feature space as compositional mixtures that share template parameters via parts and that can synthesize new templates not encountered during training yield significantly better performance. Based on our analysis, we conjecture that the greatest gains in detection performance will continue to derive from improved representations and learning algorithms that can make efficient use of large datasets.
引用
收藏
页码:76 / 92
页数:16
相关论文
共 50 条
  • [1] Do We Need More Training Data?
    Zhu, Xiangxin
    Vondrick, Carl
    Fowlkes, Charless C.
    Ramanan, Deva
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 119 (01) : 76 - 92
  • [2] DO WE REALLY NEED MORE TRAINING DATA FOR OBJECT LOCALIZATION
    Li, Hongyang
    Liu, Yu
    Zhang, Xin
    An, Zhecheng
    Wang, Jingjing
    Chen, Yibo
    Tong, Jihong
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 775 - 779
  • [3] Do We Need More Training Data or Better Models for Object Detection?
    Zhu, Xiangxin
    Vondrick, Carl
    Ramanan, Deva
    Fowlkes, Charless C.
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
  • [4] We Don't Need Replication, but We Do Need More Data
    Francis, Gregory
    [J]. EUROPEAN JOURNAL OF PERSONALITY, 2013, 27 (02) : 125 - 126
  • [5] HOW MUCH MORE DATA DO WE NEED
    WELSH, RS
    [J]. AMERICAN PSYCHOLOGIST, 1976, 31 (07) : 538 - 538
  • [6] Do We Need More Training Samples For Text Classification?
    Zheng, Wanwan
    Jin, Mingzhe
    [J]. PROCEEDINGS OF 2018 ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE (AICCC 2018), 2018, : 121 - 128
  • [7] SEDATION IN THE ENDOSCOPY DEPARTMENT - DO WE NEED MORE TRAINING?
    Mohanaruban, A.
    Bryce, K.
    Radhakrishnan, A.
    Gallaher, J.
    Trembling, P.
    Johnson, G.
    [J]. GUT, 2014, 63 : A47 - A47
  • [8] But We Need to Do More ...
    Dimick, Justin B.
    [J]. ANNALS OF SURGERY, 2020, 272 (04) : E263 - E263
  • [9] Communicating with patients' families and relatives: Do we need more training?
    Razai, Mohammad S.
    [J]. MEDICAL TEACHER, 2018, 40 (08) : 870 - 870
  • [10] Do we need more engineers?
    Sharp, John
    [J]. MATERIALS WORLD, 2009, 17 (11) : 3 - 3