Tradeoffs in Accuracy and Efficiency in Supervised Learning Methods

被引:38
|
作者
Collingwood, Loren [1 ]
Wilkerson, John [1 ]
机构
[1] Univ Washington, Dept Polit Sci, Box 353530,101 Gowen Hall, Seattle, WA 98195 USA
关键词
Machine learning; supervised learning; text classification;
D O I
10.1080/19331681.2012.669191
中图分类号
G2 [信息与知识传播];
学科分类号
05 ; 0503 ;
摘要
Words are an increasingly important source of data for social science research. Automated classification methodologies hold the promise of substantially lowering the costs of analyzing large amounts of text. In this article, we consider a number of questions of interest to prospective users of supervised learning methods, which are used to automatically classify events based on a pre-existing classification system. Although information scientists devote considerable attention to assessing the performance of different supervised learning algorithms and feature representations, the questions asked are often less directly relevant to the more practical concerns of social scientists. The first question prospective social science users are likely to ask is, How well do such methods work? The second is, How much human labeling effort is required? The third is, How do we assess whether virgin cases have been automatically classified with sufficient accuracy? We address these questions in the context of a particular dataset-the Congressional Bills Project-which includes more than 400,000 bill titles that humans have classified into 20 policy topics. This corpus offers an unusual opportunity to assess the performance of different algorithms, the impact of sample size, and the benefits of ensemble learning as a means for estimating classification accuracy.
引用
收藏
页码:298 / 318
页数:21
相关论文
共 50 条
  • [1] The Learning (and not) of Effort and Accuracy Tradeoffs
    Larson, Jeffrey S.
    ADVANCES IN CONSUMER RESEARCH, VOL 35, 2008, 35 : 820 - 821
  • [2] Efficiency and accuracy tradeoffs in using projections for motion estimation
    Robinson, D
    Milanfar, P
    CONFERENCE RECORD OF THE THIRTY-FIFTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2001, : 545 - 550
  • [3] Physics-supervised deep learning-based optimization (PSDLO) with accuracy and efficiency
    Li, Xiaowen
    Chang, Lige
    Cao, Yajun
    Lu, Junqiang
    Lu, Xiaoli
    Jiang, Hanging
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (35)
  • [4] Accuracy estimation for supervised learning algorithms
    Glover, CW
    Oblow, EM
    Rao, NSV
    APPLICATIONS AND SCIENCE OF ARTIFICIAL NEURAL NETWORKS III, 1997, 3077 : 794 - 802
  • [5] Accuracy vs Speed: Evaluation of tradeoffs in atmospheric correction methods
    Cairns, B
    Carlson, BE
    Ying, R
    Laveigne, J
    ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY VIII, 2002, 4725 : 427 - 437
  • [6] Deep learning methods for improving the accuracy and efficiency of pathological image analysis
    Huang, Tangsen
    Huang, Xingru
    Yin, Haibing
    SCIENCE PROGRESS, 2025, 108 (01)
  • [7] Approximation Methods for Supervised Learning
    Ronald DeVore
    Gerard Kerkyacharian
    Dominique Picard
    Vladimir Temlyakov
    Foundations of Computational Mathematics, 2006, 6 : 3 - 58
  • [8] Machine learning: supervised methods
    Danilo Bzdok
    Martin Krzywinski
    Naomi Altman
    Nature Methods, 2018, 15 : 5 - 6
  • [9] Machine learning: supervised methods
    Bzdok, Danilo
    Krzywinski, Martin
    Altman, Naomi
    NATURE METHODS, 2018, 15 (01) : 5 - 6
  • [10] Approximation methods for supervised learning
    DeVore, R
    Kerkyacharian, G
    Picard, D
    Temlyakov, V
    FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2006, 6 (01) : 3 - 58