Tradeoffs in Accuracy and Efficiency in Supervised Learning Methods

被引：38

作者：

Collingwood, Loren ^{[1
]}

Wilkerson, John ^{[1
]}

机构：

[1] Univ Washington, Dept Polit Sci, Box 353530,101 Gowen Hall, Seattle, WA 98195 USA

来源：

JOURNAL OF INFORMATION TECHNOLOGY & POLITICS | 2012年 / 9卷 / 03期

关键词：

Machine learning; supervised learning; text classification;

D O I：

10.1080/19331681.2012.669191

中图分类号：

G2 [信息与知识传播];

学科分类号：

05 ; 0503 ;

摘要：

Words are an increasingly important source of data for social science research. Automated classification methodologies hold the promise of substantially lowering the costs of analyzing large amounts of text. In this article, we consider a number of questions of interest to prospective users of supervised learning methods, which are used to automatically classify events based on a pre-existing classification system. Although information scientists devote considerable attention to assessing the performance of different supervised learning algorithms and feature representations, the questions asked are often less directly relevant to the more practical concerns of social scientists. The first question prospective social science users are likely to ask is, How well do such methods work? The second is, How much human labeling effort is required? The third is, How do we assess whether virgin cases have been automatically classified with sufficient accuracy? We address these questions in the context of a particular dataset-the Congressional Bills Project-which includes more than 400,000 bill titles that humans have classified into 20 policy topics. This corpus offers an unusual opportunity to assess the performance of different algorithms, the impact of sample size, and the benefits of ensemble learning as a means for estimating classification accuracy.

引用

页码：298 / 318

页数：21

共 50 条

[21] Enhancing the accuracy of knowledge discovery: a supervised learning method
Liangxi Cheng
Hongfei Lin
Feng Zhou
Zhihao Yang
Jian Wang
BMC Bioinformatics, 15
[22] Accuracy Evaluation of Prediction Using Supervised Learning Techniques
Saritha, K.
Sajimon, Abraham
PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS FOR COMPUTING RESEARCH (ICAICR '19), 2019,
[23] Enhancing the accuracy of knowledge discovery: a supervised learning method
Cheng, Liangxi
Lin, Hongfei
Zhou, Feng
Yang, Zhihao
Wang, Jian
BMC BIOINFORMATICS, 2014, 15
[24] Improving the Accuracy of Ballot Scanners Using Supervised Learning
Barretto, Sameer
Chown, William
Meyer, David
Soni, Aditya
Tata, Atreya
Halderman, J. Alex
ELECTRONIC VOTING, E-VOTE-ID 2021, 2021, 12900 : 17 - 32
[25] Enhancing the Accuracy of Knowledge Discovery: A Supervised Learning Method
Cheng, Liangxi
Lin, Hongfei
Zhou, Feng
Yang, Zhihao
2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,
[26] More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
Yi, Xinyang
Wang, Zhaoran
Yang, Zhuoran
Caramanis, Constantine
Liu, Han
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[27] On the Energy and Communication Efficiency Tradeoffs in Federated and Multi-Task Learning
Savazzi, Stefano
Rampa, Vittorio
Kianoush, Sanaz
Bennis, Mehdi
2022 IEEE 33RD ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (IEEE PIMRC), 2022, : 1431 - 1437
[28] An accuracy-maximization learning framework for supervised and semi-supervised imbalanced data
Wang, Guanjin
Wong, Kok Wai
KNOWLEDGE-BASED SYSTEMS, 2022, 255
[29] Supervised Learning Methods in Sort Yield Modeling
Hu, Helen
2009 IEEE/SEMI ADVANCED SEMICONDUCTOR MANUFACTURING CONFERENCE, 2009, : 133 - 136
[30] Subsampled Hessian Newton Methods for Supervised Learning
Wang, Chien-Chih
Huang, Chun-Heng
Lin, Chih-Jen
NEURAL COMPUTATION, 2015, 27 (08) : 1766 - 1795

← 1 2 3 4 5 →