GRASP: a goodness-of-fit test for classification learning

被引:0
|
作者
Javanmard, Adel [1 ,2 ]
Mehrabi, Mohammad [1 ]
机构
[1] Univ Southern Calif, Data Sci & Operat Dept, Los Angeles, CA USA
[2] Univ Southern Calif, Data Sci & Operat Dept, 300 Bridge Hall,3670 Trousdale Pkwy, Los Angeles, CA 90089 USA
关键词
classification; goodness-of-fit; hypothesis testing; model-X; FALSE DISCOVERY RATE; LOGISTIC-REGRESSION; MODELS; INFERENCE; ERROR;
D O I
10.1093/jrsssb/qkad106
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterising the fit of the model to the underlying conditional law of labels given the features vector (Y | X), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessing the goodness-of-fit for a general binary classifier. Our framework does not make any parametric assumption on the conditional law Y | X and treats that as a black-box oracle model which can be accessed only through queries. We formulate the goodness-of-fit assessment problem as a tolerance hypothesis testing of the form H0 : E[Df (Bern(.(X)).Bern(...(X)))] = t where Df represents an f-divergence function, and.(x),...(x), respectively, denote the true and an estimate likelihood for a feature vector x admitting a positive label. We propose a novel test, called Goodness-of-fit with Randomisation and Scoring Procedure (GRASP) for testing H-0, which works in finite sample settings, no matter the features (distribution-free). We also propose model-X GRASP designed for model-X settings where the joint distribution of the features vector is known. Model-X GRASP uses this distributional information to achieve better power. We evaluate the performance of our tests through extensive numerical experiments.
引用
收藏
页码:215 / 245
页数:31
相关论文
共 50 条