A multiple testing protocol for exploratory data analysis and the local misclassification rate

被引:1
|
作者
Watts, David D. [1 ]
Habiger, Joshua D. [1 ]
机构
[1] Oklahoma State Univ, Dept Stat, Stillwater, OK 74078 USA
关键词
Classification; False discovery rate; Local false discovery rate; Local misclassification rate; Statistical significance; FALSE DISCOVERY RATE; P-VALUES; HYPOTHESIS;
D O I
10.1080/03610926.2017.1361982
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A false discovery rate (FDR) procedure is often employed in exploratory data analysis to determine which among thousands or millions of attributes are worthy of follow-up analysis. However, these methods tend to discover the most statistically significant attributes, which need not be the most worthy of further exploration. This article provides a new FDR-controlling method that allows for the nature of the exploratory analysis to be considered when determining which attributes are discovered. To illustrate, a study in which the objective is to classify discoveries into one of several clusters is considered, and a new FDR method that minimizes the misclassification rate is developed. It is shown analytically and with simulation that the proposed method performs better than competing methods.
引用
下载
收藏
页码:3588 / 3604
页数:17
相关论文
共 50 条