Maximally selected chi-square statistics for ordinal variables

被引:23
|
作者
Boulesteix, AL [1 ]
机构
[1] Univ Munich, Dept Stat, D-80799 Munich, Germany
关键词
association test; contingency table; exact distribution; variable selection; selection bias;
D O I
10.1002/bimj.200510161
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The association between a binary variable Y and a variable X having an at least ordinal measurement scale might be examined by selecting a cutpoint in the range of X and then performing an association test for the obtained 2 x 2 contingency table using the chi-square statistic. The distribution of the maximally selected chi-square statistic (i.e. the maximal chi-square statistic over all possible cutpoints) under the null-hypothesis of no association between X and Y is different from the known chi-square distribution. In the last decades, this topic has been extensively studied for continuous X variables, but not for non-continuous variables of at least ordinal measurement scale (which include e.g. classical ordinal or discretized continuous variables). In this paper, we suggest an exact method to determine the finite-sample distribution of maximally selected chi-square statistics in this context. This novel approach can be seen as a method to measure the association between a binary variable and variables having an at least ordinal scale of different types (ordinal, discretized continuous, etc). As an illustration, this method is applied to a new data set describing pregnancy and birth for 811 babies.
引用
收藏
页码:451 / 462
页数:12
相关论文
共 50 条