KERNEL DENSITY ESTIMATION;
BANDWIDTH SELECTION;
CROSS-VALIDATION;
CHARACTERISTIC FUNCTION;
PLUG-IN METHOD;
D O I:
10.1214/aos/1176348376
中图分类号:
O21 [概率论与数理统计];
C8 [统计学];
学科分类号:
020208 ;
070103 ;
0714 ;
摘要:
The problem of automatic bandwidth selection for a kernel density estimator is considered. It is well recognized that the bandwidth estimate selected by the least squares cross-validation is subject to large sample variation. This difficulty limits the application of the cross-validation estimate. Based on characteristic functions, an important expression for the cross-validation bandwidth estimate is obtained. The expression clearly points out the source of variation. To stabilize the variation, a simple bandwidth selection procedure is proposed. It is shown that the stabilized bandwidth selector gives a strongly consistent estimate of the optimal bandwidth. Under commonly used smoothness conditions, the stabilized bandwidth estimate has a faster convergence rate than the convergence rate of the cross-validation estimate. For sufficiently smooth density functions, it is shown that the stabilized bandwidth estimate is asymptotically normal with a relative convergence rate n-1/2 instead of the rate n-1/10 of the cross-validation estimate. A plug-in estimate and an adjusted plug-in estimate are also proposed, and their asymptotic distributions are obtained. It is noted that the plug-in estimate is asymptotically efficient. The adjusted plug-in bandwidth estimate and the stabilized bandwidth estimate are shown to be asymptotically equivalent. The simulation results verify that the proposed procedures perform much better than the cross-validation for finite samples.