In data-driven system identification it is assumed that the dynamics of the system can be obtained from a low-rank structure of certain large-size matrices, e.g., an output covariance Hankel matrix. A classical approach to retrieving relevant dynamics is to truncate the singular value decomposition of the considered matrix by retaining only the most significant singular values corresponding to the true order of the noise-free matrix. The elusive true order of the system model is generally unknown; its selection, however, is a fundamental task in application of subspace identification methods for estimation of modal parameters and their uncertainties, often for the purpose of damage diagnosis. In this paper, a statistical methodology to approximate the model order of the output covariance Hankel matrices is derived and applied in the structural damage detection context. It is shown that the model order can be retrieved by analyzing the statistical distribution of the sensitivity of the output covariance Hankel matrix against a random perturbation of its eigenvalues. It is shown that this distribution follows a dual framework, which depends on the covariance of the inputs (loads) acting on the mechanical system. The proposed methodology is applied to the low-rank approximation of output covariance Hankel matrices, extracted as part of the subspace damage detection of a large-scale bridge. It is demonstrated that the robustness of damage diagnosis is enhanced by reducing the number of false alarms.