To improve the software quality and reduce the maintenance cost, cross-project fault prediction (CPFP) identifies faulty software components in a particular project (aka target project) using the historical fault data of other projects (aka source/reference projects). Although several diverse approaches/models have been proposed in the past, there exists room for improvement in the prediction performance. Further, they did not consider effort-based evaluation metrics (EBEMs), which are important to ensure the model’s application in the industry, undertaking a realistic constraint of having a limited inspection effort. Besides, they validated their respective approaches using a limited number of datasets. Addressing these issues, we propose an improved CPFP model with its validation on a large corpus of data containing 62 datasets in terms of EBEMs (PIM@20%, Cost-effectiveness@20%, and IFA) and other machine learning-based evaluation metrics (MLBEMs) like PF, G-measure, and MCC. The reference data and the target data are first normalized to reduce the distribution divergence between them and then the relevant training data is selected from the reference data using the KNN algorithm. Seeing the experimental and statistical test results, we claim the efficacy of our proposed model over state-of-the-art CPFP models namely the Turhan-Filter and Cruz model comprehensively. Thus, the proposed CPFP model provides an effective solution for predicting faulty software components, enabling practitioners in developing quality software with lesser maintenance cost.