Software fault prediction (SFP) is an active issue in software engineering (SE). At present, machine learning (ML) has been successfully applied to SFP classification problems. However, one of the challenges for building software fault prediction models (SFPM) is processing high dimensional datasets, which include many irrelevant and redundant features. To address this issue, feature selection techniques, mainly contain wrapper methods and filter methods, are used. In the paper, we report an empirical study aimed at providing a novel approach to select feature for SFP. First of all, a novel feature selection method based on correlation-based feature subset selection (CFS) is proposed. In stage 1, we use the classical CFS to selected features. Then in stage 2, we propose a method for calculating similarity of feature occurrence frequency to further decrease the usefulness features. Second, to validate the novel FS approach, we compare our method with other three FS techniques. For comparison, 38 releases of 10 Java open source projects collected from the PROMISE repository are used in our proposed method. In addition, 10 releases of 10 projects, a total of 10 different software fault data sets are randomly selected. All the selected data subsets after FS approaches are applied to five typical ML classifiers. The final prediction performance results suggest that our proposed method performs mostly better than other three FS methods. Therefore, the novel feature selection approach is feasible. To sum up, we can use the method to delete irrelevant and redundant features to gain useful data subsets and construct well-performed SFPM. The results of SFP can provide useful advice for other SE activities, such as software testing, software quality assurance. Although the current method is effective, it still has some limitations. Our future work is to test the statistical significance of the classification results to further prove the feasibility of the idea.