Multi-label feature selection is an indispensable technology in the preprocessing of multi-label high-dimensional data. Approaches utilizing information theory and sparse models hold promise in this domain, demonstrating strong performance. Although there have been extensive literatures using l 1 and l 2 , 1-norms to identify label- specific features and common features in the feature space, they all ignore the redundant information interference problem when different features are learned simultaneously. Considering that features and labels in multi-label data are rarely linearly correlated, the MFS-MFR approach is presented to generate a representation of the nonlinear correlation between features and labels using the mutual information estimator. Following that, MFS-MFR detects specific and common features in the feature-label mutual information space using two coefficient matrices constrained by the l 1 and l 2 , 1-norms, respectively. In particular, we define a nonzero correlation constraint that effectively minimizes the redundant correlation between the two matrices. Moreover, a manifold regularization term is devised to preserve the local information of the mutual information space. To solve the optimization model with nonlinear binary regular term, we employ a novel solution approach called S-FISTA. Extensive experiments across 15 multi-label benchmark datasets, comparing against 11 top-performing multi-label feature selection methods, demonstrate the superior performance of MFS-MFR.