Label Distribution Learning (LDL) is a fine-grained learning paradigm that addresses label ambiguity, yet it confronts the curse of dimensionality. Feature selection is an effective method for dimensionality reduction, and several algorithms have been proposed for LDL that tackle the problem from different views. In this paper, we propose a novel feature selection method for LDL. First, an effective LDL model is trained through a classical LDL loss function, which is composed of the maximum entropy model and KL divergence. Then, to select common and label-specific features, their weights are enhanced by l21\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$l_{21}$$\end{document}-norm and label correlation, respectively. Considering that direct constraint on the parameter by label correlation will make the label-specific features between relevant labels tend to be the same, we adopt the strategy of constraining the maximum entropy output model. Finally, the proposed method will introduce Mutual Information (MI) for the first time in the optimization model for LDL feature selection, which distinguishes similar features thus reducing the influence of redundant features. Experimental results on twelve datasets validate the effectiveness of the proposed method.