Short-term residential load forecasting is of great significance to smart grid applications. Deep learning techniques, especially recurrent neural networks, can greatly improve the performance of prediction models. However, deep neural networks usually have low interpretability, which creates obstacles for customers to deeply understand the prediction results and make quick responses. In addition, the existing deep learning prediction methods rely heavily on the centralized training of massive data. However, the transmission of data from the client to the server poses a threat to the data security of customers. In this work, we propose an interpretable deep learning framework with federated learning for short-term residential load forecasting. Specifically, we propose a new automatic relevance determination network for feature interpretation, combined with the encoder-decoder architecture to achieve interpretable multi-step load prediction. In the edge computing network, the training scheme based on federated learning does not share the original data, which can effectively protect data privacy. The introduction of iterative federated clustering algorithm can alleviate the problem of non-independent and identical distribution of data in different households. We use two real-world datasets to verify the feasibility and performance of the proposed method. Finally, we discuss in detail the feature interpretation of these two datasets.