In the contemporary era, Facial Expression Recognition (FER) plays a pivotal role in numerous fields due to its vast application areas, such as e-learning, healthcare, marketing, and psychology, to name a few examples. Several research studies have been conducted on FER, and many reviews are available. The existing FER review paper focused on presenting a standard pipeline for FER to predict basic expressions. However, previous studies have not given an adequate amount of importance to FER datasets and their influence on affecting FER system performance. In this systematic review, 105 papers retrieved papers from IEEE, ACM, Science Direct, Scopus, Web of Science, and Springer from the years 2002 to 2023, following systematic review guidelines. Review protocol and research questions are also developed for the analysis of study results. The review identified that the accuracy of the FER system in controlled and spontaneous facial expression datasets is being affected, along with other challenges such as illumination, pose, and scale variation. Furthermore, this paper comparatively analyzed the FER model in both machine and deep learning techniques, including face detection, pre-processing, handcrafted feature extraction techniques, and emotion classifiers. In addition, we discussed some unresolved issues in FER and suggested solutions to enhance FER system performance further. In the future, multimodal FER systems need to be developed for real-time scenarios, considering the computational efficiency of model performance when integrating more than one model and dataset to achieve promising accuracy and reduce error rates. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2024.