With the increasing demand for user recognition in several recent applications, experts highly recommend using biometric identification technology in application development. However, using only single biometric modalities like face and fingerprint has proven to be insufficient to meet the high-security requirements of many sensitive military and government applications that are used at critical access points. Therefore, multimodal systems have gained increasing attention to overcome many limitations and problems affecting unimodal biometric systems’ reliability and performance. In this research paper, we have proposed a robust multimodal biometric recognition system based on the fusion of the face and both irises modalities. Our proposed system used YOLOv4-tiny to detect regions of interest and a new effective Deep Learning model inspired by the Xception pre-trained model to extract features. Also, to keep the permanent features, we used Principal Component Analysis, and for classification, we applied the LinearSVC. In addition, we explore the performance of different fusion approaches, including image-level fusion, feature-level fusion, and two score-level fusion methods. To demonstrate the robustness and effectiveness of our proposed multimodal biometric recognition system, we used the two-fold cross-validation protocol during the evaluation process. Remarkably, our system achieved a perfect accuracy rate of 100% on the CASIA-ORL and SDUMLA-HMT multimodal databases, indicating its exceptional performance and reliability.