Machine learning provides researchers in speech processing and bioacoustics numerous advanced and non-invasive techniques to investigate animal vocalizations. Hidden Markov Models (HMMs) are machine learning techniques that were developed and implemented for the automatic gender recognition and speaker identification of Rhesus Macaques (Macaca mulatta) using traditional spectral and temporal features, namely Mel-Frequency Cepstral Coefficients (MFCCs) and delta (velocity) and delta-delta (acceleration) coefficients. By extracting the combined features from the frames of the vocalizations using 4 ms frame size and 2 ms step size and 4 state, left-to-right HMMs, the important tasks of gender recognition and speaker identification were performed on the database of 7285 coo call-types from 8 animals (4 males, 4 females). The task of gender recognition produced a 84.45% accuracy (1233/1460 correct recognitions), and the task of speaker identification of the 4 males and 4 males yielded 91.08% (633/695 correct identifications, males) and 83.27% (637/765 correct identifications, females) and 81.85% (119/1460 correct identifications) for all 8 animals. Based on the performance, the novel contributions of the framework—applying HMMs to the gender recognition and speaker identification of the Rhesus Macaques (M. mulatta) in an automated manner—could easily be extended to other mammals for automatic classification and recognition. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.