Speaker identification is a biometric mechanism that determines a person who is speaking from a set of known speakers. It has vital applications in areas like security, surveillance, forensic investigations, and others. The accuracy of speaker identification systems was good by using clean speech. However, the speaker identification system performance gets degraded under noisy and mismatched conditions. Recently, a network of hybrid convolutional neural networks (CNN) and enhanced recurrent neural network (RNN) variants have performed better in speech recognition, image classification, and other pattern recognition. Moreover, cochleogram features have shown better accuracy in speech and speaker recognition under noisy conditions. However, there is no attempt conducted in speaker recognition using hybrid CNN and enhanced RNN variants with the cochleogram input to enhance the models' accuracy in noisy environments. This study proposes a speaker identification for noisy conditions using a hybrid CNN and bidirectional gated recurrent unit (BiGRU) network on the cochleogram input. The models were evaluated by using the VoxCeleb1 speech dataset with real-world noise, white Gaussian noises (WGN), and without additive noise. Real-world noises andWGN were added to the dataset at the signal-to-noise ratio (SNR) of -5 dB up to 20 dB with 5 dB intervals. The proposed model attained an accuracy of 93.15%, 97.55%, and 98.60% on the dataset with real-world noises at SNR of -5 dB, 10 dB, and 20 dB, respectively. The proposed model shows approximately similar performance on both real-world noise andWGN at similar SNR levels. Using the dataset without additive noise the model achieved 98.85% accuracy. The evaluation accuracy and the comparison with the previous works indicate that our model has better accuracy.