Reference evapotranspiration (ET0) plays an undeniably important role in irrigation management. Thus, accurate estimation of ET0 is necessary to avoid over or under irrigation to increase agricultural productivity and manage water resources effectively. Due to the limited availability of climate datasets in developing countries, the estimation of ET0 remains the biggest challenge. This study presents two-hybrid deep neural network models for the estimation of reference evapotranspiration: Convolution—Long Short Term Memory (Conv-LSTM), which performs the convolution operation in LSTM cells and Convolution Neural Network—LSTM (CNN-LSTM) that uses the convolution layer for feature extraction of input data and then extracted features are fed to LSTM layers. The study also focuses on climate data scarcity conditions, and thus, different input combinations of climate parameters have been used to investigate the minimum required parameters to model the ET0 process. The climate dataset of two stations of India: Ludhiana and Amritsar, is adopted to develop proposed models. It includes daily maximum temperature (Tmax), minimum temperature (Tmin), wind speed measured at the height of 2 m (U2), solar radiation (Rs), relative humidity (Rh), vapor pressure (Vp), and sunshine hours (Ssh) data from the period 2003 to 2015 of Ludhiana station and 2000 to 2016 of Amritsar station. Several performance measures are used to assess the precision of the model and to perform sensitivity analysis. Temperature and radiation are observed as the prime data inputs required to estimate ET0 values. The proposed hybrid models are then compared with existing temperature and radiation-based empirical models such as Hargreaves, Makkink, and Ritchie. The comparison reveals that CNN-LSTM and Conv-LSTM outperform these existing models. Also, Conv-LSTM performs best among all for the estimation of ET0.