Due to the fact that a huge amount of energy consumption takes place in today’s city buildings, particularly in modern countries, this ought to be highlighted as one of the world’s important issues, which will raise the requirement for developing a variety of evaluation methods so as to advance an optimal predictive device for consuming energies efficiently in buildings. On the one hand, Internet of Things (IoT) and its characteristics are the most popular research areas in real-life applications at present. On the other hand, machine learning (ML) techniques significantly has improved the Internet of things (IoT)’s capability to control energy consumption. To this end, this study, firstly, evaluated five models’ performance in terms of predicting IoT-oriented energy consumption by dividing the studied dataset into 80% train and 20% test. The involved ML models were Adaptive Boosting, Histogram-based Gradient Boosting Machine (HistGBM), K-Nearest Neighbors, Light Gradient Boosting Machine, Extreme Gradient Boosting. The contrastive investigation of the applied models’ evaluation metric criteria demonstrated the supremacy of HistGBM model before optimization process, with the highest R2 and the lowest RMSE. For further investigation, we tuned the parameters of the abovementioned models with Bat optimization algorithm (BOA) for IoT-based energy consumption forecast in city buildings. The results are then examined for the opted model’s hyperparameters using the optimization techniques, obtaining the most accurate and reliable hybrid model. The results confirm that the proposed hybrid BOA-XGBoost approach significantly improves the efficiency of the ML methods’ forecasting. In particular, the achieved highest R2 values by 0.9999 and 0.9979, respectively as well as the lowest RMSE of 0.34 and 4.70 for both training and testing dataset in building energy consumption prediction proved that the hybrid BOA-XGBoost model outperform the other models. The spent testing time for OP-XGBoost is the lowest one by 0.0033, which makes it become the most time-efficient hybrid model. The main point of the obtained results is to underpin the general efficacy of the selected optimizer regarding the accuracy of the delivered consequences.