Diabetes mellitus is a chronic metabolic disease, mainly characterized by insufficient insulin secretion or impaired insulin action in the body, resulting in elevated blood glucose. According to the World Health Organization (WHO), the number of diabetes patients worldwide has been on the rise in recent years, and has become an important public health problem worldwide today. In this paper, we used the Random Forest-based feature importance screening method to retain the variables with larger variable feature weights, performed Spearman correlation analysis, selected the top 10 operational variables with lower correlations, and used information entropy theory and correlation analysis to test the representativeness and independence of the main variables, and finally screened out the main variables as platelet volume distribution width, HDL cholesterol, and the proportion of white globules, platelet specific volume, platelet count, red blood cell count, lymphocyte %, albumin, neutrophil %, and leukocyte count. Blood glucose prediction models were established through data mining techniques, in this paper five machine learning were selected for prediction, namely Extreme Gradient Boosted Tree (XGBoost), Random Forest Regression, Support Vector Machine Regression SVR, LightGBM, Gradient Boosted Decision Tree (GBDT). The training set was put into each model for training, and the test set was inputted into the model to get the root mean squared error produced by the five models ( MSE), Mean Absolute Error (MAE), and Maximum Absolute Error (MAS), comparing the five models, in general, the Support Vector Machine regression SVR has the highest accuracy. To establish a support vector machine SVR blood sugar prediction model based on Bayesian optimization, the sample data are normalized, the parameters are initially corrected using Bayesian principles, and then the support vector machine estimation algorithm is selected to initialize the model, the parameters are inferred using the Bayesian evidence framework, and the optimal model is established after several iterations, and the support vector machine regression SVR trained using the optimal hyperparameters obtained from Bayesian optimization model has improved accuracy in all three evaluation metrics.