To address the issues of existing short-term load forecasting models that cannot effectively mine key information for load prediction, lack of extraction of time series patterns, and insufficient prediction accuracy, a short-term electric load forecasting model is proposed that combines Bidirectional Gated Recurrent Unit (BiGRU), Temporal Pattern Attention (TPA)and Crisscross Whale Optimization Algorithm (CSWOA). Firstly, BiGRU is used to obtain time series features in the raw data and effectively capture the changing patterns of key feature vectors. Secondly, TPA adaptively weights the state vectors output by BiGRU to further mine the hidden relationships among different variables at different time steps and solve the blindness of artificially selecting the influencing factors of load forecasting. Finally, the CSWOA algorithm is used to optimize the weight coefficients and bias parameters of the fully connected layer in the TPA-BiGRU model, solving the problem that neural networks using gradient descent are prone to getting stuck in local optima when updating parameters. The model was tested on electric load data provided by the 9th National University Student Electricity and Mathematics Modeling Competition. The experimental results showed that the model had lower percentage error (MAPE), root mean square error (RMSE), and higher determination coefficient (R2).