Near-surface air temperature is a crucial weatherparameter that significantly impacts human health and is widelyutilized in numerical weather forecasting and climate predictionstudies. However, the most common ground-based meteorologicalstation observation and radar observation are often limitedby geographic and natural constraints. With the advantagesof global coverage and high spatiotemporal resolution, satelliteremote sensing has become a valuable support in overcomingdata scarcity issues related to ground-based station and radarobservations in complex geographic and natural conditions.Although remote sensing indirectly refiects atmosphere variables(e.g., near-surface air temperature), accurately estimating theatmosphere variables through satellite remote sensing remainsa significant challenge. This paper introduces a deep learningTransformer-based neural network (TaNet) for near-surface airtemperature estimation. TaNet automatically extracts informa-tion from imageries captured by China's new-generation geosta-tionary meteorological satellite FengYun-4A and generates gridnear-surface air temperature data in near real-time. Extensiveexperiments conducted using the state-of-the-art operationalreanalysis product ERA5 and meteorological station observationsas benchmark standards demonstrate the effectiveness and supe-riority of TaNet. It achieves an impressive Pearson's correlationcoefficient (CC) of 0.990 with ERA5 and 0.959 with stationobservations, outperforming the other products, such as CFSv2,CRA, and U-Net, on root mean square error (RMSE) and CCmetrics. TaNet reduces the RMSE of CFSv2, CRA, and U-Net bya margin of 10.551%(2.594 degrees C vs. 2.900 degrees C), 2.261%(2.594 degrees C vs.2.654 degrees C), and 5.535%(2.594 degrees C vs. 2.746 degrees C), respectively, usingstation observations as the benchmark.