Forests play an irreplaceable role in carbon sinks. However, there are obvious differences in the carbon sink capacity of different tree species, so the scientific and accurate identification of surface forest vegetation is the key to achieving the double carbon goal. Due to the disordered distribution of trees, varied crown geometry, and high difficulty in labeling tree species, traditional methods have a poor ability to represent complex spatial-spectral structures. Therefore, how to quickly and accurately obtain key and subtle features of tree species to finely identify tree species is an urgent problem to be solved in current research. To address these issues, a texture-aware self-attention model (TASAM) is proposed to improve spatial contrast and overcome spectral variance, achieving accurate classification of tree species hyperspectral images (HSIs). In our model, a nested spatial pyramid module is first constructed to accurately extract the multiview and multiscale features that highlight the distinction between tree species and surrounding backgrounds. In addition, a cross-spectral-spatial attention module is designed, which can capture spatial-spectral joint features over the entire image domain. The Gabor feature is introduced as an auxiliary function to guide self-attention to autonomously focus on latent space texture features, further extract more appropriate and accurate information, and enhance the distinction between the target and the background. Verification experiments on three tree species hyperspectral datasets prove that the proposed method can obtain finer and more accurate tree species classification under the condition of limited labeled samples. This method can effectively solve the problem of tree species classification in complex forest structures and can meet the application requirements of tree species diversity monitoring, forestry resource investigation, and forestry carbon sink analysis based on HSIs.