The decoding performance of motor imagery (MI) based on electroencephalogram (EEG) limits the practical applications of brain-computer interface (BCI). In this paper, we propose a decoding approach for MI classification tasks based on a multi-branch convolutional neural network (MBCNN) and a time convolution fusion network with efficient attention mechanism (EATCFN). In MBCNN-EATCFNet, the combination of multi-branch and multi-scale structures is employed to capture spatiotemporal features at different scales. Additionally, to extract more discriminative temporal features from EEG signals, by integrating ECA into TCN, the module captures integrated information of bidirectional cross-channel interactions and long-term dependencies in the temporal sequence. Finally, to improve the adaptability of the model, a novel adaptive feature fusion method is proposed to distinguish the importance of bidirectional features. Our proposed model achieves classification results of 81.34 % (subject-dependent) and 69.46 % (subject-independent) on the BCI Competition IV dataset 2a, and 87.45 % (subject-dependent) and 83.63 % (subject-independent) on the BCI Competition IV dataset 2b, respectively. On dataset 2a, compared to eight baseline models, our approach achieves an average improvement of 10.15 % (subject-dependent) and 4.34 % (subject-independent), respectively. On dataset 2b, it achieves an average improvement of 2.76 % (subject-dependent) and 1.55 % (subject-independent). Furthermore, ablation experiments have been conducted to validate the effectiveness of each module. This model has significant potential in the clinical and practical application of MI-based BCI systems, thus promoting the further development of BCI technology.