Long time series forecasting has extensive applications in various fields such as power dispatching, traffic control, and weather forecasting. Recently, models based on the Transformer architecture have dominated the field of time series forecasting. However, these methods lack the ability to handle the correlation of multi-scale information and the interaction of information between variables in model design. This paper proposes a convolutional neural network, MDWConv, based on multi-scale dilated pyramid and depthwise separable convolution. In terms of understanding and integrating multi-scale information, the multi-scale dilated pyramid structure is constructed to capture multi-scale features, and convolution operations are employed to achieve cross-scale information integration, thereby improving the understanding and processing capability of the sequence's rich scale-specific information. A depthwise separable convolution network is constructed, which adopts a grouping strategy: using depthwise convolution to extract long-term dependencies and pointwise convolution for inter-variable information interaction and hidden information extraction. This reduces computational complexity while improving the model's predictive accuracy through enhanced feature representation. We also propose a novel segmented polynomial activation function (TCP), which approximates the GELU function with piecewise cubic Hermite functions in different domains, significantly reducing computational complexity and achieving a faster loss reduction rate. Experiments on various real- world datasets demonstrate that MDWConv outperforms other methods. Despite relying solely on convolutional neural networks, MDWConv still exhibits strong competitiveness.