With the rapid development of Industry 4.0, massive distributed intelligent industrial devices are interconnected by industrial wireless networks, and generate a large number of heterogeneous industrial tasks with different delay sensitivity and computing load during smart manufacturing. Real-time and efficient processing of industrial tasks is the key factor affecting the safety and efficiency of industrial manufacturing production. However, the limited local computing capacity of industrial devices cannot support the real-time and efficient processing of industrial tasks, and the common industrial cloud computing paradigm results in uncertain communication delays and additional network security issues. It is an effective method to offload industrial tasks to Multi-access Edge Computing servers deployed in base stations, access points and other network edge infrastructures through industrial wireless networks. Nevertheless, the limited time-frequency resources of industrial wireless networks cannot support the high concurrent computing offloading of industrial tasks. With consideration of the difficulty in modeling high concurrent computing offloading of industrial tasks, a Deep Reinforcement Learning-based Concurrent Access with Dynamic Priority (CADP-DRL) algorithm is proposed in this paper. Firstly, the industrial devices are assigned with dynamic priorities according to the delay sensitivity and computing load of their industrial tasks, and the access offloading probabilities of these industrial devices are changed dynamically depending on their priorities. Then, the Markov decision process is utilized to formulate the dynamic priority concurrent computing offloading problem. As both dynamic priority and concurrent computing offloading of massive industrial devices result in the explosion of state space, deep reinforcement learning is used to establish a mapping relationship from states to actions in the high-dimensional state space. Next, the long-term cumulative reward is maximized to obtain an effective dynamic priority concurrent computing offloading policy. Especially, with the aim at the multi-objective decision of dynamic priority and concurrent offloading, a novel compound reward function with joint priority reward and offloading reward is designed. The priority reward is used to ensure reliable offloading of high-priority industrial devices, and the offloading reward is employed to minimize offloading conflicts. In order to guarantee the independent and identical distribution of training data while accelerating the convergence of CADP-DRL, an experience replay with experience-weight is designed. Experiences are classified as high-weighted and low-weighted experiences depending on their weights, and stored in different experience memories respectively. Experiences are randomly sampled as the training data for CADP-DRL, and the sampling probabilities of experiences in different experience memories vary dynamically to break the time correlation among experiences while speeding up the convergence. The expensive training overhead of CADP-DRL is consumed in the offline training phase, and the trained CADP-DRL can make an effective computing offloading decision in real-time in the online execution phase. Slotted-Aloha algorithm is chosen as the benchmark algorithm in the field of communications, and DQN, DDQN and D3QN algorithms are chosen as the benchmark algorithms in the field of deep reinforcement learning. Extensive experiments show that, compared with these benchmark algorithms, CADP-DRL converges quickly, and performs well in generalization. Meanwhile, CADP-DRL always guarantees the highest successful offloading probabilities of high-priority industrial devices with minimum offloading conflicts. © 2021, Science Press. All right reserved.