With the rapid development of IoT technology and artificial intelligence technology, vehicle edge computing has attracted more and more attention. Effectively utilizing the various communication, computational and caching resources in the vicinity of vehicles, and employing edge computing system models to migrate computational tasks closer to the vehicles, have become a hotspot in current Internet of Vehicles research. Due to the limited computational resources of in-vehicle devices, the computational demands of vehicle users cannot be met without making full use of the computational resources available in the vicinity of vehicles. Aiming to minimize the computational latency of vehicular tasks, a collaborative offloading mechanism for computational tasks in vehicle edge computing was investigated in this paper. Firstly, a three-layer architecture for task collaborative offloading was designed considering the computational resources of parked vehicles in the vicinity of vehicles as well as the computational resources of roadside units, which was comprised with three tiers: cloud server layer, roadside unit collaboration cluster layer, and the parked vehicle collaboration cluster layer. By means of collaborative offloading between the roadside unit collaboration cluster and the parked vehicle collaboration cluster, the free computational resources of system were fully leveraged, which further enhanced resource utilization. Then, in order to segment roadside units into collaboration clusters, a roadside unit collaboration cluster partitioning algorithm based on k-means clustering algorithm was proposed. A distributed iterative optimization approach with block-coordinate upper-bound minimization was utilized to design a task collaborative offloading algorithm for offloading the computation of terminal vehicle users' tasks. Finally, by comparing with other algorithm schemes through experiments, the algorithm proposed in this paper has better performance in terms of system latency and system throughput according to the stimulation result. Specifically, the system latency was reduced by 23% and the system throughput was increased by 28%. © 2024 Editorial Department of Journal of Sichuan University. All rights reserved.