Considerable progress has been made in semantic scene understanding of road scenes with monocular cameras. It is, however, mainly focused on certain specific clee-es such as cars, bicyclists and pedestrians. This work investigates traffic ciines, an object catoviry crucial for traffic control in the context of autonomous vehicles. 3D object detection using :images from a monocular camera is intrinsically an ill posed problem. Tri this work, we exploit the unique structure of traffic cones and propose a pipelined 1112 01(11 to sailve this prof-dem. Specifically, We first detect cones in images by a modified 2D object detector. l'ollowing which the keypoints on a traffic cone are recognized with the help of our deep structural regression network, here, the fact that the cross -ratio is prcijectkin invariant is leveraged for network regularization. Finally, the 3D position of cones is recovered via the classical Perspective 0-Pointdgocilluri using correspondences obtained from the keyixiint regression. Extensive experiments show that. our apprOEih call accurately detect traffic cones and estimate their position in the 31) world in real time. The proposed inet hod is also deployed on a reahtimo, autonomous system. It runs efficiently on the low-power Jetson TX2, providing accurate 3D position estimates, allowing a Face-car to reap and drive (iii 000) on an unseen wick indicated by traffic cones. \\ the help of robust and accurate perception, our lace car w'on both Formula Student Competitions held in Italy arid Gerimmy in 201 8, cruising at a Lop speed of 54 kin/li on our driverless platform "gotthard driverless" Visualization of the complete pipeline, mapping and navigation can be found on our project page Li,a.i.v.LlifiRT: