Emergencies such as terrorist attacks, with a large number of casualties, have spread worldwide and become the global issue for a long time. Previous researchers employed traditional two-dimensional (2D) models to simulate the crowd dynamics between terrorists and civilians. However, these 2D models simplify real situations and have yet to consider individual heights and visions. Therefore, more accurate models are needed. In this work, we extend the 2D model and propose the three-dimensional (3D) model, and the core is to bring the mechanism of individuals heights into decision-making process of these 3D agents for both terrorists and civilians. We first build the 3D environment. For the mechanism of crowd dynamics, under the framework of perception-decision-behavior, our 3D model has included individualized heights for all agents. Comparing 2D and 3D models, we find that individual heights and visions have greatly shaped the outcomes. The height heterogeneity has significant effects on attack deaths and slight effects on stampede deaths because smaller heights slow the moving speed for the crowd, and the higher heterogeneity (of heights) impairs the visibility of civilians. The effects of height heterogeneity on deaths will be more obvious, as the group size of civilians is beyond 1900. We have the phase transition threshold of 3030, beyond which stampede deaths exceed attack deaths. Moreover, the size effect of heroes follows the law of diminishing marginal returns. We also find that the number of heroes should twice that of terrorists, guiding better allocations of the police force and other public resources for emergencies responses. To strengthen the counter-force, heterogeneity effects of civilian heights should be controlled, and self-motivated heroes should be encouraged, which is critical for public safety worldwide.