While artificial intelligence (AI) based algorithms have become an epitome of video quality analysis (VQA) for streaming services, the true power of AI is still to be harnessed for situation aware streaming in power constrained wireless networks. In this work, we propose an architecture for situation-aware streaming, that identifies important events from live-feeds of the surveillance cameras, and allocates optimal power that reduces the long-term power consumption. Real-time video surveillance is a crucial technology for smart cities, that requires deployment of large number of cameras; both road-side and aerial. The proposed architecture is designed to (i) reduce the total power consumption of all the surveillance cameras, thus reducing the greenhouse emissions, (ii) improve flight duration of aerial cameras (e.g. drones), (iii) reduce the manual searching of desired events/objects, and (iv) improve the overall QoE. It is imperative that only the important events (example, a car violating a red light) are of interest to a law enforcement officers. Hence, if the important sections of the video are received with high quality, the long-term QoE increases. The architecture has two modules, viz., a tiny neural network (having a small number of hidden layers) at the source, that incurs a small computational resource, albeit at the cost of low accuracy; and a deep neural network (DNN) (with many hidden layers) at the destination that is capable of determining events with high accuracy. We show that there exists an optimal number of frames that provides optimal QoE and is able to reduce the required power consumption of the transmitter compared to when situation awareness is not used.