With the rise of 5G network and the rapid growth of user equipment, there exists a gap between the stringent requirements of emerging applications and the actual functionality of the Internet. In particular, transmitting data over long network links imposes high costs, which can be addressed by the edge caching (EC) method. EC caches the content at the edge server to avoid the extraordinary cost of backhaul link communication. However, in existing EC efforts, it is common to assume either known content popularity or a two-phase caching that is predicted content popularity prior to the caching action, the former being less feasible and the latter increasing the cost of deployment to the real world. A caching strategy is proposed in this paper to cope with this problem that can be feasible end-to-end deployed and has a lower caching cost. Specifically, we first investigate the system cost, including network communication cost, cache over storage cost, and cache replacement cost. And we model the EC problem as a Markov Decision Process (MDP). Then, the Double Deep Recurrent Q Network (DDRQN) algorithm is studied to solve the EC-based MDP problem. Finally, compared with other intelligent caching strategies, the proposed caching strategy can improve the system reward by up to 24% and the cache hit rate by up to 22% under certain conditions.