Mainstream image manipulation localization methods usually fuse inconsistent features of different streams through simple operations, resulting in feature redundancy and pixel misdetection of tampered regions. Therefore, we propose a novel network of dual-stream enhancement encoder and attention optimization decoder for image manipulation localization. Firstly, the dual-stream enhancement encoder module can self-reinforce and interact with the extracted dual-stream multi-scale features, and can make full use of a variety of tampered information, so that a variety of tampered information can be complemented by interaction, and more attention is paid to the tampering features. Then, a multi-scale receptive field strategy is introduced to explore multi-scale context information, and an adjacent-level feature aggregation module is designed to fuse multi-scale adjacent features. Finally, the capability of manipulation localization is enhanced with the cooperation of tamper region and genuine region, the attention optimization decoder module is designed to eliminate the wrong prediction of edge pixels in the initial tamper region prediction, and the manipulation localization is refined step by step. Extensive experiments are constructed on four mainstream public datasets, NIST16, Coverage, Columbia and CASIA, and two realistic challenge datasets, IMD20 and Wild, to compare with mainstream manipulation localization methods. Our proposed method has superior performance under six datasets in the settings of none fine-tuning-tuning and fine-tuning-tuning model, which demonstrates that our proposed method can make full use of various forgery clues to achieve greater localization accuracy and stronger robustness.