Cross-correlation serves as the core similarity calculation operation in Siamese-based trackers, and generally produces response maps with high values at the target center. During this process, global context, including boundary and surrounding background of the target, which is conducive to target localization and bounding box regression, has been overlooked. In this work, we propose a Frequency and Spatial domain Filter Network (FSFNet) for visual object tracking, which exploits abundant global context in the frequency domain and enhances target representation in the spatial domain. First, frequency filters generated from template and search patches are applied to the target, capturing and enhancing valuable frequency components. These enhanced frequency components describe the global regions of interest in the spatial domain. Second, spatial domain convolutions are adopted to highlight local details of the target. Compared with mechanisms including depth-wise correlation, pixel-wise correlation, and transformer, our method provides more accurate tracking results. Experiments on five benchmarks show that our tracker obtains competitive results. For example, our tracker achieves an AUC score of 81.2% on TrackingNet, outperforming the state-of-the-art two-stream tracker TrDiMP by 2.8% while running at 50 FPS.