Cyber-physical systems (CPS) perceive a vast number of images that are used in data fusion and data mining for the decision-making process. To process such a large number of image data, the content-based image retrieval (CBIR) platform was developed; however, its plaintext nature may pose a range of security issues. The secure CPS CBIR (SCPS-CBIR) system emerges to solve such threats. SCPS-CBIR faces the three main challenges of retrieval accuracy, computing power, and storage capacity. Artificial intelligence based high-level semantic features such as convolutional neural network descriptors can improve retrieval accuracy. However, the computing power of a considerable part of CPS can be more robust to meet the computational load. Given this, although secure outsourced computing can solve security problems, it will significantly increase computing overhead. Meanwhile, most of the existing secure image retrieval methods do not consider devices with weak computing power devices, so such methods are ineffective with respect to SCPS-CBIR. Accordingly, in this study, we not only propose a novel tripartite delayed homomorphic secret sharing (TD-HSS) protocol based on modular confusion, which can provide efficient secure computing, but also apply this protocol to the SCPS-CBIR system to extract secure VGG-VLAD features and store them securely, thereby reducing the computational and storage pressures on CPS. The experimental results demonstrate the efficiency and superiority of our proposed method.