WFormer: A Transformer-Based Soft Fusion Model for Robust Image Watermarking

被引：0

作者：

Luo T. ^{[1
]}

Wu J. ^{[2
]}

He Z. ^{[1
]}

Xu H. ^{[2
]}

Jiang G. ^{[2
]}

Chang C. ^{[3
]}

机构：

[1] College of Science and Technology, Ningbo University, Ningbo

[2] Faculty of Information Science and Engineering, Ningbo University, Ningbo

[3] Department of Information Engineering and Computer Science, Feng Chia University, Taichung

来源：

IEEE Transactions on Emerging Topics in Computational Intelligence | 2024年 / 8卷 / 06期

关键词：

Convolution; cross-attention; Decoding; Feature extraction; Noise; Robustness; soft fusion; transformer; Transformers; Watermarking;

D O I：

10.1109/TETCI.2024.3386916

中图分类号：

学科分类号：

摘要：

Most deep neural network (DNN) based image watermarking models often employ the encoder-noise-decoder structure, in which watermark is simply duplicated for expansion and then directly fused with image features to produce the encoded image. However, simple duplication will generate watermark over-redundancies, and the communication between the cover image and watermark in different domains is lacking in image feature extraction and direction fusion, which degrades the watermarking performance. To solve those drawbacks, this paper proposes a Transformer-based soft fusion model for robust image watermarking, namely WFormer. Specifically, to expand watermark effectively, a watermark preprocess module (WPM) is designed with Transformers to extract valid and expanded watermark features by computing its self-attention. Then, to replace direct fusion, a soft fusion module (SFM) is deployed to integrate Transformers into image fusion with watermark by mining their long-range correlations. Precisely, self-attention is computed to extract their own latent features, and meanwhile, cross-attention is learned for bridging their gap to embed watermark effectively. In addition, a feature enhancement module (FEM) builds communication between the cover image and watermark by capturing their cross-feature dependencies, which tunes image features in accordance with watermark features for better fusion. Experimental results show that the proposed WFormer outperforms the existing state-of-the-art watermarking models in terms of invisibility, robustness, and embedding capacity. Furthermore, ablation results prove the effectiveness of the WPM, the FEM, and the SFM. IEEE

引用

页码：1 / 18

页数：17

共 50 条

[1] A Transformer-based invertible neural network for robust image watermarking
He, Zhouyan
Hu, Renzhi
Wu, Jun
Luo, Ting
Xu, Haiyong
Journal of Visual Communication and Image Representation, 2024, 104
[2] TIPFNet: a transformer-based infrared polarization image fusion network
Li, Kunyuan
Qi, Meibin
Zhuang, Shuo
Yang, Yanfang
Gao, Jun
OPTICS LETTERS, 2022, 47 (16) : 4255 - 4258
[3] A Transformer-Based Fusion Recommendation Model For IPTV Applications
Li, Heng
Lei, Hang
Yang, Maolin
Zeng, Jinghong
Zhu, Di
Fu, Shouwei
2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 177 - 182
[4] CEWformer: A Transformer-Based Collaborative Network for Simultaneous Underwater Image Enhancement and Watermarking
Wu, Jun
Luo, Ting
He, Zhouyan
Song, Yang
Xu, Haiyong
Li, Li
IEEE JOURNAL OF OCEANIC ENGINEERING, 2024, 49 (01) : 30 - 47
[5] Transformer-based Image Compression
Lu, Ming
Guo, Peiyao
Shi, Huiqing
Cao, Chuntong
Ma, Zhan
DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 469 - 469
[6] Transformer-Based End-to-End Anatomical and Functional Image Fusion
Zhang, Jing
Liu, Aiping
Wang, Dan
Liu, Yu
Wang, Z. Jane
Chen, Xun
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
[7] Robust Human Motion Forecasting using Transformer-based Model
Mascaro, Esteve Valls
Ma, Shuo
Ahn, Hyemin
Lee, Dongheui
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 10674 - 10680
[8] RobuTrans: A Robust Transformer-Based Text-to-Speech Model
Li, Naihan
Liu, Yanqing
Wu, Yu
Liu, Shujie
Zhao, Sheng
Liu, Ming
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8228 - 8235
[9] Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion
Xie, Baijun
Sidulova, Mariia
Park, Chung Hyuk
SENSORS, 2021, 21 (14)
[10] A Transformer-Based Model for Super-Resolution of Anime Image
Xu, Shizhuo
Dutta, Vibekananda
He, Xin
Matsumaru, Takafumi
SENSORS, 2022, 22 (21)

← 1 2 3 4 5 →