WFormer: A Transformer-Based Soft Fusion Model for Robust Image Watermarking

被引:0
|
作者
Luo T. [1 ]
Wu J. [2 ]
He Z. [1 ]
Xu H. [2 ]
Jiang G. [2 ]
Chang C. [3 ]
机构
[1] College of Science and Technology, Ningbo University, Ningbo
[2] Faculty of Information Science and Engineering, Ningbo University, Ningbo
[3] Department of Information Engineering and Computer Science, Feng Chia University, Taichung
关键词
Convolution; cross-attention; Decoding; Feature extraction; Noise; Robustness; soft fusion; transformer; Transformers; Watermarking;
D O I
10.1109/TETCI.2024.3386916
中图分类号
学科分类号
摘要
Most deep neural network (DNN) based image watermarking models often employ the encoder-noise-decoder structure, in which watermark is simply duplicated for expansion and then directly fused with image features to produce the encoded image. However, simple duplication will generate watermark over-redundancies, and the communication between the cover image and watermark in different domains is lacking in image feature extraction and direction fusion, which degrades the watermarking performance. To solve those drawbacks, this paper proposes a Transformer-based soft fusion model for robust image watermarking, namely WFormer. Specifically, to expand watermark effectively, a watermark preprocess module (WPM) is designed with Transformers to extract valid and expanded watermark features by computing its self-attention. Then, to replace direct fusion, a soft fusion module (SFM) is deployed to integrate Transformers into image fusion with watermark by mining their long-range correlations. Precisely, self-attention is computed to extract their own latent features, and meanwhile, cross-attention is learned for bridging their gap to embed watermark effectively. In addition, a feature enhancement module (FEM) builds communication between the cover image and watermark by capturing their cross-feature dependencies, which tunes image features in accordance with watermark features for better fusion. Experimental results show that the proposed WFormer outperforms the existing state-of-the-art watermarking models in terms of invisibility, robustness, and embedding capacity. Furthermore, ablation results prove the effectiveness of the WPM, the FEM, and the SFM. IEEE
引用
收藏
页码:1 / 18
页数:17
相关论文
共 50 条
  • [1] A Transformer-based invertible neural network for robust image watermarking
    He, Zhouyan
    Hu, Renzhi
    Wu, Jun
    Luo, Ting
    Xu, Haiyong
    Journal of Visual Communication and Image Representation, 2024, 104
  • [2] TIPFNet: a transformer-based infrared polarization image fusion network
    Li, Kunyuan
    Qi, Meibin
    Zhuang, Shuo
    Yang, Yanfang
    Gao, Jun
    OPTICS LETTERS, 2022, 47 (16) : 4255 - 4258
  • [3] A Transformer-Based Fusion Recommendation Model For IPTV Applications
    Li, Heng
    Lei, Hang
    Yang, Maolin
    Zeng, Jinghong
    Zhu, Di
    Fu, Shouwei
    2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 177 - 182
  • [4] CEWformer: A Transformer-Based Collaborative Network for Simultaneous Underwater Image Enhancement and Watermarking
    Wu, Jun
    Luo, Ting
    He, Zhouyan
    Song, Yang
    Xu, Haiyong
    Li, Li
    IEEE JOURNAL OF OCEANIC ENGINEERING, 2024, 49 (01) : 30 - 47
  • [5] Transformer-based Image Compression
    Lu, Ming
    Guo, Peiyao
    Shi, Huiqing
    Cao, Chuntong
    Ma, Zhan
    DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 469 - 469
  • [6] Transformer-Based End-to-End Anatomical and Functional Image Fusion
    Zhang, Jing
    Liu, Aiping
    Wang, Dan
    Liu, Yu
    Wang, Z. Jane
    Chen, Xun
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [7] Robust Human Motion Forecasting using Transformer-based Model
    Mascaro, Esteve Valls
    Ma, Shuo
    Ahn, Hyemin
    Lee, Dongheui
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 10674 - 10680
  • [8] RobuTrans: A Robust Transformer-Based Text-to-Speech Model
    Li, Naihan
    Liu, Yanqing
    Wu, Yu
    Liu, Shujie
    Zhao, Sheng
    Liu, Ming
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8228 - 8235
  • [9] Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion
    Xie, Baijun
    Sidulova, Mariia
    Park, Chung Hyuk
    SENSORS, 2021, 21 (14)
  • [10] A Transformer-Based Model for Super-Resolution of Anime Image
    Xu, Shizhuo
    Dutta, Vibekananda
    He, Xin
    Matsumaru, Takafumi
    SENSORS, 2022, 22 (21)