MGQFormer: Mask-Guided Query-Based Transformer for Image Manipulation Localization

被引:0
|
作者
Zeng, Kunlun [1 ]
Cheng, Ri [1 ]
Tan, Weimin [1 ]
Yan, Bo [1 ]
机构
[1] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Sch Comp Sci, Shanghai, Peoples R China
基金
上海市自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning-based models have made great progress in image tampering localization, which aims to distinguish between manipulated and authentic regions. However, these models suffer from inefficient training. This is because they use ground-truth mask labels mainly through the cross-entropy loss, which prioritizes per-pixel precision but disregards the spatial location and shape details of manipulated regions. To address this problem, we propose a Mask-Guided Query-based Transformer Framework (MGQFormer), which uses ground-truth masks to guide the learnable query token (LQT) in identifying the forged regions. Specifically, we extract feature embeddings of ground-truth masks as the guiding query token (GQT) and feed GQT and LQT into MGQ-Former to estimate fake regions, respectively. Then we make MGQFormer learn the position and shape information in ground-truth mask labels by proposing a mask-guided loss to reduce the feature distance between GQT and LQT. We also observe that such mask-guided training strategy has a significant impact on the convergence speed of MGQFormer training. Extensive experiments on multiple benchmarks show that our method significantly improves over state-of-the-art methods.
引用
收藏
页码:6944 / 6952
页数:9
相关论文
共 50 条
  • [1] A Mask-Guided Transformer Network with Topic Token for Remote Sensing Image Captioning
    Ren, Zihao
    Gou, Shuiping
    Guo, Zhang
    Mao, Shasha
    Li, Ruimin
    [J]. REMOTE SENSING, 2022, 14 (12)
  • [2] Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction
    Cai, Yuanhao
    Lin, Jing
    Hu, Xiaowan
    Wang, Haoqian
    Yuan, Xin
    Zhang, Yulun
    Timofte, Radu
    Van Gool, Luc
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17481 - 17490
  • [3] Mask-guided network for image captioning
    Lim, Jian Han
    Chan, Chee Seng
    [J]. PATTERN RECOGNITION LETTERS, 2023, 173 : 79 - 86
  • [4] GMFIM: A generative mask-guided facial image manipulation model for privacy preservation
    Khojasteh, Mohammad Hossein
    Farid, Nastaran Moradzadeh
    Nickabadi, Ahmad
    [J]. COMPUTERS & GRAPHICS-UK, 2023, 112 : 81 - 91
  • [5] Progressive Mask Transformer With Edge Enhancement for Image Manipulation Localization
    Zhu, Ye
    Liu, Jian
    Yu, Yang
    Guo, Yingchun
    Hao, Xiaoke
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2380 - 2384
  • [6] Mask-guided Image Classification with Siamese Networks
    Alqasir, Hiba
    Muselet, Damien
    Ducottet, Christophe
    [J]. PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 536 - 543
  • [7] Segmentation mask-guided person image generation
    Meichen Liu
    Xin Yan
    Chenhui Wang
    Kejun Wang
    [J]. Applied Intelligence, 2021, 51 : 1161 - 1176
  • [8] MagConv: Mask-Guided Convolution for Image Inpainting
    Yu, Xuexin
    Xu, Long
    Li, Jia
    Ji, Xiangyang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4716 - 4727
  • [9] Segmentation mask-guided person image generation
    Liu, Meichen
    Yan, Xin
    Wang, Chenhui
    Wang, Kejun
    [J]. APPLIED INTELLIGENCE, 2021, 51 (02) : 1161 - 1176
  • [10] Mask-Guided Transformer for Human-Object Interaction Detection
    Ying, Daocheng
    Yang, Hua
    Sun, Jun
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,