UnionFormer: Unified-Learning Transformer with Multi-View Representation for Image Manipulation Detection and Localization

被引:2
|
作者
Li, Shuaibo [1 ,2 ]
Ma, Wei [1 ]
Guo, Jianwei [2 ]
Xu, Shibiao [3 ]
Li, Benchong [1 ]
Zhan, Xiaopeng [2 ]
机构
[1] Beijing Univ Technol, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China
[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
NETWORKS;
D O I
10.1109/CVPR52733.2024.01190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present UnionFormer, a novel framework that integrates tampering clues across three views by unified learning for image manipulation detection and localization. Specifically, we construct a BSFI-Net to extract tampering features from RGB and noise views, achieving enhanced responsiveness to boundary artifacts while modulating spatial consistency at different scales. Additionally, to explore the inconsistency between objects as a new view of clues, we combine object consistency modeling with tampering detection and localization into a three-task unified learning process, allowing them to promote and improve mutually. Therefore, we acquire a unified manipulation discriminative representation under multi-scale supervision that consolidates information from three views. This integration facilitates highly effective concurrent detection and localization of tampering. We perform extensive experiments on diverse datasets, and the results show that the proposed approach outperforms state-of-the-art methods in tampering detection and localization.
引用
收藏
页码:12523 / 12533
页数:11
相关论文
共 50 条
  • [21] Combining multi-view learning and consistent representation for face forgery detection
    Zhang J.
    Yu M.
    Yang J.
    Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 2023, 45 (04): : 28 - 36
  • [22] UniFusion: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird's-Eye-View
    Qin, Zequn
    Chen, Jingyu
    Chen, Chao
    Chen, Xiaozhi
    Li, Xi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8656 - 8665
  • [23] Multi-view Geometry and Deep Learning Based Drone Detection and Localization
    Shinde, Chinmay
    Lima, Rolif
    Das, Kaushik
    2019 FIFTH INDIAN CONTROL CONFERENCE (ICC), 2019, : 289 - 294
  • [24] A Multi-View Unified Feature Learning Network for EEG Epileptic Seizure Detection
    Liu, Yu
    Li, Yang
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 2608 - 2612
  • [25] Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation
    Peng, Rui
    Wang, Rongjie
    Wang, Zhenyu
    Lai, Yawen
    Wang, Ronggang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8635 - 8644
  • [26] DEEP MULTI-VIEW ROBUST REPRESENTATION LEARNING
    Jiao, Zhenyu
    Xu, Chao
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2851 - 2855
  • [27] Multi-View Concept Learning for Data Representation
    Guan, Ziyu
    Zhang, Lijun
    Peng, Jinye
    Fan, Jianping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (11) : 3016 - 3028
  • [28] A survey on representation learning for multi-view data
    Qin, Yalan
    Zhang, Xinpeng
    Yu, Shui
    Feng, Guorui
    NEURAL NETWORKS, 2025, 181
  • [29] Unsupervised Multi-View Gaze Representation Learning
    Gideon, John
    Su, Shan
    Stent, Simon
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4997 - 5005
  • [30] Tensorized Multi-view Subspace Representation Learning
    Zhang, Changqing
    Fu, Huazhu
    Wang, Jing
    Li, Wen
    Cao, Xiaochun
    Hu, Qinghua
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (8-9) : 2344 - 2361