ViXNet: Vision Transformer with Xception Network for deepfakes based video and image forgery detection

被引:25
|
作者
Ganguly, Shreyan [1 ]
Ganguly, Aditya [2 ]
Mohiuddin, Sk [3 ]
Malakar, Samir [3 ]
Sarkar, Ram [2 ]
机构
[1] Jadavpur Univ, Dept Construct Engn, Kolkata, India
[2] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata, India
[3] Asutosh Coll, Dept Comp Sci, Kolkata, India
关键词
Deepfakes; FaceSwap; Soft attention; Vision transformer; Forgery detection; Xception model;
D O I
10.1016/j.eswa.2022.118423
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the advent of image generative technologies, there is a huge growth in the development of facial manipulation techniques that allow people to easily modify media data like videos and images by changing the identity or facial expression of the target person with another person's face. Colloquially, these manipulated videos and images are termed "deepfakes". As a result, every piece of content in digital media comes with a question - is this authentic? Hence, there is an unprecedented need for a competent deepfakes detection method. The rapid changes in forging methods make this a very challenging task and thus generalization of the detection methods is also of utmost required. However, the generalization strengths of the prevailing deepfakes detection methods are not satisfactory. In other words, these models perform well when trained and tested on the same dataset but fail to perform satisfactorily when models are trained on one dataset and tested on another. The most modern deep learning aided deepfakes detection techniques looked for a consistent pattern among the leftover artifacts in specific facial regions of the target face rather than the entire face. To this end, we propose a Vision Transformer with Xception Network (ViXNet) to learn the consistency of these almost imperceptible artifacts left by deepfaking methods on the entire facial region. The ViXNet comprises two branches - one tries to learn inconsistencies among local face region specifics by combining patch-wise self-attention module and vision transformer, and the other generates global spatial features using a deep convolutional neural network. To assess the performance of ViXNet, we evaluate it using two different experimental setups - intra-dataset and inter-dataset when using three standard deepfakes video datasets, namely FaceForensics++, and Celeb-DF (V2) and one deepfakes image dataset called Deepfakes. We have attained 98.57% (83.60%), 99.26% (74.78%), and 98.93% (75.13%) AUC scores using intra(inter)-dataset experimental setups on FaceForensics++, Celeb-DF (V2), and Deepfakes datasets respectively. Additionally, we have evaluated ViXNet on the Deepfake Detection Challenge (DFDC) dataset and we have obtained 86.32% AUC score and 79.06% F1-score on the said dataset. Performances of the proposed model are comparable to state-of-the-art methods. Besides, the obtained results ensure the robustness and the generalization ability of the proposed model.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Image forgery detection by combining Visual Transformer with Variational Autoencoder Network
    Atak, Ilker Galip
    Yasar, Ali
    [J]. APPLIED SOFT COMPUTING, 2024, 165
  • [2] Deep Forgery Detection Using CutMix Algorithm and Improved Xception Network
    Geng Pengzhi
    Tang Yunqi
    Fan Hongxing
    Zhu Xintong
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (16)
  • [3] Unmasking Deepfakes: Masked Autoencoding Spatiotemporal Transformers for Enhanced Video Forgery Detection
    Das, Sayantan
    Kolahdouzi, Mojtaba
    Ozparlak, Levent
    Hickie, Will
    Etemad, Ali
    [J]. 2023 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS, IJCB, 2023,
  • [4] Network Intrusion Detection Based on Feature Image and Deformable Vision Transformer Classification
    He, Kan
    Zhang, Wei
    Zong, Xuejun
    Lian, Lian
    [J]. IEEE ACCESS, 2024, 12 : 44335 - 44350
  • [5] Image Forgery Detection Based on the Convolutional Neural Network
    Feng Guorui
    Wu Jian
    [J]. ICMLC 2020: 2020 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2018, : 266 - 270
  • [6] Where Deepfakes Gaze at? Spatial? Temporal Gaze Inconsistency Analysis for Video Face Forgery Detection
    Peng, Chunlei
    Miao, Zimin
    Liu, Decheng
    Wang, Nannan
    Hu, Ruimin
    Gao, Xinbo
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 4507 - 4517
  • [7] Hybrid Domain Meta-Learning Network for Face Forgery Detection and Localization in Deepfakes
    Zhao, Hongjie
    Liu, Beibei
    Hu, Yongjian
    Li, Jicheng
    Li, Chang-Tsun
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [8] A detailed analysis of image and video forgery detection techniques
    Tyagi, Shobhit
    Yadav, Divakar
    [J]. VISUAL COMPUTER, 2023, 39 (03): : 813 - 833
  • [9] A new video forgery detection approach based on forgery line
    Bozkurt, Isilay
    Bozkurt, Mustafa Hakan
    Ulutas, Guzin
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2017, 25 (06) : 4558 - 4574
  • [10] A detailed analysis of image and video forgery detection techniques
    Shobhit Tyagi
    Divakar Yadav
    [J]. The Visual Computer, 2023, 39 : 813 - 833