HCiT: Deepfake Video Detection Using a Hybrid Model of CNN features and Vision Transformer

被引:7
|
作者
Kaddar, Bachir [1 ]
Fezza, Sid Ahmed [2 ]
Hamidouche, Wassim [3 ]
Akhtar, Zahid [4 ]
Hadid, Abdenour [5 ]
机构
[1] Univ Ibn Khaldoun, Dept Nat Sci & Life, Tiaret, Algeria
[2] Natl Inst Telecommun & ICT, Oran, Algeria
[3] Univ Rennes, INSA Rennes, CNRS, IETR UMR 6164, Rennes, France
[4] State Univ New York Polytech Inst, Utica, NY USA
[5] Univ Polytech Hauts de France, Univ Lille, CNRS, Cent Lille,UMR 8520,IEMN, Valenciennes, France
关键词
DeepFake video; detection; convolutional neural network; vision transformer; hybrid;
D O I
10.1109/VCIP53242.2021.9675402
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The number of new falsified video contents is dramatically increasing, making the need to develop effective deepfake detection methods more urgent than ever. Even though many existing deepfake detection approaches show promising results, the majority of them still suffer from a number of critical limitations. In general, poor generalization results have been obtained under unseen or new deepfake generation methods. Consequently, in this paper, we propose a deepfake detection method called HOT, which combines Convolutional Neural Network (CNN) with Vision Transformer (ViT). The HCiT hybrid architecture exploits the advantages of CNN to extract local information with the ViT's self-attention mechanism to improve the detection accuracy. In this hybrid architecture, the feature maps extracted from the CNN are feed into ViT model that determines whether a specific video is fake or real. Experiments were performed on Faceforensics++ and DeepFake Detection Challenge preview datasets, and the results show that the proposed method significantly outperforms the state-of-the-art methods. In addition, the HCiT method shows a great capacity for generalization on datasets covering various techniques of deepfake generation. The source code is available at: https://github.com/KADDAR-Bachir/HCiT
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Cascaded Network Based on EfficientNet and Transformer for Deepfake Video Detection
    Deng, Liwei
    Wang, Jiandong
    Liu, Zhen
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (06) : 7057 - 7076
  • [22] Cascaded Network Based on EfficientNet and Transformer for Deepfake Video Detection
    Liwei Deng
    Jiandong Wang
    Zhen Liu
    [J]. Neural Processing Letters, 2023, 55 : 7057 - 7076
  • [23] A Hybrid Wheat Head Detection model with Incorporated CNN and Transformer
    Harada, Sho
    Han, Xian-Hua
    [J]. 2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [24] MSVT: Multiple Spatiotemporal Views Transformer for DeepFake Video Detection
    Yu, Yang
    Ni, Rongrong
    Zhao, Yao
    Yang, Siyuan
    Xia, Fen
    Jiang, Ning
    Zhao, Guoqing
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4462 - 4471
  • [25] Scene Detection of News Video Using CNN Features
    Cui, Yize
    Cai, Yiheng
    Qiu, Changyan
    Gao, Xurong
    [J]. 2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [26] Exploring spatial-temporal features fusion model for Deepfake video detection
    Wu, Jiujiu
    Zhou, Jiyu
    Wang, Danyu
    Wang, Lin
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (06)
  • [27] Detection of Deepfake Media Using a Hybrid CNN-RNN Model and Particle Swarm Optimization (PSO) Algorithm
    Al-Adwan, Aryaf
    Alazzam, Hadeel
    Al-Anbaki, Noor
    Alduweib, Eman
    [J]. COMPUTERS, 2024, 13 (04)
  • [28] DeepFake Video Analysis using SIFT Features
    Dordevic, Miljan
    Milivojevic, Milan
    Gavrovska, Ana
    [J]. 2019 27TH TELECOMMUNICATIONS FORUM (TELFOR 2019), 2019, : 507 - 510
  • [29] Comparison of Eye-gaze Detection using CNN and Vision Transformer
    Niikura, Daiki
    Abe, Kiyohiko
    [J]. IEEJ Transactions on Electronics, Information and Systems, 2024, 144 (07) : 683 - 684
  • [30] AN EFFICIENT DEEP VIDEO MODEL FOR DEEPFAKE DETECTION
    Sun, Ruipeng
    Zhao, Ziyuan
    Shen, Li
    Zeng, Zeng
    Li, Yuxin
    Veeravalli, Bharadwaj
    Yang Xulei
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 351 - 355