FST-Net: Exploiting Frequency Spatial Temporal Information for Low-Quality Fake Video Detection

被引:1
|
作者
Zhang, Min [1 ,2 ]
Liu, Xiaohan [3 ]
Liu, Chenyu [3 ]
Zhang, Xueqi [1 ,2 ]
Xie, Haiyong [2 ,4 ]
机构
[1] Univ Sci & Technol China, Hefei 230026, Peoples R China
[2] Minist Culture & Tourism, Key Lab Cyberculture Content Cognit & Detect, Hefei 230027, Anhui, Peoples R China
[3] Natl Engn Lab Publ Safety Risk Percept & Control, Beijing 100040, Peoples R China
[4] Capital Med Univ, Adv Innovat Ctr Human Brain Protect, Beijing 100054, Peoples R China
关键词
forgery detection; fake video detection; spectrum decomposition; spatial-temporal features across frames; attention mechanism;
D O I
10.1109/ICTAI52525.2021.00087
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, state-of-the-art face manipulation algorithms have made significant progresses to forge images and videos that are able to deceive human eyes or even detection algorithms, which brings new challenges to forgery detection. In particular, the performance of detection algorithms for forgery videos is not as perfect as that for forgery images; furthermore, the difficulty of detection increases dramatically for low-quality videos. To address this challenge, we propose a novel dual stream architecture, referred to as FST-Net, for jointly mining forged features in the frequency, spatial and temporal domains. Specifically, we extract the spectral information of different frequency bands to expose intra-frame artifacts, and use the separable 3D CNN (S3D) to extract the spatio-temporal features among video frame groups. Moreover, to make the model focus on the tampered area, we add an attention layer to both backbone networks. Comprehensive experiments show that our model outperforms existing methods in video detection on challenging FaceForensics++ datasets, especially on low-quality video datasets.
引用
收藏
页码:536 / 543
页数:8
相关论文
共 39 条
  • [21] Proposal for an objective video quality assessment method that takes temporal and spatial information into consideration
    Okamoto, J
    Hayashi, T
    Takahashi, A
    Kurita, T
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART I-COMMUNICATIONS, 2006, 89 (12): : 97 - 108
  • [22] Exposing low-quality deepfake videos of Social Network Service using Spatial Restored Detection Framework
    Li, Ying
    Bian, Shan
    Wang, Chuntao
    Polat, Kemal
    Alhudhaif, Adi
    Alenezi, Fayadh
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
  • [23] QRS classification and spatial combination for robust heart rate detection in low-quality fetal ECG recordings
    Warmerdam, G.
    Vullings, R.
    Van Pul, C.
    Andriessen, P.
    Oei, S. G.
    Wijn, P.
    2013 35TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2013, : 2004 - 2007
  • [24] Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in VIS and NIR Scenario
    Wang, Yukai
    Peng, Chunlei
    Liu, Decheng
    Wang, Nannan
    Gao, Xinbo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7943 - 7956
  • [25] FTDKD: Frequency-Time Domain Knowledge Distillation for Low-Quality Compressed Audio Deepfake Detection
    Wang, Bo
    Tang, Yeling
    Wei, Fei
    Ba, Zhongjie
    Ren, Kui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4905 - 4918
  • [26] Moving target detection and labeling in video sequence based on spatial-temporal information fusion
    Ma, Shiwei
    Liu, Zhongjie
    Yang, Banghua
    Wang, Jian
    BIO-INSPIRED COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2007, 4688 : 795 - 802
  • [27] A spatial-frequency-temporal domain based saliency model for low contrast video sequences
    Mu, Nan
    Xu, Xin
    Zhang, Xiaolong
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 58 : 79 - 88
  • [28] Window Regression: A Spatial-Temporal Analysis to Estimate Pixels Classified as Low-Quality in MODIS NDVI Time Series
    de Oliveira, Julio Cesar
    Neves Epiphanio, Jose Carlos
    Renno, Camilo Daleles
    REMOTE SENSING, 2014, 6 (04): : 3123 - 3142
  • [29] A low-quality PMU data identification method with dynamic criteria based on spatial-temporal correlations and random matrices
    Song, Wenchao
    Lu, Chao
    Lin, Junjie
    Fang, Chen
    Liu, Shu
    APPLIED ENERGY, 2023, 343
  • [30] Improved Robust Video Saliency Detection Based on Long-Term Spatial-Temporal Information
    Chen, Chenglizhao
    Wang, Guotao
    Peng, Chong
    Zhang, Xiaowei
    Qin, Hong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 1090 - 1100