CNN-Based Multi-Modal Camera Model Identification on Video Sequences

被引:8
|
作者
Dal Cortivo, Davide [1 ]
Mandelli, Sara [1 ]
Bestagini, Paolo [1 ]
Tubaro, Stefano [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, I-20133 Milan, Italy
关键词
camera model identification; video forensics; audio forensics; convolutional neural networks;
D O I
10.3390/jimaging7080135
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Identifying the source camera of images and videos has gained significant importance in multimedia forensics. It allows tracing back data to their creator, thus enabling to solve copyright infringement cases and expose the authors of hideous crimes. In this paper, we focus on the problem of camera model identification for video sequences, that is, given a video under analysis, detecting the camera model used for its acquisition. To this purpose, we develop two different CNN-based camera model identification methods, working in a novel multi-modal scenario. Differently from mono-modal methods, which use only the visual or audio information from the investigated video to tackle the identification task, the proposed multi-modal methods jointly exploit audio and visual information. We test our proposed methodologies on the well-known Vision dataset, which collects almost 2000 video sequences belonging to different devices. Experiments are performed, considering native videos directly acquired by their acquisition devices and videos uploaded on social media platforms, such as YouTube and WhatsApp. The achieved results show that the proposed multi-modal approaches significantly outperform their mono-modal counterparts, representing a valuable strategy for the tackled problem and opening future research to even more challenging scenarios.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] A Miniaturised Camera-based Multi-Modal Tactile Sensor
    Althoefer, Kaspar
    Ling, Yonggen
    Li, Wanlin
    Qian, Xinyuan
    Lee, Wang Wei
    Qi, Peng
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 12570 - 12575
  • [22] Video Compression With CNN-Based Postprocessing
    Zhang, Fan
    Ma, Di
    Feng, Chen
    Bull, David R.
    IEEE MULTIMEDIA, 2021, 28 (04) : 74 - 83
  • [23] CNN-based multi-frame IMO detection from a monocular camera
    Fanani, Nolang
    Ochs, Matthias
    Stuerck, Alina
    Mester, Rudolf
    2018 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2018, : 957 - 964
  • [24] Multi-modal fusion for video understanding
    Hoogs, A
    Mundy, J
    Cross, G
    30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 103 - 108
  • [25] Multi-modal Dense Video Captioning
    Iashin, Vladimir
    Rahtu, Esa
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4117 - 4126
  • [26] Different gait combinations based on multi-modal deep CNN architectures
    Yaprak, Busranur
    Gedikli, Eyup
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (35) : 83403 - 83425
  • [27] Automatic multi-modal meeting camera selection for video-conferences and meeting browsers
    Al-Hames, Marc
    Hoernler, Benedikt
    Mueller, Ronald
    Schenk, Joachim
    Rigoll, Gerhard
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 2074 - 2077
  • [28] News video classification based on multi-modal information fusion
    Lie, WN
    Su, CK
    2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 1021 - 1024
  • [29] Video recommendation based on multi-modal information and multiple kernel
    Li, Zhan
    Peng, Jin-Ye
    Geng, Guo-Hua
    Chen, Xiao-Jiang
    Zheng, Pan-Pan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (13) : 4599 - 4616
  • [30] Video recommendation based on multi-modal information and multiple kernel
    Zhan Li
    Jin-Ye Peng
    Guo-Hua Geng
    Xiao-Jiang Chen
    Pan-Pan Zheng
    Multimedia Tools and Applications, 2015, 74 : 4599 - 4616