DEEPFAKE SPEECH DETECTION THROUGH EMOTION RECOGNITION: A SEMANTIC APPROACH

被引:22
|
作者
Conti, Emanuele [1 ]
Salvi, Davide [1 ]
Borrelli, Clara [1 ]
Hosler, Brian [2 ]
Bestagini, Paolo [1 ]
Antonacci, Fabio [1 ]
Sarti, Augusto [1 ]
Stamm, Matthew C. [2 ]
Tubaro, Stefano [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, Milan, Italy
[2] Drexel Univ, Dept Elect & Comp Engn, Philadelphia, PA 19104 USA
关键词
deepfake; audio forensics; deep learning;
D O I
10.1109/ICASSP43922.2022.9747186
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In recent years, audio and video deepfake technology has advanced relentlessly, severely impacting people's reputation and reliability. Several factors have facilitated the growing deepfake threat. On the one hand, the hyper-connected society of social and mass media enables the spread of multimedia content worldwide in real-time, facilitating the dissemination of counterfeit material. On the other hand, neural network-based techniques have made deepfakes easier to produce and difficult to detect, showing that the analysis of low-level features is no longer sufficient for the task. This situation makes it crucial to design systems that allow detecting deepfakes at both video and audio levels. In this paper, we propose a new audio spoofing detection system leveraging emotional features. The rationale behind the proposed method is that audio deepfake techniques cannot correctly synthesize natural emotional behavior. Therefore, we feed our deepfake detector with high-level features obtained from a state-of-the-art Speech Emotion Recognition (SER) system. As the used descriptors capture semantic audio information, the proposed system proves robust in cross-dataset scenarios outperforming the considered baseline on multiple datasets.
引用
收藏
页码:8962 / 8966
页数:5
相关论文
共 50 条
  • [1] Deepfake Speech Recognition and Detection
    Chang, Hung-Chang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (09)
  • [2] SPEECH EMOTION RECOGNITION USING SEMANTIC INFORMATION
    Tzirakis, Panagiotis
    Anh Nguyen
    Zafeiriou, Stefanos
    Schuller, Bjoern W.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6279 - 6283
  • [3] Disruptive situation detection on public transport through speech emotion recognition
    Mancini, Eleonora
    Galassi, Andrea
    Ruggeri, Federico
    Torroni, Paolo
    INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 21
  • [4] Fuzzy speech emotion recognition considering semantic awareness
    Xiong, Yu
    Cai, Ting
    Zhong, Xin
    Zhou, Song
    Cai, Linqin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (03) : 7367 - 7377
  • [5] Speech Emotion Recognition Adapted to Multimodal Semantic Repositories
    Vryzas, Nikolaos
    Vrysis, Lazaros
    Kotsakis, Rigas
    Dimoulas, Charalampos
    2018 13TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION AND PERSONALIZATION (SMAP 2018), 2018, : 31 - 35
  • [6] Emotion Recognition Through Analysis of Speech - A Review
    Poyraz, Rasim Atakan
    Suvarna, Prajyot
    Iliev, Alexander I.
    DIGITAL PRESENTATION AND PRESERVATION OF CULTURAL AND SCIENTIFIC HERITAGE, 2024, 14 : 227 - 238
  • [7] Speech emotion recognition using a fuzzy approach
    Ton-That, An H.
    Cao, Nhan T.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (02) : 1587 - 1597
  • [8] Machine Learning Approach for Emotion Recognition in Speech
    Gjoreski, Martin
    Gjoreski, Hristijan
    Kulakov, Andrea
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2014, 38 (04): : 377 - 383
  • [9] A Path Signature Approach for Speech Emotion Recognition
    Wang, Bo
    Liakata, Maria
    Ni, Hao
    Lyons, Terry
    Nevado-Holgado, Alejo J.
    Saunders, Kate
    INTERSPEECH 2019, 2019, : 1661 - 1665
  • [10] Enhanced Speech Emotion Recognition Using the Cognitive Emotion Fusion Network for PTSD Detection with a Novel Hybrid Approach
    Suneetha, Chappidi
    Anitha, Raju
    JOURNAL OF ELECTRICAL SYSTEMS, 2023, 19 (04) : 376 - 398