DEEPFAKE SPEECH DETECTION THROUGH EMOTION RECOGNITION: A SEMANTIC APPROACH

被引:22
|
作者
Conti, Emanuele [1 ]
Salvi, Davide [1 ]
Borrelli, Clara [1 ]
Hosler, Brian [2 ]
Bestagini, Paolo [1 ]
Antonacci, Fabio [1 ]
Sarti, Augusto [1 ]
Stamm, Matthew C. [2 ]
Tubaro, Stefano [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, Milan, Italy
[2] Drexel Univ, Dept Elect & Comp Engn, Philadelphia, PA 19104 USA
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年
关键词
deepfake; audio forensics; deep learning;
D O I
10.1109/ICASSP43922.2022.9747186
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In recent years, audio and video deepfake technology has advanced relentlessly, severely impacting people's reputation and reliability. Several factors have facilitated the growing deepfake threat. On the one hand, the hyper-connected society of social and mass media enables the spread of multimedia content worldwide in real-time, facilitating the dissemination of counterfeit material. On the other hand, neural network-based techniques have made deepfakes easier to produce and difficult to detect, showing that the analysis of low-level features is no longer sufficient for the task. This situation makes it crucial to design systems that allow detecting deepfakes at both video and audio levels. In this paper, we propose a new audio spoofing detection system leveraging emotional features. The rationale behind the proposed method is that audio deepfake techniques cannot correctly synthesize natural emotional behavior. Therefore, we feed our deepfake detector with high-level features obtained from a state-of-the-art Speech Emotion Recognition (SER) system. As the used descriptors capture semantic audio information, the proposed system proves robust in cross-dataset scenarios outperforming the considered baseline on multiple datasets.
引用
收藏
页码:8962 / 8966
页数:5
相关论文
共 50 条
  • [31] Distinctive Approach for Speech Emotion Recognition Using Machine Learning
    Singh, Yogyata
    Neetu
    Rani, Shikha
    MACHINE LEARNING, IMAGE PROCESSING, NETWORK SECURITY AND DATA SCIENCES, MIND 2022, PT I, 2022, 1762 : 39 - 51
  • [32] A Hierarchical Approach with Feature Selection for Emotion Recognition from Speech
    Giannoulis, Panagiotis
    Potamianos, Gerasimos
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1203 - 1206
  • [33] Deep Learning Approach towards Emotion Recognition Based on Speech
    Butala, Padmanabh
    Pawar, Rajendra
    Jadhav, Nagesh
    Kalangan, Manas
    Dhumal, Aniket
    Kakad, Sahil
    JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH, 2024, 6 (03): : 16 - 24
  • [34] Bimodal Approach in Emotion Recognition using Speech and Facial Expressions
    Emerich, Simina
    Lupu, Eugen
    Apatean, Anca
    ISSCS 2009: INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS, VOLS 1 AND 2, PROCEEDINGS,, 2009, : 297 - 300
  • [35] VALENCE-AROUSAL APPROACH FOR SPEECH EMOTION RECOGNITION SYSTEM
    Kamaruddin, Norhaslinda
    Rahman, Abdul Wahab Abdul
    2013 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTER AND COMPUTATION (ICECCO), 2013, : 184 - 187
  • [36] A Data Augmentation Approach for Improving the Performance of Speech Emotion Recognition
    Paraskevopoulou, Georgia
    Spyrou, Evaggelos
    Perantonis, Stavros
    SIGMAP: PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2022, : 61 - 69
  • [37] Emotion Recognition from Speech - an LSTM approach with the Tess Dataset
    Pandiammal, Sankara K.
    Karishma, S.
    Sakthe, Harine K.
    Manimaran, V
    Kalaiselvi, S.
    Anitha, V
    2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
  • [38] SpoofCeleb: Speech Deepfake Detection and SASV in the Wild
    Jung, Jee-weon
    Wu, Yihan
    Wang, Xin
    Kim, Ji-Hoon
    Maiti, Soumi
    Matsunaga, Yuta
    Shim, Hye-jin
    Tian, Jinchuan
    Evans, Nicholas
    Chung, Joon Son
    Zhang, Wangyou
    Um, Seyun
    Takamichi, Shinnosuke
    Watanabe, Shinji
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2025, 6 : 68 - 77
  • [39] Multimodal Approach for DeepFake Detection
    Lomnitz, Michael
    Hampel-Arias, Zigfried
    Sandesara, Vishal
    Hu, Simon
    2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA, 2020,
  • [40] Novel Multimodel Approach for Marathi Speech Emotion Detection
    Yerigeri, Vaijanath V.
    Ragha, L. K.
    INTELLIGENT COMPUTING AND COMMUNICATION, ICICC 2019, 2020, 1034 : 195 - 207