Human detection of political speech deepfakes across transcripts, audio, and video

被引:4
|
作者
Groh, Matthew [1 ]
Sankaranarayanan, Aruna [2 ,3 ]
Singh, Nikhil [2 ]
Kim, Dong Young [2 ]
Lippman, Andrew [2 ]
Picard, Rosalind [2 ]
机构
[1] Northwestern Univ, Kellogg Sch Management, Evanston, IL 60208 USA
[2] MIT, Media Lab, Cambridge, MA USA
[3] MIT, CSAIL, Cambridge, MA USA
关键词
SOCIAL MEDIA; NEWS; MISINFORMATION; DISINFORMATION; ATTENTION; KNOWLEDGE; SCIENCE; PHOTOS; IMPACT;
D O I
10.1038/s41467-024-51998-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advances in technology for hyper-realistic visual and audio effects provoke the concern that deepfake videos of political speeches will soon be indistinguishable from authentic video. We conduct 5 pre-registered randomized experiments with N = 2215 participants to evaluate how accurately humans distinguish real political speeches from fabrications across base rates of misinformation, audio sources, question framings with and without priming, and media modalities. We do not find base rates of misinformation have statistically significant effects on discernment. We find deepfakes with audio produced by the state-of-the-art text-to-speech algorithms are harder to discern than the same deepfakes with voice actor audio. Moreover across all experiments and question framings, we find audio and visual information enables more accurate discernment than text alone: human discernment relies more on how something is said, the audio-visual cues, than what is said, the speech content. With advances in generative AI, political speech deepfakes are becoming more realistic. Here, the authors show that people's ability to distinguish between real and fake speeches relies on audio and visual information more than the speech content.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Synthetic Speech Detection through Audio Folding
    Salvi, Davide
    Bestagini, Paolo
    Tubaro, Stefano
    PROCEEDINGS OF THE 2ND ACM INTERNATIONAL WORKSHOP ON MULTIMEDIA AI AGAINST DISCRIMINATION, MAD 2023, 2023, : 3 - 9
  • [32] An automatic multimodal speech recognition system with audio and video information
    Karpov, A. A.
    AUTOMATION AND REMOTE CONTROL, 2014, 75 (12) : 2190 - 2200
  • [33] An automatic multimodal speech recognition system with audio and video information
    A. A. Karpov
    Automation and Remote Control, 2014, 75 : 2190 - 2200
  • [34] DeepFakes detection across generations: Analysis of facial regions, fusion, and performance evaluation
    Tolosana, Ruben
    Romero-Tapiador, Sergio
    Vera-Rodriguez, Ruben
    Gonzalez-Sosa, Ester
    Fierrez, Julian
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
  • [35] Penalty Detection in Football Video on Audio and Shot
    Nie, Yanliu
    Fan, Jiande
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND COMPUTER SCIENCE (ICEMC 2016), 2016, 129 : 963 - 969
  • [36] Combining Audio and Video for Detection of Spontaneous Emotions
    Gajsek, Rok
    Struc, Vitomir
    Dobrisek, Simon
    Zibert, Janez
    Mihelic, France
    Pavesic, Nikola
    BIOMETRIC ID MANAGEMENT AND MULTIMODAL COMMUNICATION, PROCEEDINGS, 2009, 5707 : 114 - 121
  • [37] Scene change detection by audio and video clues
    Chen, SC
    Shyu, ML
    Liao, W
    Zhang, CC
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A365 - A368
  • [38] Audio-Visual Overlapped Speech Detection for Spontaneous Distant Speech
    Kyoung, Minyoung
    Jeon, Hyungbae
    Park, Kiyoung
    IEEE ACCESS, 2023, 11 : 27426 - 27432
  • [39] Robust Audio-Visual Speech Recognition Under Noisy Audio-Video Conditions
    Stewart, Darryl
    Seymour, Rowan
    Pass, Adrian
    Ming, Ji
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (02) : 175 - 184
  • [40] Diverse misinformation: impacts of human biases on detection of deepfakes on networks
    Juniper Lovato
    Jonathan St-Onge
    Randall Harp
    Gabriela Salazar Lopez
    Sean P. Rogers
    Ijaz Ul Haq
    Laurent Hébert-Dufresne
    Jeremiah Onaolapo
    npj Complexity, 1 (1):