Human detection of political speech deepfakes across transcripts, audio, and video

被引:4
|
作者
Groh, Matthew [1 ]
Sankaranarayanan, Aruna [2 ,3 ]
Singh, Nikhil [2 ]
Kim, Dong Young [2 ]
Lippman, Andrew [2 ]
Picard, Rosalind [2 ]
机构
[1] Northwestern Univ, Kellogg Sch Management, Evanston, IL 60208 USA
[2] MIT, Media Lab, Cambridge, MA USA
[3] MIT, CSAIL, Cambridge, MA USA
关键词
SOCIAL MEDIA; NEWS; MISINFORMATION; DISINFORMATION; ATTENTION; KNOWLEDGE; SCIENCE; PHOTOS; IMPACT;
D O I
10.1038/s41467-024-51998-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advances in technology for hyper-realistic visual and audio effects provoke the concern that deepfake videos of political speeches will soon be indistinguishable from authentic video. We conduct 5 pre-registered randomized experiments with N = 2215 participants to evaluate how accurately humans distinguish real political speeches from fabrications across base rates of misinformation, audio sources, question framings with and without priming, and media modalities. We do not find base rates of misinformation have statistically significant effects on discernment. We find deepfakes with audio produced by the state-of-the-art text-to-speech algorithms are harder to discern than the same deepfakes with voice actor audio. Moreover across all experiments and question framings, we find audio and visual information enables more accurate discernment than text alone: human discernment relies more on how something is said, the audio-visual cues, than what is said, the speech content. With advances in generative AI, political speech deepfakes are becoming more realistic. Here, the authors show that people's ability to distinguish between real and fake speeches relies on audio and visual information more than the speech content.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] The detection of political deepfakes
    Appel, Markus
    Prietzel, Fabian
    JOURNAL OF COMPUTER-MEDIATED COMMUNICATION, 2022, 27 (04)
  • [2] Deepfakes Audio Detection Leveraging Audio Spectrogram and Convolutional Neural Networks
    Wani, Taiba Majid
    Amerini, Irene
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT II, 2023, 14234 : 156 - 167
  • [3] A Survey on the Detection and Impacts of Deepfakes in Visual, Audio, and Textual Formats
    Mubarak, Rami
    Alsboui, Tariq
    Alshaikh, Omar
    Inuwa-Dutse, Isa
    Khan, Saad
    Parkinson, Simon
    IEEE ACCESS, 2023, 11 : 144497 - 144529
  • [4] Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes
    Martinez-Serrano, Alba
    Montero-Ramirez, Claudia
    Pelaez-Moreno, Carmen
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [5] Remote Audio/Video Acquisition for Human Signature Detection
    Qu, Yufu
    Wang, Tao
    Zhu, Zhigang
    2009 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPR WORKSHOPS 2009), VOLS 1 AND 2, 2009, : 683 - 688
  • [6] TRAED: Speech audio editing using imperfect transcripts
    Masoodian, Masood
    Rogers, Bill
    Ware, David
    McKoy, Sam
    12TH INTERNATIONAL MULTI-MEDIA MODELLING CONFERENCE PROCEEDINGS, 2006, : 454 - 459
  • [7] Emergence of deepfakes and video tampering detection approaches: A survey
    Kingra, Staffy
    Aggarwal, Naveen
    Kaur, Nirmal
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (07) : 10165 - 10209
  • [8] FakeBuster: A DeepFakes Detection Tool for Video Conferencing Scenarios
    Mehta, Vineet
    Gupta, Parul
    Subramanian, Ramanathan
    Dhall, Abhinav
    26TH INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES (IUI '21 COMPANION), 2021, : 61 - 63
  • [9] Emergence of deepfakes and video tampering detection approaches: A survey
    Staffy Kingra
    Naveen Aggarwal
    Nirmal Kaur
    Multimedia Tools and Applications, 2023, 82 : 10165 - 10209
  • [10] Automated video summarization using speech transcripts
    Taskiran, CM
    Amir, A
    Ponceleon, D
    Delp, EJ
    STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2002, 2002, 4676 : 371 - 382