LISTENING BETWEEN THE LINES: SYNTHETIC SPEECH DETECTION DISREGARDING VERBAL CONTENT

被引：0

作者：

Salvi, Davide ^{[1
]}

Balcha, Temesgen Semu ^{[1
]}

Bestagini, Paolo ^{[1
]}

Tubaro, Stefano ^{[1
]}

机构：

[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, Milan, Italy

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024 | 2024年

关键词：

Audio Forensics; Synthetic Speech; Background Noise; Explainability;

D O I：

10.1109/ICASSPW62465.2024.10669901

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recent advancements in synthetic speech generation have led to the creation of forged audio data that are almost indistinguishable from real speech. This phenomenon poses a new challenge for the multimedia forensics community, as the misuse of synthetic media can potentially cause adverse consequences. Several methods have been proposed in the literature to mitigate potential risks and detect synthetic speech, mainly focusing on the analysis of the speech itself. However, recent studies have revealed that the most crucial frequency bands for detection lie in the highest ranges (above 6000 Hz), which do not include any speech content. In this work, we extensively explore this aspect and investigate whether synthetic speech detection can be performed by focusing only on the background component of the signal while disregarding its verbal content. Our findings indicate that the speech component is not the predominant factor in performing synthetic speech detection. These insights provide valuable guidance for the development of new synthetic speech detectors and their interpretability, together with some considerations on the existing work in the audio forensics field.

引用

页码：883 / 887

页数：5

共 50 条

[1] Listening between the lines
Linowes, JG
JOURNAL OF MANAGEMENT IN ENGINEERING, 1998, 14 (06) : 21 - 22
[2] Covariation between bizarre and nonbizarre speech as a function of the content of verbal attention
DeLeon, IG
Arnold, KL
Rodriguez-Catter, V
Uy, ML
JOURNAL OF APPLIED BEHAVIOR ANALYSIS, 2003, 36 (01) : 101 - 104
[3] Speech Recognition and Listening Effort of Meaningful Sentences Using Synthetic Speech
Ibelings, Saskia
Brand, Thomas
Holube, Inga
TRENDS IN HEARING, 2022, 26
[4] Relations Of Stuttering In Spontaneous Speech To Speech Content And Verbal Output
Moore, Wilbur E.
Soderberg, George
Powell, Donna
JOURNAL OF SPEECH AND HEARING DISORDERS, 1952, 17 (04): : 371 - 376
[5] Lateralization in the dichotic listening of tones is influenced by the content of speech
Mei, Ning
Flinker, Adeen
Zhu, Miaomiao
Cai, Qing
Tian, Xing
NEUROPSYCHOLOGIA, 2020, 140
[6] AN EXPERIMENTAL STUDY OF LISTENING BETWEEN LINES
SPENCE, DP
GRIEF, B
JOURNAL OF NERVOUS AND MENTAL DISEASE, 1970, 151 (03) : 179 - +
[7] Talk to me: Listening between the lines
Brooks, D
NEW REPUBLIC, 2000, 223 (24) : 36 - 38
[8] Talk to me - Listening between the lines
Finkle, D
NEW YORK TIMES BOOK REVIEW, 2000, : 25 - 25
[9] Stress Reactivity of Emotional and Verbal Speech Content in Schizophrenia
Dombrowski, Margaret
McCleery, Amanda
Gregory, Stanford W., Jr.
Docherty, Nancy M.
JOURNAL OF NERVOUS AND MENTAL DISEASE, 2014, 202 (08) : 608 - 612
[10] FoR: A Dataset for Synthetic Speech Detection
Reimao, Ricardo
Tzerpos, Vassilios
2019 10TH INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2019,

← 1 2 3 4 5 →