An Open Dataset of Synthetic Speech

被引：1

作者：

Yaroshchuk, Artem ^{[1
]}

Papastergiopoulos, Christoforos ^{[2
]}

Cuccovillo, Luca ^{[1
]}

Aichroth, Patrick ^{[1
]}

Votis, Konstantinos ^{[2
]}

Tzovaras, Dimitrios ^{[2
]}

机构：

[1] Fraunhofer Inst Digital Media Technol, Ilmenau, Germany

[2] Ctr Res & Technol Hellas, Thessaloniki, Greece

来源：

2023 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY, WIFS | 2023年

关键词：

datasets; neural networks; speech synthesis;

D O I：

10.1109/WIFS58808.2023.10374863

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper introduces a multilingual, multispeaker dataset composed of synthetic and natural speech, designed to foster research and benchmarking in synthetic speech detection. The dataset encompasses 18,993 audio utterances synthesized from text, alongside with their corresponding natural equivalents, representing approximately 17 hours of synthetic audio data. The dataset features synthetic speech generated by 156 voices spanning three languages, namely, English, German, and Spanish, with a balanced gender representation. It targets state-of-the-art synthesis methods, and has been released with a license allowing seamless extension and redistribution by the research community.

引用

页数：6

共 50 条

[1] FoR: A Dataset for Synthetic Speech Detection
Reimao, Ricardo
Tzerpos, Vassilios
2019 10TH INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2019,
[2] DISPEECH: A SYNTHETIC TOY DATASET FOR SPEECH DISENTANGLING
Zhang, Olivier
Gengembre, Nicolas
Le Blouch, Olivier
Lolive, Damien
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8557 - 8561
[3] Open Challenges in Synthetic Speech Detection
Cuccovillo, Luca
Papastergiopoulos, Christoforos
Vafeiadis, Anastasios
Yaroshchuk, Artem
Aichroth, Patrick
Votis, Konstantinos
Tzovaras, Dimitrios
2022 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2022,
[4] FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection
Zhang, Zhenyu
Gu, Yewei
Yi, Xiaowei
Zhao, Xianfeng
DIGITAL FORENSICS AND WATERMARKING, IWDW 2021, 2022, 13180 : 117 - 131
[5] KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset
Mussakhojayeva, Saida
Janaliyeva, Aigerim
Mirzakhmetov, Almas
Khassanov, Yerbolat
Varol, Huseyin Atakan
INTERSPEECH 2021, 2021, : 2786 - 2790
[6] Artie Bias Corpus: An Open Dataset for Detecting Demographic Bias in Speech Applications
Meyer, Josh
Rauchenstein, Lynn
Eisenberg, Joshua D.
Howell, Nicholas
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6462 - 6468
[7] TIMIT-TTS: A Text-to-Speech Dataset for Multimodal Synthetic Media Detection
Salvi, Davide
Hosler, Brian
Bestagini, Paolo
Stamm, Matthew C.
Tubaro, Stefano
IEEE ACCESS, 2023, 11 : 50851 - 50866
[8] OLKAVS: AN OPEN LARGE-SCALE KOREAN AUDIO-VISUAL SPEECH DATASET
Park, Jeongkyun
Hwang, Jung-Wook
Choi, Kwanghee
Lee, Seung-Hyeon
Ahn, Jun Hwan
Park, Rae-Hong
Park, Hyung-Min
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6385 - 6389
[9] An Open Dataset of Connected Speech in Aphasia with Consensus Ratings of Auditory-Perceptual Features
Ezzes, Zoe
Schneck, Sarah M.
Casilio, Marianne
Fromm, Davida
Mefferd, Antje S.
de Riesthal, Michael
Wilson, Stephen M.
DATA, 2022, 7 (11)
[10] SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
Maniati, Georgia
Vioni, Alexandra
Ellinas, Nikolaos
Nikitaras, Karolos
Klapsas, Konstantinos
Sung, June Sig
Jho, Gunu
Chalamandaris, Aimilios
Tsiakoulis, Pirros
INTERSPEECH 2022, 2022, : 2388 - 2392

← 1 2 3 4 5 →