An Open Dataset of Synthetic Speech

被引：1

作者：

Yaroshchuk, Artem ^{[1
]}

Papastergiopoulos, Christoforos ^{[2
]}

Cuccovillo, Luca ^{[1
]}

Aichroth, Patrick ^{[1
]}

Votis, Konstantinos ^{[2
]}

Tzovaras, Dimitrios ^{[2
]}

机构：

[1] Fraunhofer Inst Digital Media Technol, Ilmenau, Germany

[2] Ctr Res & Technol Hellas, Thessaloniki, Greece

来源：

2023 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY, WIFS | 2023年

关键词：

datasets; neural networks; speech synthesis;

D O I：

10.1109/WIFS58808.2023.10374863

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper introduces a multilingual, multispeaker dataset composed of synthetic and natural speech, designed to foster research and benchmarking in synthetic speech detection. The dataset encompasses 18,993 audio utterances synthesized from text, alongside with their corresponding natural equivalents, representing approximately 17 hours of synthetic audio data. The dataset features synthetic speech generated by 156 voices spanning three languages, namely, English, German, and Spanish, with a balanced gender representation. It targets state-of-the-art synthesis methods, and has been released with a license allowing seamless extension and redistribution by the research community.

引用

页数：6

共 50 条

[21] ROLE OF SYNTHETIC SPEECH IN SPEECH RESEARCH
LAWRENCE, W
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1964, 36 (05): : 1022 - &
[22] SYNTHESIZING SPEECH + SYNTHETIC SPEECH MUSIC
DODGE, C
MUSIC JOURNAL, 1976, 34 (02): : 14 - &
[23] Weapon Violence Dataset 2.0: A synthetic dataset for violence detection
Nadeem, Muhammad Shahroz
Kurugollu, Fatih
Atlam, Hany F.
Franqueira, Virginia N. L.
DATA IN BRIEF, 2024, 54
[24] CONTINUOUS SPEECH SEPARATION: DATASET AND ANALYSIS
Chen, Zhuo
Yoshioka, Takuya
Lu, Liang
Zhou, Tianyan
Meng, Zhong
Luo, Yi
Wu, Jian
Xiao, Xiong
Li, Jinyu
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7284 - 7288
[25] Golos: Russian Dataset for Speech Research
Karpov, Nikolay
Denisenko, Alexander
Minkin, Fedor
INTERSPEECH 2021, 2021, : 1419 - 1423
[26] Semantic Coherence Dataset: Speech transcripts
Colla, Davide
Delsanto, Matteo
Radicioni, Daniele P.
DATA IN BRIEF, 2023, 46
[27] Dataset of Speech Production in intracranial Electroencephalography
Maxime Verwoert
Maarten C. Ottenhoff
Sophocles Goulis
Albert J. Colon
Louis Wagner
Simon Tousseyn
Johannes P. van Dijk
Pieter L. Kubben
Christian Herff
Scientific Data, 9
[28] EasyCall corpus: a dysarthric speech dataset
Turrisi, Rosanna
Braccia, Arianna
Emanuele, Marco
Giulietti, Simone
Pugliatti, Maura
Sensi, Mariachiara
Fadiga, Luciano
Badino, Leonardo
INTERSPEECH 2021, 2021, : 41 - 45
[29] A Canadian French Emotional Speech Dataset
Gournay, Philippe
Lahaie, Olivier
Lefebvre, Roch
PROCEEDINGS OF THE 9TH ACM MULTIMEDIA SYSTEMS CONFERENCE (MMSYS'18), 2018, : 399 - 402
[30] CMU WILDERNESS MULTILINGUAL SPEECH DATASET
Black, Alan W.
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5971 - 5975

← 1 2 3 4 5 →