An Open Dataset of Synthetic Speech

被引:1
|
作者
Yaroshchuk, Artem [1 ]
Papastergiopoulos, Christoforos [2 ]
Cuccovillo, Luca [1 ]
Aichroth, Patrick [1 ]
Votis, Konstantinos [2 ]
Tzovaras, Dimitrios [2 ]
机构
[1] Fraunhofer Inst Digital Media Technol, Ilmenau, Germany
[2] Ctr Res & Technol Hellas, Thessaloniki, Greece
关键词
datasets; neural networks; speech synthesis;
D O I
10.1109/WIFS58808.2023.10374863
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper introduces a multilingual, multispeaker dataset composed of synthetic and natural speech, designed to foster research and benchmarking in synthetic speech detection. The dataset encompasses 18,993 audio utterances synthesized from text, alongside with their corresponding natural equivalents, representing approximately 17 hours of synthetic audio data. The dataset features synthetic speech generated by 156 voices spanning three languages, namely, English, German, and Spanish, with a balanced gender representation. It targets state-of-the-art synthesis methods, and has been released with a license allowing seamless extension and redistribution by the research community.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] FoR: A Dataset for Synthetic Speech Detection
    Reimao, Ricardo
    Tzerpos, Vassilios
    2019 10TH INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2019,
  • [2] DISPEECH: A SYNTHETIC TOY DATASET FOR SPEECH DISENTANGLING
    Zhang, Olivier
    Gengembre, Nicolas
    Le Blouch, Olivier
    Lolive, Damien
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8557 - 8561
  • [3] Open Challenges in Synthetic Speech Detection
    Cuccovillo, Luca
    Papastergiopoulos, Christoforos
    Vafeiadis, Anastasios
    Yaroshchuk, Artem
    Aichroth, Patrick
    Votis, Konstantinos
    Tzovaras, Dimitrios
    2022 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2022,
  • [4] FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection
    Zhang, Zhenyu
    Gu, Yewei
    Yi, Xiaowei
    Zhao, Xianfeng
    DIGITAL FORENSICS AND WATERMARKING, IWDW 2021, 2022, 13180 : 117 - 131
  • [5] KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset
    Mussakhojayeva, Saida
    Janaliyeva, Aigerim
    Mirzakhmetov, Almas
    Khassanov, Yerbolat
    Varol, Huseyin Atakan
    INTERSPEECH 2021, 2021, : 2786 - 2790
  • [6] Artie Bias Corpus: An Open Dataset for Detecting Demographic Bias in Speech Applications
    Meyer, Josh
    Rauchenstein, Lynn
    Eisenberg, Joshua D.
    Howell, Nicholas
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6462 - 6468
  • [7] TIMIT-TTS: A Text-to-Speech Dataset for Multimodal Synthetic Media Detection
    Salvi, Davide
    Hosler, Brian
    Bestagini, Paolo
    Stamm, Matthew C.
    Tubaro, Stefano
    IEEE ACCESS, 2023, 11 : 50851 - 50866
  • [8] OLKAVS: AN OPEN LARGE-SCALE KOREAN AUDIO-VISUAL SPEECH DATASET
    Park, Jeongkyun
    Hwang, Jung-Wook
    Choi, Kwanghee
    Lee, Seung-Hyeon
    Ahn, Jun Hwan
    Park, Rae-Hong
    Park, Hyung-Min
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6385 - 6389
  • [9] An Open Dataset of Connected Speech in Aphasia with Consensus Ratings of Auditory-Perceptual Features
    Ezzes, Zoe
    Schneck, Sarah M.
    Casilio, Marianne
    Fromm, Davida
    Mefferd, Antje S.
    de Riesthal, Michael
    Wilson, Stephen M.
    DATA, 2022, 7 (11)
  • [10] SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
    Maniati, Georgia
    Vioni, Alexandra
    Ellinas, Nikolaos
    Nikitaras, Karolos
    Klapsas, Konstantinos
    Sung, June Sig
    Jho, Gunu
    Chalamandaris, Aimilios
    Tsiakoulis, Pirros
    INTERSPEECH 2022, 2022, : 2388 - 2392