Script Selection Using Convolutional Auto-encoder for TTS Speech Corpus

被引:0
|
作者
Shamsi, Meysam [1 ]
Lolive, Damien [1 ]
Barbot, Nelly [1 ]
Chevelu, Jonathan [1 ]
机构
[1] Univ Rennes, CNRS, IRISA, Rennes, France
来源
关键词
Corpus design; Deep neural networks; Embedding space; Clustering; Text-to-speech synthesis;
D O I
10.1007/978-3-030-26061-3_43
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, we propose an approach for script selection in order to design TTS speech corpora. A Deep Convolutional Neural Network (DCNN) is used to project linguistic information to an embedding space. The embedded representation of the corpus is then fed to a selection process to extract a subset of utterances which offers a good linguistic coverage while tending to limit the linguistic unit repetition. We present two selection processes: a clustering approach based on utterance distance and another method that tends to reach a target distribution of linguistic events. We compare the synthetic signal quality of the proposed methods to state of art methods objectively and subjectively. The subjective measure confirms the performance of the proposed methods in order to design speech corpora with better synthetic speech quality.
引用
收藏
页码:423 / 432
页数:10
相关论文
共 50 条
  • [1] Convolutional Auto-Encoder and Adversarial Domain Adaptation for Cross-Corpus Speech Emotion Recognition
    Wang, Yang
    Fu, Hongliang
    Tao, Huawei
    Yang, Jing
    Ge, Hongyi
    Xie, Yue
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (10) : 1803 - 1806
  • [2] Corpus Design using Convolutional Auto-Encoder Embeddings for Audio-Book Synthesis
    Shamsi, Meysam
    Lolive, Damien
    Barbot, Nelly
    Chevelu, Jonathan
    [J]. INTERSPEECH 2019, 2019, : 1531 - 1535
  • [3] An Ensemble Net of Convolutional Auto-Encoder and Graph Auto-Encoder for Auto-Diagnosis
    Li, Jianqiang
    Ji, Changping
    Yan, Guokai
    You, Linlin
    Chen, Jie
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (01) : 189 - 199
  • [4] Ultrasound-Based Silent Speech Interface using Sequential Convolutional Auto-encoder
    Xu, Kele
    Wu, Yuxiang
    Gao, Zhifeng
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2194 - 2195
  • [5] Underwater image reconstruction using convolutional auto-encoder
    Yasukawa, Shinsuke
    Raghura, Sreeraman Srinivasa
    Nishida, Yuya
    Ishii, Kazuo
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB 2021), 2021, : P86 - P86
  • [6] An FPGA Implementation of a Convolutional Auto-Encoder
    Zhao, Wei
    Jia, Zuchen
    Wei, Xiaosong
    Wang, Hai
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (04):
  • [7] Underwater image reconstruction using convolutional auto-encoder
    Yasukawa, Shinsuke
    Raghura, Sreeraman Srinivasa
    Nishida, Yuya
    Ishii, Kazuo
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB 2021), 2021, : 262 - 265
  • [8] HRTF Representation with Convolutional Auto-encoder
    Chen, Wei
    Hu, Ruimin
    Wang, Xiaochen
    Li, Dengshi
    [J]. MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 605 - 616
  • [9] A DEEP CONVOLUTIONAL AUTO-ENCODER WITH EMBEDDED CLUSTERING
    Alqahtani, A.
    Xie, X.
    Deng, J.
    Jones, M. W.
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 4058 - 4062
  • [10] Feature Selection Guided Auto-Encoder
    Wang, Shuyang
    Ding, Zhengming
    Fu, Yun
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2725 - 2731