共 50 条
- [1] MULTI-SPEAKER MODELING AND SPEAKER ADAPTATION FOR DNN-BASED TTS SYNTHESIS [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4475 - 4479
- [2] PHONEME DEPENDENT SPEAKER EMBEDDING AND MODEL FACTORIZATION FOR MULTI-SPEAKER SPEECH SYNTHESIS AND ADAPTATION [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6930 - 6934
- [3] Cross-lingual, Multi-speaker Text-To-Speech Synthesis Using Neural Speaker Embedding [J]. INTERSPEECH 2019, 2019, : 2105 - 2109
- [5] Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS [J]. INTERSPEECH 2022, 2022, : 2968 - 2972
- [6] Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1630 - 1635
- [7] Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis [J]. INTERSPEECH 2021, 2021, : 3141 - 3145
- [8] Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries [J]. INTERSPEECH 2022, 2022, : 605 - 609
- [10] Autoregressive multi-speaker model in Chinese speech synthesis based on variational autoencoder [J]. Shengxue Xuebao/Acta Acustica, 2022, 47 (03): : 405 - 416