共 50 条
- [1] Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis [J]. INTERSPEECH 2021, 2021, : 3141 - 3145
- [2] PHONEME DEPENDENT SPEAKER EMBEDDING AND MODEL FACTORIZATION FOR MULTI-SPEAKER SPEECH SYNTHESIS AND ADAPTATION [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6930 - 6934
- [3] Unsupervised Discovery of Phoneme Boundaries in Multi-Speaker Continuous Speech [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING (ICDL), 2011,
- [5] DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding [J]. 2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2019, : 61 - 64
- [6] End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning [J]. INTERSPEECH 2019, 2019, : 4425 - 4429
- [7] Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1630 - 1635
- [8] An Unsupervised Method to Select a Speaker Subset from Large Multi-Speaker Speech Synthesis Datasets [J]. INTERSPEECH 2020, 2020, : 1758 - 1762
- [9] MultiSpeech: Multi-Speaker Text to Speech with Transformer [J]. INTERSPEECH 2020, 2020, : 4024 - 4028
- [10] ZERO-SHOT MULTI-SPEAKER TEXT-TO-SPEECH WITH STATE-OF-THE-ART NEURAL SPEAKER EMBEDDINGS [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6184 - 6188