Cross-Lingual Speaker Discrimination Using Natural and Synthetic Speech

被引:0
|
作者
Wester, Mirjam [1 ]
Liang, Hui [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9YL, Midlothian, Scotland
关键词
speaker discrimination; speaker adaptation; HMM-based speech synthesis; ADAPTATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes speaker discrimination experiments in which native English listeners were presented with natural speech stimuli in English and Mandarin, synthetic speech stimuli in English and Mandarin, or natural Mandarin speech and synthetic English speech stimuli. In each experiment, listeners were asked to judge whether the sentences in a pair were spoken by the same person or not. We found that the results of Mandarin/English speaker discrimination were very similar to those found in previous work on German/English and Finnish/English speaker discrimination. We conclude from this and previous work that listeners are able to discriminate between speakers across languages or across speech types, but the combination of these two factors leads to a speaker discrimination task that is too difficult for listeners to perform successfully, given the fact that the quality of across-language speaker adapted speech synthesis at present still needs to be improved.
引用
下载
收藏
页码:2492 / 2495
页数:4
相关论文
共 50 条
  • [1] Cross-Lingual Speaker Adaptation for Statistical Speech Synthesis Using Limited Data
    Saffjoo, Seyyed Saeed
    Demiroglu, Cenk
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 317 - 321
  • [2] Cross-lingual, Multi-speaker Text-To-Speech Synthesis Using Neural Speaker Embedding
    Chen, Mengnan
    Chen, Minchuan
    Liang, Shuang
    Ma, Jun
    Chen, Lei
    Wang, Shaojun
    Xiao, Jing
    INTERSPEECH 2019, 2019, : 2105 - 2109
  • [3] Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation
    Zhou, Yi
    Tian, Xiaohai
    Li, Haizhou
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3427 - 3439
  • [4] CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
    Wu, Yi-Jian
    King, Simon
    Tokuda, Keiichi
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 9 - 12
  • [5] Cross-lingual Speaker Adaptation using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    INTERSPEECH 2021, 2021, : 1614 - 1618
  • [6] Cross-lingual speaker adaptation using domain adaptation and speaker consistency loss for text-to-speech synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2021, 5 : 3376 - 3380
  • [7] Cross-lingual talker discrimination
    Wester, Mirjam
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1253 - 1256
  • [8] Using speaker adaptive training to realize Mandarin-Tibetan cross-lingual speech synthesis
    Yang, Hongwu
    Oura, Keiichiro
    Wang, Haiyan
    Gan, Zhenye
    Tokuda, Keiichi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (22) : 9927 - 9942
  • [9] Using speaker adaptive training to realize Mandarin-Tibetan cross-lingual speech synthesis
    Hongwu Yang
    Keiichiro Oura
    Haiyan Wang
    Zhenye Gan
    Keiichi Tokuda
    Multimedia Tools and Applications, 2015, 74 : 9927 - 9942
  • [10] Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination Task
    Tsuzaki, Minoru
    Tokuda, Keiichi
    Kawai, Hisashi
    Ni, Jinfu
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 164 - +