Phoneme Segmentation using Deep Learning for Speech Synthesis

被引:2
|
作者
Lee, Young Han [1 ]
Yang, Jong-Yeol [1 ]
Cho, Choongsang [1 ]
Jung, Hyedong [1 ]
机构
[1] Korea Elect Technol Inst, Artificial Intelligent Res Ctr, Seongnam, South Korea
关键词
Phoneme segmentation; Speech synthesis; Deep learning;
D O I
10.1145/3264746.3264801
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose the phoneme segmentation method, which is one of the basic module that consist unit-selection-based speech synthesis, using deep learning algorithm. To enhance this, we apply the additional cross entropy loss into the Deep speech based speech recognition architecture. From this approach, we can get higher accuracy of phoneme boundary. In our experiments, the proposed method has 20.91 % boundary accuracy which is higher than the conventional phoneme segmentation.
引用
收藏
页码:59 / 61
页数:3
相关论文
共 50 条
  • [1] Phoneme segmentation of speech
    Ziolko, Bartosz
    Manandhar, Suresh
    Wilson, Richard C.
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 282 - +
  • [2] Phoneme Segmentation of Speech Signal
    Goh, Y. H.
    Raveendran, P.
    [J]. 2009 INTERNATIONAL CONFERENCE FOR TECHNICAL POSTGRADUATES (TECHPOS 2009), 2009, : 150 - 152
  • [3] Simultaneous speech segmentation and phoneme recognition using dynamic programming
    Bajwa, RS
    Owens, RM
    Kelliher, TP
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3213 - 3216
  • [4] PHONEME SEGMENTATION USING SELF-SUPERVISED SPEECH MODELS
    Strgar, Luke
    Harwath, David
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1067 - 1073
  • [5] Phoneme segmentation of continuous speech using multi-layer perceptron
    Suh, Y
    Lee, Y
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1297 - 1300
  • [6] A New Approach for Phoneme Segmentation of Speech Signals
    Golipour, Ladan
    O'Shaughnessy, Douglas
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2296 - 2299
  • [7] MYANMAR SPEECH SYNTHESIS SYSTEM BY USING PHONEME CONCATENATION METHOD
    Hlaing, Chaw Su
    Thida, Aye
    [J]. PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICSPC'17), 2017, : 399 - 404
  • [8] SYNTHESIS OF ARABIC SPEECH USING PHONEME-BASED SYNTHESIZERS
    MANDURAH, MM
    [J]. JOURNAL OF ENGINEERING SCIENCES, 1984, 10 (1-2): : 9 - 14
  • [9] Metric Learning for Unsupervised Phoneme Segmentation
    Qiao, Yu
    Minematsu, Nobuaki
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1060 - 1063
  • [10] Speech/non-speech segmentation based on phoneme recognition features
    Zibert, Janez
    Pavesic, Nikola
    Mihelic, France
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1)