Phoneme Segmentation using Deep Learning for Speech Synthesis

被引:2
|
作者
Lee, Young Han [1 ]
Yang, Jong-Yeol [1 ]
Cho, Choongsang [1 ]
Jung, Hyedong [1 ]
机构
[1] Korea Elect Technol Inst, Artificial Intelligent Res Ctr, Seongnam, South Korea
关键词
Phoneme segmentation; Speech synthesis; Deep learning;
D O I
10.1145/3264746.3264801
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose the phoneme segmentation method, which is one of the basic module that consist unit-selection-based speech synthesis, using deep learning algorithm. To enhance this, we apply the additional cross entropy loss into the Deep speech based speech recognition architecture. From this approach, we can get higher accuracy of phoneme boundary. In our experiments, the proposed method has 20.91 % boundary accuracy which is higher than the conventional phoneme segmentation.
引用
收藏
页码:59 / 61
页数:3
相关论文
共 50 条
  • [21] Speech Recognition using Deep Learning
    Lakkhanawannakun, Phoemporn
    Noyunsan, Chaluemwut
    2019 34TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2019), 2019, : 514 - 517
  • [22] Speech Separation Using Deep Learning
    Nandal, P.
    SUSTAINABLE COMMUNICATION NETWORKS AND APPLICATION, ICSCN 2019, 2020, 39 : 319 - 326
  • [23] Improving phoneme recognition of throat microphone speech recordings using transfer learning
    Turan, M. A. Tugtekin
    Erzin, Engin
    SPEECH COMMUNICATION, 2021, 129 : 25 - 32
  • [24] COMPOSITE PHONEME UNITS FOR THE SPEECH SYNTHESIS OF JAPANESE
    SAGISAKA, Y
    SATO, H
    SPEECH COMMUNICATION, 1986, 5 (02) : 217 - 223
  • [25] A Review of Deep Learning Based Speech Synthesis
    Ning, Yishuang
    He, Sheng
    Wu, Zhiyong
    Xing, Chunxiao
    Zhang, Liang-Jie
    APPLIED SCIENCES-BASEL, 2019, 9 (19):
  • [26] PHONEME POWER-CONTROL FOR SPEECH SYNTHESIS
    ITOH, K
    HIROKAWA, T
    SATO, H
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1911 - 1918
  • [27] Stochastic Filter Approaches for a Phoneme-Based Segmentation of Speech Signals
    Rauh, Andreas
    Tiede, Susann
    Klenke, Cornelia
    2016 21ST INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS (MMAR), 2016, : 732 - 737
  • [28] Text-Independent Phoneme Segmentation Combining EGG and Speech Data
    Chen, Lijiang
    Mao, Xia
    Yan, Hong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) : 1029 - 1037
  • [29] A neural speech decoding framework leveraging deep learning and speech synthesis
    Chen, Xupeng
    Wang, Ran
    Khalilian-Gourtani, Amirhossein
    Yu, Leyao
    Dugan, Patricia
    Friedman, Daniel
    Doyle, Werner
    Devinsky, Orrin
    Wang, Yao
    Flinker, Adeen
    NATURE MACHINE INTELLIGENCE, 2024, 6 (04) : 467 - 480
  • [30] Phoneme recognition using speech image (spectrogram)
    Ahmadi, M
    Bailey, NJ
    Hoyle, BS
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 675 - 677