Phoneme Segmentation using Deep Learning for Speech Synthesis

被引:2
|
作者
Lee, Young Han [1 ]
Yang, Jong-Yeol [1 ]
Cho, Choongsang [1 ]
Jung, Hyedong [1 ]
机构
[1] Korea Elect Technol Inst, Artificial Intelligent Res Ctr, Seongnam, South Korea
关键词
Phoneme segmentation; Speech synthesis; Deep learning;
D O I
10.1145/3264746.3264801
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose the phoneme segmentation method, which is one of the basic module that consist unit-selection-based speech synthesis, using deep learning algorithm. To enhance this, we apply the additional cross entropy loss into the Deep speech based speech recognition architecture. From this approach, we can get higher accuracy of phoneme boundary. In our experiments, the proposed method has 20.91 % boundary accuracy which is higher than the conventional phoneme segmentation.
引用
收藏
页码:59 / 61
页数:3
相关论文
共 50 条
  • [31] Image Segmentation Using Deep Learning: A Survey
    Minaee, Shervin
    Boykov, Yuri Y.
    Porikli, Fatih
    Plaza, Antonio J.
    Kehtarnavaz, Nasser
    Terzopoulos, Demetri
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3523 - 3542
  • [32] Using Phoneme Recognition and Text-dependent Speaker Verification to Improve Speaker Segmentation for Chinese Speech
    Wang, Gang
    Wu, Xiaojun
    Zheng, Thomas Fang
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1457 - 1460
  • [33] Speech Emotion Classification Using Deep Learning
    Mishra, Siba Prasad
    Warule, Pankaj
    Deb, Suman
    [J]. PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 19 - 31
  • [34] Korean speech recognition using deep learning
    Lee, Suji
    Han, Seokjin
    Park, Sewon
    Lee, Kyeongwon
    Lee, Jaeyong
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2019, 32 (02) : 213 - 227
  • [35] Aortic Valve Segmentation using Deep Learning
    Lai, Khin Wee
    Shoaib, Muhammad Ali
    Chuah, Joon Huang
    Nizar, Muhammad Hanif Ahmad
    Anis, Shazia
    Ching, Serena Low Woan
    [J]. 2020 IEEE-EMBS CONFERENCE ON BIOMEDICAL ENGINEERING AND SCIENCES (IECBES 2020): LEADING MODERN HEALTHCARE TECHNOLOGY ENHANCING WELLNESS, 2021, : 528 - 532
  • [36] Speech Emotion Recognition Using Deep Learning
    Alagusundari, N.
    Anuradha, R.
    [J]. ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
  • [37] Persian speech recognition using deep learning
    Veisi, Hadi
    Haji Mani, Armita
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (04) : 893 - 905
  • [38] Speech Command Recognition Using Deep Learning
    Ayache, Mohammad
    Kanaan, Hussien
    Kassir, Kawthar
    Kassir, Yasser
    [J]. 2021 SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN BIOMEDICAL ENGINEERING (ICABME), 2021, : 24 - 29
  • [39] Deep Learning Techniques in Tandem with Signal Processing Cues for Phonetic Segmentation for Text to Speech Synthesis in Indian Languages
    Baby, Arun
    Prakash, Jeena J.
    Vignesh, Rupak
    Murthy, Hema A.
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3817 - 3821
  • [40] SEMANTIC SEGMENTATION OF TEXT USING DEEP LEARNING
    Lattisi, Tiziano
    Farina, Davide
    Ronchetti, Marco
    [J]. COMPUTING AND INFORMATICS, 2022, 41 (01) : 78 - 97