Phoneme Segmentation using Deep Learning for Speech Synthesis

被引：2

作者：

Lee, Young Han ^{[1
]}

Yang, Jong-Yeol ^{[1
]}

Cho, Choongsang ^{[1
]}

Jung, Hyedong ^{[1
]}

机构：

[1] Korea Elect Technol Inst, Artificial Intelligent Res Ctr, Seongnam, South Korea

来源：

PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018) | 2018年

关键词：

Phoneme segmentation; Speech synthesis; Deep learning;

D O I：

10.1145/3264746.3264801

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In this paper, we propose the phoneme segmentation method, which is one of the basic module that consist unit-selection-based speech synthesis, using deep learning algorithm. To enhance this, we apply the additional cross entropy loss into the Deep speech based speech recognition architecture. From this approach, we can get higher accuracy of phoneme boundary. In our experiments, the proposed method has 20.91 % boundary accuracy which is higher than the conventional phoneme segmentation.

引用

页码：59 / 61

页数：3

共 50 条

[1] Phoneme segmentation of speech
Ziolko, Bartosz
Manandhar, Suresh
Wilson, Richard C.
18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 282 - +
[2] Phoneme Segmentation of Speech Signal
Goh, Y. H.
Raveendran, P.
2009 INTERNATIONAL CONFERENCE FOR TECHNICAL POSTGRADUATES (TECHPOS 2009), 2009, : 150 - 152
[3] Simultaneous speech segmentation and phoneme recognition using dynamic programming
Bajwa, RS
Owens, RM
Kelliher, TP
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3213 - 3216
[4] PHONEME SEGMENTATION USING SELF-SUPERVISED SPEECH MODELS
Strgar, Luke
Harwath, David
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1067 - 1073
[5] Phoneme segmentation of continuous speech using multi-layer perceptron
Suh, Y
Lee, Y
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1297 - 1300
[6] A New Approach for Phoneme Segmentation of Speech Signals
Golipour, Ladan
O'Shaughnessy, Douglas
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2296 - 2299
[7] Unsupervised phoneme segmentation of continuous Arabic speech
Hind Ait Mait
Noureddine Aboutabit
International Journal of Speech Technology, 2025, 28 (1) : 1 - 12
[8] Speech recognition using stochastic phonemic segment model based on phoneme segmentation
Furuichi, Chieko, 1600, Scripta Technica Inc, New York, NY, United States (31):
[9] MYANMAR SPEECH SYNTHESIS SYSTEM BY USING PHONEME CONCATENATION METHOD
Hlaing, Chaw Su
Thida, Aye
PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICSPC'17), 2017, : 399 - 404
[10] SYNTHESIS OF ARABIC SPEECH USING PHONEME-BASED SYNTHESIZERS
MANDURAH, MM
JOURNAL OF ENGINEERING SCIENCES, 1984, 10 (1-2): : 9 - 14

← 1 2 3 4 5 →