A quantitative method for modeling context in concatenative synthesis using large speech database

被引:0
|
作者
Hamza, W
Rashwan, M
Afify, M
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Modeling phonetic context is one of the key points to get natural sounding in concatenative speech synthesis. In this paper, a new quantitative method to model context has been proposed. In the proposed method, the context is measured as the distance between leafs of the top-down likelihood-based decision trees that have been grown during the construction of acoustic inventory. Unlike other context modeling methods, this method allows the unit selection algorithm to borrow unit occurrences from other contexts when their context distances are close. This is done by incorporating the measured distance as an element in the unit selection cost function. The motivation behind this method is that it reduces the required speech modification by using better unit occurrences from near context. This method also makes it easy to use long synthesis units, e.g. syllables or words, in the same unit selection framework.
引用
收藏
页码:789 / 792
页数:4
相关论文
共 50 条