On a modified cepstral pitch control technique for the high quality text-to-speech type system

被引:0
|
作者
Kim, J [1 ]
Bae, M [1 ]
机构
[1] Soongsil Univ, Dept Telecommun Engn, Seoul 156743, South Korea
关键词
D O I
10.1109/MWSCAS.1998.759568
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the area of speech synthesis, the waveform coding methods are mainly used to maintain intelligibility and naturalness of synthetic speech, However, it is difficult to apply the waveform coding methods to the synthesis by rule since this methods do not separate both the excitation information and vocal tract information from a speech signal. This paper proposes a modified pitch alteration method that can reduce the spectrum distortion by reconstructing the pitch altered speech signal using both the formant component in the quefrency domain and the phase component in the time domain. This has little spectrum distortion of 1.18% for 50% pitch change.
引用
收藏
页码:616 / 619
页数:4
相关论文
共 50 条
  • [21] Implementation of high quality text-to-speech using words and diphones
    Shukla, SR
    Barnwell, TP
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4020 - 4020
  • [22] Text normalization in mandarin Text-to-Speech system
    Jia, Yuxiang
    Huang, Dezhi
    Liu, Wu
    Dong, Yuan
    Yu, Shiwen
    Wang, Haila
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4693 - +
  • [23] A TECHNIQUE FOR USING MULTIPULSE LINEAR PREDICTIVE SPEECH SYNTHESIS IN TEXT-TO-SPEECH TYPE SYSTEMS
    VARGA, A
    FALLSIDE, F
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (04): : 586 - 587
  • [24] Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description
    Pribilová, Anna
    Pribil, Jiri
    SPEECH COMMUNICATION, 2006, 48 (12) : 1691 - 1703
  • [25] Using pitch accenting to improve Japanese text-to-speech understanding
    Yu, WW
    Yokoi, H
    Kakazu, Y
    Tamura, T
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 4556 - 4559
  • [26] A waveform concatenation technique for text-to-speech synthesis
    Panda S.P.
    Nayak A.K.
    International Journal of Speech Technology, 2017, 20 (4) : 959 - 976
  • [27] JAPANESE TEXT-TO-SPEECH CONVERSION SYSTEM
    SATO, H
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1984, 32 (02): : 179 - 187
  • [28] Study on Cantonese text-to-speech system
    Long, Qinghua
    Jing, Huisheng
    Ren, Ping
    Situ, Xikang
    Shengxue Xuebao/Acta Acustica, 1993, 18 (02): : 143 - 147
  • [29] COSEGMENTATION IN THE IBM TEXT-TO-SPEECH SYSTEM
    PICKERING, JB
    PROCEEDINGS : INSTITUTE OF ACOUSTICS, VOL 8, PART 7: SPEECH & HEARING, 1986, 8 : 385 - 392
  • [30] TOWARD AN ARABIC TEXT-TO-SPEECH SYSTEM
    AHMED, ME
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1991, 16 (04): : 565 - 583