ARTICULATORY FEATURES FOR EXPRESSIVE SPEECH SYNTHESIS

被引:0
|
作者
Black, Alan W. [1 ]
Bunnell, H. Timothy [2 ]
Dou, Ying [3 ]
Muthukumar, Prasanna Kumar [1 ]
Metze, Florian [1 ]
Perry, Daniel [4 ]
Polzehl, Tim [5 ]
Prahallad, Kishore [6 ]
Steidl, Stefan [7 ]
Vaughn, Callie [8 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Nemours Biomed Res, Wilmington, DE USA
[3] Johns Hopkins Univ, Baltimore, MD 21218 USA
[4] Univ Calif Los Angeles, Los Angeles, CA 90024 USA
[5] Tech Univ Berlin, Deutsche Telekom Lab, Berlin, Germany
[6] Int Inst Informat Technol, Hyderabad, Andhra Pradesh, India
[7] Int Comp Sci Inst, Berkeley, CA USA
[8] Oberlin Coll, Oberlin, OH 44074 USA
关键词
speech synthesis; articulatory features; emotional speech; meta-data extraction; evaluation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes some of the results from the project entitled "New Parameterization for Emotional Speech Synthesis" held at the Summer 2011 JHU CLSP workshop. We describe experiments on how to use articulatory features as a meaningful intermediate representation for speech synthesis. This parameterization not only allows us to reproduce natural sounding speech but also allows us to generate stylistically varying speech. We show methods for deriving articulatory features from speech, predicting articulatory features from text and reconstructing natural sounding speech from the predicted articulatory features. The methods were tested on clean speech databases in English and German, as well as databases of emotionally and personality varying speech. The resulting speech was evaluated both objectively, using techniques normally used for emotion identification, and subjectively, using crowd-sourcing.
引用
下载
收藏
页码:4005 / 4008
页数:4
相关论文
共 50 条
  • [21] ON THE USE OF NEURAL NETWORKS IN ARTICULATORY SPEECH SYNTHESIS
    RAHIM, MG
    GOODYEAR, CC
    KLEIJN, WB
    SCHROETER, J
    SONDHI, MM
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1993, 93 (02): : 1109 - 1121
  • [22] Articulatory Acoustic Feature Applications in Speech Synthesis
    Cahill, Peter
    Aioanei, Daniel
    Carson-Berndsen, Julie
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1917 - 1920
  • [23] Speech synthesis based on a physiological articulatory model
    Fang, Qiang
    Dang, Jianwu
    Chinese Spoken Language Processing, Proceedings, 2006, 4274 : 211 - 222
  • [24] SPEECH SYNTHESIS FROM TEXT WITH AN ARTICULATORY MODEL
    COKER, CH
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1978, 64 : S43 - S43
  • [25] Deep Speech Synthesis from Articulatory Representations
    Wu, Peter
    Watanabe, Shinji
    Goldstein, Louis
    Black, Alan W.
    Anumanchipalli, Gopala K.
    INTERSPEECH 2022, 2022, : 779 - 783
  • [26] EXPRESSIVE SPEECH SYNTHESIS FOR CRITICAL SITUATIONS
    Rusko, Milan
    Darjaa, Sakhia
    Trnka, Marian
    Sabo, Robert
    Ritomsk, Marian
    COMPUTING AND INFORMATICS, 2014, 33 (06) : 1312 - 1332
  • [27] Advancements in Expressive Speech Synthesis: a Review
    Alwaisi, Shaimaa
    Nemeth, Geza
    INFOCOMMUNICATIONS JOURNAL, 2024, 16 (01): : 35 - 46
  • [28] A STUDY ON ROBUSTNESS OF ARTICULATORY FEATURES FOR AUTOMATIC SPEECH RECOGNITION OF NEUTRAL AND WHISPERED SPEECH
    Srinivasan, Gokul
    Illa, Aravind
    Ghosh, Prasanta Kumar
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5936 - 5940
  • [29] Articulatory speech synthesis using diphone units
    Greenwood, AR
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1635 - 1638
  • [30] Expressive speech: Production, perception and application to speech synthesis
    Erickson, Donna
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2005, 26 (04) : 317 - 325