Measuring the naturalness of synthetic speech

被引:0
|
作者
Howard C. Nusbaum
Alexander L. Francis
Anne S. Henly
机构
[1] The University of Chicago,Center for Computational Psychology, Committee on Cognition and Communication
关键词
synthetic speech; naturalness; intelligibility; perception;
D O I
10.1007/BF02215800
中图分类号
学科分类号
摘要
Even the highest quality synthetic speech generated by rule sounds unlike human speech. As the intelligibility of rule-based synthetic speech improves, and the number of applications for synthetic speech increases, the naturalness of synthetic speech will become an important factor in determining its use. In order to improve this aspect of the quality of synthetic speech it is necessary to have diagnostic tests that can measure naturalness. Currently, all of the available metrics for evaluating the acceptability of synthetic speech do not distinguish sufficiently between measuring overall acceptability (including naturalness) and simply measuring the ability of listeners to extract intelligible information from the signal. In this paper we propose a new methodology for measuring the naturalness of particular aspects of synthesized speech, independent of the intelligibility of the speech. Although naturalness is a multidimensional, subjective quality of speech, this methodology makes it possible to assess the separate contributions of prosodic, segmental, and source characteristics of the utterance. In two experiments, listeners reliably differentiated the naturalness of speech produced by two male talkers and two text-to-speech systems. Furthermore, they reliably differentiated between the two text-to-speech systems. The results of these experiments demonstrate that perception of naturalness is affected by information contained within the smallest part of speech, the glottal pulse, and by information contained within the prosodic structure of a syllable. These results show that this new methodology does provide a solid basis for measuring and diagnosing the naturalness of synthetic speech.
引用
收藏
页码:7 / 19
页数:12
相关论文
共 50 条
  • [21] SPEECH NATURALNESS RATINGS OF TREATED STUTTERERS
    RUNYAN, CM
    BELL, JN
    PROSEK, RA
    [J]. JOURNAL OF SPEECH AND HEARING DISORDERS, 1990, 55 (03): : 434 - 438
  • [22] Speech Naturalness in the Assessment of Childhood Dysarthria
    Schoelderle, Theresa
    Haas, Elisabet
    Ziegler, Wolfram
    [J]. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2023, 32 (04) : 1633 - 1643
  • [23] Towards a Vowel Formant Based Quality Metric for Text-to-Speech Systems: Measuring Monophthong Naturalness
    Albrecht, Sven
    Tamboli, Rewa
    Taubert, Stefan
    Eibl, Maximilian
    Diaeresis, Gunter
    Schmied, Josef
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND VIRTUAL ENVIRONMENTS FOR MEASUREMENT SYSTEMS AND APPLICATIONS (IEEE CIVEMSA 2022), 2022,
  • [24] RELIABILITY OF SPEECH NATURALNESS RATINGS OF STUTTERED SPEECH DURING TREATMENT
    ONSLOW, M
    ADAMS, R
    INGHAM, R
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1992, 35 (05): : 994 - 1001
  • [25] Speech naturalness of stutterers following generating fluent speech therapy
    Dahm, BL
    Kaplan, Y
    [J]. JOURNAL OF FLUENCY DISORDERS, 2000, 25 (03) : 199 - 199
  • [26] The effect of SpeechEasy on stuttering frequency, speech rate, and speech naturalness
    Armson, Joy
    Kiefte, Michael
    [J]. JOURNAL OF FLUENCY DISORDERS, 2008, 33 (02) : 120 - 134
  • [27] EFFECTIVE METHOD FOR IMPROVING NATURALNESS OF COMPRESSED SPEECH
    SOBOLEV, VN
    LEYTES, RD
    [J]. TELECOMMUNICATIONS AND RADIO ENGINEERING, 1973, 27 (03) : 44 - 47
  • [28] THE EFFECT OF ACCENTUATION AND WORD DURATION ON THE NATURALNESS OF SPEECH
    EEFTING, W
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1992, 91 (01): : 411 - 420
  • [29] NATURALNESS AND DISTORTION IN SPEECH-PROCESSING DEVICES
    DAVID, EE
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (04): : 586 - 589
  • [30] Perceived naturalness of spectrally distorted speech and music
    Moore, BCJ
    Tan, CT
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 114 (01): : 408 - 419