Measuring the naturalness of synthetic speech

被引:0
|
作者
Howard C. Nusbaum
Alexander L. Francis
Anne S. Henly
机构
[1] The University of Chicago,Center for Computational Psychology, Committee on Cognition and Communication
关键词
synthetic speech; naturalness; intelligibility; perception;
D O I
10.1007/BF02215800
中图分类号
学科分类号
摘要
Even the highest quality synthetic speech generated by rule sounds unlike human speech. As the intelligibility of rule-based synthetic speech improves, and the number of applications for synthetic speech increases, the naturalness of synthetic speech will become an important factor in determining its use. In order to improve this aspect of the quality of synthetic speech it is necessary to have diagnostic tests that can measure naturalness. Currently, all of the available metrics for evaluating the acceptability of synthetic speech do not distinguish sufficiently between measuring overall acceptability (including naturalness) and simply measuring the ability of listeners to extract intelligible information from the signal. In this paper we propose a new methodology for measuring the naturalness of particular aspects of synthesized speech, independent of the intelligibility of the speech. Although naturalness is a multidimensional, subjective quality of speech, this methodology makes it possible to assess the separate contributions of prosodic, segmental, and source characteristics of the utterance. In two experiments, listeners reliably differentiated the naturalness of speech produced by two male talkers and two text-to-speech systems. Furthermore, they reliably differentiated between the two text-to-speech systems. The results of these experiments demonstrate that perception of naturalness is affected by information contained within the smallest part of speech, the glottal pulse, and by information contained within the prosodic structure of a syllable. These results show that this new methodology does provide a solid basis for measuring and diagnosing the naturalness of synthetic speech.
引用
收藏
页码:7 / 19
页数:12
相关论文
共 50 条
  • [31] NATURALNESS AND DISTORTION IN SPEECH-PROCESSING DEVICES
    DAVID, EE
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (04): : 586 - 589
  • [32] THE EFFECT OF ACCENTUATION AND WORD DURATION ON THE NATURALNESS OF SPEECH
    EEFTING, W
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1992, 91 (01): : 411 - 420
  • [33] Perceived naturalness of spectrally distorted speech and music
    Moore, BCJ
    Tan, CT
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 114 (01): : 408 - 419
  • [34] STUTTERING AND SPEECH NATURALNESS - SOME ADDITIONAL DATA
    INGHAM, RJ
    GOW, M
    COSTELLO, JM
    [J]. JOURNAL OF SPEECH AND HEARING DISORDERS, 1985, 50 (02): : 217 - 219
  • [35] STUTTERING AND SPEECH NATURALNESS - AUDIO AND AUDIOVISUAL JUDGMENTS
    MARTIN, RR
    HAROLDSON, SK
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1992, 35 (03): : 521 - 528
  • [36] Naturalness Analysis of the Speech Synthesized by a TTS Card
    Tora, Hakan
    Uslu, Baran
    [J]. 2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1741 - 1744
  • [37] English and human morphology: 'Naturalness' in counting and measuring
    Sullivan, WJ
    [J]. LACUS FORUM XXV, 1999, 25 : 5 - 18
  • [38] Fractal dimension as a tool for defining and measuring naturalness
    Hagerhall, CM
    [J]. Designing Social Innovation: Planning, Building, Evaluating, 2005, : 75 - 82
  • [39] Measuring the cognitive load of synthetic speech using a dual task paradigm
    Govender, Avashna
    King, Simon
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2843 - 2847
  • [40] Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering
    Davidow, Jason H.
    Grossman, Heather L.
    Edge, Robin L.
    [J]. LANGUAGE AND SPEECH, 2019, 62 (02) : 318 - 332