An auditory-based measure for improved phone segment concatenation

被引:0
|
作者
Chappell, DT
Hansen, JHL
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a new auditory-based distance measure intended for use in a concatenated synthesis technique wherein time- and frequency-domain characteristics are used to perform natural-sounding speaker synthesis. Whereas most concatenation systems use large databases (often +100,000 units), we begin from a small, limited database (approx. 400 units) and use a new spectral distortion measure to aid in the selection of phones for optimal concatenation. At the transition between speech segments, the new auditory-based distance metric assesses perceived discontinuities in the frequency domain. The distortion measure, which employs the Carney auditory model, is used to select phones which minimize the perceived distortion between concatenated segments. Moreover, time- and frequency-domain methods can shape the prosodic and spectral characteristics of each speech segment. The final results demonstrate improved performance over standard concatenation methods applied to small databases.
引用
收藏
页码:1639 / 1642
页数:4
相关论文
共 50 条
  • [21] Application of an Auditory-Based Feedback Distortion to Modify Gait Symmetry in Healthy Individuals
    Liu, Le Yu
    Sangani, Samir
    Patterson, Kara K.
    Fung, Joyce
    Lamontagne, Anouk
    [J]. BRAIN SCIENCES, 2024, 14 (08)
  • [22] Auditory Development in Early Amplified Children: Factors Influencing Auditory-Based Communication Outcomes in Children with Hearing Loss
    Sininger, Yvonne S.
    Grimes, Alison
    Christensen, Elizabeth
    [J]. EAR AND HEARING, 2010, 31 (02): : 166 - 185
  • [23] Segment Specific Concatenation Cost for Syllable Based Bengali TTS
    Narendra, N. P.
    Rao, K. Sreenivasa
    [J]. CONTEMPORARY COMPUTING, 2011, 168 : 371 - 382
  • [24] Speech sentence compression based on speech segment extraction and concatenation
    Wu, Chung-Hsien
    Hsieh, Chia-Hsin
    Huang, Chien-Lin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (02) : 434 - 438
  • [25] Auditory-based defence against gleaning bats in neotropical katydids (Orthoptera: Tettigoniidae)
    ter Hofstede, Hannah M.
    Kalko, Elisabeth K. V.
    Fullard, James H.
    [J]. JOURNAL OF COMPARATIVE PHYSIOLOGY A-NEUROETHOLOGY SENSORY NEURAL AND BEHAVIORAL PHYSIOLOGY, 2010, 196 (05): : 349 - 358
  • [26] Robust auditory-based speech processing using the average localized synchrony detection
    Ali, AMA
    Van der Spiegel, J
    Mueller, P
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05): : 279 - 292
  • [27] DUAL-CHANNEL ITERATIVE SPEECH ENHANCEMENT WITH CONSTRAINTS ON AN AUDITORY-BASED SPECTRUM
    NANDKUMAR, S
    HANSEN, JHL
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 22 - 34
  • [28] Robust Auditory-Based Speech Feature Extraction Using Independent Subspace Method
    Wu, Qiang
    Zhang, Liqing
    Xia, Bin
    [J]. ADVANCES IN COGNITIVE NEURODYNAMICS, PROCEEDINGS, 2008, : 405 - +
  • [29] THE ERBLET TRANSFORM: AN AUDITORY-BASED TIME-FREQUENCY REPRESENTATION WITH PERFECT RECONSTRUCTION
    Necciari, T.
    Balazs, P.
    Holighaus, N.
    Sondergaard, P. L.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 498 - 502
  • [30] Auditory-based defence against gleaning bats in neotropical katydids (Orthoptera: Tettigoniidae)
    Hannah M. ter Hofstede
    Elisabeth K. V. Kalko
    James H. Fullard
    [J]. Journal of Comparative Physiology A, 2010, 196 : 349 - 358