An auditory-based measure for improved phone segment concatenation

被引:0
|
作者
Chappell, DT
Hansen, JHL
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a new auditory-based distance measure intended for use in a concatenated synthesis technique wherein time- and frequency-domain characteristics are used to perform natural-sounding speaker synthesis. Whereas most concatenation systems use large databases (often +100,000 units), we begin from a small, limited database (approx. 400 units) and use a new spectral distortion measure to aid in the selection of phones for optimal concatenation. At the transition between speech segments, the new auditory-based distance metric assesses perceived discontinuities in the frequency domain. The distortion measure, which employs the Carney auditory model, is used to select phones which minimize the perceived distortion between concatenated segments. Moreover, time- and frequency-domain methods can shape the prosodic and spectral characteristics of each speech segment. The final results demonstrate improved performance over standard concatenation methods applied to small databases.
引用
收藏
页码:1639 / 1642
页数:4
相关论文
共 50 条
  • [1] An auditory-based distortion measure with application to concatenative speech synthesis
    Hansen, JHL
    Chappell, DT
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (05): : 489 - 495
  • [2] Speech Enhancement Using Auditory-Based Transform
    Tank, Vanita Raj
    Mahajan, S. P.
    Khaparde, Arti
    Deshpande, Rahul
    [J]. 2015 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2015,
  • [3] AN AUDITORY-BASED FEATURE FOR ROBUST SPEECH RECOGNITION
    Shao, Yang
    Jin, Zhaozhang
    Wang, DeLiang
    Srinivasan, Soundararajan
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4625 - +
  • [4] An Improved Dissonance Measure Based on Auditory Memory
    Jensen, Kristoffer
    Hjortkjaer, Jens
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2012, 60 (05): : 350 - 354
  • [5] An Auditory-Based Scene Change Detection in Audio Data
    Maka, Tomasz
    [J]. 2014 INTERNATIONAL CONFERENCE ON SIGNALS AND ELECTRONIC SYSTEMS (ICSES), 2014,
  • [6] Central auditory processing deficits in schizophrenia: Effects of auditory-based cognitive training
    Molina, Juan L.
    Joshi, Yash B.
    Nungaray, John A.
    Thomas, Michael L.
    Sprock, Joyce
    Clayson, Peter E.
    Sanchez, Victoria A.
    Attarha, Mouna
    Biagianti, Bruno
    Swerdlow, Neal R.
    Light, Gregory A.
    [J]. SCHIZOPHRENIA RESEARCH, 2021, 236 : 135 - 141
  • [7] ROBUST SPEAKER IDENTIFICATION USING AN AUDITORY-BASED FEATURE
    Li, Qi
    Huang, Yan
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4514 - 4517
  • [8] Discriminative auditory-based features for robust speech recognition
    Mak, BKW
    Tam, YC
    Li, PQ
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (01): : 27 - 36
  • [9] Auditory-Based Spectral Amplitude Estimators for Speech Enhancement
    Plourde, Eric
    Champagne, Benoit
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (08): : 1614 - 1623
  • [10] Towards an Immersive Auditory-based Journey Planner for the Visually Impaired
    McCarthy, Chris
    Lai, Tuan Dung
    Favilla, Stuart
    Sly, David
    [J]. PROCEEDINGS OF THE 31ST AUSTRALIAN CONFERENCE ON HUMAN-COMPUTER-INTERACTION (OZCHI'19), 2020, : 387 - 391