AUDITORY DISTORTION MEASURE FOR SPEECH CODER EVALUATION - HIDDEN MARKOVIAN APPROACH

被引:2
|
作者
DE, A [1 ]
KABAL, P [1 ]
机构
[1] UNIV QUEBEC, INRS TELECOMMUN, VERDUN, PQ H3H 1H6, CANADA
关键词
AUDITORY (COCHLEAR) MODEL; NEURAL FIRING MECHANISM; HIDDEN MARKOV MODEL; CODED SPEECH QUALITY; DISTORTION MEASURE;
D O I
10.1016/0167-6393(95)00016-H
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article introduces a methodology for quantifying the distortion introduced by a low or medium bit-rate speech coder. Since the perceptual acuity of a human being determines the precision with which speech data must be processed, the speech signal is transformed onto a perceptual-domain (PD). This is done using Lyon's cochlear (auditory) model whose output provides the probability-of-firing information in the neural channels at different clock times. In our present approach, we use a hidden Markov model to describe the basic firing/non-firing process operative in the auditory pathway. We consider a two-state fully-connected model of order one for each neural channel; the two states of the model correspond to the firing and non-firing events. Assuming that the models are stationary over a fixed duration, the model parameters are determined from the PD observations corresponding to the original signal. Then, the PD representations of the coded speech are passed through the respective models and the corresponding likelihood probabilities are calculated. These probability scores are used to define a cochlear hidden Markovian (CHM) distortion measure. This methodology considers the temporal ordering in the neural firing patterns. The CHM measure which utilizes the contextual information present in the firing pattern shows robustness against coder delays.
引用
收藏
页码:39 / 57
页数:19
相关论文
共 50 条
  • [1] AUDITORY DISTORTION MEASURE FOR SPEECH CODER EVALUATION - DISCRIMINATION INFORMATION APPROACH
    DE, A
    KABAL, P
    [J]. SPEECH COMMUNICATION, 1994, 14 (03) : 205 - 229
  • [2] Speech distortion measure based on auditory properties
    CHEN Guo
    HU Xiulin
    ZHANG Yunyu
    ZHU Yaoting (Department of Electronics and Information Engineering
    [J]. Chinese Journal of Acoustics, 2000, (04) : 339 - 345
  • [3] A modified Itakura speech distortion measure based on auditory properties
    Chen, G
    Hu, XL
    Zhang, YY
    Zhu, YT
    [J]. APPLIED ACOUSTICS, 2001, 62 (05) : 545 - 553
  • [4] An auditory-based distortion measure with application to concatenative speech synthesis
    Hansen, JHL
    Chappell, DT
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (05): : 489 - 495
  • [5] Auditory-based distortion measure with application to concatenative speech synthesis
    Duke Univ, Durham, United States
    [J]. IEEE Trans Speech Audio Process, 5 (489-495):
  • [6] Evaluation of speech quality based on wavelet spectrum distortion measure
    School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
    不详
    [J]. Binggong Xuebao/Acta Armamentarii, 2008, 29 (01): : 33 - 36
  • [7] A speech spectrum distortion measure with interframe memory
    Nordén, F
    Eriksson, T
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 717 - 720
  • [8] A novel approach to vocoders: Scaled Speech Coder (SSC)
    Erogul, O
    Ilk, HG
    [J]. PROCEEDINGS OF THE IEEE-EURASIP WORKSHOP ON NONLINEAR SIGNAL AND IMAGE PROCESSING (NSIP'99), 1999, : 296 - 300
  • [9] RATE-DISTORTION SPEECH CODING WITH A MINIMUM DISCRIMINATION INFORMATION DISTORTION MEASURE
    GRAY, RM
    GRAY, AH
    REBOLLEDO, G
    SHORE, JE
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1981, 27 (06) : 708 - 721
  • [10] Intelligibility evaluation of GSM coder for Mandarin speech using CDRT
    McLoughlin, I
    Ding, ZQ
    Tan, EC
    [J]. SPEECH COMMUNICATION, 2002, 38 (1-2) : 161 - 165