Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm

被引:0
|
作者
Ahmadi, S [1 ]
Spanias, AS [1 ]
机构
[1] Nokia Mobile Phones Inc, San Diego, CA 92121 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.
引用
收藏
页码:1075 / 1079
页数:5
相关论文
共 50 条
  • [21] An algorithm of optimal phase estimation in harmonic sinusoidal speech model
    Ying, Na
    Zhao, Xiao-Hui
    Dong, Jing
    Fang, Xin
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2009, 37 (04): : 860 - 863
  • [22] Audio and Speech Compression using Sinusoidal Modeling and Wavelet Residuum Coding
    Nagy, Martin Turi
    Vargic, Radoslav
    PROCEEDINGS ELMAR-2012, 2012, : 207 - 210
  • [23] Speech analysis and coding using a multi-resolution sinusoidal transform
    Anderson, DV
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 1037 - 1040
  • [24] An improved 4 kbit/s CELP speech coding algorithm
    Bai, YN
    Bao, CC
    2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 153 - 156
  • [25] Sinusoidal modeling of audio and speech using psychoacoustic-adaptive matching pursuits
    Heusdens, R
    Vafin, R
    Kleijn, WB
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 3281 - 3284
  • [26] Phase modelling of speech excitation for low bit-rate sinusoidal transform coding
    Sun, XQ
    Plante, F
    Cheetham, BMG
    Wong, KWT
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1691 - 1694
  • [27] Automatic fingerprint matching using improved vector matching algorithm
    Wang Yuan
    Zhou Fuqiang
    Yao Lixiu
    ICEMI 2007: PROCEEDINGS OF 2007 8TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL II, 2007, : 408 - +
  • [28] An improved residual-domain phase/amplitude model for sinusoidal coding of speech at very low bit rates: A variable rate scheme
    Ahmadi, S
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 2291 - 2294
  • [29] IMPROVED MULTIPULSE ALGORITHM FOR SPEECH CODING BY MEANS OF ADAPTIVE BOLTZMANN ANNEALING
    MUMOLO, E
    REBELLI, A
    RICCARDI, G
    EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, 1994, 5 (06): : 739 - 746
  • [30] IMPROVED SINGLE-CHANNEL SPEECH SEPARATION USING SINUSOIDAL MODELING
    Mowlaee, Pejman
    Christensen, Mads Graesboll
    Jensen, Soren Holdt
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 21 - 24