Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm

被引:0
|
作者
Ahmadi, S [1 ]
Spanias, AS [1 ]
机构
[1] Nokia Mobile Phones Inc, San Diego, CA 92121 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.
引用
收藏
页码:1075 / 1079
页数:5
相关论文
共 50 条
  • [31] Phase-based stereo matching by using improved LMedS algorithm and greedy strategy
    Li, Chunlei
    Chang, Zhiyong
    Mo, Rong
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2014, 26 (11): : 2046 - 2055
  • [32] Pitch estimation using music algorithm based on the sinusoidal speech model
    Amirkabir University of Technology, Electrical Engineering Department, Hafez Avenue, 15914 Tehran, Iran
    不详
    Advances in Communications and Software Technologies, 2002, : 255 - 258
  • [33] Sinusoidal modelling using perceptual matching pursuits in the Bark scale for parametric audio coding
    Vera-Candeas, P.
    Ruiz-Reyes, N.
    Cuevas-Martinez, J. C.
    Rosa-Zurera, M.
    Lopez-Ferreras, F.
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2006, 153 (04): : 431 - 435
  • [34] Highband Coding Method Using Matching Pusuit Estimation and CELP Coding for Wideband Speech Coder
    Jeong, Gyu-Hyeok
    Ahn, Yeong-Uk
    Kim, Jong-Hark
    Shin, Jae-Hyun
    Seo, Sang-Won
    Hwang, In-Kwan
    Lee, In-Sung
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2006, 25 (01): : 21 - 29
  • [35] Iterative Expansion and Color Coding: An Improved Algorithm for 3D-Matching
    Chen, Jianer
    Liu, Yang
    Lu, Songjian
    Sze, Sing-Hoi
    Zhang, Fenghui
    ACM TRANSACTIONS ON ALGORITHMS, 2012, 8 (01)
  • [36] Approach to speech feature matching using image registration algorithm
    College of Engineering, Yanbian University, Yanji 133002, China
    不详
    Harbin Gongye Daxue Xuebao, 2008, 7 (1152-1155):
  • [37] Disordered Speech Quality Estimation Using the Matching Pursuit Algorithm
    Ali, Yousef S. Ettomi
    Parsa, Vijay
    Doyle, Philip
    Berkane, Soulaimane
    2017 IEEE 30TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2017,
  • [38] An audio reversible watermarking algorithm based on improved phase coding
    Zhang, Qiu-yu
    Yu, Shuang
    Zhang, Qi-wen
    Ren, Zhan-wei
    WIRELESS COMMUNICATION AND SENSOR NETWORK, 2016, : 396 - 404
  • [39] An improved polar scan matching using genetic algorithm
    Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
    Inf. Technol. J., 2007, 1 (89-95):
  • [40] An improved speech feature extraction algorithm using DWT
    Wu, Xiang
    Tian, Feng
    Liu, Jingao
    2008 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2008, : 1086 - 1090