Anthropomorphic coding of speech and audio: A model inversion approach

被引:10
|
作者
Feldbauer, C [1 ]
Kubin, G
Kleijn, WB
机构
[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria
[2] Royal Inst Technol, KTH, Dept Signal Sensors & Syst, S-10044 Stockholm, Sweden
关键词
speech and audio coding; auditory representation; auditory model inversion; auditory synthesis; perceptual domain coding; multiple description coding;
D O I
10.1155/ASP.2005.1334
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.
引用
收藏
页码:1334 / 1349
页数:16
相关论文
共 50 条
  • [11] Embedded coding using a mixed speech and audio coding paradigm
    Ramprashad S.A.
    International Journal of Speech Technology, 1999, 2 (4) : 359 - 372
  • [12] Quantization and psychoacoustic model in audio coding in Advanced Audio Coding
    Brzuchalski, Grzegorz
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2011, 2011, 8008
  • [13] Speech and audio coding using temporal masking
    Gunawan, TS
    Ambikairajah, E
    Senn, D
    SIGNAL PROCESSING FOR TELECOMMUNICATIONS AND MULTIMEDIA, 2005, 27 : 31 - 42
  • [14] CODING OF SPEECH AND WIDE-BAND AUDIO
    JAYANT, NS
    LAWRENCE, VB
    PREZAS, DP
    AT&T TECHNICAL JOURNAL, 1990, 69 (05): : 25 - 41
  • [15] Wideband speech and audio coding in the perceptual domain
    Lin, L
    Ambikairajah, E
    Holmes, WH
    ADVANCED SIGNAL PROCESSING FOR COMMUNICATION SYSTEMS, 2002, 703 : 15 - 30
  • [16] WIDE-BAND SPEECH AND AUDIO CODING
    NOLL, P
    IEEE COMMUNICATIONS MAGAZINE, 1993, 31 (11) : 34 - 44
  • [17] A novel fast algorithm for speech and audio coding
    Guz, Umit
    Gurkan, Hakan
    Yarman, B. Siddik
    2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 4020 - +
  • [18] Hybrid Audio Coding for speech and audio below medium bit bate
    Makino, K
    Matsumoto, J
    IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - 2000 DIGEST OF TECHNICAL PAPERS, 2000, : 264 - 265
  • [19] Lattice Vector Quantization Applied to Speech and Audio Coding
    Minjie Xie(ZTE USA Inc.
    ZTE Communications, 2012, 10 (02) : 25 - 33
  • [20] Postfiltering with Complex Spectral Correlations for Speech and Audio Coding
    Das, Sneha
    Backstrom, Tom
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3538 - 3542