Anthropomorphic coding of speech and audio: A model inversion approach

被引：10

作者：

Feldbauer, C ^{[1
]}

Kubin, G

Kleijn, WB

机构：

[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria

[2] Royal Inst Technol, KTH, Dept Signal Sensors & Syst, S-10044 Stockholm, Sweden

来源：

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING | 2005年 / 2005卷 / 09期

关键词：

speech and audio coding; auditory representation; auditory model inversion; auditory synthesis; perceptual domain coding; multiple description coding;

D O I：

10.1155/ASP.2005.1334

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.

引用

页码：1334 / 1349

页数：16

共 50 条

[11] Embedded coding using a mixed speech and audio coding paradigm
Ramprashad S.A.
International Journal of Speech Technology, 1999, 2 (4) : 359 - 372
[12] Quantization and psychoacoustic model in audio coding in Advanced Audio Coding
Brzuchalski, Grzegorz
PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2011, 2011, 8008
[13] Speech and audio coding using temporal masking
Gunawan, TS
Ambikairajah, E
Senn, D
SIGNAL PROCESSING FOR TELECOMMUNICATIONS AND MULTIMEDIA, 2005, 27 : 31 - 42
[14] CODING OF SPEECH AND WIDE-BAND AUDIO
JAYANT, NS
LAWRENCE, VB
PREZAS, DP
AT&T TECHNICAL JOURNAL, 1990, 69 (05): : 25 - 41
[15] Wideband speech and audio coding in the perceptual domain
Lin, L
Ambikairajah, E
Holmes, WH
ADVANCED SIGNAL PROCESSING FOR COMMUNICATION SYSTEMS, 2002, 703 : 15 - 30
[16] WIDE-BAND SPEECH AND AUDIO CODING
NOLL, P
IEEE COMMUNICATIONS MAGAZINE, 1993, 31 (11) : 34 - 44
[17] A novel fast algorithm for speech and audio coding
Guz, Umit
Gurkan, Hakan
Yarman, B. Siddik
2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 4020 - +
[18] Hybrid Audio Coding for speech and audio below medium bit bate
Makino, K
Matsumoto, J
IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - 2000 DIGEST OF TECHNICAL PAPERS, 2000, : 264 - 265
[19] Lattice Vector Quantization Applied to Speech and Audio Coding
Minjie Xie(ZTE USA Inc.
ZTE Communications, 2012, 10 (02) : 25 - 33
[20] Postfiltering with Complex Spectral Correlations for Speech and Audio Coding
Das, Sneha
Backstrom, Tom
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3538 - 3542

← 1 2 3 4 5 →