Speech and audio coding using temporal masking

被引:0
|
作者
Gunawan, TS [1 ]
Ambikairajah, E [1 ]
Senn, D [1 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
关键词
temporal masking model; simultaneous masking model; Gammatone filters; wavelet packet; PESQ; subjective listening test;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a comparison of three auditory temporal masking models for speech and audio coding applications. The first model was developed based upon the existing forward masking psychoacoustic data with an assumption of ail approximately 200 ms. The model's dynamic parameters were derived from this data. The previously developed second model was,: based upon the principle of an exponential decay following higher energy stimuli, where the masking effects have a relatively short duration. The existing third model best matches the previously reported forward masking, data using ail exponential curve but the effects of the Forward masking are restricted to 100-200ms. Objective assessments employing the PESQ measure reveal that these three ternporal models have potential for removing perceptually redundant information in speech and audio coding, applications. Results show that the incorporation of temporal masking along with simultaneous masking into a speech/audio coding algorithm results in a further bit rate reduction of approximately 17% compared with simultaneous masking alone. while preserving perceptual quality.
引用
收藏
页码:31 / 42
页数:12
相关论文
共 50 条
  • [1] Wavelet packet based audio coding using temporal masking
    Sinaga, F
    Gunawan, TS
    Ambikairajah, E
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1380 - 1383
  • [2] High efficiency audio coding using auditory masking
    Miyasaka, Eiichi
    Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2001, 55 (12):
  • [3] TEMPORAL MASKING OF SPEECH
    CHARAN, KK
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 50 (01): : 147 - &
  • [4] Embedded coding using a mixed speech and audio coding paradigm
    Ramprashad S.A.
    International Journal of Speech Technology, 1999, 2 (4) : 359 - 372
  • [5] Audio watermarking using m-sequences and temporal masking
    Cvejic, N
    Keskinarkaus, A
    Seppanen, T
    PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, : 227 - 230
  • [6] Improving perceptual coding of wideband audio signal when taking into consideration of temporal masking
    Zakharenko, A
    Kowalguin, Y
    ARCHITECTURAL ACOUSTICS AND SOUND REINFORCEMENT, 2002, : 235 - 239
  • [7] Single channel speech enhancement using temporal masking
    Gunawan, TS
    Ambikairajah, E
    2004 9TH IEEE SINGAPORE INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS (ICCS), 2004, : 250 - 254
  • [8] Using Visual Speech Information in Masking Methods for Audio Speaker Separation
    Khan, Faheem Ullah
    Milner, Ben P.
    Le Cornu, Thomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1742 - 1754
  • [9] Perceptual speech coding using time and frequency masking constraints
    Carnero, B
    Drygajlo, A
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1363 - 1366
  • [10] Technologies for Speech and Audio Coding
    Moriya, Takehiro
    ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, 2009, : 20 - 21