Mixed wideband speech and music coding using a speech/music discriminator

被引:0
|
作者
Qiao, RY [1 ]
机构
[1] CSIRO, Epping, NSW 2121, Australia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multimedia applications such as videoconferencing, users are demanding higher quality speech/audio transmission than the POTS can offer. 7 kHz wideband speech/audio offers a good compromise between bandwidth and sound quality. It improves the intelligibility and naturalness of speech and adds a feeling of transparent communication. Currently the only existing international standard for coding such signals is the G.722 wideband speech/audio coder. While its coding quality is satisfactory, it leaves much to be desired with its bit rate. CELP-based approach has been very successful in telephone bandwidth speech coding, but is not suitable for coding non-speech signals because of the assumed signal production model. This paper proposes an alternative approach to mixed speech/music coding, which uses a discriminator to separate music signals from speech, and codes them with the G.722 coder and a G.723.1-based speech coder, respectively. Simulations shows very promising results.
引用
收藏
页码:605 / 608
页数:4
相关论文
共 50 条
  • [31] MUSIC MODELS FOR MUSIC-SPEECH SEPARATION
    Hughes, Thad
    Kristjansson, Trausti
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4917 - 4920
  • [32] MUSIC MODELS FOR MUSIC-SPEECH SEPARATION
    Hughes, Thad
    Kristjansson, Trausti
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4917 - 4920
  • [33] Wideband coding of speech using a scalable pulse codebook
    Ashley, JP
    Cruz-Zeno, EM
    Mittal, U
    Peng, WM
    [J]. 2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS: MEETING THE CHALLENGES OF THE NEW MILLENNIUM, 2000, : 148 - 150
  • [34] Music expertise shapes audiovisual temporal integration windows for speech, sinewave speech, and music
    Lee, Hweeling
    Noppeney, Uta
    [J]. FRONTIERS IN PSYCHOLOGY, 2014, 5
  • [35] Mixed excitation linear prediction coding of wideband speech at 8 kbps
    Lin, WR
    Koh, SN
    Lin, X
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1137 - 1140
  • [36] End-to-end Music-mixed Speech Recognition
    Woo, Jeongwoo
    Mimura, Masato
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 800 - 804
  • [37] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Khonglah, Banriskhem K.
    Dey, Abhishek
    Prasanna, S. R. Mahadeva
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (02) : 643 - 663
  • [38] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Banriskhem K. Khonglah
    Abhishek Dey
    S. R. Mahadeva Prasanna
    [J]. Circuits, Systems, and Signal Processing, 2019, 38 : 643 - 663
  • [39] Trends and perspectives in wideband speech coding
    Schnitzler, J
    Vary, P
    [J]. SIGNAL PROCESSING, 2000, 80 (11) : 2267 - 2281
  • [40] Music, language, speech, and brain
    不详
    [J]. INTERNATIONAL JOURNAL OF PSYCHOPHYSIOLOGY, 2014, 94 (02) : 125 - 126