Sound analysis using MPEG compressed audio

被引:0
|
作者
Tzanetakis, G [1 ]
Cook, P [1 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
There is a huge amount of audio data available that is compressed using the MPEG audio compression standard. Sound analysis is based on the computation of short time feature vectors that describe the instantaneous spectral content of the sound. An interesting possibility is the calculation of features directly from compressed data. Since the bulk of the feature calculation is performed during the encoding stage this process has a significant performance advantage if the available data is compressed. Combining decoding and analysis in one stage is also very important for audio streaming applications. In this paper, we describe the calculation of features directly from MPEG audio compressed data. Two of the basic processes of analyzing sound are: segmentation and classification. To illustrate the effectiveness of the calculated features we have implemented two case studies: a general audio segmentation algorithm and a Music/Speech classifier. Experimental data is provided to show that the results obtained are comparable with sound analysis algorithms working directly with audio samples.
引用
收藏
页码:761 / 764
页数:4
相关论文
共 50 条
  • [41] Required bit rate of 22.2 multichannel audio signal compressed by MPEG-H 3D Audio to meet broadcast quality
    Sugimoto, Takehiro
    Komori, Tomoyasu
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2018, 39 (03) : 266 - 269
  • [42] MPEG-7 and MPEG-7 audio - An overview
    Lindsay, AT
    Herre, J
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2001, 49 (7-8): : 589 - 594
  • [43] Using MPEG-4 audio for DRM digital narrowband broadcasting
    Dietz, M
    Mlasko, T
    [J]. ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL III: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 205 - 208
  • [44] Advancement of 22.2 Multichannel Sound Broadcasting Based on MPEG-H 3D Audio
    Sugimoto, Takehiro
    Aoki, Shuichi
    Hasegawa, Tomomi
    Komori, Tomoyasu
    [J]. IEEE TRANSACTIONS ON BROADCASTING, 2020, 66 (02) : 365 - 371
  • [45] Optimized MPEG audio decoding using recursive subband synthesis windowing
    De Smet, P
    Bruyland, I
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3160 - 3163
  • [46] A survey of MPEG-1 audio, video and semantic analysis techniques
    Srinivasan, U
    Pfeiffer, S
    Nepal, S
    Lee, M
    Gu, LF
    Barrass, S
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2005, 27 (01) : 105 - 141
  • [47] Camera motion estimation using feature points in MPEG compressed domain
    Kuhn, PM
    [J]. 2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 596 - 599
  • [48] Audio analysis and adaptation within interactive MPEG-21 framework
    Kim, HJ
    Kim, HK
    Kim, RC
    Nam, J
    Hong, J
    [J]. Digital Media: Processing Multimedia Interactive Services, 2003, : 294 - 301
  • [49] Environment Recognition from Audio Using MPEG-7 Features
    Muhammad, Ghulam
    Alghathbar, Khaled
    [J]. PROCEEDINGS OF THE 2009 FOURTH INTERNATIONAL CONFERENCE ON EMBEDDED AND MULTIMEDIA COMPUTING, 2009, : 142 - 147
  • [50] Performance Analysis of Data Hiding in MPEG-4 AAC Audio
    Xu, Shuzheng
    Zhang, Peng
    Wang, Pengjun
    Yang, Huazhong
    [J]. Tsinghua Science and Technology, 2009, 14 (01) : 55 - 61