Sound analysis using MPEG compressed audio

被引:0
|
作者
Tzanetakis, G [1 ]
Cook, P [1 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
There is a huge amount of audio data available that is compressed using the MPEG audio compression standard. Sound analysis is based on the computation of short time feature vectors that describe the instantaneous spectral content of the sound. An interesting possibility is the calculation of features directly from compressed data. Since the bulk of the feature calculation is performed during the encoding stage this process has a significant performance advantage if the available data is compressed. Combining decoding and analysis in one stage is also very important for audio streaming applications. In this paper, we describe the calculation of features directly from MPEG audio compressed data. Two of the basic processes of analyzing sound are: segmentation and classification. To illustrate the effectiveness of the calculated features we have implemented two case studies: a general audio segmentation algorithm and a Music/Speech classifier. Experimental data is provided to show that the results obtained are comparable with sound analysis algorithms working directly with audio samples.
引用
收藏
页码:761 / 764
页数:4
相关论文
共 50 条
  • [21] Bit rate required for mono audio object in object-based audio program compressed with MPEG-H 3D Audio
    Sugimoto, Takehiro
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2023, 44 (02) : 93 - 100
  • [22] Environmental sound classification using hybrid SVM/KNN classifier and MPEG-7 audio low-level descriptor
    Wang, Jia-Ching
    Wang, Jhing-Fa
    He, Kuok Wai
    Hsu, Cheng-Shu
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1731 - +
  • [23] A test of MPEG using time-inverted spoken audio
    McLaughlin, T
    Cookson, J
    Rasmussen, L
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 1025 - 1028
  • [24] Polyphase filter architectures for MPEG audio using fast IDCT
    Shih, CW
    Ling, N
    [J]. THIRTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 1998, : 416 - 420
  • [25] Comparison of MPEG-7 audio spectrum projection features and mfcc applied to speaker recognition, sound classification and audio segmentation
    Kim, HG
    Sikora, T
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 925 - 928
  • [26] DIRECT PROCESSING OF MPEG AUDIO USING COMPANDING AND BFP TECHNIQUES
    Vezyrtzis, Christos
    Klein, Aaron
    Ellis, Dan
    Tsividis, Yannis
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 361 - 364
  • [27] Temporal audio segmentation using MPEG-7 descriptors
    Wellhausen, J
    Crysandt, H
    [J]. STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2003, 2003, 5021 : 380 - 387
  • [28] Using MPEG-7 audio descriptors for music querying
    Gruhne, M.
    Dittmar, C.
    [J]. APPLICATIONS OF DIGITAL IMAGE PROCESSING XXIX, 2006, 6312
  • [29] MPEG content summarization based on compressed domain feature analysis
    Sugano, M
    Nakajima, Y
    Yanagihara, H
    [J]. INTERNET MULTIMEDIA MANAGEMENT SYSTEMS IV, 2003, 5242 : 280 - 288
  • [30] FORENSIC ANALYSIS AND LOCALIZATION OF MULTIPLY COMPRESSED MP3 AUDIO USING TRANSFORMERS
    Xiang, Ziyue
    Bestagini, Paolo
    Tubaro, Stefano
    Delp, Edward J.
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2929 - 2933