Sound analysis using MPEG compressed audio

被引：0

作者：

Tzanetakis, G ^{[1
]}

Cook, P ^{[1
]}

机构：

[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA

来源：

2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI | 2000年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

There is a huge amount of audio data available that is compressed using the MPEG audio compression standard. Sound analysis is based on the computation of short time feature vectors that describe the instantaneous spectral content of the sound. An interesting possibility is the calculation of features directly from compressed data. Since the bulk of the feature calculation is performed during the encoding stage this process has a significant performance advantage if the available data is compressed. Combining decoding and analysis in one stage is also very important for audio streaming applications. In this paper, we describe the calculation of features directly from MPEG audio compressed data. Two of the basic processes of analyzing sound are: segmentation and classification. To illustrate the effectiveness of the calculated features we have implemented two case studies: a general audio segmentation algorithm and a Music/Speech classifier. Experimental data is provided to show that the results obtained are comparable with sound analysis algorithms working directly with audio samples.

引用

页码：761 / 764

页数：4

共 50 条

[41] Required bit rate of 22.2 multichannel audio signal compressed by MPEG-H 3D Audio to meet broadcast quality
Sugimoto, Takehiro
Komori, Tomoyasu
[J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2018, 39 (03) : 266 - 269
[42] MPEG-7 and MPEG-7 audio - An overview
Lindsay, AT
Herre, J
[J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2001, 49 (7-8): : 589 - 594
[43] Using MPEG-4 audio for DRM digital narrowband broadcasting
Dietz, M
Mlasko, T
[J]. ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL III: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 205 - 208
[44] Advancement of 22.2 Multichannel Sound Broadcasting Based on MPEG-H 3D Audio
Sugimoto, Takehiro
Aoki, Shuichi
Hasegawa, Tomomi
Komori, Tomoyasu
[J]. IEEE TRANSACTIONS ON BROADCASTING, 2020, 66 (02) : 365 - 371
[45] Optimized MPEG audio decoding using recursive subband synthesis windowing
De Smet, P
Bruyland, I
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3160 - 3163
[46] A survey of MPEG-1 audio, video and semantic analysis techniques
Srinivasan, U
Pfeiffer, S
Nepal, S
Lee, M
Gu, LF
Barrass, S
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2005, 27 (01) : 105 - 141
[47] Camera motion estimation using feature points in MPEG compressed domain
Kuhn, PM
[J]. 2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 596 - 599
[48] Audio analysis and adaptation within interactive MPEG-21 framework
Kim, HJ
Kim, HK
Kim, RC
Nam, J
Hong, J
[J]. Digital Media: Processing Multimedia Interactive Services, 2003, : 294 - 301
[49] Environment Recognition from Audio Using MPEG-7 Features
Muhammad, Ghulam
Alghathbar, Khaled
[J]. PROCEEDINGS OF THE 2009 FOURTH INTERNATIONAL CONFERENCE ON EMBEDDED AND MULTIMEDIA COMPUTING, 2009, : 142 - 147
[50] Performance Analysis of Data Hiding in MPEG-4 AAC Audio
Xu, Shuzheng
Zhang, Peng
Wang, Pengjun
Yang, Huazhong
[J]. Tsinghua Science and Technology, 2009, 14 (01) : 55 - 61

← 1 2 3 4 5 →