Audio Classification and Retrieval Using Wavelets and Gaussian Mixture Models

被引:1
|
作者
Chuan, Ching-Hua [1 ]
机构
[1] Univ North Florida, Sch Comp, Coll Comp Engn & Construct, Jacksonville, FL 32224 USA
关键词
Audio Classification; Compact Vector Representation; Gaussian Mixture Models; Retrieval; Wavelets;
D O I
10.4018/jmdem.2013010101
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents an audio classification and retrieval system using wavelets for extracting low-level acoustic features. The author performed multiple-level decomposition using discrete wavelet transform to extract acoustic features from audio recordings at different scales and times. The extracted features are then translated into a compact vector representation. Gaussian mixture models with expectation maximization algorithm are used to build models for audio classes and individual audio examples. The system is evaluated using three audio classification tasks: speech/music, male/female speech, and music genre. They also show how wavelets and Gaussian mixture models are used for class-based audio retrieval in two approaches: indexing using only wavelets versus indexing by Gaussian components. By evaluating the system through 10-fold cross-validation, the author shows the promising capability of wavelets and Gaussian mixture models for audio classification and retrieval. They also compare how parameters including frame size, wavelet level, Gaussian components, and sampling size affect performance in Gaussian models.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [21] Gender classification using principal geodesic analysis and Gaussian mixture models
    Wu, Jing
    Smith, William A. P.
    Hancock, Edwin R.
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2006, 4225 : 58 - 67
  • [22] Antenna Classification Using Gaussian Mixture Models (GMM) and Machine Learning
    Ma, Yihan
    Hao, Yang
    IEEE OPEN JOURNAL OF ANTENNAS AND PROPAGATION, 2020, 1 (01): : 320 - 328
  • [23] Traffic Classification and Verification using Unsupervised Learning of Gaussian Mixture Models
    Alizadeh, Hassan
    Khoshrou, Abdolrahman
    Zuquete, Andre
    2015 IEEE INTERNATIONAL WORKSHOP ON MEASUREMENTS AND NETWORKING (M&N), 2015, : 94 - 99
  • [24] Audio-Visual Emotion Recognition using Gaussian Mixture Models for Face and Voice
    Metallinou, Angeliki
    Lee, Sungbok
    Narayanan, Shrikanth
    ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 250 - 257
  • [25] Secure sound classification: Gaussian mixture models
    Shashanka, Madhusudana V. S.
    Smaragdis, Paris
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 3539 - 3542
  • [26] Query by example of audio signals using Euclidean distance between Gaussian mixture models
    Helen, Marko
    Virtanen, Tuomas
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 225 - 228
  • [27] SAR Image Segmentation using Wavelets and Gaussian Mixture Model
    Dutta, Anirban
    Sarma, Kandarpa Kumar
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2014, : 766 - 770
  • [28] Content-based indexing and retrieval of audio data using wavelets
    Li, GH
    Khokhar, AA
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 885 - 888
  • [29] Application of Relevance Feedback in Content Based Image Retrieval Using Gaussian Mixture Models
    Marakakis, Apostolos
    Galatsanos, Nikolaos
    Likas, Arisfidis
    Stafylopatis, Andreas
    20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 1, PROCEEDINGS, 2008, : 141 - +
  • [30] Image retrieval with embeded sub-class information using Gaussian mixture models
    Muneesawang, P
    Guan, L
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 769 - 772