A new approach for classification of generic audio data

被引:4
|
作者
Lin, RS [1 ]
Chen, LH [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp & Informat Sci, Hsinchu 30050, Taiwan
关键词
audio classification; spectrogram; Bayesian decision function; multivariable Gaussian distribution;
D O I
10.1142/S0218001405003958
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The existing audio retrieval systems fall into one of two categories: single-domain systems that can accept data of only a single type (e.g. speech) or multiple-domain systems that offer content-based retrieval for multiple types of audio data. Since a single-domain system has limited applications, a multiple-domain system will be more useful. However, different types of audio data will have different properties, this will make a multiple-domain system harder to be developed. If we can classify audio information in advance, the above problems can be solved. In this paper, we will propose a real-time classification method to classify audio signals into several basic audio types such as pure speech, music, song, speech with music background, and speech with environmental noise background. In order to make the proposed method robust for a variety of audio sources, we use Bayesian decision function for multivariable Gaussian distribution instead of manually adjusting a threshold for each discriminator. The proposed approach can be applied to content-based audio/video retrieval. In the experiment, the efficiency and effectiveness of this method are shown by an accuracy rate of more than 96% for general audio data classification.
引用
收藏
页码:63 / 78
页数:16
相关论文
共 50 条
  • [21] On classification and segmentation of massive audio data streams
    Aggarwal, Charu C.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2009, 20 (02) : 137 - 156
  • [22] AUDIO CLASSIFICATION BASED ON WEAKLY LABELED DATA
    Cheng, Chieh-Feng
    Anderson, David, V
    Davenport, Mark A.
    Rashidi, Abbas
    2018 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2018, : 568 - 572
  • [23] Classification of Depression Audio Data by Deep Learning
    Homsiang, Phanomkorn
    Treebupachatsakul, Treesukon
    Kiatrungrit, Komsan
    Poomrittigul, Suvit
    2022 14TH BIOMEDICAL ENGINEERING INTERNATIONAL CONFERENCE (BMEICON 2022), 2022,
  • [24] An automatic approach towards audio segmentation and classification
    Pan, Wenjuan
    Wang, Zongwu
    Liu, Zhijing
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 405 - 408
  • [25] An Audio Classification Approach Based on Machine Learning
    Dan, Wu
    2019 INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION, BIG DATA & SMART CITY (ICITBS), 2019, : 626 - 629
  • [26] Hierarchical classification of audio data for archiving and retrieving
    Zhang, T
    Kuo, CCJ
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 3001 - 3004
  • [27] Hybrid SVM/HMM approach for audio classification
    He Xin
    Shi Ying-chun
    Huang Bing
    Zhou Xian-zhong
    PROCEEDINGS OF 2005 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1 AND 2, 2005, : 1503 - +
  • [28] Hierarchical classification of audio data for archiving and retrieving
    Zhang, Tong
    Kuo, C.-C.Jay
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 6 : 3001 - 3004
  • [29] A new approach for audio classification and segmentation using Gabor wavelets and Fisher Linear Discriminator
    Lin, RS
    Chen, LH
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (06) : 807 - 822
  • [30] A neuromorphic network for generic multivariate data classification
    Schmuker, Michael
    Pfeil, Thomas
    Nawrot, Martin Paul
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2014, 111 (06) : 2081 - 2086