A new approach for classification of generic audio data

被引:4
|
作者
Lin, RS [1 ]
Chen, LH [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp & Informat Sci, Hsinchu 30050, Taiwan
关键词
audio classification; spectrogram; Bayesian decision function; multivariable Gaussian distribution;
D O I
10.1142/S0218001405003958
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The existing audio retrieval systems fall into one of two categories: single-domain systems that can accept data of only a single type (e.g. speech) or multiple-domain systems that offer content-based retrieval for multiple types of audio data. Since a single-domain system has limited applications, a multiple-domain system will be more useful. However, different types of audio data will have different properties, this will make a multiple-domain system harder to be developed. If we can classify audio information in advance, the above problems can be solved. In this paper, we will propose a real-time classification method to classify audio signals into several basic audio types such as pure speech, music, song, speech with music background, and speech with environmental noise background. In order to make the proposed method robust for a variety of audio sources, we use Bayesian decision function for multivariable Gaussian distribution instead of manually adjusting a threshold for each discriminator. The proposed approach can be applied to content-based audio/video retrieval. In the experiment, the efficiency and effectiveness of this method are shown by an accuracy rate of more than 96% for general audio data classification.
引用
收藏
页码:63 / 78
页数:16
相关论文
共 50 条
  • [41] Music genre classification of MPEG AAC audio data
    Kobayakawa, Michihiro
    Hoshi, Mamoru
    Yuzawa, Koichiro
    2014 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2014, : 347 - 352
  • [42] LEARNING WITH OUT-OF-DISTRIBUTION DATA FOR AUDIO CLASSIFICATION
    Iqbal, Turab
    Cao, Yin
    Kong, Qiuqiang
    Plumbley, Mark D.
    Wang, Wenwu
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 636 - 640
  • [43] A fast audio classification from MPEG coded data
    Nakajima, Y
    Lu, Y
    Sugano, M
    Yoneyama, A
    Yanagihara, H
    Kurematsu, A
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 3005 - 3008
  • [44] Tolerance-Based Approach to Audio Signal Classification
    Ramanna, Sheela
    Singh, Ashmeet
    ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2016, 2016, 9673 : 83 - 88
  • [45] Fast audio classification from MPEG coded data
    Nakajima, Yasuyuki
    Lu, Yang
    Sugano, Masaru
    Yoneyama, Akio
    Yanagihara, Hiromasa
    Kurematsu, Akira
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 6 : 3005 - 3008
  • [46] The IANET Hardware Accelerator for Audio and Visual Data Classification
    Gillela, Rohini J.
    Ganguly, Amlan
    Patru, Dorin
    Indovina, Mark A.
    2020 IEEE 33RD INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), 2020, : 48 - 53
  • [47] A NEW GENERIC CLASSIFICATION OF TRIBE BIGNONIEAE (BIGNONIACEAE)
    Lohmann, Lucia G.
    Taylor, Charlotte M.
    ANNALS OF THE MISSOURI BOTANICAL GARDEN, 2014, 99 (03) : 348 - 489
  • [48] AVS generic audio coding
    Hu, RM
    Chen, SX
    Ai, HJ
    Xiong, NX
    PDCAT 2005: Sixth International Conference on Parallel and Distributed Computing, Applications and Technologies, Proceedings, 2005, : 679 - 683
  • [49] NEW APPROACH TO AN AUDIO CONSOLE FOR TELEVISION
    DODSON, RE
    JOURNAL OF THE SMPTE-SOCIETY OF MOTION PICTURE AND TELEVISION ENGINEERS, 1973, 82 (11): : 930 - 936
  • [50] A new approach of audio emotion recognition
    Ooi, Chien Shing
    Seng, Kah Phooi
    Ang, Li-Minn
    Chew, Li Wern
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (13) : 5858 - 5869