Speech and Music Classification Using Hybrid Form of Spectrogram and Fourier Transformation

被引:0
|
作者
Neammalai, Piyawat [1 ]
Phimoltares, Suphakant [1 ]
Lursinsap, Chidchanok [1 ]
机构
[1] Chulalongkorn Univ, Fac Sci, Dept Math & Comp Sci, Adv Virtual & Intelligent Comp AVIC Ctr, Bangkok, Thailand
关键词
Speech music classification; Spectrogram; Fourier Transform; AUDIO CLASSIFICATION; SEGMENTATION; FEATURES;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents the technique for feature extraction to classify speech and music audio data. The combination of image processing and signal processing is used to classify audio data. There are three main steps. First, the audio data is segments and transformed to spectrogram image and then apply image processing methods to find the salient characteristics on the spectrogram image. The next step transforms the salient spectrogram image using 2-dimensional Fourier Transform and then calculates the energy of signal at the specific frequencies to form the feature vector. Next, in classification process, Support Vector Machine is used as bi-classification tool. The method is tested on an audio database containing 510 instances with 1.5 seconds length of each. The experimental results show that the acceptable classification accuracy of our proposed technique is achieved.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Speech and music classification using spectrogram based statistical descriptors and extreme learning machine
    Gajanan K. Birajdar
    Mukesh D. Patil
    [J]. Multimedia Tools and Applications, 2019, 78 : 15141 - 15168
  • [2] Speech and music classification using spectrogram based statistical descriptors and extreme learning machine
    Birajdar, Gajanan K.
    Patil, Mukesh D.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (11) : 15141 - 15168
  • [3] A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis
    Jena, Kalyan Kumar
    Bhoi, Sourav Kumar
    Mohapatra, Sonalisha
    Bakshi, Sambit
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (15): : 11223 - 11248
  • [4] A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis
    Kalyan Kumar Jena
    Sourav Kumar Bhoi
    Sonalisha Mohapatra
    Sambit Bakshi
    [J]. Neural Computing and Applications, 2023, 35 (15) : 11223 - 11248
  • [5] Acoustic Characteristics of Emotional Speech Using Spectrogram Image Classification
    Stolar, Melissa
    Lech, Margaret
    Bolia, Robert S.
    Skinner, Michael
    [J]. 2018 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2018,
  • [6] Random fourier feature based music-speech classification
    Vyshnav, M. T.
    Kumar, S. Sachin
    Mohan, Neethu
    Soman, K. P.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 6353 - 6363
  • [7] Music Genre Classification by Analyzing the Subband Spectrogram
    Chou, Chih-Hsun
    Liao, Bo-Jun
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1676 - +
  • [8] Speech/music classification using speech-specific features
    Khonglah, Banriskhem K.
    Prasanna, S. R. Mahadeva
    [J]. DIGITAL SIGNAL PROCESSING, 2016, 48 : 71 - 83
  • [9] Speech classification by using binary quantized SIFT features of signal spectrogram images
    The Duy Bui
    Quang Trung Nguyen
    [J]. 2016 3RD NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2016, : 177 - 182
  • [10] The log-Gabor method: speech classification using spectrogram image analysis
    Buisman, Harm
    Postma, Eric
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 518 - 521