Semi-supervised Single-Channel Speech-Music Separation for Automatic Speech Recognition

被引:0
|
作者
Demir, Cemil [1 ,3 ]
Cemgil, A. Taylan [2 ]
Saraclar, Murat [3 ]
机构
[1] TUBITAK BILGEM, Kocaeli, Turkey
[2] Bogazici Univ, Dept Comp Engn, Istanbul, Turkey
[3] Bogazici Univ, Dept Elect & Elect Engn, Istanbul, Turkey
关键词
speech-music separation; semi-supervised; speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we propose a semi-supervised speech-music separation method which uses the speech, music and speech-music segments in a given segmented audio signal to separate speech and music signals from each other in the mixed speech-music segments. In this strategy, we assume, the background music of the mixed signal is partially composed of the repetition of the music segment in the audio. Therefore, we used a mixture model to represent the music signal. The speech signal is modeled using Non-negative Matrix Factorization (NMF) model. The prior model of the template matrix of the NMF model is estimated using the speech segment and updated using the mixed segment of the audio. The separation performance of the proposed method is evaluated in automatic speech recognition task.
引用
收藏
页码:688 / +
页数:2
相关论文
共 50 条
  • [1] CATALOG-BASED SINGLE-CHANNEL SPEECH-MUSIC SEPARATION FOR AUTOMATIC SPEECH RECOGNITION
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    [J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 2133 - 2137
  • [2] ANALYSIS OF EFFECT OF SINGLE-CHANNEL SPEECH-MUSIC SEPARATION USING NMF TO AUTOMATIC SPEECH RECOGNITION
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    [J]. 2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 1818 - 1821
  • [3] Effect of speech priors in single-channel speech-music separation for ASR
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1234 - 1237
  • [4] Catalog-Based Single-Channel Speech-Music Separation
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2786 - +
  • [5] Single-Channel Speech-Music Separation for Robust ASR With Mixture Models
    Demir, Cemil
    Saraclar, Murat
    Cemgil, Ali Taylan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (04): : 725 - 736
  • [6] CATALOG-BASED SINGLE-CHANNEL SPEECH-MUSIC SEPARATION WITH THE ITAKURA-SAITO DIVERGENCE
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    [J]. 2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2812 - 2816
  • [7] Supervised and semi-supervised separation of sounds from single-channel mixtures
    Smaragdis, Paris
    Raj, Bhiksha
    Shashanka, Madhusudana
    [J]. INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2007, 4666 : 414 - +
  • [8] Speech-Music Segmentation System for Speech Recognition
    Demir, Cemil
    Dogan, Mehmet Ugur
    [J]. 2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 846 - 849
  • [9] Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge
    Mowlaee, P.
    Saeidi, R.
    Tan, Z. -H.
    Christensen, M. G.
    Kinnunen, T.
    Franti, P.
    Jensen, S. H.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 684 - +
  • [10] Semi-supervised Model for Emotion Recognition in Speech
    Pereira, Ingryd
    Santos, Diego
    Maciel, Alexandre
    Barros, Pablo
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 791 - 800