Semi-supervised Single-Channel Speech-Music Separation for Automatic Speech Recognition

被引:0
|
作者
Demir, Cemil [1 ,3 ]
Cemgil, A. Taylan [2 ]
Saraclar, Murat [3 ]
机构
[1] TUBITAK BILGEM, Kocaeli, Turkey
[2] Bogazici Univ, Dept Comp Engn, Istanbul, Turkey
[3] Bogazici Univ, Dept Elect & Elect Engn, Istanbul, Turkey
关键词
speech-music separation; semi-supervised; speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we propose a semi-supervised speech-music separation method which uses the speech, music and speech-music segments in a given segmented audio signal to separate speech and music signals from each other in the mixed speech-music segments. In this strategy, we assume, the background music of the mixed signal is partially composed of the repetition of the music segment in the audio. Therefore, we used a mixture model to represent the music signal. The speech signal is modeled using Non-negative Matrix Factorization (NMF) model. The prior model of the template matrix of the NMF model is estimated using the speech segment and updated using the mixed segment of the audio. The separation performance of the proposed method is evaluated in automatic speech recognition task.
引用
收藏
页码:688 / +
页数:2
相关论文
共 50 条
  • [41] Single-channel speech separation based on modulation frequency
    Gu, Lingyun
    Stern, Richard M.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 25 - 28
  • [42] Optimum Mixture Estimator for single-channel Speech Separation
    Mowlaee, Pejman
    Sayadiyan, Abolghassem
    Sheikhan, Mansour
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 543 - +
  • [43] Deep Neural Network for Supervised Single-Channel Speech Enhancement
    Saleem, Nasir
    Irfan Khattak, Muhammad
    Ali, Muhammad Yousaf
    Shafi, Muhammad
    [J]. ARCHIVES OF ACOUSTICS, 2019, 44 (01) : 3 - 12
  • [44] Semi-Supervised Learning of Speech Sounds
    Jansen, Aren
    Niyogi, Partha
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2264 - 2267
  • [45] Supervised Single-Channel Speech Separation via Sparse Decomposition Using Periodic Signal Models
    Nakashizuka, Makoto
    Okumura, Hiroyuki
    Iiguni, Youji
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2012, E95A (05) : 853 - 866
  • [46] Semi-supervised speech activity detection with an application to automatic speaker verification
    Sholokhov, Alexey
    Sahidullah, Md
    Kinnunen, Tomi
    [J]. COMPUTER SPEECH AND LANGUAGE, 2018, 47 : 132 - 156
  • [47] Semi-supervised Part-of-speech Tagging in Speech Applications
    Dufour, Richard
    Favre, Benoit
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1373 - 1376
  • [48] SEMI-SUPERVISED END-TO-END SPEECH RECOGNITION USING TEXT-TO-SPEECH AND AUTOENCODERS
    Karita, Shigeki
    Watanabe, Shinji
    Iwata, Tomoharu
    Delcroix, Marc
    Ogawa, Atsunori
    Nakatani, Tomohiro
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6166 - 6170
  • [49] Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior
    Sekiguchi, Kouhei
    Bando, Yoshiaki
    Nugraha, Aditya Arie
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2197 - 2212
  • [50] Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation
    Zhang, Jisi
    Zorila, Catalin
    Doddipatla, Rama
    Barker, Jon
    [J]. INTERSPEECH 2021, 2021, : 3495 - 3499