Independent Deeply Learned Tensor Analysis for Determined Audio Source Separation

被引:0
|
作者
Narisawa, Naoki [1 ]
Ikeshita, Rintaro [2 ]
Takamune, Norihiro [1 ]
Kitamura, Daichi [3 ]
Nakamura, Tomohiko [1 ]
Saruwatari, Hiroshi [1 ]
Nakatani, Tomohiro [2 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
[2] NTT Corp, NTT Commun Sci Labs, Kyoto, Japan
[3] Kagawa Coll, Natl Inst Technol, Takamatsu, Kagawa, Japan
关键词
audio source separation; independent component analysis; deep neural networks; inter-frequency correlation; ICA;
D O I
10.23919/EUSIPCO54536.2021.9616300
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We address the determined audio source separation problem in the time-frequency domain. In independent deeply learned matrix analysis (IDLMA), it is assumed that the inter-frequency correlation of each source spectrum is zero, which is inappropriate for modeling nonstationary signals such as music signals. To account for the correlation between frequencies, independent positive semidefinite tensor analysis has been proposed. This unsupervised (blind) method, however, severely restrict the structure of frequency covariance matrices (FCMs) to reduce the number of model parameters. As an extension of these conventional approaches, we here propose a supervised method that models FCMs using deep neural networks (DNNs). It is difficult to directly infer FCMs using DNNs. Therefore, we also propose a new FCM model represented as a convex combination of a diagonal FCM and a rank-1 FCM. Our FCM model is flexible enough to not only consider inter-frequency correlation, but also capture the dynamics of time-varying FCMs of nonstationary signals. We infer the proposed FCMs using two DNNs: DNN for power spectrum estimation and DNN for time-domain signal estimation. An experimental result of separating music signals shows that the proposed method provides higher separation performance than IDLMA.
引用
收藏
页码:326 / 330
页数:5
相关论文
共 50 条
  • [1] Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation
    Makishima, Naoki
    Mogami, Shinichi
    Takamune, Norihiro
    Kitamura, Daichi
    Sumino, Hayato
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    Ono, Nobutaka
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (10) : 1601 - 1615
  • [2] Independent Deeply Learned Matrix Analysis for Multichannel Audio Source Separation
    Mogami, Shinichi
    Sumino, Hayato
    Kitamura, Daichi
    Takamune, Norihiro
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    Ono, Nobutaka
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1557 - 1561
  • [3] Empirical Bayesian Independent Deeply Learned Matrix Analysis For Multichannel Audio Source Separation
    Hasumi, Takuya
    Nakamura, Tomohiko
    Takamune, Norihiro
    Saruwatari, Hiroshi
    Kitamura, Daichi
    Takahashi, Yu
    Kondo, Kazunobu
    [J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 331 - 335
  • [4] Multichannel Audio Source Separation with Independent Deeply Learned Matrix Analysis Using Product of Source Models
    Hasumi, Takuya
    Nakamura, Tomohiko
    Takamune, Norihiro
    Saruwatari, Hiroshi
    Kitamura, Daichi
    Takahashi, Yu
    Kondo, Kazunobu
    [J]. 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1226 - 1233
  • [5] Independent Low-Rank Tensor Analysis for Audio Source Separation
    Yoshii, Kazuyoshi
    Kitamura, Koichi
    Bando, Yoshiaki
    Nakamura, Eita
    Kawahara, Tatsuya
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1657 - 1661
  • [6] Audio source separation based on independent component analysis
    Makino, S
    Araki, S
    Mukai, R
    Sawada, H
    [J]. 2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 5, PROCEEDINGS, 2004, : 668 - 671
  • [7] CORRELATED TENSOR FACTORIZATION FOR AUDIO SOURCE SEPARATION
    Yoshii, Kazuyoshi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 731 - 735
  • [8] PoP-IDLMA: Product-of-Prior Independent Deeply Learned Matrix Analysis for Multichannel Music Source Separation
    Hasumi, Takuya
    Nakamura, Tomohiko
    Takamune, Norihiro
    Saruwatari, Hiroshi
    Kitamura, Daichi
    Takahashi, Yu
    Kondo, Kazunobu
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2680 - 2694
  • [9] Blind audio source separation based on independent component analysis
    Makino, Shoji
    Sawada, Hiroshi
    Araki, Shoko
    [J]. INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2007, 4666 : 843 - 843
  • [10] Independent Positive Semidefinite Tensor Analysis in Blind Source Separation
    Ikeshita, Rintaro
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1652 - 1656