Multichannel Loss Function for Supervised Speech Source Separation by Mask-based Beamforming

被引:3
|
作者
Masuyama, Yoshiki [1 ,2 ]
Togami, Masahito [2 ]
Komatsu, Tatsuya [2 ]
机构
[1] Waseda Univ, Dept Intermedia Art & Sci, Tokyo, Japan
[2] LINE Corpolat, Tokyo, Japan
来源
关键词
Speaker-independent multi-talker separation; neural beamformer; multichannel Italura-Saito divergence;
D O I
10.21437/Interspeech.2019-1289
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we propose two mask-based beamforming methods using a deep neural network (DNN) trained by multichannel loss functions. Beamforming technique using time-frequency (TF)-masks estimated by a DNN have been applied to many applications where TF-masks are used for estimating spatial covariance matrices. To train a DNN for mask-based beamforming, loss functions designed for monaural speech enhancement/separation have been employed. Although such a training criterion is simple, it does not directly correspond to the performance of mask-based beamforming. To overcome this problem, we use multichannel loss functions which evaluate the estimated spatial covariance matrices based on the multichannel Itakura-Saito divergence. DNNs trained by the multichannel loss functions can be applied to construct several beamformers. Experimental results confirmed their effectiveness and robustness to microphone configurations.
引用
收藏
页码:2708 / 2712
页数:5
相关论文
共 50 条
  • [41] Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function
    Li, Xiaofei
    Girin, Laurent
    Gannot, Sharon
    Horaud, Radu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 645 - 659
  • [42] Beamforming-based convolutive source separation
    Baumann, W
    Kolossa, D
    Orglmeister, R
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 357 - 360
  • [43] Teacher-Student Learning and Post-processing for Robust BiLSTM Mask-Based Acoustic Beamforming
    Liu, Zhaoyi
    Chen, Qiuyuan
    Hu, Han
    Tang, Haoyu
    Zou, Y. X.
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III, 2019, 11955 : 522 - 533
  • [44] DESIGNING MULTICHANNEL SOURCE SEPARATION BASED ON SINGLE-CHANNEL SOURCE SEPARATION
    Lopez, A. Ramirez
    Ono, N.
    Remes, U.
    Palomaki, K.
    Kurimo, M.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 469 - 473
  • [45] Semi-supervised Multichannel Speech Separation Based on a Phone- and Speaker-Aware Deep Generative Model of Speech Spectrograms
    Du, Yicheng
    Sekiguchi, Kouhei
    Bando, Yoshiaki
    Nugraha, Aditya Arie
    Fontaine, Mathieu
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 870 - 874
  • [46] Disentangled Image Attribute Editing in Latent Space via Mask-based Retention Loss
    Ohaga, Shunya
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,
  • [47] Time-Frequency Mask-based Speech Enhancement using Convolutional Generative Adversarial Network
    Shah, Neil
    Patil, Hemant A.
    Soni, Meet H.
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1246 - 1251
  • [48] SUPERVISED MONAURAL SOURCE SEPARATION BASED ON AUTOENCODERS
    Osako, Keiichi
    Mitsufuji, Yuki
    Singh, Rita
    Raj, Bhiksha
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 11 - 15
  • [49] A SI-SDR Loss Function based Monaural Source Separation
    Li, Shuai
    Liu, Hongqing
    Zhou, Yi
    Luo, Zhen
    PROCEEDINGS OF 2020 IEEE 15TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2020), 2020, : 356 - 360
  • [50] 3D Mask-Based Shape Loss Function for LIDAR Data for Improved 3D Object Detection
    Park, R.
    Lee, C.
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON VEHICLE TECHNOLOGY AND INTELLIGENT TRANSPORT SYSTEMS, VEHITS 2023, 2023, : 305 - 312