Robust speaker diarization in a multi-speaker environment using autocorrelation-based noise subtraction

被引:0
|
作者
Mirrezaie, S. M. [1 ]
Ahadi, S. M. [1 ]
Kashi, A. [1 ]
机构
[1] Amir Kabir Univ Technol, Dept Elect Engn, Tehran 15914, Iran
关键词
robust speaker diarization; speaker segmentation and clustering; meetings indexing; noisy speech;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper shows research performed into the topic of speaker diarization for multi-speaker environment. It looks into the algorithms and the implementation of an off-line speaker segmentation and indexing system for recorded speech data where usually more than one speaker is present. Speaker diarization is a well studied topic in the domain of broadcast news recordings. Most of the proposed systems involve hierarchical clustering of the data, where the number of speakers and their identities are known a priori. Speaker diarization is the task of assigning a unique label to all speech segments in an audio stream by the same speaker. There are two key challenges: processing speed and robustness in the presence of noise. In this paper we address the robustness issue by using a method already successful in speech recognition application. Using ANS (Autocorrelation-Based Noise Subtraction) for robust genetic algorithm-based speaker diarization, we compare the results with the baseline MFCC-based system in clean and noisy conditions.
引用
收藏
页码:962 / 967
页数:6
相关论文
共 50 条
  • [21] INTEGRATION OF SPEECH SEPARATION, DIARIZATION, AND RECOGNITION FOR MULTI-SPEAKER MEETINGS: SYSTEM DESCRIPTION, COMPARISON, AND ANALYSIS
    Raj, Desh
    Denisov, Pavel
    Chen, Zhuo
    Erdogan, Hakan
    Huang, Zili
    He, Maokui
    Watanabe, Shinji
    Du, Jun
    Yoshioka, Takuya
    Luo, Yi
    Kanda, Naoyuki
    Li, Jinyu
    Wisdom, Scott
    Hershey, John R.
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 897 - 904
  • [22] Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals
    Chakrabarty, Soumitro
    Habets, Emanuel A. P.
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (01) : 8 - 21
  • [23] DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding
    Lee, Junmo
    Song, Kwangsub
    Noh, Kyoungjin
    Park, Tae-Jun
    Chang, Joon-Hyuk
    [J]. 2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2019, : 61 - 64
  • [24] Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation
    Mitsui, Kentaro
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    [J]. SPEECH COMMUNICATION, 2021, 132 : 132 - 145
  • [25] Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis
    Fujita, Kenichi
    Ando, Atsushi
    Ijima, Yusuke
    [J]. INTERSPEECH 2021, 2021, : 3141 - 3145
  • [26] Robust Multi-Speaker Tracking via Dictionary Learning and Identity Modeling
    Barnard, Mark
    Koniusz, Peter
    Wang, Wenwu
    Kittler, Josef
    Naqvi, Syed Mohsen
    Chambers, Jonathon
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (03) : 864 - 880
  • [27] A Multi-speaker Tracking Approach under Reverberation Environment based on Finite Set Theory
    Liu, Shuai
    Liu, Hongqing
    Zhou, Yi
    Luo, Zhen
    [J]. PROCEEDINGS OF 2020 IEEE 15TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2020), 2020, : 114 - 120
  • [28] A DEEP REINFORCEMENT LEARNING APPROACH TO AUDIO-BASED NAVIGATION IN A MULTI-SPEAKER ENVIRONMENT
    Giannakopoulos, Petros
    Pikrakis, Aggelos
    Cotronis, Yannis
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3475 - 3479
  • [29] End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning
    Denisov, Pavel
    Ngoc Thang Vu
    [J]. INTERSPEECH 2019, 2019, : 4425 - 4429
  • [30] Automatic Transcription and Captioning System for Bahasa Indonesia in Multi-Speaker Environment
    Andra, Muhammad Bagus
    Usagawa, Tsuyoshi
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS 2020), 2020, : 51 - 56