Robust speaker diarization in a multi-speaker environment using autocorrelation-based noise subtraction

被引:0
|
作者
Mirrezaie, S. M. [1 ]
Ahadi, S. M. [1 ]
Kashi, A. [1 ]
机构
[1] Amir Kabir Univ Technol, Dept Elect Engn, Tehran 15914, Iran
关键词
robust speaker diarization; speaker segmentation and clustering; meetings indexing; noisy speech;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper shows research performed into the topic of speaker diarization for multi-speaker environment. It looks into the algorithms and the implementation of an off-line speaker segmentation and indexing system for recorded speech data where usually more than one speaker is present. Speaker diarization is a well studied topic in the domain of broadcast news recordings. Most of the proposed systems involve hierarchical clustering of the data, where the number of speakers and their identities are known a priori. Speaker diarization is the task of assigning a unique label to all speech segments in an audio stream by the same speaker. There are two key challenges: processing speed and robustness in the presence of noise. In this paper we address the robustness issue by using a method already successful in speech recognition application. Using ANS (Autocorrelation-Based Noise Subtraction) for robust genetic algorithm-based speaker diarization, we compare the results with the baseline MFCC-based system in clean and noisy conditions.
引用
收藏
页码:962 / 967
页数:6
相关论文
共 50 条
  • [1] Speaker Diarization in a Multi-Speaker Environment Using Particle Swarm Optimization and Mutual Information
    Mirrezaie, S. M.
    Ahadi, S. M.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1533 - 1536
  • [2] MULTI-SPEAKER CONVERSATIONS, CROSS-TALK, AND DIARIZATION FOR SPEAKER RECOGNITION
    Sell, Gregory
    McCree, Alan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5425 - 5429
  • [3] Keyword-based speaker localization: Localizing a target speaker in a multi-speaker environment
    Sivasankaran, Sunit
    Vincent, Emmanuel
    Fohr, Dominique
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2703 - 2707
  • [4] A hybrid approach to speaker recognition in multi-speaker environment
    Trivedi, J
    Maitra, A
    Mitra, SK
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 272 - 275
  • [5] Speech Recognition and Multi-Speaker Diarization of Long Conversations
    Mao, Huanru Henry
    Li, Shuyang
    McAuley, Julian
    Cottrell, Garrison W.
    [J]. INTERSPEECH 2020, 2020, : 691 - 695
  • [6] ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding
    He, Mao-Kui
    Du, Jun
    Liu, Qing-Feng
    Lee, Chin-Hui
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1561 - 1573
  • [7] Multi-Speaker Adaptation for Robust Speech Recognition under Ubiquitous Environment
    Shih, Po-Yi
    Wang, Jhing-Fa
    Lin, Yuan-Ning
    Fu, Zhong-Hua
    [J]. ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 126 - 131
  • [8] Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries
    Stafylakis, Themos
    Mosner, Ladislav
    Plchot, Oldrich
    Rohdin, Johan
    Silnova, Anna
    Burget, Lukas
    Cernocky, Jan Honza
    [J]. INTERSPEECH 2022, 2022, : 605 - 609
  • [9] Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion
    Aloradi, Ahmad
    Mack, Wolfgang
    Elminshawi, Mohamed
    Habets, EmanuM A. P.
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 354 - 358
  • [10] Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
    Medennikov, Ivan
    Korenevsky, Maxim
    Prisyach, Tatiana
    Khokhlov, Yuri
    Korenevskaya, Mariya
    Sorokin, Ivan
    Timofeeva, Tatiana
    Mitrofanov, Anton
    Andrusenko, Andrei
    Podluzhny, Ivan
    Laptev, Aleksandr
    Romanenko, Aleksei
    [J]. INTERSPEECH 2020, 2020, : 274 - 278