Multi-speaker Beamforming for Voice Activity Classification

被引:0
|
作者
Tran, Thuy N. [1 ]
Cowley, William [1 ]
Pollok, Andre [1 ]
机构
[1] Univ S Australia, Inst Telecommun Res, Adelaide, SA 5001, Australia
关键词
D O I
暂无
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
In a multi-speaker environment, voice activity classification (VAC) attempts to identify active speaker(s) at different recording periods. Using a beamformer-output-ratio (BOR) from a multi-beamforming system, an efficient solution for VAC is available by comparing the calculated BOR with pre-specified thresholds. Considering two speakers, this paper derives theoretical results on BOR statistics, including the probability distribution function and the cumulative distribution function (c.d.f.) of the BOR employing an assumption that the narrow-band signal power in the frequency domain is Gamma distributed. Using the c.d.f. of the BOR, the thresholds for VAC can be automatically calculated via a closed form expression for given acceptable mis-detection rates. The method is tested with simulated recording setups for a non-reverberant environment and a 0.3 second reverberation time environment. Both simulations show high accuracy for the classification.
引用
收藏
页码:116 / 121
页数:6
相关论文
共 50 条
  • [1] Multi-speaker voice cryptographic key generation
    Paola Garcia-Perera, L.
    Carlos Mex-Perera, J.
    Nolazco-Flores, Juan A.
    [J]. 3RD ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, 2005, 2005,
  • [2] LCMV BEAMFORMING WITH SUBSPACE PROJECTION FOR MULTI-SPEAKER SPEECH ENHANCEMENT
    Hassani, Amin
    Bertrand, Alexander
    Moonen, Marc
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 91 - 95
  • [3] THE MULTI-SPEAKER MULTI-STYLE VOICE CLONING CHALLENGE 2021
    Xie, Qicong
    Tian, Xiaohai
    Liu, Guanghou
    Song, Kun
    Xie, Lei
    Wu, Zhiyong
    Li, Hai
    Shi, Song
    Li, Haizhou
    Hong, Fen
    Bu, Hui
    Xu, Xin
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8613 - 8617
  • [4] Multi-Speaker Voice Activity Detection Using a Camera-assisted Microphone Array
    Bergh, Trond E.
    Hafizovicz, Ines
    Holm, Sverre
    [J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING, (IWSSIP 2016), 2016, : 327 - 330
  • [5] Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
    Medennikov, Ivan
    Korenevsky, Maxim
    Prisyach, Tatiana
    Khokhlov, Yuri
    Korenevskaya, Mariya
    Sorokin, Ivan
    Timofeeva, Tatiana
    Mitrofanov, Anton
    Andrusenko, Andrei
    Podluzhny, Ivan
    Laptev, Aleksandr
    Romanenko, Aleksei
    [J]. INTERSPEECH 2020, 2020, : 274 - 278
  • [6] ENERGY-BASED MULTI-SPEAKER VOICE ACTIVITY DETECTION WITH AN AD HOC MICROPHONE ARRAY
    Bertrand, Alexander
    Moonen, Marc
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 85 - 88
  • [7] Voice Activity Classification for Automatic Bi-Speaker Adaptive Beamforming in Speech Separation
    Tran, Thuy N.
    Cowley, William
    Pollok, Andre
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 817 - 821
  • [8] Single-speaker/multi-speaker co-channel speech classification
    Rossignol, Stephane
    Pietquini, Olivier
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2322 - 2325
  • [9] Deep Voice 2: Multi-Speaker Neural Text-to-Speech
    Arik, Sercan O.
    Diamos, Gregory
    Gibiansky, Andrew
    Miller, John
    Peng, Kainan
    Ping, Wei
    Raiman, Jonathan
    Zhou, Yanqi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [10] Multi-Lingual Multi-Speaker Text-to-Speech Synthesis for Voice Cloning with Online Speaker Enrollment
    Liu, Zhaoyu
    Mak, Brian
    [J]. INTERSPEECH 2020, 2020, : 2932 - 2936