A REAL-TIME SPEAKER DIARIZATION SYSTEM BASED ON SPATIAL SPECTRUM

被引:13
|
作者
Zheng, Siqi [1 ]
Huang, Weilong [1 ]
Wang, Xianliang [1 ]
Suo, Hongbin [1 ]
Feng, Jinwei [1 ]
Yan, Zhijie [1 ]
机构
[1] Alibaba Grp, Speech Lab, Hangzhou, Zhejiang, Peoples R China
关键词
Speaker diarization; speaker localization; microphone array; SPEECH;
D O I
10.1109/ICASSP39728.2021.9413544
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we describe a speaker diarization system that enables localization and identification of all speakers present in a conversation or meeting. We propose a novel systematic approach to tackle several long-standing challenges in speaker diarization tasks: (1) to segment and separate overlapping speech from two speakers; (2) to estimate the number of speakers when participants may enter or leave the conversation at any time; (3) to provide accurate speaker identification on short text-independent utterances; (4) to track down speakers movement during the conversation; (5) to detect speaker change incidence real-time. First, a differential directional microphone array-based approach is exploited to capture the target speakers' voice in far-field adverse environment. Second, an online speaker-location joint clustering approach is proposed to keep track of speaker location. Third, an instant speaker number detector is developed to trigger the mechanism that separates overlapped speech. The results suggest that our system effectively incorporates spatial information and achieves significant gains.
引用
收藏
页码:7208 / 7212
页数:5
相关论文
共 50 条
  • [21] The real-time optimal spectrum analysis system based on personal computer
    Wu, Rong-Ching
    Chiang, Ching-Tai
    Tsai, Jong-Ian
    [J]. PROCEEDINGS OF THE 15TH IASTED INTERNATIONAL CONFERENCE ON APPLIED SIMULATION AND MODELLING, 2006, : 296 - +
  • [22] IMPROVED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    El-Khoury, Elie
    Senac, Christine
    Pinquier, Julien
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4097 - 4100
  • [23] KL-HMM BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    Madikeri, Srikanth
    Bourlard, Herve
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4435 - 4439
  • [24] pnf Improvements in speaker diarization system
    Fu, Rong
    Benest, Ian D.
    [J]. SIGMAP 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2007, : 317 - +
  • [25] A GPU-based real-time spatial coherence imaging system
    Hyun, Dongwoon
    Trahey, Gregg E.
    Dahl, Jeremy
    [J]. MEDICAL IMAGING 2013: ULTRASONIC IMAGING, TOMOGRAPHY, AND THERAPY, 2013, 8675
  • [26] Real-time speaker identification and verification
    Kinnunen, T
    Karpov, E
    Fränti, P
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 277 - 288
  • [27] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Wang, D.
    Vogt, R.
    Sridharan, S.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
  • [28] Real-time Spectrum Sensor based on USRP
    Martian, Alexandru
    [J]. 2014 10TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS (COMM), 2014,
  • [29] Design and Implementation of a Real-Time Speaker Identification System with Improved GMM
    Jiang, Ye
    Tang, Zhen-min
    [J]. PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 603 - 607
  • [30] A Novel Approach To Enhance The Efficiency Of Real-Time Speaker Identification System
    Priyadarshini, Subhashree
    Sarangi, Susanta Kumar
    Bhuyan, K. C.
    [J]. 2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 52 - 54