HIDDEN MARKOV MODEL DIARISATION WITH SPEAKER LOCATION INFORMATION

被引:3
|
作者
Wong, Jeremy H. M. [1 ]
Xiao, Xiong [1 ]
Gong, Yifan [1 ]
机构
[1] Microsoft Speech & Language Grp, Singapore, Singapore
关键词
Speaker location; sound source localisation; hidden Markov model; diarisation; meeting transcription; DIARIZATION;
D O I
10.1109/ICASSP39728.2021.9413761
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker diarisation methods often rely on speaker embeddings to cluster together the segments of audio that are uttered by the same speaker. When the audio is captured using a microphone array, it is possible to estimate the locations of where the sounds originate from. This location information may be complementary to the speaker embeddings in the diarisation processes. This report proposes to extend the Hidden Markov Model (HMM) clustering method, to enable the use of speaker location information. The HMM observation log-likelihood for the speaker location can take the form of a KL-divergence, when the speaker location is represented as a discrete posterior distribution of the probabilities that the sound originated from each possible location. Experimental results on a Microsoft rich meeting transcription task show that using speaker location information with the proposed HMM modification can yield performance improvements over using speaker embeddings alone.
引用
收藏
页码:7158 / 7162
页数:5
相关论文
共 50 条
  • [41] Optimization model of terror response facility location with hidden information
    Xiang, Yin
    [J]. Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2019, 39 (05): : 1164 - 1177
  • [42] COMPARISON OF HIDDEN MARKOV MODEL TECHNIQUES FOR AUTOMATIC SPEAKER VERIFICATION IN REAL-WORLD CONDITIONS
    DEVETH, J
    BOURLARD, H
    [J]. SPEECH COMMUNICATION, 1995, 17 (1-2) : 81 - 90
  • [43] Text-dependent speaker identification using hidden Markov Model with stress compensation technique
    Shahin, I
    Botros, N
    [J]. PROCEEDINGS IEEE SOUTHEASTCON '98: ENGINEERING FOR A NEW ERA, 1998, : 61 - 64
  • [44] An automatic retraining method for speaker independent Hidden Markov Models
    Banhalmi, Andras
    Busa-Fekete, Robert
    Kocsor, Andras
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 382 - 389
  • [45] Speaker Detection Using Phoneme Specific Hidden Markov Models
    Pakoci, Edvin
    Jakovljevic, Niksa
    Popovic, Branislav
    Miskovic, Dragisa
    Pekar, Darko
    [J]. SPEECH AND COMPUTER, 2014, 8773 : 410 - 417
  • [46] Improved Bayesian learning of hidden Markov models for speaker adaptation
    Chien, JT
    Lee, CH
    Wang, HC
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1027 - 1030
  • [47] Markov Financial Model Using Hidden Markov Model
    Luc Tri Tuyen
    [J]. INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS & STATISTICS, 2013, 40 (10): : 72 - 83
  • [48] Optimization of hidden Markov model by a genetic algorithm for web information extraction
    Xiao, Jiyi
    Zou, Lamei
    Li, Chuanqi
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE 2007), 2007,
  • [49] A Hidden Markov Model to Detect Coded Information Islands in Free Text
    Cerulo, Luigi
    Ceccarelli, Michele
    Di Penta, Massimiliano
    Canfora, Gerardo
    [J]. 2013 IEEE 13TH INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM), 2013, : 157 - 166
  • [50] Web object information extraction based on generalized hidden Markov model
    Wang, Jing
    Yao, Yong
    Liu, ZhiJing
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, VOLS 1-3, 2007, : 1520 - 1523