Model complexity selection and cross-validation em training for robust speaker diarization

被引:0
|
作者
Anguera, Xavier [1 ,2 ]
Shinozaki, Takahiro [3 ,4 ]
Wooters, Chuck
Hernando, Javier [2 ]
机构
[1] Int Comp Sci Inst, Berkeley, CA 94704 USA
[2] Tech Univ Catalonia UPC, Barcelona 08034, Spain
[3] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA
[4] Kyoto Univ, Kyoto 6068501, Japan
关键词
speaker diarization; speaker segmentation and clustering; complexity selection; cross-validation EM training;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Accurate modeling of speaker clusters is important in the task of speaker diarization. Creating accurate models involves both selection of the model complexity and optimum training given the data. Using models with fixed complexity and trained using the standard EM algorithm poses a risk of overfitting, which can lead to a reduction in diarization performance. In this paper a technique proposed by the author to estimate the complexity of a model is combined with a novel training algorithm called "Cross-Validation EM" to control the number of training iterations. This combination leads to more robust speaker modeling and results in an increase in speaker diarization performance. Tests on the NIST RT (MDM) datasets for meetings show a relative improvement of 10.6% relative on the test set.
引用
收藏
页码:273 / +
页数:2
相关论文
共 50 条
  • [1] Cross-validation EM training for robust parameter estimation
    Shinozaki, T.
    Ostendorf, M.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 437 - +
  • [2] Cross-validation and aggregated EM training for robust parameter estimation
    Shinozaki, Takahiro
    Ostendorf, Mari
    [J]. COMPUTER SPEECH AND LANGUAGE, 2008, 22 (02): : 185 - 195
  • [3] Robust linear model selection by cross-validation
    Ronchetti, E
    Field, C
    Blanchard, W
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (439) : 1017 - 1023
  • [4] Automatic cluster complexity and quantity selection: Towards robust speaker diarization
    Anguera, Xavier
    Wooters, Chuck
    Hernando, Javier
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 248 - +
  • [5] Improving speaker diarization by cross EM refinement
    Ning, Huazhong
    Xu, Wei
    Gong, Yihong
    Huang, Thomas
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 1901 - 1904
  • [6] Linear model selection by cross-validation
    Rao, CR
    Wu, Y
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2005, 128 (01) : 231 - 240
  • [7] A survey of cross-validation procedures for model selection
    Arlot, Sylvain
    Celisse, Alain
    [J]. STATISTICS SURVEYS, 2010, 4 : 40 - 79
  • [8] MODEL-STRUCTURE SELECTION BY CROSS-VALIDATION
    STOICA, P
    EYKHOFF, P
    JANSSEN, P
    SODERSTROM, T
    [J]. INTERNATIONAL JOURNAL OF CONTROL, 1986, 43 (06) : 1841 - 1878
  • [9] On Estimating Model in Feature Selection With Cross-Validation
    Qi, Chunxia
    Diao, Jiandong
    Qiu, Like
    [J]. IEEE ACCESS, 2019, 7 : 33454 - 33463
  • [10] MODEL SELECTION VIA MULTIFOLD CROSS-VALIDATION
    ZHANG, P
    [J]. ANNALS OF STATISTICS, 1993, 21 (01): : 299 - 313