Randomization Effect on Iterative-Based Speaker Diarization System for Telephone Conversations

被引:0
|
作者
Furmanov, Tal [1 ]
Aminov, Lidiya [2 ]
Moyal, Ami [2 ]
Lapidot, Itshak [2 ]
机构
[1] Appl Mat Inc, Rehovot, Israel
[2] Afeka Tel Aviv Acad Coll Engn, ACLP Afeka Ctr Language Proc, Tel Aviv, Israel
关键词
hidden-distortion model (HDM); self-organizing maps (SOM); K-means; initialization; speaker diarization;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The primary objective of speaker diarization system is to designate speech segments to one of K speakers in the conversation. We use a hidden-distortion-model (HDM)-based system. HDM allows using different emission models as speaker models. We investigate the effect of randomization in two different levels. One level is stochastic training versus deterministic training and the other, random model initialization versus preserving initialization from the previous iteration. The emission models were codebooks (CBs) trained using K-means algorithm, both, batch and stochastic versions, as well as a self-organizing map (SOM) in its stochastic version. The evaluation performed on 108 telephone conversations from the LDC CallHome corpus. We will show that randomizing is always outperforming the deterministic training. Stochastic training demonstrated relative improvement of 3.5%. Random initialization achieved relative improvement of 7.28% comparing to preservation of initialization from the previous iteration.
引用
收藏
页数:5
相关论文
共 34 条
  • [21] Technical improvements of the E-HMM based speaker diarization system for meeting records
    Fredouille, Corinne
    Senay, Gregory
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 359 - +
  • [22] End-to-End Neural Speaker Diarization with an Iterative Refinement of Non-Autoregressive Attention-based Attractors
    Rybicka, Magdalena
    Villalba, Jesus
    Dehak, Najim
    Kowalczyk, Konrad
    INTERSPEECH 2022, 2022, : 5090 - 5094
  • [23] Real-time multilingual speech recognition and speaker diarization system based on Whisper segmentation
    Lyu, Ke-Ming
    Lyu, Ren-yuan
    Chang, Hsien-Tsung
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [24] Unsupervised help-trained LS-SVR-based segmentation in speaker diarization system
    Teimoori, Farshad
    Razzazi, Farbod
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (09) : 11743 - 11777
  • [25] Unsupervised help-trained LS-SVR-based segmentation in speaker diarization system
    Farshad Teimoori
    Farbod Razzazi
    Multimedia Tools and Applications, 2019, 78 : 11743 - 11777
  • [26] Speaker Diarization System based on DPCA Algorithm For Fearless Steps Challenge Phase-2
    Zhang, Xueshuai
    Wang, Wenchao
    Zhang, Pengyuan
    INTERSPEECH 2020, 2020, : 2602 - 2606
  • [27] Integrating Online I-vector extractor with Information Bottleneck based Speaker Diarization system
    Madikeri, Srikanth
    Himawan, Ivan
    Motlicek, Petr
    Ferras, Marc
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3105 - 3109
  • [28] INCREMENTAL TRANSFER LEARNING IN TWO-PASS INFORMATION BOTTLENECK BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    Dawalatabad, Nauman
    Madikeri, Srikanth
    Sekhar, C. Chandra
    Murthy, Hema A.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6291 - 6295
  • [29] A hybrid HXPLS-TMFCC parameterization and DCNN-SFO clustering based speaker diarization system
    Sailaja, C.
    Maloji, Suman
    Mannepalli, Kasiprasad
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (15):
  • [30] Two-Pass IB based Speaker Diarization System using Meeting-Specific ANN based Features
    Dawalatabad, Nauman
    Madikeri, Srikanth
    Sekhar, C. Chandra
    Murthy, Hema A.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2199 - 2203