Phone Adaptive Training for Speaker Diarization

被引:0
|
作者
Bozonnet, Simon [1 ]
Vipperla, Ravichander [1 ]
Evans, Nicholas [1 ]
机构
[1] EURECOM, F-06904 Sophia Antipolis, France
关键词
Speaker Diarization; Phone Adaptive Training; Speaker Discrimination;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The linguistic content of a speech signal is a source of unwanted variation which can degrade speaker diarization performance. This paper presents our latest work to reduce its impact. The new approach, referred to as Phone Adaptive Training (PAT), is analogous to speaker adaptive training used in automatic speech recognition. We report an oracle experiment which shows that PAT has the potential to deliver a 33% relative improvement in the diarization error rate over our baseline system. Practical experiments show significant improvements across two standard, independent evaluation datasets.
引用
收藏
页码:494 / 497
页数:4
相关论文
共 50 条
  • [31] Speaker-Corrupted Embeddings for Online Speaker Diarization
    Ghahabi, Omid
    Fischer, Volker
    INTERSPEECH 2019, 2019, : 386 - 390
  • [32] Online Neural Speaker Diarization With Target Speaker Tracking
    Wang, Weiqing
    Li, Ming
    IEEE/ACM Transactions on Audio Speech and Language Processing, 2024, 32 : 5078 - 5091
  • [33] Speaker Diarization and Linking of Meeting Data
    Ferras, Marc
    Madikeri, Srikanth
    Bourlard, Herve
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
  • [34] Speaker Diarization Using Gesture and Speech
    Gebre, Binyam Gebrekidan
    Wittenburg, Peter
    Drude, Sebastian
    Huijbregts, Marijn
    Heskes, Tom
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 582 - 586
  • [35] Group Delay Functions for Speaker Diarization
    Yadav, Mohit
    Sao, Anil Kumar
    Dileep, A. D.
    Rajan, Padmanabhan
    2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
  • [36] Iterative PLDA Adaptation for Speaker Diarization
    Le Lan, Gael
    Charlet, Delphine
    Larcher, Anthony
    Meignier, Sylvain
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2175 - 2179
  • [37] Multistage speaker diarization of broadcast news
    Barras, Claude
    Zhu, Xuan
    Meignier, Sylvain
    Gauvain, Jean-Luc
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1505 - 1512
  • [38] An overview of automatic speaker diarization systems
    Tranter, Sue E.
    Reynolds, Douglas A.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1557 - 1565
  • [39] Speaker Diarization using Embedding Vectors
    Toruk, Mesut
    Bilgin, Gokhan
    Serbes, Ahmet
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [40] Emotional Adaptive Training for Speaker Verification
    Bie, Fanhu
    Wang, Dong
    Zheng, Thomas Fang
    Tejedor, Javier
    Chen, Ruxin
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,