Investigation of Different Calibration Methods for Deep Speaker Embedding Based Verification Systems

被引:0
|
作者
Novoselov, Sergey [1 ]
Lavrentyeva, Galina [1 ]
Volokhov, Vladimir [1 ,2 ]
Volkova, Marina [1 ,2 ]
Khmelev, Nikita [1 ,2 ]
Akulov, Artem [1 ,2 ]
机构
[1] ITMO Univ, St Petersburg, Russia
[2] STC Ltd, St Petersburg, Russia
来源
关键词
Speaker verification; Calibration; MagNetO;
D O I
10.1007/978-3-031-48309-7_13
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep speaker embedding extractors have already become new state-of-the-art systems in the speaker verification field. However, the problem of verification score calibration for such systems often remains out of focus. An irrelevant score calibration leads to serious issues, especially in the case of unknown acoustic conditions, even if we use a strong speaker verification system in terms of threshold-free metrics. This paper presents an investigation over several methods of score calibration: a classical approach based on the logistic regression model; the recently presented magnitude estimation network MagNetO that uses activations from the pooling layer of the trained deep speaker extractor and generalization of such approach based on separate scale and offset prediction neural networks. An additional focus of this research is to estimate the impact of score normalization on the calibration performance of the system. The obtained results demonstrate that there are no serious problems if in-domain development data are used for calibration tuning. Otherwise, a trade-off between good calibration performance and threshold-free system quality arises. In most cases using adaptive s-norm helps to stabilize score distributions and to improve system performance.
引用
收藏
页码:159 / 168
页数:10
相关论文
共 50 条
  • [1] An Effective Deep Embedding Learning Architecture for Speaker Verification
    Jiang, Yiheng
    Song, Yan
    McLoughlin, Ian
    Gao, Zhifu
    Dai, Lirong
    INTERSPEECH 2019, 2019, : 4040 - 4044
  • [2] Deep Speaker Embedding with Frame-Constrained Training Strategy for Speaker Verification
    Gu, Bin
    INTERSPEECH 2022, 2022, : 1451 - 1455
  • [3] INVESTIGATION OF SPECAUGMENT FOR DEEP SPEAKER EMBEDDING LEARNING
    Wang, Shuai
    Rohdin, Johan
    Plchot, Oldrich
    Burget, Lukas
    Yu, Kai
    Cernocky, Jan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7139 - 7143
  • [4] Deep Segment Attentive Embedding for Duration Robust Speaker Verification
    Liu, Bin
    Nie, Shuai
    Liu, Wenju
    Zhang, Hui
    Li, Xiangang
    Li, Changliang
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 822 - 826
  • [5] Deep Embedding Learning for Text-Dependent Speaker Verification
    Zhang, Peng
    Hu, Peng
    Zhang, Xueliang
    INTERSPEECH 2020, 2020, : 3461 - 3465
  • [6] Triplet-Center Loss Based Deep Embedding Learning Method for Speaker Verification
    Jiang, Yiheng
    Song, Yan
    Yan, Jie
    Dai, Lirong
    McLoughlin, Ian
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1625 - 1629
  • [7] On Metric-based Deep Embedding Learning for Text-Independent Speaker Verification
    Kashani, Hamidreza Baradaran
    Reza, Shaghayegh
    Rezaei, Iman Sarraf
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
  • [8] An Improved Deep Embedding Learning Method for Short Duration Speaker Verification
    Gao, Zhifu
    Song, Yan
    McLoughlin, Ian
    Guo, Wu
    Dai, Lirong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3578 - 3582
  • [9] AN EFFECTIVE DEEP EMBEDDING LEARNING METHOD BASED ON DENSE-RESIDUAL NETWORKS FOR SPEAKER VERIFICATION
    Liu, Ying
    Song, Yan
    McLoughlin, Ian
    Liu, Lin
    Dai, Li-rong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6683 - 6687
  • [10] Speaker Verification based on extraction of Deep Features
    Mitsianis, Evangelos
    Spyrou, Evaggelos
    Giannakopoulos, Theodore
    10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,