UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation

被引：4

作者：

Zhang, Chunlei ^{[1
]}

Bahmaninezhad, Fahimeh ^{[1
]}

Ranjan, Shivesh ^{[1
]}

Yu, Chengzhu ^{[1
]}

Shokouhi, Navid ^{[1
]}

Hansen, John H. L. ^{[1
]}

机构：

[1] Univ Texas Dallas, CRSS, Erik Jonsson Sch Engn, Richardson, TX 75080 USA

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

NIST SRE; speaker recognition; domain mismatch; i-vector; speaker clustering;

D O I：

10.21437/Interspeech.2017-555

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This study describes systems submitted by the Center for Robust Speech Systems (CRSS) from the University of Texas at Dallas (UTD) to the 2016 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE). We developed 4 UBM and DNN i-vector based speaker recognition systems with alternate data sets and feature representations. Given that the emphasis of the NIST SRE 2016 is on language mismatch between training and enrollment/test data. so-called domain mismatch. in our system development we focused on: (i) utilizing unlabeled in-domain data for centralizing i-vectors to alleviate the domain mismatch; (ii) selecting the proper data sets and optimizing configurations for training LDA/PLDA; (iii) introducing a newly proposed dimension reduction technique which incorporates unlabeled in-domain data before PLDA training: (iv) unsupervised speaker clustering of unlabeled data and using them alone or with previous SREs for PLDA training, and finally (v) score calibration using unlabeled data with "pseudo" speaker labels generated from speaker clustering. NIST evaluations show that our proposed methods were very successful for the given task.

引用

页码：1343 / 1347

页数：5

共 50 条

[1] UTD-CRSS SYSTEMS FOR 2018 NIST SPEAKER RECOGNITION EVALUATION
Zhang, Chunlei
Bahmaninezhad, Fahimeh
Ranjan, Shivesh
Dubey, Harishchandra
Xia, Wei
Hansen, John H. L.
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5776 - 5780
[2] CRSS SYSTEMS FOR 2012 NIST SPEAKER RECOGNITION EVALUATION
Hasan, Taufiq
Sadjadi, Seyed Omid
Liu, Gang
Shokouhi, Navid
Boril, Hynek
Hansen, John H. L.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6783 - 6787
[3] UTD-CRSS SYSTEM FOR THE NIST 2015 LANGUAGE RECOGNITION I-VECTOR MACHINE LEARNING CHALLENGE
Yu, Chengzhu
Zhang, Chunlei
Ranjan, Shivesh
Zhang, Qian
Misra, Abhinav
Kelly, Finnian
Hansen, John H. L.
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5835 - 5839
[4] The 2016 NIST Speaker Recognition Evaluation
Sadjadi, Seyed Omid
Kheyrkhah, Timothee
Tong, Audrey
Greenberg, Craig
Reynolds, Douglas
Singer, Elliot
Mason, Lisa
Hernandez-Cordero, Jaime
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1353 - 1357
[5] The Opensesame NIST 2016 Speaker Recognition Evaluation System
Liu, Gang
Qian, Qi
Wang, Zhibin
Zhao, Qingen
Wang, Tianzhou
Li, Hao
Xue, Jian
Zhu, Shenghuo
Jin, Rong
Zhao, Tuo
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2854 - 2858
[6] Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006
Bruemmer, Niko
Burget, Lukas
Cernocky, Jan 'Honza'
Glembek, Ondrej
Grezl, Frantisek
Karafiat, Martin
van Leeuwen, David A.
Matejka, Pavel
Schwarz, Petr
Strasheim, Albert
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 2072 - 2084
[7] The NIST 2010 Speaker Recognition Evaluation
Martin, Alvin F.
Greenberg, Craig S.
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2734 - 2737
[8] The 2018 NIST Speaker Recognition Evaluation
Sadjadi, Seyed Omid
Greenberg, Craig
Singer, Elliot
Reynolds, Douglas
Mason, Lisa
Hernandez-Cordero, Jaime
INTERSPEECH 2019, 2019, : 1483 - 1487
[9] The 2012 NIST Speaker Recognition Evaluation
Greenberg, Craig S.
Stanford, Vincent M.
Martin, Alvin F.
Yadagiri, Meghana
Doddington, George R.
Godfrey, John J.
Hernandez-Cordero, Jaime
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1970 - 1974
[10] The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective
Doddington, GR
Przybocki, MA
Martin, AF
Reynolds, DA
SPEECH COMMUNICATION, 2000, 31 (2-3) : 225 - 254

← 1 2 3 4 5 →