A tutorial on text-independent speaker verification

被引:445
|
作者
Bimbot, F [1 ]
Bonastre, JF
Fredouille, C
Gravier, G
Magrin-Chagnolleau, I
Meignier, S
Merlin, T
Ortega-García, J
Petrovska-Delacrétaz, D
Reynolds, DA
机构
[1] IRISA, INRIA, F-35042 Rennes, France
[2] CNRS, F-35042 Rennes, France
[3] Univ Avignon, LIA, F-84911 Avignon 9, France
[4] CNRS, Lab Dynam Langage, F-69369 Lyon 07, France
[5] Univ Politecn Madrid, ATVS, E-28040 Madrid, Spain
[6] Univ Fribourg, Dept Informat, DIVA Lab, CH-1700 Fribourg, Switzerland
[7] MIT, Lincoln Lab, Cambridge, MA 02420 USA
关键词
speaker verification; text-independent; cepstral analysis; Gaussian mixture modeling;
D O I
10.1155/S1110865704310024
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Gaussian mixture modeling, which is the speaker modeling technique used in most systems, is then explained. A few speaker modeling alternatives, namely, neural networks and support vector machines, are mentioned. Normalization of scores is then explained, as this is a very important step to deal with real-world data. The evaluation of a speaker verification system is then detailed, and the detection error trade-off (DET) curve is explained. Several extensions of speaker verification are then enumerated, including speaker tracking and segmentation by speakers. Then, some applications of speaker verification are proposed, including on-site applications, remote applications, applications relative to structuring audio information, and games. Issues concerning the forensic area are then recalled, as we believe it is very important to inform people about the actual performance and limitations of speaker verification systems. This paper concludes by giving a few research trends in speaker verification for the next couple of years.
引用
收藏
页码:430 / 451
页数:22
相关论文
共 50 条
  • [41] Maximum Likelihood Discriminant Feature for Text-Independent Speaker Verification
    Liu, Qingsong
    Dai, Beiqian
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 3733 - 3736
  • [42] Text-independent speaker verification: The WCL-1 system
    Ganchev, T
    Fakotakis, N
    Kokkinakis, G
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 263 - 268
  • [43] Text-independent speaker verification using predictive neural networks
    Finan, RA
    Sapeluk, AT
    Damper, RI
    FIFTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS, 1997, (440): : 274 - 279
  • [44] Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification
    Shum, Stephen
    Dehak, Najim
    Dehak, Reda
    Glass, James R.
    ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 76 - 82
  • [45] A novel text-independent speaker verification method based on the global speaker model
    Zhang, YY
    Zhang, D
    Zhu, XY
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2000, 30 (05): : 598 - 602
  • [46] SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION FOR EMBEDDED SYSTEMS
    Balian, Julien
    Tavarone, Raffaele
    Poumeyrol, Mathieu
    Coucke, Alice
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6179 - 6183
  • [47] English-Chinese bilingual text-independent speaker verification
    Ma, B
    Meng, H
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 293 - 296
  • [48] Score Fusion Methods for Text-Independent Speaker Verification Applications
    Rastoceanu, Florin
    Lazar, Marilena
    2011 6TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2011,
  • [49] Text-independent speaker verification using Support Vector Machines
    Kharroubi, J
    Chollet, G
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4017 - 4017
  • [50] Searching through a speech memory for text-independent speaker verification
    Petrovska-Delacrétaz, D
    El Hannani, A
    Chollet, G
    AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 95 - 103