Text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution

被引:0
|
作者
Miyajima, C [1 ]
Hattori, Y
Tokuda, K
Masuko, T
Kobayashi, T
Kitamura, T
机构
[1] Nagoya Inst Technol, Dept Comp Sci, Nagoya, Aichi 4668555, Japan
[2] Tokyo Inst Technol, Interdisciplinary Grad Sch Sci & Engn, Dept Informat Proc, Yokohama, Kanagawa 2268502, Japan
来源
关键词
speaker identification; pitch; multi-space probability distribution; Gaussian mixture model; minimum classification error;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a new approach to modeling speech spectra and pitch for text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution (MSD-GMM). MSD-GMM allows us to model continuous pitch values of voiced frames and discrete symbols for unvoiced frames in a unified framework. Spectral and pitch features are jointly modeled by a two-stream MSD-GMM. We derive maximum likelihood (ML) estimation formulae and minimum classification error (MCE) training procedure for MSD-GMM parameters. The MSD-GMM speaker models are evaluated for text-independent speaker identification tasks. The experimental results show that the MSD-GMM can efficiently model spectral and pitch features of each speaker and outperforms conventional speaker models. The results also demonstrate the utility of the MCE training of the MSD-GMM parameters and the robustness for the inter-session variability.
引用
收藏
页码:847 / 855
页数:9
相关论文
共 50 条
  • [1] Speaker identification using Gaussian mixture models based on multi-space probability distribution
    Miyajima, C
    Hattori, Y
    Tokuda, K
    Masuko, T
    Kobayashi, T
    Kitamura, T
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 433 - 436
  • [2] ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS
    REYNOLDS, DA
    ROSE, RC
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 72 - 83
  • [3] Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models
    Chakroun, Rania
    Frikha, Mondher
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 3 - 10
  • [4] Frame level likelihood normalization for text-independent speaker identification using Gaussian Mixture Models
    Markov, K
    Nakagawa, S
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1764 - 1767
  • [5] Dimensionality reduction for text-independent speaker identification using Gaussian Mixture Model
    El-Gamal, MA
    Abu El-Yazeed, MF
    El Ayadi, MMH
    [J]. Proceedings of the 46th IEEE International Midwest Symposium on Circuits & Systems, Vols 1-3, 2003, : 625 - 628
  • [6] Robust Text-independent Speaker recognition with Short Utterances using Gaussian Mixture Models
    Chakroun, Rania
    Frikha, Mondher
    [J]. 2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 2204 - 2209
  • [7] Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model
    Kumari, R. Shantha Selva
    Nidhyananthan, S. Selva
    Anand, G.
    [J]. INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND SYSTEM DESIGN 2011, 2012, 30 : 319 - 326
  • [8] Text-Independent Speaker Verification Using Variational Gaussian Mixture Model
    Moattar, Mohammad Hossein
    Homayounpour, Mohammad Mehdi
    [J]. ETRI JOURNAL, 2011, 33 (06) : 914 - 923
  • [9] Self-Organizing Mixture Models for Text-Independent Speaker Identification
    Bouziane, Ayoub
    Kharroubi, Jamal
    Zarghili, Arsalane
    [J]. 2014 THIRD IEEE INTERNATIONAL COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY (CIST'14), 2014, : 345 - 350
  • [10] A Chain of Gaussian Mixture Model for Text-independent Speaker Recognition
    Chen, Yanxiang
    Liu, Ming
    [J]. ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 100 - +