Application of variational Bayesian PCA for speech feature extraction

被引:0
|
作者
Kwon, OW [1 ]
Lee, TW [1 ]
Chan, KL [1 ]
机构
[1] Univ Calif San Diego, Inst Neural Computat, La Jolla, CA 92059 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In a standard mel-frequency cepstral coefficient-based speech recognizer, it is common to use the same feature dimension and the number of Gaussian mixtures for all subunits. We proposed to use different transformations and different number of mixtures for each subunit. We obtained the transformations from mel-frequency band energies by using the variational Bayesian principal component analysis (PCA) method. In the method, hyperparameters of the Gaussian mixtures and the number of mixtures are automatically learned through maximization of a lower bound of the evidence instead of the likelihood in the conventional maximum likelihood paradigm. Analyzing the TIMIT speech data, we revealed intrinsic structures of vowels and consonants. We demonstrated the usefulness of the method for speech recognition by performing phoneme classification of /b/, /d/ and /g/ phonemes.
引用
收藏
页码:825 / 828
页数:4
相关论文
共 50 条
  • [1] Speech feature analysis using variational Bayesian PCA
    Kwon, OW
    Chan, KL
    Lee, TW
    IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (05) : 137 - 140
  • [2] On the use of kernel PCA for feature extraction in speech recognition
    Lima, A
    Zen, H
    Nankaku, Y
    Miyajima, C
    Tokuda, K
    Kitamura, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (12) : 2802 - 2811
  • [3] Variational Bayesian functional PCA
    van der Linde, Angelika
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 53 (02) : 517 - 533
  • [4] Variational Bayesian Inference for Source Separation and Robust Feature Extraction
    Adiloglu, Kamil
    Vincent, Emmanuel
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (10) : 1746 - 1758
  • [5] Speech emotion analysis and recognition based on the PCA feature extraction model
    Ye, Shiping
    International Journal of Applied Mathematics and Statistics, 2013, 51 (22): : 127 - 135
  • [6] Variational Bayesian learning of speech GMMs for feature enhancement based on Algonquin
    Pettersen, Svein G.
    Johnsen, Magne H.
    Wellekens, Christian
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 905 - +
  • [7] Proposed combination of PCA and MFCC feature extraction in speech recognition system
    Hoang Trang
    Tran Hoang Loc
    Huynh Bui Hoang Nam
    2014 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC), 2014, : 697 - 702
  • [8] The application of optimization in feature extraction of speech recognition
    Gu, L
    Liu, RS
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 745 - 748
  • [9] Transfer Learning for Dynamic Feature Extraction Using Variational Bayesian Inference
    Xie, Junyao
    Huang, Biao
    Dubljevic, Stevan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (11) : 5524 - 5535
  • [10] A GENERAL VARIATIONAL BAYESIAN FRAMEWORK FOR ROBUST FEATURE EXTRACTION IN MULTISOURCE RECORDINGS
    Adiloglu, Kamil
    Vincent, Emmanuel
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 273 - 276