Usable speech detection using a context dependent Gaussian Mixture Model classifier

被引:0
|
作者
Yantorno, RE [1 ]
Smolenski, BY [1 ]
Iyer, AN [1 ]
Shah, JK [1 ]
机构
[1] Temple Univ, ECE Dept, Philadelphia, PA 19122 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech that is corrupted by nonstationary interference, but contains segments that are still usable for applications such as speaker identification or speech recognition, is referred to as "usable" speech. A common example of nonstationary interference occurs when there is more than one person talking at the same time, which is known as co-channel speech. In general the above speech processing applications do not work in co-channel environments; however, they can work on the extracted usable segments. Unfortunately, currently available usable speech measures only detect about 75% of the total available usable speech. The,first reason for this high error stems from the fact that no single feature can accurately identify all the usable speech characteristics. This situation can be resolved by using a Gaussian Mixture Model (GMM) based classifier to combine several usable speech features. A second source of error stems from the fact that the current usable speech measures treat each frame of co-channel data independently of the decisions made on adjacent frames. This situation can be resolved when a Hidden Markov Model (HMM) is used to incorporate any context dependent information in adjacent frames. Using this approach we were able to obtain 84% detection of usable speech with a 16% false alarm rate.
引用
收藏
页码:620 / 623
页数:4
相关论文
共 50 条
  • [21] Automatic shot boundary detection using Gaussian Mixture Model
    Reddy, A. Adhipathi
    Varadharajan, Sridhar
    [J]. VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 1, 2008, : 547 - 550
  • [22] Foreground Detection of Moving Object Using Gaussian Mixture Model
    Aslam, Nazia
    Sharma, Veena
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, : 1071 - 1074
  • [23] A NOVEL IMAGE CLASSIFIER BASED ON GAUSSIAN MIXTURE LANGUAGE MODEL
    Wu, Wei
    Gao, Guanglai
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 1312 - 1316
  • [24] Two-microphones Speech Separation Using Generalized Gaussian Mixture Model
    Fan, Miao
    Mao, Jia-min
    Ding, Jao-gui
    Li, Wei-feng
    [J]. CURRENT TRENDS IN COMPUTER SCIENCE AND MECHANICAL AUTOMATION, VOL 1, 2017, : 362 - 370
  • [25] Quality enhancement of CELP coded speech by using a voicing gaussian mixture model
    Raza, DG
    Chan, CF
    [J]. 2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 452 - 455
  • [26] Emotion Recognition from Speech using Gaussian Mixture Model and Vector Quantization
    Agrawal, Surabhi
    Dongaonkar, Shabda
    [J]. 2015 4TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (ICRITO) (TRENDS AND FUTURE DIRECTIONS), 2015,
  • [27] Speech enhancement based on speech spectral complex Gaussian Mixture Model
    Ding, GH
    Wang, X
    Cao, Y
    Ding, F
    Tang, YZ
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 165 - 168
  • [28] Speech wideband extension based on Gaussian mixture model
    ZHANG Yong HU Ruimin (National Engineering Research Center for Multimedia software
    [J]. Chinese Journal of Acoustics, 2009, 28 (04) : 362 - 377
  • [29] Mixture Gaussian envelope chirp model for speech and audio
    Mondal, B
    Sreenivas, TV
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 857 - 860
  • [30] Speech wideband extension based on Gaussian mixture model
    Zhang, Yong
    Hu, Ruimin
    [J]. Shengxue Xuebao/Acta Acustica, 2009, 34 (05): : 471 - 480