Subsyllable-based discriminative segmental Bayesian network for Mandarin speech keyword spotting

被引:2
|
作者
Wu, CH
机构
[1] Institute of Information Engineering, National Cheng Kung University, Tainan
来源
关键词
Mandarin speech keyword spotting; context-dependent subsyllable; discriminative segmental Bayesian network;
D O I
10.1049/ip-vis:19971095
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A continuous Mandarin keyword spotting system based on dependent subsyllables is presented. In this vocabulary-independent system, users can define their own keywords and most frequently occurring non-keywords without retraining the system. A set of 176 monosyllables and 483 balanced words or sentences are used to extract the context-dependent subsyllables (i.e, initials or finals in Mandarin speech), for training. Each subsyllable is represented by a proposed discriminative segmental Bayesian network (DSBN). In the training process, the generalised probabilistic descent (GPD) algorithm is used for discriminative training. The most frequently occurring non-keywords are divided into keyword predecessors and successors. Non-keyword garbage models for keyword predecessors, for keyword successors and extraneous speech are separately constructed. In the recognition process, a final part preprocessor is used to screen out unreasonable hypotheses in order to reduce the recognition time. Using a test set of 750 conversational speech utterances from 20 speakers (ten males and ten females), word spotting rates of 92.0% when the vocabulary word was embedded in unconstrained extraneous speech, were obtained for a user-defined 20 keyword vocabulary.
引用
收藏
页码:65 / 71
页数:7
相关论文
共 50 条
  • [1] Keyword Spotting Based On CTC and RNN For Mandarin Chinese Speech
    Wang, Yiyan
    Long, Yanhua
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 374 - 378
  • [2] Utterance verification for spontaneous mandarin speech keyword spotting
    Xin, L
    Wang, BX
    [J]. 2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C397 - C401
  • [3] A new keyword spotting approach for spontaneous mandarin speech
    Zhang, Pengyuan
    Han, Jiang
    Shao, Jian
    Yan, Yonghong
    [J]. 2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 764 - +
  • [4] Audio-visual Keyword Spotting for Mandarin Based on Discriminative Local Spatial-temporal Descriptors
    Liu, Hong
    Fan, Ting
    Wu, Pingping
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 785 - 790
  • [5] Utterance verification using prosodic information for Mandarin telephone speech keyword spotting
    Chen, YJ
    Wu, CH
    Yan, GL
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 697 - 700
  • [6] Speech Keyword Spotting with Rule Based Segmentation
    Greibus, Mindaugas
    Telksnys, Laimutis
    [J]. INFORMATION AND SOFTWARE TECHNOLOGIES (ICIST 2013), 2013, 403 : 186 - 197
  • [7] KEYWORD-SPECIFIC NORMALIZATION BASED KEYWORD SPOTTING FOR SPONTANEOUS SPEECH
    Li, Weifeng
    Liao, Qingmin
    [J]. 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 233 - 237
  • [8] Keyword spotting in continuous speech using convolutional neural network
    Rostami, Amir Mohammad
    Karimi, Ali
    Akhaee, Mohammad Ali
    [J]. SPEECH COMMUNICATION, 2022, 142 : 15 - 21
  • [9] Keyword spotting in continuous speech using convolutional neural network
    Rostami, Amir Mohammad
    Karimi, Ali
    Akhaee, Mohammad Ali
    [J]. Speech Communication, 2022, 142 : 15 - 21
  • [10] Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech
    Weng, Chao
    Juang, Biing-Hwang
    Povey, Daniel
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 558 - 561