Combination of diverse subword units in spoken term detection

被引:0
|
作者
Lee, Shi-wook [1 ]
Tanaka, Kazuyo [2 ]
Itoh, Yoshiaki [3 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
[2] Univ Tsukuba, Tsukuba, Ibaraki 305, Japan
[3] Iwate Prefectural Univ, Takizawa, Iwate, Japan
关键词
spoken term detection; keyword search; system combination; phonetic recognition; diversity;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper focuses on the following two points: First, we try to clarify the effect of combination systems from two aspects, accuracy and heterogeneity. And then we evaluate our unique subword unit, called Sub-Phonetic Segment (SPS) to maximize performance improvement by combination. Combination systems usually yield higher performance than any individual system. When the systems being combined are individually accurate but also mutually heterogeneous, the improvement by combination can be maximized. From this consideration, we estimate heterogeneity by correlation of false alarm errors of combined systems and confirm that lower correlation of two systems yields the better performance improvement by combination. Comparative tests of several combination approaches are carried out on subword-based spoken term detection. Since subword-based systems use constrained linguistic knowledge, it is fairly straightforward to verify the heterogeneity of combined systems. Experimental results show that the most significant improvements can be achieved by combination of two different subword units, triphone and SPS, which are highly heterogeneous subword units with low correlation of false alarm detections.
引用
收藏
页码:3685 / 3689
页数:5
相关论文
共 50 条
  • [41] A novel approach for spoken term detection in Vietnamese
    Nguyen Hong Quang
    Trinh Van Loan
    Le Xuan Thanh
    2015 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, MANAGEMENT AND TELECOMMUNICATIONS (COMMANTEL), 2015, : 68 - 72
  • [42] AN ITERATIVE DEEP LEARNING FRAMEWORK FOR UNSUPERVISED DISCOVERY OF SPEECH FEATURES AND LINGUISTIC UNITS WITH APPLICATIONS ON SPOKEN TERM DETECTION
    Chung, Cheng-Tao
    Tsai, Cheng-Yu
    Lu, Hsiang-Hung
    Liu, Chia-Hsiang
    Lee, Hung-yi
    Lee, Lin-Shan
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 245 - 251
  • [43] Combination of syllable based N-gram search and word search for spoken term detection through spoken queries and IV/OOV classification
    Toyohashi University of Technology, Japan
    IEEE Workshop Autom. Speech Recognit. Underst., ASRU - Proc., 2015, (200-206):
  • [44] COMBINATION OF SYLLABLE BASED N-GRAM SEARCH AND WORD SEARCH FOR SPOKEN TERM DETECTION THROUGH SPOKEN QUERIES AND IV/OOV CLASSIFICATION
    Sakamoto, Nagisa
    Yamamoto, Kazumasa
    Nakagawa, Seiichi
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 200 - 206
  • [45] Spoken term detection for Turkish Broadcast News
    Parlak, Siddika
    Saraclar, Murat
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5244 - 5247
  • [46] English Spoken Term Detection in Multilingual Recordings
    Motlicek, Petr
    Valente, Fabio
    Garner, Philip N.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 206 - 209
  • [47] Recent developments in spoken term detection: a survey
    Mandal, Anupam
    Kumar, K.
    Mitra, Pabitra
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (02) : 183 - 198
  • [48] ORDER-FREE SPOKEN TERM DETECTION
    Mangu, Lidia
    Saon, George
    Picheny, Michael
    Kingsbury, Brian
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5331 - 5335
  • [49] Incorporating visual information for spoken term detection
    Kalantari, Shahram
    Dean, David
    Sridharan, Sridha
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 558 - 562
  • [50] Stochastic Pronunciation Modelling for Spoken Term Detection
    Wang, Dong
    King, Simon
    Frankel, Joe
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2091 - 2094