Combination of diverse subword units in spoken term detection

被引:0
|
作者
Lee, Shi-wook [1 ]
Tanaka, Kazuyo [2 ]
Itoh, Yoshiaki [3 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
[2] Univ Tsukuba, Tsukuba, Ibaraki 305, Japan
[3] Iwate Prefectural Univ, Takizawa, Iwate, Japan
关键词
spoken term detection; keyword search; system combination; phonetic recognition; diversity;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper focuses on the following two points: First, we try to clarify the effect of combination systems from two aspects, accuracy and heterogeneity. And then we evaluate our unique subword unit, called Sub-Phonetic Segment (SPS) to maximize performance improvement by combination. Combination systems usually yield higher performance than any individual system. When the systems being combined are individually accurate but also mutually heterogeneous, the improvement by combination can be maximized. From this consideration, we estimate heterogeneity by correlation of false alarm errors of combined systems and confirm that lower correlation of two systems yields the better performance improvement by combination. Comparative tests of several combination approaches are carried out on subword-based spoken term detection. Since subword-based systems use constrained linguistic knowledge, it is fairly straightforward to verify the heterogeneity of combined systems. Experimental results show that the most significant improvements can be achieved by combination of two different subword units, triphone and SPS, which are highly heterogeneous subword units with low correlation of false alarm detections.
引用
收藏
页码:3685 / 3689
页数:5
相关论文
共 50 条
  • [31] Semantically Expanded Spoken Term Detection
    Kozhirbayev, Zhanibek
    Yessenbayev, Zhandos
    IEEE ACCESS, 2024, 12 : 177844 - 177855
  • [32] Multilingual spoken term detection: a review
    Deekshitha, G.
    Mary, Leena
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 653 - 667
  • [33] USING PARALLEL TOKENIZERS WITH DTW MATRIX COMBINATION FOR LOW-RESOURCE SPOKEN TERM DETECTION
    Wang, Haipeng
    Lee, Tan
    Leung, Cheung-Chi
    Ma, Bin
    Li, Haizhou
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8545 - 8549
  • [34] EFFICIENT SYSTEM COMBINATION FOR SYLLABLE-CONFUSION-NETWORK-BASED CHINESE SPOKEN TERM DETECTION
    Gao, Jie
    Zhao, Qingwei
    Yan, Yonghong
    Shao, Jian
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 366 - 369
  • [35] ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION
    Torbati, Amir Hossein Harati Nejad
    Picone, Joe
    2013 IEEE INTERNATIONAL MULTI-DISCIPLINARY CONFERENCE ON COGNITIVE METHODS IN SITUATION AWARENESS AND DECISION SUPPORT (COGSIMA), 2013, : 114 - 117
  • [36] Open-Vocabulary Spoken Document Retrieval based on new subword models and subword phonetic similarity
    Iwata, Kohei
    Itoh, Yoshiaki
    Kojima, Kazunori
    Ishigame, Masaaki
    Tanaka, Kazuyo
    Lee, Shi-wook
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 325 - +
  • [37] Model-Based Unsupervised Spoken Term Detection with Spoken Queries
    Chan, Chun-an
    Lee, Lin-shan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (07): : 1330 - 1342
  • [38] Analytical comparison between position specific posterior lattices and confusion networks based on words and subword units for spoken document indexing
    Pan, Yi-cheng
    Chang, Hung-lin
    Lee, Lin-shan
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 677 - 682
  • [39] Subword and Crossword Units for CTC Acoustic Models
    Zenkel, Thomas
    Sanabria, Ramon
    Metze, Florian
    Waibel, Alex
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 396 - 400
  • [40] Robust Spoken Term Detection Using Combination of Phone-Based and Word-Based Recognition
    Iwata, Kenji
    Shinoda, Koichi
    Furui, Sadaoki
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2195 - 2198