EFFICIENT SYSTEM COMBINATION FOR SYLLABLE-CONFUSION-NETWORK-BASED CHINESE SPOKEN TERM DETECTION

被引:0
|
作者
Gao, Jie [1 ]
Zhao, Qingwei [1 ]
Yan, Yonghong [1 ]
Shao, Jian [2 ]
机构
[1] Chinese Acad Sci, Inst Acoust, ThinkIT Speech Lab, Beijing, Peoples R China
[2] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
关键词
syllable confusion network; Chinese spoken term detection; system combination; speech indexing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper examines the system combination issue for syllable-confusion-network (SCN)-based Chinese spoken term detection (STD). System combination for STD usually leads to improvements in accuracy but suffers from increased index size or complicated index structure. This paper explores methods for efficient combination of a word-based system and a syllable-based system while keeping the compactness of the indices. First, a composite SCN is generated using two approaches: lattice combination (The SCN is generated from a combined lattice) and confusion network combination (Two SCNs are combined into one). Then a simple compact index is constructed from this composite SCN by merging cross-system redundant information. The experimental result on a 60-hour corpus shows a relative accuracy improvement of 14.7% is achieved over the baseline syllable-based system. Meanwhile, it reduces the index size by 22.3% compared to the commonly adopted score combination method when achieves comparable accuracy.
引用
收藏
页码:366 / 369
页数:4
相关论文
共 50 条
  • [1] Efficient System Combination for Chinese Spoken Term Detection
    Gao Jie
    Shao Jian
    Zhao Qingwei
    Yan Yonghong
    CHINESE JOURNAL OF ELECTRONICS, 2010, 19 (03): : 457 - 462
  • [2] Spoken Document Retrieval Based on Confusion Network with Syllable Fragments
    Lei, Zhang
    Gotoh, Yoshihiko
    Khan, Muhammad Usman Ghani
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2012, 9
  • [3] The Research on Mongolian Spoken Term Detection Based on Confusion Network
    Bao, Feilong
    Gao, Guanglai
    Bao, Yulai
    Su, Xiangdong
    PATTERN RECOGNITION, 2012, 321 : 606 - +
  • [4] EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS
    Mangu, Lidia
    Kingsbury, Brian
    Soltau, Hagen
    Kuo, Hong-Kwang
    Picheny, Michael
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [5] INCORPORATING SYLLABLE DURATION INTO LINE-DETECTION-BASED SPOKEN TERM DETECTION
    Ohno, Teppei
    Akiba, Tomoyosi
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 204 - 209
  • [6] SPOKEN TERM DETECTION USING DYNAMIC MATCH SUBWORD CONFUSION NETWORK
    Gao, Jie
    Shao, Jian
    Zhang, Qingqing
    Zhao, Qingwei
    Yan, Yonghong
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 4, PROCEEDINGS, 2008, : 250 - 254
  • [7] Spoken term detection system based on combination of LVCSR and phonetic search
    Szoeke, Igor
    Fapso, Michal
    Karafiat, Martin
    Burget, Lukas
    Grezl, Frantisek
    Schwarz, Petr
    Glembek, Ondrej
    Matejka, Pavel
    Kopecky, Jiri
    Cernocky, Jan Honza
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2008, 4892 : 237 - 247
  • [8] Keyword spotting based on syllable confusion network
    Zhang, Pengyuan
    Shao, Jian
    Zhao, Qingwei
    Yan, Yonghong
    ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 2, PROCEEDINGS, 2007, : 656 - +
  • [9] SYSTEM COMBINATION AND SCORE NORMALIZATION FOR SPOKEN TERM DETECTION
    Mamou, Jonathan
    Cui, Jia
    Cui, Xiaodong
    Gales, Mark J. F.
    Kingsbury, Brian
    Knill, Kate
    Mangu, Lidia
    Nolden, David
    Picheny, Michael
    Ramabhadran, Bhuvana
    Schlueter, Ralf
    Sethy, Abhinav
    Woodland, Philip C.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8272 - 8276
  • [10] Combination of syllable based N-gram search and word search for spoken term detection through spoken queries and IV/OOV classification
    Toyohashi University of Technology, Japan
    IEEE Workshop Autom. Speech Recognit. Underst., ASRU - Proc., 2015, (200-206):