EFFICIENT SUBWORD LATTICE RETRIEVAL FOR GERMAN SPOKEN TERM DETECTION

被引:12
|
作者
Mertens, Timo [1 ,2 ]
Schneider, Daniel [2 ]
机构
[1] NTNU, Dept Elect & Telecommun, Trondheim, Norway
[2] Fraunhofer IAIS, Schloss Birlinghoven, St Augustin 53754, Germany
关键词
spoken term detection; spoken document retrieval; speech recognition; speech search;
D O I
10.1109/ICASSP.2009.4960726
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a lattice-based STD method for German broadcast news data and compare it to a previously proposed fuzzy search. Due to the important out-of-vocabulary (OOV) problem in German, we evaluate suitable subword indexing units for lattice retrieval. Hybrid lattice retrieval of words and subwords is investigated because of the robust nature of words as an indexing unit. We show that by using efficient lattice graph and score pruning techniques, precision of subword retrieval is increased by 8% absolute with only a small loss in recall. Additionally, a speed-up of up to 6 times can be observed.
引用
收藏
页码:4885 / +
页数:2
相关论文
共 50 条
  • [1] Combination of diverse subword units in spoken term detection
    Lee, Shi-wook
    Tanaka, Kazuyo
    Itoh, Yoshiaki
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3685 - 3689
  • [2] Merging Search Spaces for Subword Spoken Term Detection
    Mertens, Timo
    Schneider, Daniel
    Koehler, Joachim
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2075 - +
  • [3] SUBWORD-BASED SPOKEN TERM DETECTION IN AUDIO COURSE LECTURES
    Rose, Richard
    Norouzian, Atta
    Reddy, Aarthi
    Coy, Andre
    Gupta, Vishwa
    Karafiat, Martin
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5282 - 5285
  • [4] SPOKEN TERM DETECTION USING DYNAMIC MATCH SUBWORD CONFUSION NETWORK
    Gao, Jie
    Shao, Jian
    Zhang, Qingqing
    Zhao, Qingwei
    Yan, Yonghong
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 4, PROCEEDINGS, 2008, : 250 - 254
  • [5] Subword-based approaches for spoken document retrieval
    Ng, K
    Zue, VW
    SPEECH COMMUNICATION, 2000, 32 (03) : 157 - 186
  • [6] Lattice Indexing for Spoken Term Detection
    Can, Dogan
    Saraclar, Murat
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2338 - 2347
  • [7] EFFECTIVE COMBINATION OF HETEROGENEOUS SUBWORD-BASED SPOKEN TERM DETECTION SYSTEMS
    Lee, Shi-wook
    Tanaka, Kazuyo
    Itoh, Yoshiaki
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 436 - 441
  • [8] Open-Vocabulary Spoken Document Retrieval based on new subword models and subword phonetic similarity
    Iwata, Kohei
    Itoh, Yoshiaki
    Kojima, Kazunori
    Ishigame, Masaaki
    Tanaka, Kazuyo
    Lee, Shi-wook
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 325 - +
  • [9] Spoken Term Detection Results using Plural Subword Models by Estimating Detection Performance for Each Query
    Itoh, Yoshiaki
    Iwata, Kohei
    Ishigame, Masaaki
    Tanaka, Kazuyo
    Lee, Shi-wook
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2128 - 2131
  • [10] Spoken Term Detection from Bilingual Spontaneous Speech Using Code-switched Lattice-based Structures for Words and Subword Units
    Lee, Hung-Yi
    Tang, Yueh-Lien
    Tang, Hao
    Lee, Lin-Shan
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 410 - +