N-Gram FST Indexing for Spoken Term Detection

被引:0
|
作者
Liu, Chao [1 ]
Wang, Dong [1 ]
Tejedor, Javier
机构
[1] Tsinghua Univ, Ctr Speech & Language Technol, Beijing, Peoples R China
关键词
spoken term indexing; finite state transducer; spoken term detection; speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An efficient indexing scheme is essentially important for spoken term detection (STD) on large databases, particularly for phone-based systems that have been widely adopted to achieve vocabulary-independent detection. While the finite state transducer (FST) composition provides a standard indexing approach, the n-gram reverse indexing is more flexible in connectivity representation and confidence measuring and therefore may result in better performance than searching within the original lattices or the equivalent FSTs. In this paper we present an n-gram FST indexing approach which combines the flexibility of n-gram indexing and the efficiency of FST indexing. Specifically, we employ the n-gram indexing to relax connectivity in original lattices and then formalize the indices into an FST for online search. We demonstrate this approach with a phone-based STD task where the lattice is sparse due to strong language models. The results show that n-gram FST indexing provides not only better detection performance than lattice search, but also a faster detection than both conventional n-gram and FST indexing.
引用
收藏
页码:2091 / 2094
页数:4
相关论文
共 50 条
  • [1] Handling OOV Words in Mandarin Spoken Term Detection with an Hierarchical n-Gram Language Model
    WANG Xuyang
    ZHANG Pengyuan
    NA Xingyu
    PAN Jielin
    YAN Yonghong
    [J]. Chinese Journal of Electronics, 2017, 26 (06) : 1239 - 1244
  • [2] Handling OOV Words in Mandarin Spoken Term Detection with an Hierarchical n-Gram Language Model
    Wang Xuyang
    Zhang Pengyuan
    Na Xingyu
    Pan Jielin
    Yan Yonghong
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2017, 26 (06) : 1239 - 1244
  • [3] A robust/fast spoken term detection method based on a syllable n-gram index with a distance metric
    Nakagawa, Seiichi
    Iwami, Keisuke
    Fujii, Yasuhisa
    Yamamoto, Kazumasa
    [J]. SPEECH COMMUNICATION, 2013, 55 (03) : 470 - 485
  • [4] n-gram Models for Video Semantic Indexing
    Inoue, Nakamasa
    Shinoda, Koichi
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 777 - 780
  • [5] A study on N-gram indexing of musical features
    Yip, CL
    Kao, B
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 869 - 872
  • [6] COMBINATION OF SYLLABLE BASED N-GRAM SEARCH AND WORD SEARCH FOR SPOKEN TERM DETECTION THROUGH SPOKEN QUERIES AND IV/OOV CLASSIFICATION
    Sakamoto, Nagisa
    Yamamoto, Kazumasa
    Nakagawa, Seiichi
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 200 - 206
  • [7] Tokenization and N-gram for Indexing Indonesian Translation of the Quran
    Putra, Syopiansyah Jaya
    Gunawan, Muhamad Nur
    Suryatno, Agung
    [J]. 2018 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT), 2018, : 158 - 161
  • [8] Entropy-based indexing term selection for N-gram text search system
    Yamamoto, H
    Ohmi, S
    Tsuji, H
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4852 - 4857
  • [9] N-gram MalGAN: Evading machine learning detection via feature n-gram
    Zhu, Enmin
    Zhang, Jianjie
    Yan, Jijie
    Chen, Kongyang
    Gao, Chongzhi
    [J]. DIGITAL COMMUNICATIONS AND NETWORKS, 2022, 8 (04) : 485 - 491
  • [10] Lattice Indexing for Spoken Term Detection
    Can, Dogan
    Saraclar, Murat
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2338 - 2347