New word detection in audio-indexing

被引:0
|
作者
Dharanipragada, S [1 ]
Roukos, S [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Heights, NY 10598 USA
关键词
D O I
10.1109/ASRU.1997.659135
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For an Audio-Indexing system that uses a speech recognizer with a fixed vocabulary to be practical one needs the ability to detect out of vocabulary or new words at query time. In this paper we present a fast, vocabulary independent, algorithm for spotting words in speech. The algorithm consists of a preprocessing stage and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing method provides a phone-level representation of the speech that can be searched efficiently. The coarse search, consisting of phone-ngram matching, identifies regions of speech as putative word hits. The detailed acoustic match is then conducted only at the putative hits identified in the coarse match. This gives us the desired speed in wordspotting.
引用
收藏
页码:551 / 557
页数:7
相关论文
共 50 条
  • [31] An unsupervised scheme for speaker indexing of audio databases
    Chen, Yanxiang
    Liu, Ming
    2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 3, 2009, : 90 - +
  • [32] Speech and Singing Discrimination for Audio Data Indexing
    Tsai, Wei-Ho
    Ma, Cin-Hao
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 276 - 280
  • [33] Transcribing broadcast news for audio and video indexing
    Gauvain, JL
    Lamel, L
    Adda, G
    COMMUNICATIONS OF THE ACM, 2000, 43 (02) : 64 - 70
  • [34] An Overview on Perceptually Motivated Audio Indexing and Classification
    Richard, Gael
    Sundaram, Shiva
    Narayanan, Shrikanth
    PROCEEDINGS OF THE IEEE, 2013, 101 (09) : 1939 - 1954
  • [35] AN AUDIO INDEXING SYSTEM FOR ELECTION VIDEO MATERIAL
    Alberti, Christopher
    Bacchiani, Michiel
    Bezman, Ari
    Chelba, Ciprian
    Drofa, Anastassia
    Liao, Hank
    Moreno, Pedro
    Power, Ted
    Sahuguet, Arnaud
    Shugrina, Maria
    Siohan, Olivier
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4873 - 4876
  • [36] Mixtures of probability experts for audio retrieval and indexing
    Slaney, M
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : 345 - 348
  • [37] Audio visual cues for video indexing and retrieval
    Muneesawang, P
    Amin, T
    Guan, L
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 642 - 649
  • [38] Parallel algorithms for indexing and retrieval in audio databases
    Subramanya, SR
    Youssef, A
    INTERNATIONAL SOCIETY FOR COMPUTERS AND THEIR APPLICATIONS 10TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 1997, : 611 - 618
  • [39] Speech and language technologies for audio indexing and retrieval
    Makhoul, J
    Kubala, F
    Leek, T
    Liu, DB
    Nguyen, L
    Schwartz, R
    Srivastava, A
    PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1338 - 1353
  • [40] Audio visual cues for video indexing and retrieval
    Muneesawang, Paisarn
    Amin, Tahir
    Guan, Ling
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3331 : 642 - 649