New word detection in audio-indexing

被引：0

作者：

Dharanipragada, S ^{[1
]}

Roukos, S ^{[1
]}

机构：

[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Heights, NY 10598 USA

来源：

1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS | 1997年

关键词：

D O I：

10.1109/ASRU.1997.659135

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For an Audio-Indexing system that uses a speech recognizer with a fixed vocabulary to be practical one needs the ability to detect out of vocabulary or new words at query time. In this paper we present a fast, vocabulary independent, algorithm for spotting words in speech. The algorithm consists of a preprocessing stage and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing method provides a phone-level representation of the speech that can be searched efficiently. The coarse search, consisting of phone-ngram matching, identifies regions of speech as putative word hits. The detailed acoustic match is then conducted only at the putative hits identified in the coarse match. This gives us the desired speed in wordspotting.

引用

页码：551 / 557

页数：7

共 50 条

[31] An unsupervised scheme for speaker indexing of audio databases
Chen, Yanxiang
Liu, Ming
2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 3, 2009, : 90 - +
[32] Speech and Singing Discrimination for Audio Data Indexing
Tsai, Wei-Ho
Ma, Cin-Hao
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 276 - 280
[33] Transcribing broadcast news for audio and video indexing
Gauvain, JL
Lamel, L
Adda, G
COMMUNICATIONS OF THE ACM, 2000, 43 (02) : 64 - 70
[34] An Overview on Perceptually Motivated Audio Indexing and Classification
Richard, Gael
Sundaram, Shiva
Narayanan, Shrikanth
PROCEEDINGS OF THE IEEE, 2013, 101 (09) : 1939 - 1954
[35] AN AUDIO INDEXING SYSTEM FOR ELECTION VIDEO MATERIAL
Alberti, Christopher
Bacchiani, Michiel
Bezman, Ari
Chelba, Ciprian
Drofa, Anastassia
Liao, Hank
Moreno, Pedro
Power, Ted
Sahuguet, Arnaud
Shugrina, Maria
Siohan, Olivier
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4873 - 4876
[36] Mixtures of probability experts for audio retrieval and indexing
Slaney, M
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : 345 - 348
[37] Audio visual cues for video indexing and retrieval
Muneesawang, P
Amin, T
Guan, L
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 642 - 649
[38] Parallel algorithms for indexing and retrieval in audio databases
Subramanya, SR
Youssef, A
INTERNATIONAL SOCIETY FOR COMPUTERS AND THEIR APPLICATIONS 10TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 1997, : 611 - 618
[39] Speech and language technologies for audio indexing and retrieval
Makhoul, J
Kubala, F
Leek, T
Liu, DB
Nguyen, L
Schwartz, R
Srivastava, A
PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1338 - 1353
[40] Audio visual cues for video indexing and retrieval
Muneesawang, Paisarn
Amin, Tahir
Guan, Ling
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3331 : 642 - 649

← 1 2 3 4 5 →