Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity

被引：4

作者：

Lee, Hung-yi ^{[1
]}

Chou, Po-wei ^{[2
]}

Lee, Lin-shan ^{[1
]}

机构：

[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei 10617, Taiwan

[2] Natl Taiwan Univ, Dept Elect Engn, Taipei 10617, Taiwan

来源：

COMPUTER SPEECH AND LANGUAGE | 2014年 / 28卷 / 05期

关键词：

Spoken content retrieval; Spoken term detection; Pseudo-relevance feedback; Random walk; SEARCH;

D O I：

10.1016/j.csl.2013.12.003

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Spoken content retrieval will be very important for retrieving and browsing multimedia content over the Internet, and spoken term detection (STD) is one of the key technologies for spoken content retrieval. In this paper, we show acoustic feature similarity between spoken segments used with pseudo-relevance feedback and graph-based re-ranking can improve the performance of STD. This is based on the concept that spoken segments similar in acoustic feature vector sequences to those with higher/lower relevance scores should have higher/lower scores, while graph-based re-ranking further uses a graph to consider the similarity structure among all the segments retrieved in the first pass. These approaches are formulated on both word and subword lattices, and a complete framework of using them in open vocabulary retrieval of spoken content is presented. Significant improvements for these approaches with both in-vocabulary and out-of-vocabulary queries were observed in preliminary experiments. (C) 2014 Elsevier Ltd. All rights reserved.

引用

页码：1045 / 1065

页数：21

共 19 条

[1] Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity
Lee, Hung-yi
Chou, Po-wei
Lee, Lin-shan
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2075 - 2078
[2] Open-Vocabulary Spoken Document Retrieval based on new subword models and subword phonetic similarity
Iwata, Kohei
Itoh, Yoshiaki
Kojima, Kazunori
Ishigame, Masaaki
Tanaka, Kazuyo
Lee, Shi-wook
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 325 - +
[3] Combining multiple subword representations for open-vocabulary spoken document retrieval
Lee, SW
Tanaka, K
Itoh, Y
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 505 - 508
[4] Open-vocabulary spoken utterance retrieval using confusion networks
Hori, Takaaki
Hetherington, I. Lee
Hazen, Timothy J.
Glass, James R.
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 73 - +
[5] Subword-Based Compact Reconstruction for Open-Vocabulary Neural Word Embeddings
Sasaki, Shota
Suzuki, Jun
Inui, Kentaro
[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2021, 29 : 3551 - 3564
[6] Subword-Based Compact Reconstruction for Open-Vocabulary Neural Word Embeddings
Sasaki, Shota
Suzuki, Jun
Inui, Kentaro
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3551 - 3564
[7] OPEN VOCABULARY SPOKEN DOCUMENT RETRIEVAL BY SUBWORD SEQUENCE OBTAINED FROM SPEECH RECOGNIZER
Kuriki, Go
Itoh, Yoshiaki
Kojima, Kazunori
Ishigame, Masaaki
Tanaka, Kazuyo
Lee, Shi-wook
[J]. 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 301 - +
[8] Open-Vocabulary Spoken-Document Retrieval Based on Query Expansion Using Related Web Documents
Terao, Makoto
Koshinaka, Takafumi
Ando, Shinichi
Isotani, Ryosuke
Okumura, Akitoshi
[J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2171 - 2174
[9] IMPROVED SEMANTIC RETRIEVAL OF SPOKEN CONTENT BY LANGUAGE MODELS ENHANCED WITH ACOUSTIC SIMILARITY GRAPH
Lee, Hung-yi
Wen, Tsung-Hsien
Lee, Lin-Shan
[J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 182 - 187
[10] Open Vocabulary Spoken Content Retrieval by front-ending with Spoken Term Detection
Takigami, Tomoko
Akiba, Tomoyosi
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,

← 1 2 →