Segmented Dynamic Time Warping for Spoken Query-by-Example Search

被引：5

作者：

Proenca, Jorge ^{[1
]}

Perdigao, Fernando

机构：

[1] Univ Coimbra, Inst Telecomunicacoes, Coimbra, Portugal

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

Query-by-example; Spoken term detection; Dynamic Time Warping; TERM DETECTION;

D O I：

10.21437/Interspeech.2016-1276

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper describes a low-resource approach to a Query-by-Example task, where spoken queries must be matched in a large dataset of spoken documents sometimes in complex or non-exact ways. Our approach tackles these complex match cases by using Dynamic Time Warping to obtain alternative paths that account for reordering of words, small extra content and small lexical variations. We also report certain advances on calibration and fusion of sub-systems that improve overall results, such as manipulating the score distribution per query and using an average posteriorgram distance matrix as an extra sub-system. Results are evaluated on the MediaEval task of Query-by-Example Search on Speech (QUESST). For this task, the language of the audio being searched is almost irrelevant, approaching the use case scenario to a language of very low resources. For that, we use as features the posterior probabilities obtained from five phonetic recognizers trained with five different languages.

引用

页码：750 / 754

页数：5

共 50 条

[1] QUERY BY EXAMPLE SEARCH WITH SEGMENTED DYNAMIC TIME WARPING FOR NON-EXACT SPOKEN QUERIES
Proenca, Jorge
Veiga, Arlindo
Perdigao, Fernando
[J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1661 - 1665
[2] Modification in Sequential Dynamic Time Warping for Fast Computation of Query-by-Example Spoken Term Detection Task
Madhavi, Maulik C.
Patil, Hemant A.
[J]. 2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
[3] Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping
Lopez-Otero, Paula
Parapar, Javier
Barreiro, Alvaro
[J]. INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (01) : 43 - 60
[4] Query-by-Example Retrieval via Fast Sequential Dynamic Time Warping Algorithm
Vavrek, Jozef
Viszlay, Peter
Kiktova, Eva
Lojka, Martin
Juhar, Jozef
Cizmar, Anton
[J]. 2015 38TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2015,
[5] Query-by-Example Spoken Term Detection using Frequency Domain Linear Prediction and Non-Segmental Dynamic Time Warping
Mantena, Gautam
Achanta, Sivanand
Prahallad, Kishore
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (05) : 946 - 955
[6] Query-by-example spoken term detection based on phonetic posteriorgram Query-by-example spoken term detection based on phonetic posteriorgram
Song, Beili
Zhang, Wei-Qiang
Cai, Meng
Liu, Jia
Johnson, Michael T.
[J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND COMPUTING TECHNOLOGY, 2015, 30 : 1255 - 1260
[7] Query-by-Example Spoken Term Detection For OOV Terms
Parada, Carolina
Sethy, Abhinav
Ramabhadran, Bhuvana
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 404 - +
[8] A Comparison of Query-by-Example Methods for Spoken Term Detection
Shen, Wade
White, Christopher M.
Hazen, Timothy J.
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2107 - 2110
[9] Query-by-Example Spoken Term Detection Using Bessel Features
Vasudev, Drisya
Gangashetty, Suryakanth V.
Babu, Anish K. K.
Riyas, K. S.
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
[10] ALBAYZIN Query-by-example Spoken Term Detection 2016 evaluation
Javier Tejedor
Doroteo T. Toledano
Paula Lopez-Otero
Laura Docio-Fernandez
Jorge Proença
Fernando Perdigão
Fernando García-Granada
Emilio Sanchis
Anna Pompili
Alberto Abad
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2018

← 1 2 3 4 5 →