QUERY-BY-EXAMPLE SPOKEN TERM DETECTION USING ATTENTION-BASED MULTI-HOP NETWORKS

被引:0
|
作者
Ao, Chia-Wei [1 ]
Lee, Hung-yi [1 ]
机构
[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan
关键词
Attention-based Multi-hop Network;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Retrieving spoken content with spoken queries, or query-by-example spoken term detection (STD), is attractive because it makes possible the matching of signals directly on the acoustic level without transcribing them into text. Here, we propose an end-to-end query-by-example STD model based on an attention-based multi-hop network, whose input is a spoken query and an audio segment containing several utterances; the output states whether the audio segment includes the query. The model can be trained in either a supervised scenario using labeled data, or in an unsupervised fashion. In the supervised scenario, we find that the attention mechanism and multiple hops improve performance, and that the attention weights indicate the time span of the detected terms. In the unsupervised setting, the model mimics the behavior of DTW, and it performs as well as DTW but with a lower run-time complexity.
引用
收藏
页码:6264 / 6268
页数:5
相关论文
共 50 条
  • [1] Query-by-example spoken term detection based on phonetic posteriorgram Query-by-example spoken term detection based on phonetic posteriorgram
    Song, Beili
    Zhang, Wei-Qiang
    Cai, Meng
    Liu, Jia
    Johnson, Michael T.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND COMPUTING TECHNOLOGY, 2015, 30 : 1255 - 1260
  • [2] Query-by-Example Spoken Term Detection using Attentive Pooling Networks
    Zhang, Kun
    Wu, Zhiyong
    Jia, Jia
    Meng, Helen
    Song, Binheng
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1267 - 1272
  • [3] Query-by-Example Spoken Term Detection Using Bessel Features
    Vasudev, Drisya
    Gangashetty, Suryakanth V.
    Babu, Anish K. K.
    Riyas, K. S.
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
  • [4] Query-By-Example Spoken Term Detection Using Generative Adversarial Network
    Shah, Neil
    Sreeraj, R.
    Madhavi, Maulik C.
    Shah, Nirmesh J.
    Patil, Hemant A.
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 644 - 648
  • [5] Query-By-Example Spoken Term Detection Using Phonetic Posteriorgram Templates
    Hazen, Timothy J.
    Shen, Wade
    White, Christopher
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 421 - +
  • [6] Query-by-Example Spoken Term Detection For OOV Terms
    Parada, Carolina
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 404 - +
  • [7] A Comparison of Query-by-Example Methods for Spoken Term Detection
    Shen, Wade
    White, Christopher M.
    Hazen, Timothy J.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2107 - 2110
  • [8] A query-by-example spoken term detection method based on phonetic posteriorgram
    Zhang, Weiqiang
    Song, Beili
    Cai, Meng
    Liu, Jia
    Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2015, 48 (09): : 757 - 760
  • [9] Unsupervised Query-by-example spoken term detection based on DPHMM tokenizer
    Cao Jiankai
    Zhang Lianhai
    2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 1321 - 1325
  • [10] A STAGE MATCH FOR QUERY-BY-EXAMPLE SPOKEN TERM DETECTION BASED ON STRUCTURE INFORMATION OF QUERY
    Zhan, Junyao
    He, Qianhua
    Su, Jianbin
    Li, Yanxiong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6833 - 6837