Castsearch - Context based spoken document retrieval

被引:0
|
作者
Molgaard, Lasse Lohilahti [1 ]
Jorgensen, Kasper Winther [1 ]
Hansen, Lars Kai [1 ]
机构
[1] Tech Univ Denmark Richard Petersens Plads, Bldg 321, DK-2800 Lyngby, Denmark
关键词
audio retrieval; document clustering; non-negative matrix factorization; text mining;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The paper describes our work on the development of a system for retrieval of relevant stories from broadcast news. The system utilizes a combination of audio processing and text mining. The audio processing consists of a segmentation step that partitions the audio into speech and music. The speech is further segmented into speaker segments and then transcribed using an automatic speech recognition system, to yield text input for clustering using non-negative matrix factorization (NMF). We find semantic topics that are used to evaluate the performance for topic detection. Based on these topics we show that a novel query expansion can be performed to return more intelligent search results. We also show that the query expansion helps overcome errors of the automatic transcription.
引用
收藏
页码:93 / +
页数:2
相关论文
共 50 条
  • [1] IMPROVING PHONEME-BASED SPOKEN DOCUMENT RETRIEVAL WITH PHONETIC CONTEXT EXPANSION
    Olivier, Le Blouch
    Collen, Patrice
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1217 - 1220
  • [2] SEMANTIC QUERY EXPANSION AND CONTEXT-BASED DISCRIMINATIVE TERM MODELING FOR SPOKEN DOCUMENT RETRIEVAL
    Tu, Tsung-wei
    Lee, Hung-yi
    Chou, Yu-yu
    Lee, Lin-shan
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5085 - 5088
  • [3] An architecture for spoken document retrieval
    Terol, RM
    Martínez-Barco, P
    Palomar, M
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 505 - 511
  • [4] Experiments in spoken document retrieval
    Sparck-Jones, K
    Jones, GJF
    Foote, JT
    Young, SJ
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1996, 32 (04) : 399 - 417
  • [5] A Soundex-Based Approach for Spoken Document Retrieval
    Alejandro Reyes-Barragan, M.
    Villasenor-Pineda, Luis
    Montes-y-Gomez, Manuel
    [J]. MICAI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5317 : 204 - 211
  • [6] Subword-based approaches for spoken document retrieval
    Ng, K
    Zue, VW
    [J]. SPEECH COMMUNICATION, 2000, 32 (03) : 157 - 186
  • [7] Statistical Lattice-Based Spoken Document Retrieval
    Chia, Tee Kiah
    Sim, Khe Chai
    Li, Haizhou
    Ng, Hwee Tou
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (01)
  • [8] Spoken Document Retrieval Based on Approximated Sequence Alignment
    Comas, Pere R.
    Turmo, Jordi
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 285 - 292
  • [9] Spoken Document Retrieval System based on Phonemic Transcribing
    Tatarinova, Alexandra
    Prozorov, Dmitriy
    [J]. 2017 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS), 2017,
  • [10] SEMANTIC CONTEXT INFERENCE FOR SPOKEN DOCUMENT RETRIEVAL USING TERM ASSOCIATION MATRICES
    Huang, Chien-Lin
    Hori, Chiori
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,