Exploring the Use of Significant Words Language Modeling for Spoken Document Retrieval

被引：4

作者：

Chen, Ying-Wen ^{[1
]}

Chen, Kuan-Yu ^{[2
]}

Wang, Hsin-Min ^{[2
]}

Chen, Berlin ^{[1
]}

机构：

[1] Natl Taiwan Normal Univ, Taipei, Taiwan

[2] Acad Sinica, Taipei, Taiwan

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

Query Model; Significant Words; Pseudo Relevance Feedback; INFORMATION-RETRIEVAL;

D O I：

10.21437/Interspeech.2017-612

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Owing to the rapid global access to tremendous amounts of multimedia associated with speech information on the Internet, spoken document retrieval (SDR) has become an emerging application recently. Apart from much effort devoted to developing robust indexing and modeling techniques for spoken documents, a recent line of research targets at enriching and reformulating query representations in an attempt to enhance retrieval effectiveness. In practice, pseudo-relevance feedback is by far the most prevalent paradigm for query reformulation, which assumes that top-ranked feedback documents obtained from the initial round of retrieval are potentially relevant and can be exploited to reformulate the original query. Continuing this line of research, the paper presents a novel modeling framework, which aims at discovering significant words occurring in the feedback documents, to infer an enhanced query language model for SDR. Formally, the proposed framework targets at extracting the essential words representing a common notion of relevance (i.e., the significant words which occur in almost all of the feedback documents), so as to deduce a new query language model that captures these significant words and meanwhile modulates the influence of both highly frequent words and too specific words. Experiments conducted on a benchmark SDR task demonstrate the performance merits of our proposed framework.

引用

页码：2889 / 2893

页数：5

共 50 条

[1] A NEURAL DOCUMENT LANGUAGE MODELING FRAMEWORK FOR SPOKEN DOCUMENT RETRIEVAL
Yen, Li-Phen
Wu, Zhen-Yu
Chen, Kuan-Yu
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8139 - 8143
[2] Exploring an Unsupervised, Language Independent, Spoken Document Retrieval System
Caranica, Alexandru
Cucu, Horia
Buzo, Andi
[J]. 2016 14TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2016,
[3] I-VECTOR BASED LANGUAGE MODELING FOR SPOKEN DOCUMENT RETRIEVAL
Chen, Kuan-Yu
Lee, Hung-Shin
Wang, Hsin-Min
Chen, Berlin
Chen, Hsin-Hsi
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[4] Exploring the use of latent topical information for statistical Chinese spoken document retrieval
Chen, B
[J]. PATTERN RECOGNITION LETTERS, 2006, 27 (01) : 9 - 18
[5] Spoken Document Retrieval With Unsupervised Query Modeling Techniques
Chen, Berlin
Chen, Kuan-Yu
Chen, Pei-Ning
Chen, Yi-Wen
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (09): : 2602 - 2612
[6] Language model expansion using webdata for spoken document retrieval
Masumura, Ryo
Hahm, Seongjun
Ito, Akinori
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2144 - 2147
[7] Spoken words versus spoken language
Jerger, James
[J]. JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2006, 17 (07)
[8] CLEF 2004 cross-language spoken document retrieval track
Federico, M
Bertoldi, N
Levow, GA
Jones, GJF
[J]. MULTILINGUAL INFORMATION ACCESS FOR TEXT, SPEECH AND IMAGES, 2005, 3491 : 816 - 820
[9] An architecture for spoken document retrieval
Terol, RM
Martínez-Barco, P
Palomar, M
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 505 - 511
[10] Experiments in spoken document retrieval
Sparck-Jones, K
Jones, GJF
Foote, JT
Young, SJ
[J]. INFORMATION PROCESSING & MANAGEMENT, 1996, 32 (04) : 399 - 417

← 1 2 3 4 5 →