Study of Entity-Topic Models for OOV Proper Name Retrieval

被引:0
|
作者
Sheikh, Imran [1 ]
Illina, Irina
Fohr, Dominique
机构
[1] Univ Lorraine, LORIA, UMR 7503, F-54506 Vandoeuvre Les Nancy, France
关键词
proper names; OOV; topic models; LVCSR;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Retrieving Proper Names (PNs) relevant to an audio document can improve speech recognition and content based audio -video indexing. Latent Dirichlet Allocation (LDA) topic model has been used to retrieve Out-Of-Vocabulary (OOV) PNs relevant to an audio document with good recall rates. However, retrieval of OOV PNs using LDA is affected by two issues, which we study in this paper: (1) Word Frequency Bias (less frequent OOV PNs are ranked lower); (2) Loss of Specificity (the reduced topic space representation loses lexical context). Entity-Topic models have been proposed as extensions of LDA to specifically learn relations between words, entities (PNs) and topics. We study OOV PN retrieval with Entity-Topic models and show that they are also affected by word frequency bias and loss of specificity. We evaluate our proposed methods for rare OOV PN re-ranking and lexical context re-ranking for LDA as well as for Entity Topic models. The results show an improvement in both Recall and the Mean Average Precision.
引用
收藏
页码:1344 / 1348
页数:5
相关论文
共 50 条
  • [41] Names and their meanings: A dual-process account of proper-name encoding and retrieval
    O'Rourke, Thomas
    de Diego Balaguer, Ruth
    NEUROSCIENCE AND BIOBEHAVIORAL REVIEWS, 2020, 108 : 308 - 321
  • [42] Person name spotting by combining acoustic matching and LDA topic models
    Senay, Gregory
    Bigot, Benjamin
    Dufour, Richard
    Linares, Georges
    Fredouille, Corinne
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1583 - 1587
  • [43] Integrating social annotations into topic models for personalized document retrieval
    Xu, Bo
    Lin, Hongfei
    Lin, Yuan
    Guan, Yizhou
    SOFT COMPUTING, 2020, 24 (03) : 1707 - 1716
  • [44] Integrating social annotations into topic models for personalized document retrieval
    Bo Xu
    Hongfei Lin
    Yuan Lin
    Yizhou Guan
    Soft Computing, 2020, 24 : 1707 - 1716
  • [45] Topic Models Ensembles for AD-HOC Information Retrieval
    Ormeno, Pablo
    Mendoza, Marcelo
    Valle, Carlos
    INFORMATION, 2021, 12 (09)
  • [46] A Tutorial on Probabilistic Topic Models for Text Data Retrieval and Analysis
    Zhai, ChengXiang
    Geigle, Chase
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 1395 - 1397
  • [47] Topic based language models for ad hoc information retrieval
    Azzopardi, L
    Girolami, M
    van Rijsbergen, CJ
    2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 3281 - 3286
  • [48] Characteristics of the proper name. Study of linguistic interpretation
    Bruguera, Jordi
    LLENGUA & LITERATURA, 2010, (21) : 451 - 457
  • [49] Generating flexible proper name references in text: Data, models and evaluation
    Ferreira, Thiago Castro
    Krahmer, Emiel
    Wubben, Sander
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 655 - 664
  • [50] Binary Acceleration and Compression for Dense Vector Entity Retrieval Models
    Wang Y.
    Fan Y.
    Chen W.
    Zhang R.
    Guo J.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (01): : 60 - 69