Using Latent Dirichlet Allocation to Incorporate Domain Knowledge For Topic Transition Detection

被引:0
|
作者
Zhu, Xiaodan [1 ]
He, Xuming [1 ]
Munteanu, Cosmin [1 ]
Penn, Gerald [1 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
关键词
slides transition detection; boundary detection;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies automatic detection of topic transitions for recorded presentations. This can be achieved by matching slide content with presentation transcripts directly with some similarity metrics. Such literal matching, however, misses domain-specific knowledge and is sensitive to speech recognition errors. In this paper, we incorporate relevant written materials, e.g., textbooks for lectures, which convey semantic relationships, in particular domain-specific relationships, between words. To this end, we train latent Dirichlet allocation (LDA) models on these materials and measure the similarity between slides and transcripts in the acquired hidden-topic space. This similarity is then combined with literal matchings. Experiments show that the proposed approach reduces the errors in slide transition detection by 17-41% on manual transcripts and 27-37% on automatic transcripts.
引用
收藏
页码:2442 / 2445
页数:4
相关论文
共 50 条
  • [41] Full-Text or Abstract? Examining Topic Coherence Scores Using Latent Dirichlet Allocation
    Syed, Shaheen
    Spruit, Marco
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2017, : 165 - 174
  • [42] Enhanced Sentiment Analysis and Topic Modeling During the Pandemic Using Automated Latent Dirichlet Allocation
    Batool, Amreen
    Byun, Yung-Cheol
    [J]. IEEE ACCESS, 2024, 12 : 81206 - 81220
  • [43] Semantic similarity measure for topic modeling using latent Dirichlet allocation and collapsed Gibbs sampling
    Micheal Olalekan Ajinaja
    Adebayo Olusola Adetunmbi
    Chukwuemeka Christian Ugwu
    Olugbemiga Solomon Popoola
    [J]. Iran Journal of Computer Science, 2023, 6 (1) : 81 - 94
  • [44] Incorporating lexical knowledge via wordnet to latent dirichlet allocation in offensive message detection
    Gitari, Njagi Dennis
    Zuping, Zhang
    Hanyurwimfura, Damien
    Long, Jun
    [J]. Journal of Computational and Theoretical Nanoscience, 2016, 13 (05) : 3464 - 3471
  • [45] Bug localization using latent Dirichlet allocation
    Lukins, Stacy K.
    Kraft, Nicholas A.
    Etzkorn, Letha H.
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2010, 52 (09) : 972 - 990
  • [46] Author Identification Using Latent Dirichlet Allocation
    Calvo, Hiram
    Hernandez-Castaneda, Angel
    Garcia-Flores, Jorge
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2017, PT II, 2018, 10762 : 303 - 312
  • [47] Employing Latent Dirichlet Allocation for fraud detection in telecommunications
    Xing, Dongshan
    Girolami, Mark
    [J]. PATTERN RECOGNITION LETTERS, 2007, 28 (13) : 1727 - 1734
  • [48] Unsupervised Domain Discovery using Latent Dirichlet Allocation for Acoustic Modelling in Speech Recognition
    Doulaty, Mortaza
    Saz, Oscar
    Hain, Thomas
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3640 - 3644
  • [49] Arabic Domain-Oriented Sentiment Lexicon Construction Using Latent Dirichlet Allocation
    Alshahrani, Hasan A.
    Fong, Alvis C.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT), 2018, : 174 - 180
  • [50] Topic Regression Multi-Modal Latent Dirichlet Allocation for Image Annotation
    Putthividhya, Duangmanee
    Attias, Hagai T.
    Nagarajan, Srikantan S.
    [J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 3408 - 3415