Using Latent Dirichlet Allocation to Incorporate Domain Knowledge For Topic Transition Detection

被引:0
|
作者
Zhu, Xiaodan [1 ]
He, Xuming [1 ]
Munteanu, Cosmin [1 ]
Penn, Gerald [1 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
关键词
slides transition detection; boundary detection;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies automatic detection of topic transitions for recorded presentations. This can be achieved by matching slide content with presentation transcripts directly with some similarity metrics. Such literal matching, however, misses domain-specific knowledge and is sensitive to speech recognition errors. In this paper, we incorporate relevant written materials, e.g., textbooks for lectures, which convey semantic relationships, in particular domain-specific relationships, between words. To this end, we train latent Dirichlet allocation (LDA) models on these materials and measure the similarity between slides and transcripts in the acquired hidden-topic space. This similarity is then combined with literal matchings. Experiments show that the proposed approach reduces the errors in slide transition detection by 17-41% on manual transcripts and 27-37% on automatic transcripts.
引用
收藏
页码:2442 / 2445
页数:4
相关论文
共 50 条
  • [1] Topic Analysis of the Research Domain in Knowledge Organization: A Latent Dirichlet Allocation Approach
    Joo, Soohyung
    Choi, Inkyung
    Choi, Namjoo
    [J]. KNOWLEDGE ORGANIZATION, 2018, 45 (02): : 170 - 183
  • [2] A Machine Learning Framework for Document Classification by Topic Recognition Using Latent Dirichlet Allocation and Domain Knowledge
    Lavanya, B.
    Vageeswari, U.
    [J]. INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, ICICC 2022, VOL 1, 2023, 473 : 509 - 520
  • [3] Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation
    Bolelli, Levent
    Ertekin, Seyda
    Giles, C. Lee
    [J]. ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5478 : 776 - +
  • [4] Topic Modeling Using Latent Dirichlet allocation: A Survey
    Chauhan, Uttam
    Shah, Apurva
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (07)
  • [5] Using Latent Dirichlet Allocation for Topic Modelling in Twitter
    Ostrowski, David Alfred
    [J]. 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 493 - 497
  • [6] An Improved Latent Dirichlet Allocation Method for Service Topic Detection
    Guo Lantian
    Li Zhe
    Yang Tao
    Zhang Huixiang
    Mu Dejun
    Li Yang
    [J]. PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 7045 - 7049
  • [7] Topic Selection in Latent Dirichlet Allocation
    Wang, Biao
    Liu, Zelong
    Li, Maozhen
    Liu, Yang
    Qi, Man
    [J]. 2014 11TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2014, : 756 - 760
  • [8] Topic detection and tracking for conversational content by using conceptual dynamic latent Dirichlet allocation
    Yeh, Jui-Feng
    Tan, Yi-Shan
    Lee, Chen-Hsien
    [J]. NEUROCOMPUTING, 2016, 216 : 310 - 318
  • [9] Topic modeling for expert finding using latent Dirichlet allocation
    Momtazi, Saeedeh
    Naumann, Felix
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 346 - 353
  • [10] Topic Modeling Twitter Data Using Latent Dirichlet Allocation and Latent Semantic Analysis
    Qomariyah, Siti
    Iriawan, Nur
    Fithriasari, Kartika
    [J]. 2ND INTERNATIONAL CONFERENCE ON SCIENCE, MATHEMATICS, ENVIRONMENT, AND EDUCATION, 2019, 2019, 2194