Using Latent Dirichlet Allocation to Incorporate Domain Knowledge For Topic Transition Detection

被引:0
|
作者
Zhu, Xiaodan [1 ]
He, Xuming [1 ]
Munteanu, Cosmin [1 ]
Penn, Gerald [1 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
关键词
slides transition detection; boundary detection;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies automatic detection of topic transitions for recorded presentations. This can be achieved by matching slide content with presentation transcripts directly with some similarity metrics. Such literal matching, however, misses domain-specific knowledge and is sensitive to speech recognition errors. In this paper, we incorporate relevant written materials, e.g., textbooks for lectures, which convey semantic relationships, in particular domain-specific relationships, between words. To this end, we train latent Dirichlet allocation (LDA) models on these materials and measure the similarity between slides and transcripts in the acquired hidden-topic space. This similarity is then combined with literal matchings. Experiments show that the proposed approach reduces the errors in slide transition detection by 17-41% on manual transcripts and 27-37% on automatic transcripts.
引用
收藏
页码:2442 / 2445
页数:4
相关论文
共 50 条
  • [21] Approaches to improve preprocessing for Latent Dirichlet Allocation topic modeling
    Zimmermann, Jamie
    Champagne, Lance E.
    Dickens, John M.
    Hazen, Benjamin T.
    [J]. DECISION SUPPORT SYSTEMS, 2024, 185
  • [22] Context-Aware Latent Dirichlet Allocation for Topic Segmentation
    Li, Wenbo
    Matsukawa, Tetsu
    Saigo, Hiroto
    Suzuki, Einoshin
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT I, 2020, 12084 : 475 - 486
  • [23] An Improved Latent Dirichlet Allocation Model for Hot Topic Extraction
    Liu, Guolong
    Xu, Xiaofei
    Zhu, Ying
    Li, Li
    [J]. 2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 470 - 476
  • [24] Topic modeling with latent Dirichlet allocation for cancer disease posts
    Altintas, Volkan
    Albayrak, Mehmet
    Topal, Kamil
    [J]. JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2021, 36 (04): : 2183 - 2196
  • [25] Constrained Latent Dirichlet Allocation for Subgroup Discovery with Topic Rules
    Li, Rui
    Ahmadi, Zahra
    Kramer, Stefan
    [J]. 21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 519 - +
  • [26] Topic Modelling Twitter Data with Latent Dirichlet Allocation Method
    Negara, Edi Surya
    Triadi, Dendi
    Andryani, Ria
    [J]. 2019 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICECOS 2019), 2019, : 386 - 390
  • [27] Language Model Adaptation Using Latent Dirichlet Allocation and an Efficient Topic Inference Algorithm
    Heidel, Aaron
    Chang, Hung-an
    Lee, Lin-shan
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1145 - +
  • [28] Mining Web Log Data for News Topic Modeling Using Latent Dirichlet Allocation
    Surjandari, Isti
    Rosyidah, Asma
    Zulkarnain
    Laoh, Enrico
    [J]. 2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 331 - 335
  • [29] Topic analysis of online reviews for two competitive products using latent Dirichlet allocation
    Wang, Wenxin
    Feng, Yi
    Dai, Wenqiang
    [J]. ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2018, 29 : 142 - 156
  • [30] HDPauthor: A New Hybrid Author-Topic Model using Latent Dirichlet Allocation and Hierarchical Dirichlet Processes
    Yang, Ming
    Hsu, Willian H.
    [J]. PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16 COMPANION), 2016, : 619 - 624