Topic detection and tracking for conversational content by using conceptual dynamic latent Dirichlet allocation

被引:38
|
作者
Yeh, Jui-Feng [1 ]
Tan, Yi-Shan [1 ]
Lee, Chen-Hsien [1 ]
机构
[1] Natl Chiayi Univ, Dept Comp Sci & Informat Engn, Chiayi, Taiwan
关键词
Latent Dirichlet allocation; Hypernym; Speech act; Spoken language understanding; Topic detection and tracking; MODEL;
D O I
10.1016/j.neucom.2016.08.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study proposes a conceptual dynamic latent Dirichlet allocation (CDLDA) model for topic detection and tracking in conversational content. Topic detection and tracking is vital for conversational communication, especially for spoken interactions. Because topic transitions occur frequently during conversational communication (i.e., a conversation usually contains many topics), language processors must detect different topics in conversational content. Considering the structure of spoken dialogue, the dynamic model was employed in this study to capture the sequence of two adjacent topics in spoken content. The proposed model applies the proportions of verbs and nouns to analyze the similarity between utterances. An agglomerative clustering algorithm, based on an ontology defined in E-HowNet, clusters conversational utterances. Because the topic structure of conversational content is friable, E-HowNet uses hypernym relationships of speech acts to obtain robust solutions, even for sparse data. Compared with the traditional latent Dirichlet allocation (LDA) model, which detects topics only through a bag-of-words technique, the proposed model considers temporal features by introducing dynamic concepts. Experimental results revealed that the proposed approach outperformed the traditional DLDA and LDA and support vector machine models, in addition to achieving excellent performance for topic detection and tracking in conversations. (C) 2016 Elsevier B.V. All rights
引用
收藏
页码:310 / 318
页数:9
相关论文
共 50 条
  • [1] Topic Model Allocation of Conversational Dialogue Records by Latent Dirichlet Allocation
    Yeh, Jui-Feng
    Lee, Chen-Hsien
    Tan, Yi-Shiuan
    Yu, Liang-Chih
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [2] Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation
    Bolelli, Levent
    Ertekin, Seyda
    Giles, C. Lee
    [J]. ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5478 : 776 - +
  • [3] Using Latent Dirichlet Allocation to Incorporate Domain Knowledge For Topic Transition Detection
    Zhu, Xiaodan
    He, Xuming
    Munteanu, Cosmin
    Penn, Gerald
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2442 - 2445
  • [4] Topic Modeling Using Latent Dirichlet allocation: A Survey
    Chauhan, Uttam
    Shah, Apurva
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (07)
  • [5] Using Latent Dirichlet Allocation for Topic Modelling in Twitter
    Ostrowski, David Alfred
    [J]. 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 493 - 497
  • [6] An Improved Latent Dirichlet Allocation Method for Service Topic Detection
    Guo Lantian
    Li Zhe
    Yang Tao
    Zhang Huixiang
    Mu Dejun
    Li Yang
    [J]. PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 7045 - 7049
  • [7] Topic Selection in Latent Dirichlet Allocation
    Wang, Biao
    Liu, Zelong
    Li, Maozhen
    Liu, Yang
    Qi, Man
    [J]. 2014 11TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2014, : 756 - 760
  • [8] Topic modeling for expert finding using latent Dirichlet allocation
    Momtazi, Saeedeh
    Naumann, Felix
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 346 - 353
  • [9] Topic Modeling Twitter Data Using Latent Dirichlet Allocation and Latent Semantic Analysis
    Qomariyah, Siti
    Iriawan, Nur
    Fithriasari, Kartika
    [J]. 2ND INTERNATIONAL CONFERENCE ON SCIENCE, MATHEMATICS, ENVIRONMENT, AND EDUCATION, 2019, 2019, 2194
  • [10] Road Traffic Topic Modeling on Twitter using Latent Dirichlet Allocation
    Hidayatullah, Ahmad Fathan
    Ma'arif, Muhammad Rifqi
    [J]. 2017 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET), 2017, : 47 - 52