Joint dynamic topic model for recognition of lead-lag relationship in two text corpora

被引:0
|
作者
Yandi Zhu
Xiaoling Lu
Jingya Hong
Feifei Wang
机构
[1] Peking University,School of Economics
[2] Renmin University of China,Center for Applied Statistics
[3] Renmin University of China,School of Statistics
来源
关键词
Dynamic topic models; Embedding; Lead-lag relationship; Topic evolution; Variational Bayesian;
D O I
暂无
中图分类号
学科分类号
摘要
Topic evolution modeling has received significant attentions in recent decades. Although various topic evolution models have been proposed, most studies focus on the single document corpus. However in practice, we can easily access data from multiple sources and also observe relationships between them. Then it is of great interest to recognize the relationship between multiple text corpora and further utilize this relationship to improve topic modeling. In this work, we focus on a special type of relationship between two text corpora, which we define as the “lead-lag relationship". This relationship characterizes the phenomenon that one text corpus would influence the topics to be discussed in the other text corpus in the future. To discover the lead-lag relationship, we propose a joint dynamic topic model and also develop an embedding extension to address the modeling problem of large-scale text corpus. With the recognized lead-lag relationship, the similarities of the two text corpora can be figured out and the quality of topic learning in both corpora can be improved. We numerically investigate the performance of the joint dynamic topic modeling approach using synthetic data. Finally, we apply the proposed model on two text corpora consisting of statistical papers and the graduation theses. Results show the proposed model can well recognize the lead-lag relationship between the two corpora, and the specific and shared topic patterns in the two corpora are also discovered.
引用
收藏
页码:2272 / 2298
页数:26
相关论文
共 50 条
  • [1] Joint dynamic topic model for recognition of lead-lag relationship in two text corpora
    Zhu, Yandi
    Lu, Xiaoling
    Hong, Jingya
    Wang, Feifei
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 36 (06) : 2272 - 2298
  • [2] Bayesian sparse joint dynamic topic model with flexible lead-lag order
    Wang, Feifei
    Zhou, Rui
    Feng, Yichao
    Lu, Xiaoling
    [J]. INFORMATION SCIENCES, 2022, 616 : 392 - 410
  • [3] Bayesian sparse joint dynamic topic model with flexible lead-lag order
    Wang, Feifei
    Zhou, Rui
    Feng, Yichao
    Lu, Xiaoling
    [J]. Information Sciences, 2022, 616 : 392 - 410
  • [4] Exploring Topical Lead-Lag across Corpora
    Liu, Shixia
    Chen, Yang
    Wei, Hao
    Yang, Jing
    Zhou, Kun
    Drucker, Steven M.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (01) : 115 - 129
  • [5] A lead-lag analysis of the topic evolution patterns for preprints and publications
    Hu, Beibei
    Dong, Xianlei
    Zhang, Chenwei
    Bowman, Timothy D.
    Ding, Ying
    Milojevic, Stasa
    Ni, Chaoqun
    Yan, Erjia
    Lariviere, Vincent
    [J]. JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2015, 66 (12) : 2643 - 2656
  • [6] THE LEAD-LAG RELATIONSHIP OF MONEY, INCOME, AND PRICES IN MALAYSIA
    LEE, SY
    LI, WK
    [J]. SINGAPORE ECONOMIC REVIEW, 1985, 30 (01): : 68 - 76
  • [7] Lead-lag relationship, volatility asymmetry, and overreaction phenomenon
    Chang, Chih-Hsiang
    Cheng, Hsin-I
    Huang, I-Hsiang
    Huang, Hsu-Huei
    [J]. MANAGERIAL FINANCE, 2011, 37 (01) : 47 - 71
  • [8] The Tail Dependence and Lead-Lag Relationship in Financial Markets
    Mar'I, Muhammad
    Seraj, Mehdi
    [J]. ASIA-PACIFIC FINANCIAL MARKETS, 2024,
  • [9] Measuring the dynamic lead-lag relationship between the cash market and stock index futures market
    Ma, Chaoqun
    Xiao, Ru
    Mi, Xianhua
    [J]. FINANCE RESEARCH LETTERS, 2022, 47
  • [10] LEAD-LAG TOPIC EVOLUTION ANALYSIS: PREPRINTS VS. PAPERS (RIP)
    Ding, Ying
    Yan, Erjia
    Sugimoto, Cassidy
    Milojevic, Stasa
    [J]. 14TH INTERNATIONAL SOCIETY OF SCIENTOMETRICS AND INFORMETRICS CONFERENCE (ISSI), 2013, : 1106 - 1113