Analyzing Bayesian Crosslingual Transfer in Topic Models

被引:0
|
作者
Hao, Shudong [1 ]
Paul, Michael J. [1 ]
机构
[1] Univ Colorado Boulder, Informat Sci, Boulder, CO 80309 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a theoretical analysis of crosslingual transfer in probabilistic topic models. By formulating posterior inference through Gibbs sampling as a process of language transfer, we propose a new measure that quantifies the loss of knowledge across languages during this process. This measure enables us to derive a PAC-Bayesian bound that elucidates the factors affecting model quality, both during training and in downstream applications. We provide experimental validation of the analysis on a diverse set of five languages, and discuss best practices for data collection and model design based on our analysis.
引用
收藏
页码:1551 / 1565
页数:15
相关论文
共 50 条
  • [1] An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models
    Hao, Shudong
    Paul, Michael J.
    [J]. COMPUTATIONAL LINGUISTICS, 2020, 46 (01) : 95 - 134
  • [2] Crosslingual Topic Modeling with WikiPDA
    Piccardi, Tiziano
    West, Robert
    [J]. PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 3032 - 3041
  • [3] Bayesian Bridging Topic Models for Classification
    Wu, Meng-Sung
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (05) : 1585 - 1600
  • [4] A Method for Analyzing Solution Diversity in Topic Models
    Uchiyama, Toshio
    [J]. PROCEEDINGS OF 2018 5TH INTERNATIONAL CONFERENCE ON BUSINESS AND INDUSTRIAL RESEARCH (ICBIR): SMART TECHNOLOGY FOR NEXT GENERATION OF INFORMATION, ENGINEERING, BUSINESS AND SOCIAL SCIENCE, 2018, : 29 - 34
  • [5] Analyzing user reviews in tourism with topic models
    Rossetti, Marco
    Stella, Fabio
    Zanker, Markus
    [J]. INFORMATION TECHNOLOGY & TOURISM, 2016, 16 (01) : 5 - 21
  • [6] Analyzing the history of Cognition using Topic Models
    Priva, Uriel Cohen
    Austerweil, Joseph L.
    [J]. COGNITION, 2015, 135 : 4 - 9
  • [7] Bayesian Analysis of Dynamic Linear Topic Models
    Glynn, Chris
    Tokdar, Surya T.
    Howard, Brian
    Banks, David L.
    [J]. BAYESIAN ANALYSIS, 2019, 14 (01): : 53 - 80
  • [8] Diagnosing and Improving Topic Models by Analyzing Posterior Variability
    Xing, Linzi
    Paul, Michael J.
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6005 - 6012
  • [9] Analyzing Topic Models: A Tourism Recommender System Perspective
    Kamal, Maryam
    Romani, Gianfranco
    Ricciuti, Giuseppe
    Anagnostopoulos, Aris
    Chatzigiannakis, Ioannis
    [J]. ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 2, AINA 2024, 2024, 200 : 250 - 262
  • [10] A Bayesian Method for Analyzing Lateral Gene Transfer
    Sjostrand, Joel
    Tofigh, Ali
    Daubin, Vincent
    Arvestad, Lars
    Sennblad, Bengt
    Lagergren, Jens
    [J]. SYSTEMATIC BIOLOGY, 2014, 63 (03) : 409 - 420