Probabilistic Latent Document Network Embedding

被引:49
|
作者
Le, Tuan M. V. [1 ]
Lauw, Hady W. [1 ]
机构
[1] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
关键词
document network; embedding; visualization; topic modeling; generative model; dimensionality reduction;
D O I
10.1109/ICDM.2014.119
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A document network refers to a data type that can be represented as a graph of vertices, where each vertex is associated with a text document. Examples of such a data type include hyperlinked Web pages, academic publications with citations, and user profiles in social networks. Such data have very high-dimensional representations, in terms of text as well as network connectivity. In this paper, we study the problem of embedding, or finding a low-dimensional representation of a document network that "preserves" the data as much as possible. These embedded representations are useful for various applications driven by dimensionality reduction, such as visualization or feature selection. While previous works in embedding have mostly focused on either the textual aspect or the network aspect, we advocate a holistic approach by finding a unified low-rank representation for both aspects. Moreover, to lend semantic interpretability to the low-rank representation, we further propose to integrate topic modeling and embedding within a joint model. The gist is to join the various representations of a document (words, links, topics, and coordinates) within a generative model, and to estimate the hidden representations through MAP estimation. We validate our model on real-life document networks, showing that it outperforms comparable baselines comprehensively on objective evaluation metrics.
引用
收藏
页码:270 / 279
页数:10
相关论文
共 50 条
  • [21] Matrix factorization based Bayesian network embedding for efficient probabilistic inferences
    Qi, Zhiwei
    Yue, Kun
    Duan, Liang
    Wang, Jiahui
    Qiao, Shaojie
    Fu, Xiaodong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169
  • [22] Embedding based quantile regression neural network for probabilistic load forecasting
    Dahua GAN
    Yi WANG
    Shuo YANG
    Chongqing KANG
    [J]. Journal of Modern Power Systems and Clean Energy, 2018, 6 (02) : 244 - 254
  • [23] Embedding based quantile regression neural network for probabilistic load forecasting
    Gan, Dahua
    Wang, Yi
    Yang, Shuo
    Kang, Chongqing
    [J]. JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, 2018, 6 (02) : 244 - 254
  • [24] Field-aware Probabilistic Embedding Neural Network for CTR Prediction
    Liu, Weiwen
    Tang, Ruiming
    Li, Jiajin
    Yu, Jinkai
    Guo, Huifeng
    He, Xiuqiang
    Zhang, Shengyu
    [J]. 12TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS), 2018, : 412 - 416
  • [25] Artificial neural network for document classification using latent semantic indexing
    Li, Cheng Hua
    Park, Soon Cheol
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON INFORMATION TECHNOLOGY CONVERGENCE, PROCEEDINGS, 2007, : 17 - 21
  • [26] The Benefit of Document Embedding in Unsupervised Document Classification
    Novotny, Jaromir
    Ircing, Pavel
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 470 - 478
  • [27] Incremental community discovery via latent network representation and probabilistic inference
    Zhe Cui
    Noseong Park
    Tanmoy Chakraborty
    [J]. Knowledge and Information Systems, 2020, 62 : 2281 - 2300
  • [28] Incremental community discovery via latent network representation and probabilistic inference
    Cui, Zhe
    Park, Noseong
    Chakraborty, Tanmoy
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (06) : 2281 - 2300
  • [29] Document classification using deep neural network with different word embedding techniques
    Kathiria, Preeti
    Patel, Usha
    Kansara, Nishant
    [J]. International Journal of Web Engineering and Technology, 2022, 17 (02) : 203 - 222
  • [30] Structural property-aware multilayer network embedding for latent factor analysis
    Lu, Jie
    Xuan, Junyu
    Zhang, Guangquan
    Luo, Xiangfeng
    [J]. PATTERN RECOGNITION, 2018, 76 : 228 - 241