DeHIN: A Decentralized Framework for Embedding Large-Scale Heterogeneous Information Networks

被引:3
|
作者
Imran, Mubashir [1 ]
Yin, Hongzhi [1 ]
Chen, Tong [1 ]
Huang, Zi [1 ]
Zheng, Kai [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610056, Sichuan, Peoples R China
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
Heterogeneous networks; Task analysis; Parallel processing; Data models; Pipelines; Computational modeling; Training; Decentralized network embedding; heterogeneous networks; link prediction; node classification;
D O I
10.1109/TKDE.2022.3141951
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modeling heterogeneity by extraction and exploitation of high-order information from heterogeneous information networks (HINs) has been attracting immense research attention in recent times. Such heterogeneous network embedding (HNE) methods effectively harness the heterogeneity of small-scale HINs. However, in the real world, the size of HINs grow exponentially with the continuous introduction of new nodes and different types of links, making it a billion-scale network. Learning node embeddings on such HINs creates a performance bottleneck for existing HNE methods that are commonly centralized, i.e., complete data and the model are both on a single machine. To address large-scale HNE tasks with strong efficiency and effectiveness guarantee, we present Decentralized Embedding Framework for Heterogeneous Information Network (DeHIN) in this paper. In DeHIN, we generate a distributed parallel pipeline that utilizes hypergraphs in order to infuse parallelization into the HNE task. DeHIN presents a context preserving partition mechanism that innovatively formulates a large HIN as a hypergraph, whose hyperedges connect semantically similar nodes. Our framework then adopts a decentralized strategy to efficiently partition HINs by adopting a tree-like pipeline. Then, each resulting subnetwork is assigned to a distributed worker, which employs the deep information maximization theorem to locally learn node embeddings from the partition it receives. We further devise a novel embedding alignment scheme to precisely project independently learned node embeddings from all subnetworks onto a common vector space, thus allowing for downstream tasks like link prediction and node classification. As shown from our experimental results, DeHIN significantly improves the efficiency and accuracy of existing HNE models as well as outperforms the large-scale graph embedding frameworks by efficiently scaling up to large-scale HINs.
引用
收藏
页码:3645 / 3657
页数:13
相关论文
共 50 条
  • [1] Decentralized Embedding Framework for Large-Scale Networks
    Imran, Mubashir
    Yin, Hongzhi
    Chen, Tong
    Shao, Yingxia
    Zhang, Xiangliang
    Zhou, Xiaofang
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT III, 2020, 12114 : 425 - 441
  • [2] A General Embedding Framework for Heterogeneous Information Learning in Large-Scale Networks
    Huang, Xiao
    Li, Jundong
    Zou, Na
    Hu, Xia
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (06)
  • [3] DDHH: A Decentralized Deep Learning Framework for Large-scale Heterogeneous Networks
    Imran, Mubashir
    Yin, Hongzhi
    Chen, Tong
    Huang, Zi
    Zhang, Xiangliang
    Zheng, Kai
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2033 - 2038
  • [4] A flexible aggregation framework on large-scale heterogeneous information networks
    Yin, Dan
    Gao, Hong
    [J]. JOURNAL OF INFORMATION SCIENCE, 2017, 43 (02) : 186 - 203
  • [5] Large-Scale Heterogeneous Feature Embedding
    Huang, Xiao
    Song, Qingquan
    Yang, Fan
    Hu, Xia
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3878 - 3885
  • [6] COSINE: Compressive Network Embedding on Large-Scale Information Networks
    Zhang, Zhengyan
    Yang, Cheng
    Liu, Zhiyuan
    Sun, Maosong
    Fang, Zhichong
    Zhang, Bo
    Lin, Leyu
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3655 - 3668
  • [7] An Adaptive Embedding Framework for Heterogeneous Information Networks
    Chen, Daoyuan
    Li, Yaliang
    Ding, Bolin
    Shen, Ying
    [J]. CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 165 - 174
  • [8] PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks
    Tang, Jian
    Qu, Meng
    Mei, Qiaozhu
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1165 - 1174
  • [9] A Framework of Transferring Structures Across Large-scale Information Networks
    Xue, Shan
    Lu, Jie
    Zhang, Guangquan
    Xiong, Li
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [10] Decentralized Ranking in Large-Scale Overlay Networks
    Montresor, Alberto
    Jelasity, Mark
    Babaoglu, Ozalp
    [J]. SASOW 2008: SECOND IEEE INTERNATIONAL CONFERENCE ON SELF-ADAPTIVE AND SELF-ORGANIZING SYSTEMS WORKSHOPS, PROCEEDINGS, 2008, : 208 - +