Learning Data Dependency with Communication Cost

被引:2
|
作者
Jang, Hyeryung [1 ]
Song, HyungSeok [2 ]
Yi, Yung [2 ]
机构
[1] Kings Coll London, Dept Informat, London, England
[2] Dept Elect Engn, Daejeon, South Korea
关键词
Graph structure learning; Distributed inference; Sample complexity; BELIEF-PROPAGATION; INFERENCE; PRODUCT;
D O I
10.1145/3209582.3209600
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we consider the problem of recovering a graph that represents the statistical data dependency among nodes for a set of data samples generated by nodes, which provides the basic structure to perform an inference task, such as MAP (maximum a posteriori). This problem is referred to as structure learning. When nodes are spatially separated in different locations, running an inference algorithm requires a non-negligible amount of message passing, incurring some communication cost. We inevitably have the trade-off between the accuracy of structure learning and the cost we need to pay to perform a given message-passing based inference task because the learnt edge structures of data dependency and physical connectivity graph are often highly different. In this paper, we formalize this trade-off in an optimization problem which outputs the data dependency graph that jointly considers learning accuracy and message-passing cost. We focus on a distributed MAP as the target inference task due to its popularity, and consider two different implementations, ASYNC-MAP and SYNC-MAP that have different message-passing mechanisms and thus different cost structures. In ASYNC-MAP, we propose a polynomial time learning algorithm that is optimal, motivated by the problem of finding a maximum weight spanning tree. In SYNC-MAP, we first prove that it is NP-hard and propose a greedy heuristic. For both implementations, we then quantify how the probability that the resulting data graphs from those learning algorithms differ from the ideal data graph decays as the number of data samples grows, using the large deviation principle, where the decaying rate is characterized by some topological structures of both original data dependency and physical connectivity graphs as well as the degree of the trade-off, which provides some guideline on how many samples are necessary to obtain a certain learning accuracy. We validate our theoretical findings through extensive simulations, which confirm that it has a good match.
引用
收藏
页码:171 / 180
页数:10
相关论文
共 50 条
  • [1] On Cost-Efficient Learning of Data Dependency
    Jang, Hyeryung
    Song, Hyungseok
    Yi, Yung
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2022, 30 (03) : 1382 - 1394
  • [2] Fractal Communication in Software Data Dependency Graphs
    Greenfield, Daniel L.
    Moore, Simon W.
    [J]. SPAA'08: PROCEEDINGS OF THE TWENTIETH ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2008, : 116 - 118
  • [3] Study of data dependency and communication in CFD program
    [J]. Kuang, Zhengqian, 2000, NPU, China (18):
  • [4] Analysis of the wireless communication latency and its dependency on a data size
    Horalek, Josef
    Svoboda, Tomas
    Holik, Filip
    [J]. 2016 17TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI 2016), 2016, : 145 - 149
  • [5] Learning Data Streams With Changing Distributions and Temporal Dependency
    Song, Yiliao
    Lu, Jie
    Lu, Haiyan
    Zhang, Guangquan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 3952 - 3965
  • [6] Communication Cost of Joins over Federated Data
    Cucumides, Tamara
    Reutter, Juan
    [J]. 27TH INTERNATIONAL CONFERENCE ON DATABASE THEORY, ICDT 2024, 2024, 290
  • [7] Decentralized Communication for Data Dependency Analysis Among Process Execution Agents
    Urban, Susan D.
    Liu, Ziao
    Gao, Le
    [J]. INTERNATIONAL JOURNAL OF WEB SERVICES RESEARCH, 2011, 8 (04) : 1 - 28
  • [8] A Graph Based Security Dependency Analysis of Data Communication Networks to Their Topology
    Fouladi, Roja
    Salimi, Somayeh
    Salahi, Ahmad
    [J]. ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 284 - 288
  • [9] COST, DEPENDENCY, AND HELPING
    SCHAPS, E
    [J]. JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1972, 21 (01) : 74 - &
  • [10] FedDBO: A Novel Federated Learning Approach for Communication Cost and Data Heterogeneity Using Dung Beetle Optimizer
    Wang, Dongyan
    Chen, Limin
    Lu, Xiaotong
    Wang, Yidi
    Shen, Yue
    Xu, Jingjing
    [J]. IEEE ACCESS, 2024, 12 : 43396 - 43409