Learning Data Dependency with Communication Cost

被引:2
|
作者
Jang, Hyeryung [1 ]
Song, HyungSeok [2 ]
Yi, Yung [2 ]
机构
[1] Kings Coll London, Dept Informat, London, England
[2] Dept Elect Engn, Daejeon, South Korea
关键词
Graph structure learning; Distributed inference; Sample complexity; BELIEF-PROPAGATION; INFERENCE; PRODUCT;
D O I
10.1145/3209582.3209600
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we consider the problem of recovering a graph that represents the statistical data dependency among nodes for a set of data samples generated by nodes, which provides the basic structure to perform an inference task, such as MAP (maximum a posteriori). This problem is referred to as structure learning. When nodes are spatially separated in different locations, running an inference algorithm requires a non-negligible amount of message passing, incurring some communication cost. We inevitably have the trade-off between the accuracy of structure learning and the cost we need to pay to perform a given message-passing based inference task because the learnt edge structures of data dependency and physical connectivity graph are often highly different. In this paper, we formalize this trade-off in an optimization problem which outputs the data dependency graph that jointly considers learning accuracy and message-passing cost. We focus on a distributed MAP as the target inference task due to its popularity, and consider two different implementations, ASYNC-MAP and SYNC-MAP that have different message-passing mechanisms and thus different cost structures. In ASYNC-MAP, we propose a polynomial time learning algorithm that is optimal, motivated by the problem of finding a maximum weight spanning tree. In SYNC-MAP, we first prove that it is NP-hard and propose a greedy heuristic. For both implementations, we then quantify how the probability that the resulting data graphs from those learning algorithms differ from the ideal data graph decays as the number of data samples grows, using the large deviation principle, where the decaying rate is characterized by some topological structures of both original data dependency and physical connectivity graphs as well as the degree of the trade-off, which provides some guideline on how many samples are necessary to obtain a certain learning accuracy. We validate our theoretical findings through extensive simulations, which confirm that it has a good match.
引用
收藏
页码:171 / 180
页数:10
相关论文
共 50 条
  • [21] Dependency Learning for QBF
    Peitl, Tomas
    Slivovsky, Friedrich
    Szeider, Stefan
    [J]. THEORY AND APPLICATIONS OF SATISFIABILITY TESTING (SAT 2017), 2017, 10491 : 298 - 313
  • [22] Dependency Learning for QBF
    Peitl, Tomas
    Slivovsky, Friedrich
    Szeider, Stefan
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2019, 65 : 181 - 208
  • [23] EVALUATION OF THE MOBILE LEARNING AS AN EDUCATIONAL AID FOR LEARNING DATA COMMUNICATION COURSE
    Tahir, Asni
    Tanalol, Siti Hasnah
    Fattah, Salmah
    [J]. 2011 4TH INTERNATIONAL CONFERENCE OF EDUCATION, RESEARCH AND INNOVATION (ICERI), 2011, : 2090 - 2099
  • [24] Coding based Distributed Data Shuffling for Low Communication Cost in Data Center Networks
    Liang, Junpeng
    Yang, Lei
    Wang, Zhenyu
    Liu, Xuxun
    Wu, Weigang
    [J]. 2020 16TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2020), 2020, : 700 - 705
  • [25] Machine learning and modeling: Data, validation, communication challenges
    El Naqa, Issam
    Ruan, Dan
    Valdes, Gilmer
    Dekker, Andre
    McNutt, Todd
    Ge, Yaorong
    Wu, Q. Jackie
    Oh, Jung Hun
    Thor, Maria
    Smith, Wade
    Rao, Arvind
    Fuller, Clifton
    Xiao, Ying
    Manion, Frank
    Schipper, Matthew
    Mayo, Charles
    Moran, Jean M.
    Ten Haken, Randall
    [J]. MEDICAL PHYSICS, 2018, 45 (10) : E834 - E840
  • [26] Communication Efficient Distributed Learning with Feature Partitioned Data
    Zhang, Bingwen
    Geng, Jun
    Xu, Weiyu
    Lai, Lifeng
    [J]. 2018 52ND ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2018,
  • [27] INTERPROCESS COMMUNICATION DEPENDENCY ON NETWORK LOAD
    BRACCINI, A
    DELBIMBO, A
    VICARIO, E
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1991, 17 (04) : 357 - 369
  • [28] Learning communication patterns for malware discovery in HTTPs data
    Kohout, Jan
    Komarek, Tornag
    Cchc, Premysl
    Bodnar, Jan
    Lokoc, Jakub
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 101 : 129 - 142
  • [29] Machine Learning Algorithms for Data Categorization and Analysis in Communication
    Xian, Tan
    [J]. THIRD INTERNATIONAL CONFERENCE ON INFORMATION SECURITY AND INTELLIGENT CONTROL (ISIC 2012), 2012, : 1 - 3
  • [30] COST AND DEPENDENCY AS DETERMINANTS OF HELPING AND EXPLOITATION
    GRUDER, CL
    [J]. JOURNAL OF CONFLICT RESOLUTION, 1974, 18 (03) : 473 - 485