Extendable MQTT Broker for Feedback-based Resource Management in Large-scale Computing Environments

被引:0
|
作者
Ouchi, Ryo [1 ]
Sakamoto, Ryuichi [1 ]
机构
[1] Tokyo Inst Technol, Tokyo, Japan
来源
PROCEEDINGS OF THE 7TH ASIA-PACIFIC WORKSHOP ON NETWORKING, APNET 2023 | 2023年
关键词
D O I
10.1145/3600061.3603129
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High-performance computing (HPC) systems demand continuous monitoring to ensure efficient resource allocation and application performance. Recent studies indicate that real-time resource utilization monitoring can significantly improve the performance of dynamic scheduling algorithms. However, latency induced by protocol stack heavily impacts the effectiveness of dynamic scheduling. In this paper, we propose a novel monitoring system that implements the protocol stack on a Field-Programmable Gate Array (FPGA) and adopts a publish/subscribe (pub/sub) communication protocol. Specifically, by introducing an FPGA-based protocol stack, we substantially reduce the latency of protocol stack processing and enable the implementation of custom plugins at the L7 layer. Our experiments demonstrate that the proposed system effectively reduces protocol stack latency and, with the extensibility provided by user-defined plugins, offers great potential for a wide range of HPC monitoring and feedback applications.
引用
收藏
页码:190 / 191
页数:2
相关论文
共 50 条
  • [31] Exploiting resource profiling mechanism for large-scale scientific computing on grids
    Hossain, Md. Azam
    Cao Ngoc Nguyen
    Kim, Jik-Soo
    Hwang, Soonwook
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (03): : 1527 - 1539
  • [32] Exploiting resource profiling mechanism for large-scale scientific computing on grids
    Md. Azam Hossain
    Cao Ngoc Nguyen
    Jik-Soo Kim
    Soonwook Hwang
    Cluster Computing, 2016, 19 : 1527 - 1539
  • [33] Grid authorization management oriented to large-scale collaborative computing
    Huang, CQ
    Zhu, ZT
    Wang, XQ
    Chen, D
    COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN I, 2004, 3168 : 55 - 66
  • [34] Delaunay State Management for Large-scale Networked Virtual Environments
    Chien, Chien-Hao
    Hu, Shun-Yun
    Jiang, Jehn-Ruey
    PROCEEDINGS OF THE 2008 14TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2008, : 781 - 786
  • [35] Towards Visual SLAM with Memory Management for Large-Scale Environments
    Li, Fu
    Yang, Shaowu
    Yi, Xiaodong
    Yang, Xuejun
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 776 - 786
  • [36] Resource Bundles: Using Aggregation for Statistical Large-Scale Resource Discovery and Management
    Cardosa, Michael
    Chandra, Abhishek
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2010, 21 (08) : 1089 - 1102
  • [37] Personalized recommendation based on large-scale implicit feedback
    Yin, Jian, 1953, Chinese Academy of Sciences (25):
  • [38] Large-Scale Experiment for Topology-Aware Resource Management
    Georgiou, Yiannis
    Mercier, Guillaume
    Villiermet, Adele
    EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 179 - 186
  • [39] STORM:: Scalable resource management for large-scale parallel computers
    Frachtenberg, Eitan
    Petrini, Fabrizio
    Fernandez, Juan
    Pakin, Scott
    IEEE TRANSACTIONS ON COMPUTERS, 2006, 55 (12) : 1572 - 1587
  • [40] Hierarchical information combination in large-scale multiagent resource management
    Yadgar, O
    Kraus, S
    Ortiz, CL
    COMMUNICATION IN MULTIAGENT SYSTEMS: AGENT COMMUNICATION LANGUAGES AND CONVERSATION POLICIES, 2003, 2650 : 129 - 145