Partitioning Streaming Parallelism for Multi-cores: A Machine Learning Based Approach

被引:0
|
作者
Wang, Zheng [1 ]
O'Boyle, Michael F. P. [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Inst Comp Syst Architecture, Edinburgh EH8 9YL, Midlothian, Scotland
关键词
Compiler Optimization; Machine Learning; Partitioning Streaming Parallelism;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Stream based languages are a popular approach to expressing parallelism in modern applications. The efficient mapping of streaming parallelism to multi-core processors is, however, highly dependent on the program and underlying architecture. We address this by developing a portable and automatic compiler-based approach to partitioning streaming programs using machine learning. Our technique predicts the ideal partition structure for a given streaming application using prior knowledge learned off-line. Using the predictor we rapidly search the program space (without executing any code) to generate and select a good partition. We applied this technique to standard Stream It applications and compared against existing approaches. On a 4-core platform, our approach achieves 60% of the best performance found by iteratively compiling and executing over 3000 different partitions per program. We obtain, on average, a 1.90x speedup over the already tuned partitioning scheme of the Stream It compiler. When compared against a state-of-the-art analytical, model-based approach, we achieve, on average, a 1.77x performance improvement. By porting our approach to a 8-core platform, we are able to obtain 1.8x improvement over the Stream It default scheme, demonstrating the portability of our approach.
引用
收藏
页码:307 / 318
页数:12
相关论文
共 50 条
  • [1] Mapping Parallelism to Multi-cores: A Machine Learning Based Approach
    Wang, Zheng
    O'Boyle, Michael F. P.
    [J]. ACM SIGPLAN NOTICES, 2009, 44 (04) : 75 - 84
  • [2] Mapping parallelism to multi-cores: A machine learning based approach
    Member of HiPEAC, School of Informatics, University of Edinburgh, United Kingdom
    [J]. ACM SIGPLAN Not., 2009, 4 (75-84):
  • [3] Online Power Management for Multi-Cores: A Reinforcement Learning Based Approach
    Wang, Yiming
    Zhang, Weizhe
    Hao, Meng
    Wang, Zheng
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (04) : 751 - 764
  • [4] Combining stream with data parallelism abstractions for multi-cores
    Loff, Junior
    Hoffmann, Renato B.
    Griebler, Dalvan
    Fernandes, Luiz G.
    [J]. JOURNAL OF COMPUTER LANGUAGES, 2022, 73
  • [5] Using Machine Learning Techniques for Performance Prediction on Multi-Cores
    Rai, Jitendra Kumar
    Negi, Atul
    Wankar, Rajeev
    [J]. INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2011, 3 (04) : 14 - 28
  • [6] Seamless Parallelism Management for Video Stream Processing on Multi-Cores
    Vogel, Adriano
    Griebler, Dalvan
    Fernandes, Luiz Gustavo
    Danelutto, Marco
    [J]. PARALLEL COMPUTING: TECHNOLOGY TRENDS, 2020, 36 : 533 - 542
  • [7] Stream Parallelism on the LZSS Data Compression Application for Multi-Cores with GPUs
    Stein, Charles Michael
    Griebler, Dalvan
    Danelutto, Marco
    Fernandes, Luiz Gustavo
    [J]. 2019 27TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP), 2019, : 247 - 251
  • [8] Exploring locking & partitioning for predictable shared caches on multi-cores
    Suhendra, Vivy
    Mitra, Tulika
    [J]. 2008 45TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2008, : 300 - 303
  • [9] High-level and efficient structured stream parallelism for rust on multi-cores
    Pieper, Ricardo
    Loff, Junior
    Hoffmann, Renato B.
    Griebler, Dalvan
    Fernandes, Luiz G.
    [J]. JOURNAL OF COMPUTER LANGUAGES, 2021, 65
  • [10] OpenMP as runtime for providing high-level stream parallelism on multi-cores
    Renato B. Hoffmann
    Júnior Löff
    Dalvan Griebler
    Luiz G. Fernandes
    [J]. The Journal of Supercomputing, 2022, 78 : 7655 - 7676