Scheduling dense linear algebra operations on multicore processors

被引:49
|
作者
Kurzak, Jakub [1 ]
Ltaief, Hatem [1 ]
Dongarra, Jack [1 ,2 ,3 ,4 ]
Badia, Rosa M. [5 ,6 ]
机构
[1] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN 37996 USA
[2] Oak Ridge Natl Lab, Div Math & Comp Sci, Oak Ridge, TN 37831 USA
[3] Univ Manchester, Sch Math, Manchester, Lancs, England
[4] Univ Manchester, Sch Comp Sci, Manchester, Lancs, England
[5] Ctr Nacl Supercomputac, Barcelona, Spain
[6] Barcelona Supercomp Ctr, Barcelona, Spain
来源
基金
美国国家科学基金会;
关键词
task graph; scheduling; multicore; linear algebra; factorization; Cholesky; LU; QR; direct acyclic graph; dynamic scheduling; matrix factorization; QR FACTORIZATION; RECURSION; EQUATIONS; LEADS;
D O I
10.1002/cpe.1467
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
State-of-the-art dense linear algebra software, such as the LAPACK and ScaLAPACK libraries, suffers performance losses on multicore processors due to their inability to fully exploit thread-level parallelism. At the same time, the coarse-grain dataflow model gains popularity as a paradigm for programming multicore architectures. This Work looks at implementing classic dense linear algebra workloads, the Cholesky factorization, the QR factorization and the LU factorization, using dynamic data-driven execution. Two emerging approaches to implementing coarse-grain dataflow are examined, the model of nested parallelism, represented by the Cilk framework, and the model of parallelism expressed through an arbitrary Direct Acyclic Graph, represented by the SMP Superscalar framework. Performance and coding effort are analyzed and compared against code manually parallelized at the thread level. Copyright (C) 2009 John Wiley & Sons, Ltd.
引用
收藏
页码:15 / 44
页数:30
相关论文
共 50 条
  • [1] Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors
    Catalan, Sandra
    Herrero, Jose R.
    Igual, Francisco D.
    Rodriguez-Sanchez, Rafael
    Quintana-Orti, Enrique S.
    Adeniyi-Jones, Chris
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 25 : 140 - 151
  • [2] Optimization of Linear Algebra Core Function Framework on Multicore Processors
    Fang, Zhi
    [J]. APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2022, 8 (01) : 1585 - 1596
  • [3] Scaling Dense Linear Algebra on Multicore and Beyond: a Survey
    Viviani, Paolo
    Drocco, Maurizio
    Aldinucci, Marco
    [J]. 2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 751 - 758
  • [4] DVFS-control techniques for dense linear algebra operations on multi-core processors
    Alonso, Pedro
    Dolz, Manuel F.
    Igual, Francisco D.
    Mayo, Rafael
    Quintana-Orti, Enrique S.
    [J]. COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2012, 27 (04): : 289 - 298
  • [5] Fast Development of Dense Linear Algebra Codes on Graphics Processors
    Jesus Zafont, M.
    Martin, Alberto
    Igual, Francisco
    Quintana-Orti, Enrique S.
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 1713 - 1720
  • [6] Adaptive Task Scheduling on Multicore Processors
    Nour, Samar
    Mahmoud, Shahira
    Saleh, Mohamed
    [J]. INTERNATIONAL CONFERENCE ON ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS (AMLTA2018), 2018, 723 : 575 - 584
  • [7] Efficient Computation of Linkage Disequilibria as Dense Linear Algebra Operations
    Alachiotis, Nikolaos
    Popovici, Thom
    Low, Tze Meng
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 418 - 427
  • [8] Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures
    Haidar, Azzam
    Ltaief, Hatem
    YarKhan, Asim
    Dongarra, Jack
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (03): : 305 - 321
  • [9] Job Scheduling in a Computational Cluster with Multicore Processors
    Tran Thi Xuan
    Tien Van Do
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING (ICCSAMA 2016), 2016, 453 : 75 - 84
  • [10] Enhanced energy aware scheduling in multicore processors
    Kumar, K. Vinod
    Ranvijay
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (02) : 1375 - 1385