共 50 条
- [21] Auto-Tuning GEMM Kernels for a Decoupled Access/Execute Architecture Processor 2013 FIRST INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2013, : 233 - 239
- [22] Tensile: Auto-tuning GEMM GPU Assembly for All Problem Sizes 2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 1066 - 1075
- [24] Optimizing and Auto-tuning Belief Propagation on the GPU LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2011, 6548 : 121 - 135
- [25] SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions International Journal of Parallel Programming, 2019, 47 : 296 - 316
- [28] Vectorization of High-performance Scientific Calculations Using AVX-512 Intruction Set Lobachevskii Journal of Mathematics, 2019, 40 : 580 - 598
- [29] Parallel GMRES Incomplete Orthogonalization Auto-Tuning PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS), 2011, 4 : 2246 - 2256