Reducing Last Level Cache Pollution in NUMA Multicore Systems for Improving Cache Performance

被引：0

作者：

An, Deukhyeon ^{[1
]}

Kim, Jeehong ^{[1
]}

Han, JungHyun ^{[2
]}

Eom, Young Ik ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Coll Informat & Commun Eng, 2066 Seobu Ro, Suwon 440746, Gyeong Gi Do, South Korea

[2] Korea Univ, Coll Informat & Commun, Seoul 136701, South Korea

来源：

COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2012, PT III | 2012年 / 7335卷

基金：

新加坡国家研究基金会;

关键词：

Cache Pollution; Cache Performance; Last Level Cache; NUMA Scheduling; Task Characteristics; I/O Intensive Task;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Non-uniform memory architecture (NUMA) system has numerous nodes with shared last level cache (LLC). Their shared LLC has brought many benefits in the cache utilization. However, LLC can be seriously polluted by tasks that cause huge I/O traffic for a long time since inclusive cache architecture of LLC replaces valid cache line by back-invalidate. Many research on the page coloring, partitioning, and pollute buffer mechanism handled this cache pollution. But, there are no scheduling approaches considering I/O-intensive tasks in NUMA systems. To address the above problem, OS scheduling that reduces cache pollution is highly needed in NUMA systems. In this paper, we propose a software-based mechanism that reduces shared LLC miss in NUMA systems. Our mechanism includes I/O traffic measurement and devil conscious scheduling. The experimental results show that LLC miss rate can be reduced up to 37.6%, and our approach improves execution time to 1.48%.

引用

页码：272 / 282

页数：11

共 50 条

[41] Impact of level-2 cache sharing on the performance and power requirements of homogeneous multicore embedded systems
Asaduzzaman, Abu
Sibai, Fadi N.
Rani, Maniral
[J]. MICROPROCESSORS AND MICROSYSTEMS, 2009, 33 (5-6) : 388 - 397
[42] Improving cache locking performance of modern embedded systems via the addition of a miss table at the L2 cache level
Asaduzzaman, Abu
Sibai, Fadi N.
Rani, Manira
[J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2010, 56 (4-6) : 151 - 162
[43] A second-level cache with the distance-aware replacement policy for NUMA systems
Chung, SW
Shin, JH
Kim, HS
Jhon, CS
[J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2002, 18 (05) : 803 - 813
[44] Row-Buffer Hit Harvesting in Orchestrated Last-Level Cache and DRAM Scheduling for Heterogeneous Multicore Systems
Song, Yang
Alavoine, Olivier
Lin, Bill
[J]. PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 779 - 784
[45] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
Bae, Han Jun
Choi, Lynn
[J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (10): : 7521 - 7544
[46] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
Han Jun Bae
Lynn Choi
[J]. The Journal of Supercomputing, 2020, 76 : 7521 - 7544
[47] Improving the data cache performance of multiprocessor operating systems
Xia, C
Torrellas, J
[J]. SECOND INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 1996, : 85 - 94
[48] Reducing remote conflict misses: NUMA with remote cache versus COMA
Zhang, Z
Torrellas, J
[J]. THIRD INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE - PROCEEDINGS, 1997, : 272 - 281
[49] Improving GPU Cache Hierarchy Performance with a Fetch and Replacement Cache
Candel, Francisco
Petit, Salvador
Valero, Alejandro
Sahuquillo, Julio
[J]. EURO-PAR 2018: PARALLEL PROCESSING, 2018, 11014 : 235 - 248
[50] Reducing Cache Coherence Traffic with a NUMA-Aware Runtime Approach
Caheny, Paul
Alvarez, Lluc
Derradji, Said
Valero, Mateo
Moreto, Miquel
Casas, Marc
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (05) : 1174 - 1187

← 1 2 3 4 5 →