Optimizing image processing on multi-core CPUs with Intel parallel programming technologies

被引:0
|
作者
Cheong Ghil Kim
Jeom Goo Kim
Do Hyeon Lee
机构
[1] Namseoul University,Department of Computer Science
[2] Namseoul University,IT Convergence Technology Research & Education Center
来源
关键词
Multi-core; Streaming SIMD extension; Threading building block; Sobel operator; Sub-word parallelism; Task-level parallelism; Multimedia;
D O I
暂无
中图分类号
学科分类号
摘要
The rapid advance of computer hardware and popularity of multimedia applications enable multi-core processors with sub-word parallelism instructions to become a dominant market trend in desk-top PCs as well as high end mobile devices. This paper presents an efficient parallel implementation of 2D convolution algorithm demanding high performance computing power in multi-core desktop PCs. It is a representative computation intensive algorithm, in image and signal processing applications, accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. The purpose of this study is to explore the effectiveness of exploiting the streaming SIMD (Single Instruction Multiple Data) extension (SSE) technology and TBB (Threading Building Block) run-time library in Intel multi-core processors. By doing so, we can take advantage of all the hardware features of multi-core processor concurrently for data- and task-level parallelism. For the performance evaluation, we implemented a 3 × 3 kernel based convolution algorithm using SSE2 and TBB with different combinations and compared their processing speeds. The experimental results show that both technologies have a significant effect on the performance and the processing speed can be greatly improved when using two technologies at the same time; for example, 6.2, 6.1, and 1.4 times speedup compared with the implementation of either of them are suggested for 256 × 256, 512 × 512, and 1024 × 1024 data sets, respectively.
引用
收藏
页码:237 / 251
页数:14
相关论文
共 50 条
  • [41] Multi-core CPUs, Clusters, and Grid Computing: A Tutorial
    Michael Creel
    William L. Goffe
    Computational Economics, 2008, 32
  • [42] Performance analysis & improvement of SNPHAP on Multi-core CPUs
    Ranokphanuwat, Ratthaslip
    Kittitornkun, Surin
    Tongsima, Sissades
    2013 10TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2013,
  • [43] Multi-core CPUs, Clusters, and Grid Computing: A Tutorial
    Creel, Michael
    Goffe, William L.
    COMPUTATIONAL ECONOMICS, 2008, 32 (04) : 353 - 382
  • [44] Beyond Gbps Turbo Decoder on Multi-Core CPUs
    Cassagne, Adrien
    Tonnellier, Hibaud
    Leroux, Camille
    Le Gal, Bertrand
    Aumage, Olivier
    Barthou, Denis
    2016 9TH INTERNATIONAL SYMPOSIUM ON TURBO CODES AND ITERATIVE INFORMATION PROCESSING (ISTC), 2016, : 136 - 140
  • [45] Leveraging Multi-Core CPUs in the Context of Demand Planning
    Tinnefeld, Christian
    Mueller, Stephan H.
    Krueger, Jens
    Grund, Martin
    Zeier, Alexander
    2009 IEEE 16TH INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1 AND 2, PROCEEDINGS, 2009, : 2007 - 2011
  • [46] A Case Study on the Performance of Gazebo with Multi-core CPUs
    Yang, Hai
    Wang, Xuefei
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2017, PT I, 2017, 10462 : 671 - 682
  • [47] Efficient Implementation of XPath Processoron Multi-Core CPUs
    Krulis, Martin
    Yaghob, Jakub
    PROCEEDINGS OF THE DATESO 2010 WORKSHOP - DATESO DATABASES, TEXTS, SPECIFICATIONS, AND OBJECTS, 2010, 567 : 60 - 71
  • [48] High performance parallel -means clustering for disk-resident datasets on multi-core CPUs
    Hadian, Ali
    Shahrivari, Saeed
    JOURNAL OF SUPERCOMPUTING, 2014, 69 (02): : 845 - 863
  • [49] Digital Image Processing through Parallel Computing in Single-Core and Multi-Core Systems using MATLAB
    Reddy, Ritesh
    Kompala, Neeharika
    Chowdary, Nitin Ravi
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2017, : 462 - 465
  • [50] Optimizing Parallel Sn Sweeps on Unstructured Grids for Multi-Core Clusters
    闫洁
    谭光明
    孙凝晖
    JournalofComputerScience&Technology, 2013, 28 (04) : 657 - 670