Tailored AVX2 Transform Kernels for Versatile Video Coding

被引:0
|
作者
Siivonen, Kari [1 ]
Sainio, Joose [1 ]
Mercat, Alexandre [1 ]
Vanne, Jarno [1 ]
机构
[1] Tampere Univ, Ultra Video Grp, Tampere, Finland
基金
芬兰科学院;
关键词
Versatile Video Coding (VVC); transform; complexity reduction; Advanced Vector Extensions 2 (AVX2); practical encoder implementation;
D O I
10.1109/NorCAS58970.2023.10305449
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transform coding tools play an integral part in video codecs due to their substantial impact on coding efficiency. The latest video coding standard, Versatile Video Coding (VVC), makes the most of these tools by introducing new DST7, DCT8, and non-square transforms alongside the conventional DCT2 transform. This paper proposes optimized AVX2 kernels for all these transforms to speed up VVC coding. Unlike existing solutions, our kernels are specially tailored for each VVC transform type and block size. Accelerating our open-source uvg266 VVC encoder with the proposed kernels yields up to a 1.1x speedup under all intra (AI) coding condition without any coding overhead. Our implementations make forward DCT2 and DST7/DCT8 transforms 4.0x and 6.7x as fast as their respective scalar implementations in the VTM reference encoder. They also outpace the AVX2 kernels of the practical VVenC encoder by factors of 3.0x and 2.8x. The respective speedups rise up to 5.3x, 11.1x, 3.4x, and 3.0x with inverse transforms.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Fair Scheduling for AVX2 and AVX-512 Workloads
    Gottschlag, Mathias
    Machauer, Philipp
    Khalil, Yussuf
    Bellosa, Frank
    PROCEEDINGS OF THE 2021 USENIX ANNUAL TECHNICAL CONFERENCE, 2021, : 745 - 758
  • [2] Optimizing Dilithium Implementation with AVX2/-512
    Xu, Runqing
    He, Debiao
    Luo, Min
    Peng, Cong
    Zeng, Xiangyong
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (06)
  • [3] Discrete Tchebichef Transform for Versatile Video Coding
    Chan, Ka-Hou
    Im, Sio-Kei
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 623 - 626
  • [4] String searching with mismatches using AVX2 and AVX-512 instructions
    Chhabra, Tamanna
    Ghuman, Sukhpal Singh
    Tarhio, Jorma
    INFORMATION PROCESSING LETTERS, 2025, 189
  • [5] Residual Coding for Transform Skip Mode in Versatile Video Coding
    Nguyen, T.
    Bross, B.
    Schwarz, H.
    Marpe, D.
    Wiegand, T.
    2020 DATA COMPRESSION CONFERENCE (DCC 2020), 2020, : 83 - 92
  • [6] Transform Skip Residual Coding for the Versatile Video Coding Standard
    Bross, Benjamin
    Tung Nguyen
    Schwarz, Heiko
    Marpe, Detlev
    Wiegand, Thomas
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XLII, 2019, 11137
  • [7] SIMD vectorization for the Lennard-Jones potential with AVX2 and AVX-512 instructions
    Watanabe, Hiroshi
    Nakagawa, Koh M.
    COMPUTER PHYSICS COMMUNICATIONS, 2019, 237 : 1 - 7
  • [8] Faster Population Counts Using AVX2 Instructions
    Mula, Wojciech
    Kurz, Nathan
    Lemire, Daniel
    COMPUTER JOURNAL, 2018, 61 (01): : 111 - 120
  • [9] AKCN-MLWE算法AVX2高效实现
    杨昊
    刘哲
    黄军浩
    沈诗羽
    赵运磊
    计算机学报, 2021, 44 (12) : 2560 - 2572
  • [10] Fast Adaptive Multiple Transform for Versatile Video Coding
    Zhang, Zhaobin
    Zhao, Xin
    Li, Xiang
    Li, Zhu
    Liu, Shan
    2019 DATA COMPRESSION CONFERENCE (DCC), 2019, : 63 - 72