Tailored AVX2 Transform Kernels for Versatile Video Coding

被引:0
|
作者
Siivonen, Kari [1 ]
Sainio, Joose [1 ]
Mercat, Alexandre [1 ]
Vanne, Jarno [1 ]
机构
[1] Tampere Univ, Ultra Video Grp, Tampere, Finland
基金
芬兰科学院;
关键词
Versatile Video Coding (VVC); transform; complexity reduction; Advanced Vector Extensions 2 (AVX2); practical encoder implementation;
D O I
10.1109/NorCAS58970.2023.10305449
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transform coding tools play an integral part in video codecs due to their substantial impact on coding efficiency. The latest video coding standard, Versatile Video Coding (VVC), makes the most of these tools by introducing new DST7, DCT8, and non-square transforms alongside the conventional DCT2 transform. This paper proposes optimized AVX2 kernels for all these transforms to speed up VVC coding. Unlike existing solutions, our kernels are specially tailored for each VVC transform type and block size. Accelerating our open-source uvg266 VVC encoder with the proposed kernels yields up to a 1.1x speedup under all intra (AI) coding condition without any coding overhead. Our implementations make forward DCT2 and DST7/DCT8 transforms 4.0x and 6.7x as fast as their respective scalar implementations in the VTM reference encoder. They also outpace the AVX2 kernels of the practical VVenC encoder by factors of 3.0x and 2.8x. The respective speedups rise up to 5.3x, 11.1x, 3.4x, and 3.0x with inverse transforms.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] High-Speed AVX2 Implementation of AKCN-MLWE
    Yang H.
    Liu Z.
    Huang J.-H.
    Shen S.-Y.
    Zhao Y.-L.
    Liu, Zhe (zhe.liu@nuaa.edu.cn), 1600, Science Press (44): : 2560 - 2572
  • [22] IMPROVED QUANTIZATION AND TRANSFORM COEFFICIENT CODING FOR THE EMERGING VERSATILE VIDEO CODING (VVC) STANDARD
    Schwarz, Heiko
    Tung Nguyen
    Marpe, Detlev
    Wiegand, Thomas
    Karczewicz, Marta
    Coban, Muhammed
    Dong, Jie
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1183 - 1187
  • [23] High Performance Implementation of 2-D Convolution using AVX2
    Amiri, Hossein
    Shahbahrami, Asadollah
    2017 19TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND DIGITAL SYSTEMS (CADS), 2017, : 24 - 27
  • [24] Transform coefficients distribution of the future versatile video coding (VVC) standard
    Li, Yang
    Mou, Xuanqin
    OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY V, 2018, 10817
  • [25] A multicriteria optimization of the discrete sine transform for versatile video coding standard
    Sonda Ben Jdidia
    Fatma Belghith
    Maher Jridi
    Nouri Masmoudi
    Signal, Image and Video Processing, 2022, 16 : 329 - 337
  • [26] A multicriteria optimization of the discrete sine transform for versatile video coding standard
    Ben Jdidia, Sonda
    Belghith, Fatma
    Jridi, Maher
    Masmoudi, Nouri
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (02) : 329 - 337
  • [27] A high-throughput unified transform architecture for Versatile Video Coding
    Mohd Rafi Lone
    Cluster Computing, 2025, 28 (5)
  • [28] Hardware Acceleration of Approximate Transform Module for the Versatile Video Coding Standard
    Kammoun, Ahmed
    Hamidouche, Wassim
    Philippe, Pierrick
    Belghith, Fatma
    Massmoudi, Nouri
    Nezan, Jean-Francois
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [29] Reconfigurable Adaptive Multiple Transform Hardware Solutions for Versatile Video Coding
    Sau, Carlo
    Ligas, Dario
    Fanni, Tiziana
    Raffo, Luigi
    Palumbo, Francesca
    IEEE ACCESS, 2019, 7 : 153258 - 153268
  • [30] Nibbling MAYO: Optimized Implementations for AVX2 and Cortex-M4
    Beullens W.
    Campos F.
    Celi S.
    Hess B.
    Kannwischer M.J.
    IACR Transactions on Cryptographic Hardware and Embedded Systems, 2024, 2024 (02): : 252 - 275