Applying Transformer-Based Computer Vision Models to Adaptive Bitrate Allocation for 360° Live Streaming

被引:2
|
作者
Ao, Alice [1 ]
Park, Sohee [1 ]
机构
[1] Yale Univ, Comp Sci, New Haven, CT 06520 USA
关键词
360 degrees Video; 360 degrees Live Streaming; Adaptive Streaming; Transformers; Machine Learning;
D O I
10.1109/WCNC57260.2024.10571028
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the heightened popularity of virtual reality (VR) and 360 degrees video, 360 degrees content remains expensive and difficult to stream. 360 degrees live streaming is especially challenging, as it requires high bandwidth and low latency to avoid quality and motion-sickness issues. This paper explores how adaptive bitrate allocation, in which only the user's predicted viewport is streamed in high quality, and the rest of the view is streamed in low quality, can lead to increases in viewport quality. Transformer-based saliency models pre-trained on 2D images are used for viewport prediction. Key contributions include 1) determining whether transformer-based models for 2D images are effective for saliency detection of 360 degrees content 2) examining viewport prediction accuracy of saliency-only models and 3) a novel bitrate allocation algorithm. Empirical results demonstrate that even without access to head-movement data or fine-tuning, these models lead to increased quality in a user's perceived viewport over traditional non-adaptive streaming.
引用
收藏
页数:6
相关论文
共 20 条
  • [1] EFFICIENT PER-SHOT TRANSFORMER-BASED BITRATE LADDER PREDICTION FOR ADAPTIVE VIDEO STREAMING
    Telili, Ahmed
    Hamidouche, Wassim
    Fezza, Sid Ahmed
    Morin, Luce
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1835 - 1839
  • [2] Hybrid-360: An adaptive bitrate algorithm for tile-based 360 video streaming
    Yang, Shujie
    Hu, Jialu
    Jiang, Ke
    Xiao, Han
    Wang, Mu
    [J]. TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2022, 33 (04)
  • [3] Strawberry disease identification with vision transformer-based models
    Nguyen, Hai Thanh
    Tran, Tri Dac
    Nguyen, Thanh Tuong
    Pham, Nhi Minh
    Nguyen Ly, Phuc Hoang
    Luong, Huong Hoang
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (29) : 73101 - 73126
  • [4] Computer Vision-Based Monitoring of Construction Site Housekeeping: An Evaluation of CNN and Transformer-Based Models
    Shao, Zherui
    Goh, Yang Miang
    Tian, Jing
    Lim, Yu Guang
    Gan, Vincent Jie Long
    [J]. COMPUTING IN CIVIL ENGINEERING 2023-RESILIENCE, SAFETY, AND SUSTAINABILITY, 2024, : 508 - 515
  • [5] Securing Tiny Transformer-based Computer Vision Models: Evaluating Real-World Patch Attacks
    Mattei, Andrea
    Scherer, Moritz
    Cioflan, Cristian
    Magno, Michele
    Benini, Luca
    [J]. 2023 IEEE 9TH WORLD FORUM ON INTERNET OF THINGS, WF-IOT, 2023,
  • [6] Performance Comparison of Vision Transformer-Based Models in Medical Image Classification
    Kanca, Elif
    Ayas, Selen
    Kablan, Elif Baykal
    Ekinci, Murat
    [J]. 2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [7] VISION TRANSFORMER-BASED RETINA VESSEL SEGMENTATION WITH DEEP ADAPTIVE GAMMA CORRECTION
    Yu, Hyunwoo
    Shim, Jae-hun
    Kwak, Jaeho
    Song, Jou Won
    Kang, Suk-Ju
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1456 - 1460
  • [8] Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
    Wu, Chunyang
    Wang, Yongqiang
    Shi, Yangyang
    Yeh, Ching-Feng
    Zhang, Frank
    [J]. INTERSPEECH 2020, 2020, : 2132 - 2136
  • [9] Generalizability of Convolutional Neural Network and Vision Transformer-Based OCT Segmentation Models
    Pely, Adam
    Wu, Zhichao
    Leng, Theodore
    Gao, Simon S.
    Chen, Hao
    Hejrati, Mohsen
    Zhang, Miao
    [J]. INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
  • [10] An SDN-Based Device-Aware Live Video Service For Inter-Domain Adaptive Bitrate Streaming
    Khalid, Ahmed
    Zahran, Ahmed H.
    Sreenan, Cormac J.
    [J]. PROCEEDINGS OF THE 10TH ACM MULTIMEDIA SYSTEMS CONFERENCE (ACM MMSYS'19), 2019, : 121 - 132