Deep Multimodality Learning for UAV Video Aesthetic Quality Assessment

被引:19
|
作者
Kuang, Qi [1 ]
Jin, Xin [2 ]
Zhao, Qinping [1 ]
Zhou, Bin [1 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[2] Beijing Elect Sci & Technol Inst, Beijing 100070, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Aesthetic quality assessment; aerial video aesthetic; deep multimodality learning; OBSTACLE AVOIDANCE; CLASSIFICATION; CATEGORIZATION; GENERATION; PHOTO;
D O I
10.1109/TMM.2019.2960656
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the growing number of unmanned aerial vehicles (UAVs) and aerial videos, there is a paucity of studies focusing on the aesthetics of aerial videos that can provide valuable information for improving the aesthetic quality of aerial photography. In this article, we present a method of deep multimodality learning for UAV video aesthetic quality assessment. More specifically, a multistream framework is designed to exploit aesthetic attributes from multiple modalities, including spatial appearance, drone camera motion, and scene structure. A novel specially designed motion stream network is proposed for this new multistream framework. We construct a dataset with 6,000 UAV video shots captured by drone cameras. Our model can judge whether a UAV video was shot by professional photographers or amateurs together with the scene type classification. The experimental results reveal that our method outperforms the video classification methods and traditional SVM-based methods for video aesthetics. In addition, we present three application examples of UAV video grading, professional segment detection and aesthetic-based UAV path planning using the proposed method.
引用
收藏
页码:2623 / 2634
页数:12
相关论文
共 50 条
  • [41] Framework for biometric iris recognition in video, by deep learning and quality assessment of the iris-pupil region
    Eduardo Garea-Llano
    Annette Morales-Gonzalez
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 6517 - 6529
  • [42] Framework for biometric iris recognition in video, by deep learning and quality assessment of the iris-pupil region
    Garea-Llano, Eduardo
    Morales-Gonzalez, Annette
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (6) : 6517 - 6529
  • [43] Deep learning and handcrafted features for one-class anomaly detection in UAV video
    Chriki, Amira
    Touati, Haifa
    Snoussi, Hichem
    Kamoun, Farouk
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (02) : 2599 - 2620
  • [44] Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment
    Chen, Baoliang
    Zhu, Lingyu
    Li, Guo
    Lu, Fangbo
    Fan, Hongfei
    Wang, Shiqi
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 1903 - 1916
  • [45] No Reference Video Quality Assessment Based on Transfer Learning
    Zhang Hao
    Sang Qingbing
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2018, 55 (09)
  • [46] Spatiotemporal Representation Learning for Blind Video Quality Assessment
    Liu, Yongxu
    Wu, Jinjian
    Li, Leida
    Dong, Weisheng
    Zhang, Jinpeng
    Shi, Guangming
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3500 - 3513
  • [47] Video Quality Assessment and Machine Learning: Performance and Interpretability
    Sogaard, Jacob
    Forchhammer, Soren
    Korhonen, Jari
    [J]. 2015 SEVENTH INTERNATIONAL WORKSHOP ON QUALITY OF MULTIMEDIA EXPERIENCE (QOMEX), 2015,
  • [48] Deep learning and handcrafted features for one-class anomaly detection in UAV video
    Amira Chriki
    Haifa Touati
    Hichem Snoussi
    Farouk Kamoun
    [J]. Multimedia Tools and Applications, 2021, 80 : 2599 - 2620
  • [49] VIDEO QUALITY ASSESSMENT USING TEMPORAL QUALITY VARIATIONS AND MACHINE LEARNING
    Narwaria, Manish
    Lin, Weisi
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [50] Deep Blind Video Quality Assessment for User Generated Videos
    Tang, Jiapeng
    Dong, Yu
    Xie, Rong
    Gu, Xiao
    Song, Li
    Li, Lin
    Zhou, Bing
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2020, : 156 - 159