Improving AI-assisted video editing: Optimized footage analysis through multi-task learning

被引:0
|
作者
Li, Yuzhi [1 ]
Xu, Haojun [1 ]
Cai, Feifan [1 ]
Tian, Feng [1 ]
机构
[1] Shanghai Univ, Shanghai, Peoples R China
关键词
Footage analysis; Multi-task learning; AI-assisted video editing;
D O I
10.1016/j.neucom.2024.128485
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, AI-assisted video editing has shown promising applications. Understanding and analyzing camera language accurately is fundamental in video editing, guiding subsequent editing and production processes. However, many existing methods for camera language analysis overlook computational efficiency and deployment requirements in favor of improving classification accuracy. Consequently, they often fail to meet the demands of scenarios with limited computing power, such as mobile devices. To address this challenge, this paper proposes an efficient multi-task camera language analysis pipeline based on shared representations. This approach employs a multi-task learning architecture with hard parameter sharing, enabling different camera language classification tasks to utilize the same low-level feature extraction network, thereby implicitly learning feature representations of the footage. Subsequently, each classification sub- task independently learns the high-level semantic information corresponding to the camera language type. This method significantly reduces computational complexity and memory usage while facilitating efficient deployment on devices with limited computing power. Furthermore, to enhance performance, we introduce a dynamic task priority strategy and a conditional dataset downsampling strategy. The experimental results demonstrate that achieved a comprehensive accuracy surpassing all previous methods. Moreover, training time was reduced by 66.33%, inference cost decreased by 59.85%, and memory usage decreased by 31.95% on the 2-task dataset MovieShots; on the 4-task dataset AVE, training time was reduced by 95.34%, inference cost decreased by 97.23%, and memory usage decreased by 61.21%.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Unified Voice Embedding through Multi-task Learning
    Rajenthiran, Jenarthanan
    Sithamaparanathan, Lakshikka
    Uthayakumar, Saranya
    Thayasivam, Uthayasanker
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 178 - 183
  • [32] Multi-task Learning for Mongolian Morphological Analysis
    Liu, Na
    Qing-Dao-Er-Ji, Ren
    Su, Xiangdong
    Ji, Yatu
    Aodengbala
    Liu, Guiping
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX, 2023, 14262 : 65 - 77
  • [33] Improving radial lens distortion correction with multi-task learning
    Janos, Igor
    Benesova, Wanda
    PATTERN RECOGNITION LETTERS, 2024, 183 : 147 - 154
  • [34] Personalized Keyword Spotting through Multi-task Learning
    Yang, Seunghan
    Kim, Byeonggeun
    Chung, Inseop
    Chang, Simyung
    INTERSPEECH 2022, 2022, : 1881 - 1885
  • [35] Improving Robustness of Neural Machine Translation with Multi-task Learning
    Zhou, Shuyan
    Zeng, Xiangkai
    Zhou, Yingqi
    Anastasopoulos, Antonios
    Neubig, Graham
    FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 565 - 571
  • [36] Improving person re-identification by multi-task learning
    Xinyu Ou
    Qianzhi Ma
    Yijin Wang
    Multimedia Tools and Applications, 2019, 78 : 28257 - 28283
  • [37] Improving person re-identification by multi-task learning
    Ling, Hefei
    Wang, Ziyang
    Li, Ping
    Shi, Yuxuan
    Chen, Jiazhong
    Zou, Fuhao
    NEUROCOMPUTING, 2019, 347 : 109 - 118
  • [38] Improving a neural network classifier ensemble with multi-task learning
    Ye, Qiang
    Munro, Paul W.
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 5164 - 5170
  • [39] AI-Assisted Security Alert Data Analysis with Imbalanced Learning Methods
    Ndichu, Samuel
    Ban, Tao
    Takahashi, Takeshi
    Inoue, Daisuke
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [40] Multi-view Multi-task Learning for Improving Autonomous Mammogram Diagnosis
    Kyono, Trent
    Gilbert, Fiona J.
    van der Schaar, Mihaela
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 106, 2019, 106