An Aesthetic-Driven Approach to Unsupervised Video Summarization

被引:1
|
作者
Huang, Hongben [1 ]
Wu, Zaiqun [2 ]
Pang, Guangyao [3 ]
Xie, Jiehang [3 ]
机构
[1] Wuzhou Univ, Guangxi Key Lab Machine Vis & Intelligent Control, Wuzhou 543002, Peoples R China
[2] Baise Univ, Baise 533000, Peoples R China
[3] Guangxi Coll & Univ Key Lab Intelligent Ind Softwa, Wuzhou 543002, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Video summarization; feature extraction; multimodal information; ATTENTION; NETWORK;
D O I
10.1109/ACCESS.2024.3434508
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The aim of video summarization is to condense lengthy videos into shorter versions, making them more accessible for viewing. Typically, people can identify important shots within a video by using audiovisual cues and assessing the aesthetic attributes of the frames. However, existing methods either focus only on unimodal features or neglect the aesthetic attributes of videos, resulting in the limited quality of the generated summaries. Particularly, the reliance on annotated data for training models also imposes limitations, as it not only demands significant time and resources but may not capture the diverse and subjective nature across different videos. To tackle these issues, we propose an aesthetic-driven approach to unsupervised video summarization, namely ADUVS. Specifically, ADUVS incorporates an aesthetics encoder to extract key aesthetic attributes. Additionally, we design a multimodal fusion module that assesses how different modalities of information complement each other and highlights the most relevant segments for the desired summary. Moreover, the training process for ADUVS does not require reliance on annotated data, thus reducing both time and labor costs. Extensive experiments demonstrate that our proposed method is better than various benchmark methods across commonly used evaluation metrics.
引用
收藏
页码:128768 / 128777
页数:10
相关论文
共 50 条
  • [31] Endoscopy Video Summarization based on Unsupervised Learning and Feature Discrimination
    Ben Ismail, M. Maher
    Bchir, Ouiem
    Emam, Ahmed Z.
    2013 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP 2013), 2013,
  • [32] VIDEO ANALYSIS BASED ON HUMAN POSE FOR UNSUPERVISED SUMMARIZATION AND RETRIEVAL
    Santiago, C.
    Alves, D. M.
    Ferreira, B. Q.
    Carvalho, J.
    Messina, A.
    Costeira, J. P.
    2019 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2019,
  • [33] Time Driven Video Summarization using GMM
    Sujatha, C.
    Chivate, Akshay Ravindra
    Ganihar, Sayed Altaf
    Mudenagudi, Uma
    2013 FOURTH NATIONAL CONFERENCE ON COMPUTER VISION, PATTERN RECOGNITION, IMAGE PROCESSING AND GRAPHICS (NCVPRIPG), 2013,
  • [34] Story-Driven Summarization for Egocentric Video
    Lu, Zheng
    Grauman, Kristen
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2714 - 2721
  • [35] Plot Preservation Approach for Video Summarization
    Lim, Yeosun
    Uh, Youngjung
    Byun, Hyeran
    2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 67 - 71
  • [36] A Domain Independent Approach to Video Summarization
    Dash, Amanda
    Albu, Alexandra Branzan
    ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS (ACIVS 2017), 2017, 10617 : 431 - 442
  • [37] An Algorithmic Approach for General Video Summarization
    Varghese, Jina
    Nair, K. N. Ramachandran
    2015 FIFTH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS (ICACC), 2015, : 7 - 11
  • [38] REGULARIZED SVD-BASED VIDEO FRAME SALIENCY FOR UNSUPERVISED ACTIVITY VIDEO SUMMARIZATION
    Mademlis, Ioannis
    Tefas, Anastasios
    Pitas, Ioannis
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2691 - 2695
  • [39] Multimodal-Based and Aesthetic-Guided Narrative Video Summarization
    Xie, Jiehang
    Chen, Xuanbai
    Zhang, Tianyi
    Zhang, Yixuan
    Lu, Shao-Ping
    Cesar, Pablo
    Yang, Yulu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4894 - 4908
  • [40] ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization
    Shemer, Yair
    Rotman, Daniel
    Shimkin, Nahum
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1259 - 1266