Exploring challenge and explainable shot type classification using SAM-guided approaches

被引:0
|
作者
Fengtian Lu
Yuzhi Li
Feng Tian
机构
[1] Shanghai University,Shanghai Film Academy
来源
关键词
Shot attribute analysis; Shot type classification; Segment anything; Supplementary dataset;
D O I
暂无
中图分类号
学科分类号
摘要
The language of film shots is an important component of cinematic narrative, as it can visually convey the story, emotions, and themes, making films a highly expressive and engaging art form. In previous methods for analyzing film shot attributes, the focus has mainly been on movements and scale with a lack of interpretable research on the results of shot type analysis. In this study, we have built a new dataset to broaden the scope of existing shot attribute analysis tasks, such as distinguishing film composition, and introduced a new task: recognizing the key objects that determine shot attributes. Specifically, we have proposed a framework that utilizes clues from the Detection Transformer (DETR) to guide the use of segment anything (SAM) for mask segmentation to classify shot attributes. To address the issue of variable quantities of key objects within shots, we have developed an adaptive weight allocation strategy that enhances network training and provides a more effective approach to handling the new task we have introduced. Additionally, we extract optical flow magnitude and angle information from each pair of frames to enhance training effectiveness. Subsequent experimental results on MovieShots and our dataset demonstrate that our proposed method surpasses all prior approaches.
引用
收藏
页码:2533 / 2542
页数:9
相关论文
共 18 条
  • [1] Exploring challenge and explainable shot type classification using SAM-guided approaches
    Lu, Fengtian
    Li, Yuzhi
    Tian, Feng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2533 - 2542
  • [2] ProZe: Explainable and Prompt-Guided Zero-Shot Text Classification
    Harrando, Ismail
    Reboud, Alison
    Schleider, Thomas
    Ehrhart, Thibault
    Troncy, Raphael
    IEEE INTERNET COMPUTING, 2022, 26 (06) : 69 - 77
  • [3] HistoEM: A Pathologist-Guided and Explainable Workflow Using Histogram Embedding for Gland Classification
    Ferrero, Alessandro
    Ghelichkhan, Elham
    Manoochehri, Hamid
    Ho, Man Minh
    Albertson, Daniel J.
    Brintz, Benjamin J.
    Tasdizen, Tolga
    Whitaker, Ross T.
    Knudsen, Beatrice S.
    MODERN PATHOLOGY, 2024, 37 (04)
  • [4] Classification of Impact Echo Signals Using Explainable Deep Learning and Transfer Learning Approaches
    Torlapati, Rahul
    Azari, Hoda
    Shokouhi, Parisa
    TRANSPORTATION RESEARCH RECORD, 2023, 2677 (09) : 464 - 477
  • [5] Few-shot learning using explainable Siamese twin network for the automated classification of blood cells
    Tummala, Sudhakar
    Suresh, Anil K.
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (06) : 1549 - 1563
  • [6] Few-shot learning using explainable Siamese twin network for the automated classification of blood cells
    Sudhakar Tummala
    Anil K. Suresh
    Medical & Biological Engineering & Computing, 2023, 61 : 1549 - 1563
  • [7] Shot type classification in sports video using fuzzy information granular
    Lang, CY
    Xu, D
    Cheng, WG
    Jiang, YW
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2005, 3682 : 1217 - 1223
  • [8] PFEMed: Few-shot medical image classification using prior guided feature enhancement
    Dai, Zhiyong
    Yi, Jianjun
    Yan, Lei
    Xu, Qingwen
    Hu, Liang
    Zhang, Qi
    Li, Jiahui
    Wang, Guoqiang
    PATTERN RECOGNITION, 2023, 134
  • [9] Exploring Nutritional Influence on Blood Glucose Forecasting for Type 1 Diabetes Using Explainable AI
    Annuzzi, Giovanni
    Apicella, Andrea
    Arpaia, Pasquale
    Bozzetto, Lutgarda
    Criscuolo, Sabatina
    De Benedetto, Egidio
    Pesola, Marisa
    Prevete, Roberto
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (05) : 3123 - 3133
  • [10] Exploring the Impact of Image-Based Audio Representations in Classification Tasks Using Vision Transformers and Explainable AI Techniques
    Masri, Sari
    Hasasneh, Ahmad
    Tami, Mohammad
    Tadj, Chakib
    Information (Switzerland), 2024, 15 (12)