Multi-label movie genre classification based on multimodal fusion

被引:7
|
作者
Cai, Zihui [1 ]
Ding, Hongwei [1 ]
Wu, Jinlu [1 ]
Xi, Ying [1 ]
Wu, Xuemeng [1 ]
Cui, Xiaohui [1 ]
机构
[1] Wuhan Univ, Minist Educ, Key Lab Aerosp Informat Secur & Trusted Comp, Sch Cyber Sci & Engn, Wuhan, Peoples R China
关键词
Multi-label; Movie genre classification; Multimodal fusion; Deep learning; RECOGNITION; NETWORK;
D O I
10.1007/s11042-023-16121-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Determining the genre of a movie based on its relevant information is a challenging multi-label classification task. Previous studies tended to classify movies based on only one or two modalities, ignoring some valuable modalities. Considering this, we propose a multimodal movie genre classification framework which comprehensively considers the data from different modalities including the audio, poster, plot and frame sequences from video. To be specific, it processes the data from various modalities with the help of deep learning technologies, and fuses them in the way of decision-level fusion and intermediate fusion including concatenation and element-wise sum, which can improve the classification performance due to making full use of the information complementarity between multiple modalities. We train and evaluate the proposed framework on the LMTD-9 dataset. The results show that our best multimodal model outperforms state-of-the-art methods by 8.6% improvement in AU(PRC) and 5.3% improvement in AU(PRC)(w). It can be seen that the performance of movie genre classification can be effectively improved by means of multimodal fusion.
引用
收藏
页码:36823 / 36840
页数:18
相关论文
共 50 条
  • [1] Multi-label movie genre classification based on multimodal fusion
    Zihui Cai
    Hongwei Ding
    Jinlu Wu
    Ying Xi
    Xuemeng Wu
    Xiaohui Cui
    [J]. Multimedia Tools and Applications, 2024, 83 : 36823 - 36840
  • [2] A multimodal approach for multi-label movie genre classification
    Rafael B. Mangolin
    Rodolfo M. Pereira
    Alceu S. Britto
    Carlos N. Silla
    Valéria D. Feltrim
    Diego Bertolini
    Yandre M. G. Costa
    [J]. Multimedia Tools and Applications, 2022, 81 : 19071 - 19096
  • [3] A multimodal approach for multi-label movie genre classification
    Mangolin, Rafael B.
    Pereira, Rodolfo M.
    Britto, Alceu S., Jr.
    Silla, Carlos N., Jr.
    Feltrim, Valeria D.
    Bertolini, Diego
    Costa, Yandre M. G.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (14) : 19071 - 19096
  • [4] Evaluating multimodal strategies for multi-label movie genre classification
    Paulino, Marco Aurelio D.
    Costa, Yandre M. G.
    Feltrim, Valeria D.
    [J]. 2022 29TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2022,
  • [5] Video Representation Fusion Network For Multi-Label Movie Genre Classification
    Bi, Tianyu
    Jarnikov, Dmitri
    Lukkien, Johan
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9386 - 9391
  • [6] A multi-label movie genre classification scheme based on the movie’s subtitles
    Nikhil Kumar Rajput
    Bhavya Ahuja Grover
    [J]. Multimedia Tools and Applications, 2022, 81 : 32469 - 32490
  • [7] A multi-label movie genre classification scheme based on the movie's subtitles
    Rajput, Nikhil Kumar
    Grover, Bhavya Ahuja
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (22) : 32469 - 32490
  • [8] Movie genre classification: A multi-label approach based on convolutions through time
    Wehrmann, Jonatas
    Barros, Rodrigo C.
    [J]. APPLIED SOFT COMPUTING, 2017, 61 : 973 - 982
  • [9] A Turkish Topic Modeling Dataset For Multi-label Classification of Movie Genre
    Jabrayilzade, Elgun
    Arslan, Algin Poyraz
    Para, Hasan
    Polatbilek, Ozan
    Sezerer, Erhan
    Tekir, Selma
    [J]. 2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [10] Detecting Musical Genre Borders For Multi-label Genre Classification
    Nakamura, Hiroki
    Huang, Hung-Hsuan
    Kawagoe, Kyoji
    [J]. 2013 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2013, : 532 - 533