ARF-Net: a multi-modal aesthetic attention-based fusion

被引:0
|
作者
Iffath, Fariha [1 ]
Gavrilova, Marina [1 ]
机构
[1] Univ Calgary, Calgary, AB, Canada
来源
VISUAL COMPUTER | 2024年 / 40卷 / 07期
基金
加拿大自然科学与工程研究理事会;
关键词
Audio-visual aesthetics; Image processing; Multimedia content; Biometric identification; Multi-modal aesthetics; Transfer learning; Attention-based fusion; IDENTIFICATION;
D O I
10.1007/s00371-024-03492-2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Over the last decade, Online Social Media platforms have witnessed a dramatic expansion due to the substantial reliance of individuals on these communication channels. These platforms are widely utilized to convey emotions, share opinions, and express preferences through various means such as artworks, multimedia contents, and blogs. Researchers are exploring these individual-specific traits for biometric identification. Aesthetic biometric systems utilize users' unique preferences across various subjective forms such as images, music, and textual contents. This study introduces a novel multi-modal aesthetic system, with a primary contribution to the development of an attention-based fusion method for person identification. The proposed identification system leverages a deep pre-trained model for high-level feature extraction from visual and auditory modalities. The paper introduces a novel fusion architecture named attention-based residual fusion network (ARF-Net) to incorporate two heterogeneous aesthetic feature vectors. The proposed model yielded a 99.38% identification accuracy on the Aesthetic Image Audio 32 (AIA32) dataset and 98.02% identification accuracy on Aesthetic Image Audio 52 (AIA52) dataset, outperforming other aesthetic biometric systems. The proposed architecture stands out for its efficiency, showcasing a lightweight architecture with minimal parameters, ensuring optimal performance in different modalities.
引用
收藏
页码:4941 / 4953
页数:13
相关论文
共 50 条
  • [1] Attention-based multi-modal fusion sarcasm detection
    Liu, Jing
    Tian, Shengwei
    Yu, Long
    Long, Jun
    Zhou, Tiejun
    Wang, Bo
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2097 - 2108
  • [2] Attention-Based Multi-Modal Fusion Network for Semantic Scene Completion
    Li, Siqi
    Zou, Changqing
    Li, Yipeng
    Zhao, Xibin
    Gao, Yue
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11402 - 11409
  • [3] An attention-based multi-modal MRI fusion model for major depressive disorder diagnosis
    Zheng, Guowei
    Zheng, Weihao
    Zhang, Yu
    Wang, Junyu
    Chen, Miao
    Wang, Yin
    Cai, Tianhong
    Yao, Zhijun
    Hu, Bin
    [J]. JOURNAL OF NEURAL ENGINEERING, 2023, 20 (06)
  • [4] Attention-Based Multi-Modal Multi-View Fusion Approach for Driver Facial Expression Recognition
    Chen, Jianrong
    Dey, Sujit
    Wang, Lei
    Bi, Ning
    Liu, Peng
    [J]. IEEE Access, 2024, 12 : 137203 - 137221
  • [5] MULTI-MODAL HIERARCHICAL ATTENTION-BASED DENSE VIDEO CAPTIONING
    Munusamy, Hemalatha
    Sekhar, Chandra C.
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 475 - 479
  • [6] Automatic depression prediction via cross-modal attention-based multi-modal fusion in social networks
    Wang, Lidong
    Zhang, Yin
    Zhou, Bin
    Cao, Shihua
    Hu, Keyong
    Tan, Yunfei
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2024, 118
  • [7] Attention-based multi-modal fusion for improved real estate appraisal: a case study in Los Angeles
    Junchi Bin
    Bryan Gardiner
    Zheng Liu
    Eric Li
    [J]. Multimedia Tools and Applications, 2019, 78 : 31163 - 31184
  • [8] Attention-based multi-modal fusion for improved real estate appraisal: a case study in Los Angeles
    Bin, Junchi
    Gardiner, Bryan
    Liu, Zheng
    Li, Eric
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 31163 - 31184
  • [9] Fuel consumption prediction for pre-departure flights using attention-based multi-modal fusion
    Lin, Yi
    Guo, Dongyue
    Wu, Yuankai
    Li, Lishuai
    Wu, Edmond Q.
    Ge, Wenyi
    [J]. INFORMATION FUSION, 2024, 101
  • [10] High-Resolution Depth Maps Imaging via Attention-Based Hierarchical Multi-Modal Fusion
    Zhong, Zhiwei
    Liu, Xianming
    Jiang, Junjun
    Zhao, Debin
    Chen, Zhiwen
    Ji, Xiangyang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 648 - 663