ResMem-Net: memory based deep CNN for image memorability estimation

被引:2
|
作者
Praveen, Arockia [1 ]
Noorwali, Abdulfattah [2 ]
Samiayya, Duraimurugan [3 ]
Khan, Mohammad Zubair [4 ]
Vincent, Durai Raj P. M. [5 ]
Bashir, Ali Kashif [6 ]
Alagupandi, Vinoth [3 ]
机构
[1] Phosphene AI, Madurai, Tamil Nadu, India
[2] Umm Al Qura Univ, Mecca, Saudi Arabia
[3] Optisol Business Solut, Chennai, Tamil Nadu, India
[4] Taibah Univ, Dept Comp Sci, Medina, Saudi Arabia
[5] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore, Tamil Nadu, India
[6] Manchester Metropolitan Univ, Manchester, Lancs, England
关键词
Deep Learning; Image Memorability; Visual Emotions; Saliency; Object Interestingness;
D O I
10.7717/peerj-cs.767
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image memorability is a very hard problem in image processing due to its subjective nature. But due to the introduction of Deep Learning and the large availability of data and GPUs, great strides have been made in predicting the memorability of an image. In this paper, we propose a novel deep learning architecture called ResMem-Net that is a hybrid of LSTM and CNN that uses information from the hidden layers of the CNN to compute the memorability score of an image. The intermediate layers are important for predicting the output because they contain information about the intrinsic properties of the image. The proposed architecture automatically learns visual emotions and saliency, shown by the heatmaps generated using the GradRAM technique. We have also used the heatmaps and results to analyze and answer one of the most important questions in image memorability: "What makes an image memorable?''. The model is trained and evaluated using the publicly available Large-scale Image Memorability dataset (LaMem) from MIT. The results show that the model achieves a rank correlation of 0.679 and a mean squared error of 0.011, which is better than the current state-of-the-art models and is close to human consistency (p = 0.68). The proposed architecture also has a significantly low number of parameters compared to the state-of-the-art architecture, making it memory efficient and suitable for production.
引用
收藏
页码:1 / 27
页数:27
相关论文
共 50 条
  • [31] Optimal Deep CNN–Based Vectorial Variation Filter for Medical Image Denoising
    Dinesh Kumar Atal
    Journal of Digital Imaging, 2023, 36 : 1216 - 1236
  • [32] Deep CNN based online image deduplication technique for cloud storage system
    Ravneet Kaur
    Jhilik Bhattacharya
    Inderveer Chana
    Multimedia Tools and Applications, 2022, 81 : 40793 - 40826
  • [33] Deep CNN based online image deduplication technique for cloud storage system
    Kaur, Ravneet
    Bhattacharya, Jhilik
    Chana, Inderveer
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 40793 - 40826
  • [34] Multifocus Image Fusion Using Wavelet-Domain-Based Deep CNN
    Li, Jinjiang
    Yuan, Genji
    Fan, Hui
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2019, 2019
  • [35] Retraction Note: CNN deep learning-based image to vector depiction
    Safa Riyadh Waheed
    Mohd Shafry Mohd Rahim
    Norhaida Mohd Suaib
    A. A. Salim
    Multimedia Tools and Applications, 2024, 83 (37) : 85495 - 85495
  • [36] SAR Image Active Jamming Type Recognition Based on Deep CNN Model
    Chen S.
    Cui X.
    Li M.
    Tao C.
    Li H.
    Journal of Radars, 2022, 11 (05) : 897 - 908
  • [37] Adaptive Coati Optimization Enabled Deep CNN-based Image Captioning
    Balasubramaniam, S.
    Kadry, Seifedine
    Dhanaraj, Rajesh Kumar
    Kumar, K. Satheesh
    APPLIED ARTIFICIAL INTELLIGENCE, 2024, 38 (01)
  • [38] How Image Degradations Affect Deep CNN-based Face Recognition?
    Karahan, Samil
    Yildirm, Merve Kilinc
    Kirtac, Kadir
    Rende, Ferhat Sukru
    Butun, Gultekin
    Ekenel, Hazim Kemal
    PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE OF THE BIOMETRICS SPECIAL INTEREST GROUP (BIOSIG 2016), 2016, P-260
  • [39] Impact of Traditional and Embedded Image Denoising on CNN-Based Deep Learning
    Kaur, Roopdeep
    Karmakar, Gour
    Imran, Muhammad
    APPLIED SCIENCES-BASEL, 2023, 13 (20):
  • [40] Building Extraction From PolSAR Image Based on Deep CNN with Polarimetric Features
    Xu, Xiaofang
    Lu, Yilong
    Bin Zou
    2020 21ST INTERNATIONAL RADAR SYMPOSIUM (IRS 2020), 2020, : 117 - 120