Continuous Emotion-Based Image-to-Music Generation

被引:0
|
作者
Wang, Yajie [1 ,2 ]
Chen, Mulin [1 ,2 ]
Li, Xuelong [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Artificial Intelligence, OPt & Elect iOPEN, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Key Lab Intelligent Interact & Applicat, Minist Ind & Informat Technol, Xian 710072, Peoples R China
关键词
Image-to-music generation; valence-arousal space; multi-modal cognitive computing; vicinagearth security;
D O I
10.1109/TMM.2023.3338089
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image-to-music generation aims to generate realistic pure music according to a given image. Although many previous works are conducted on bridging image and music, they mainly focus on the content-based cross-modal matching. For example, matching the Christmas song to an image that contains a Christmas tree. By comparison, image-to-music generation is a more challenging task due to its ambiguity and subjectivity. Specifically, there is no explicit correlation between the image content and music melody, without any lyric and human sound. Meanwhile, the perception of generated music varies from person to person. Inspired by the synesthesia phenomenon, we think that if an image tends to elicit a certain emotion on human, the generated music should also leave a similar impression. Therefore, in this paper, we propose a continuous emotion-based image-to-music generation framework, which uses emotion as the key for cross-modal generation. Specifically, a new image-music dataset is established, which uses valence-arousal (VA) space to capture the complex and nuanced nature of emotions. After that, a plug and play model is proposed to translate an image into a piece of music with similar emotion, which projects the emotions into continuous-valued labels, and explores both the intra-modal and inter-modal emotional consistency with contrastive learning. To our best knowledge, this is the first end-to-end framework towards the task of pure music generation from natural images. Extensive experiments show that the generated music achieves satisfactory emotional consistency with the input images, as well as impressive quality.
引用
收藏
页码:5670 / 5679
页数:10
相关论文
共 50 条
  • [1] Multitrack Emotion-Based Music Generation Network Using Continuous Symbolic Features
    Zhang, Donghui
    Li, Xiaobing
    Lu, Di
    Tie, Yun
    Gao, Yan
    Qi, Lin
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
  • [2] Emotion-based music classification
    Zhu, W., 1600, Asian Network for Scientific Information (12):
  • [3] I-sounds - Emotion-based music generation for virtual environments
    Cruz, Ricardo
    Brisson, Antonio
    Paiva, Ana
    Lopes, Eduardo
    AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PROCEEDINGS, 2007, 4738 : 766 - +
  • [4] EBIR: Emotion-based Image Retrieval
    Kim, Youngrae
    Shin, Yunhee
    Kim, So-jung
    Kim, Eun Yi
    Shin, Hyoseop
    2009 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 2009, : 433 - 434
  • [5] Emotion-based music visualization using photos
    Chen, Chin-Han
    Weng, Ming-Fang
    Jeng, Shyh-Kang
    Chuang, Yung-Yu
    ADVANCES IN MULTIMEDIA MODELING, PROCEEDINGS, 2008, 4903 : 358 - +
  • [6] Affecticon: Emotion-Based Icons for Music Retrieval
    Yoo, Min-Joon
    Lee, In-Kwon
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2011, 31 (03) : 89 - 95
  • [7] An emotion-based personalized music recommendation framework for emotion improvement
    Liu, Zhiyuan
    Xu, Wei
    Zhang, Wenping
    Jiang, Qiqi
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [8] Emotion-Based Painting Image Display System
    Lee, Taemin
    Kang, Dongwann
    Yoon, Kyunghyun
    Seo, Sanghyun
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2020, 26 (01): : 181 - 192
  • [9] Automatic Emotion-based Image Semantic Annotation
    Zhang, Jingjing
    Cao, Yan
    Mu, Xiangwei
    MECHATRONICS, ROBOTICS AND AUTOMATION, PTS 1-3, 2013, 373-375 : 624 - 628
  • [10] Emotion-Based Music Information Retrieval Using Lyrics
    Ogino, Akihiro
    Yamashita, Yuko
    COMPUTER INFORMATION SYSTEMS AND INDUSTRIAL MANAGEMENT, 2015, 9339 : 613 - 622