Explaining digital humanities by aligning images and textual descriptions

被引:17
|
作者
Cornia, Marcella [1 ]
Stefanini, Matteo [1 ]
Baraldi, Lorenzo [1 ]
Corsini, Massimiliano [1 ]
Cucchiara, Rita [1 ]
机构
[1] Univ Modena & Reggio Emilia, Dept Engn Enzo Ferrari, Via P Vivarelli 10, I-41125 Modena, Italy
关键词
Visual-semantic retrieval; Semi-supervised learning; Cultural heritage;
D O I
10.1016/j.patrec.2019.11.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Replicating the human ability to connect Vision and Language has recently been gaining a lot of attention in the Computer Vision and the Natural Language Processing communities. This research effort has resulted in algorithms that can retrieve images from textual descriptions and vice versa, when realistic images and sentences with simple semantics are employed and when paired training data is provided. In this paper, we go beyond these limitations and tackle the design of visual-semantic algorithms in the domain of the Digital Humanities. This setting not only advertises more complex visual and semantic structures but also features a significant lack of training data which makes the use of fully-supervised approaches infeasible. With this aim, we propose a joint visual-semantic embedding that can automatically align illustrations and textual elements without paired supervision. This is achieved by transferring the knowledge learned on ordinary visual-semantic datasets to the artistic domain. Experiments, performed on two datasets specifically designed for this domain, validate the proposed strategies and quantify the domain shift between natural images and artworks. (C) 2019 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:166 / 172
页数:7
相关论文
共 50 条
  • [1] Aligning textual and model-based process descriptions
    Sanchez-Ferreres, Josep
    van der Aa, Han
    Carmona, Josep
    Padro, Lluis
    DATA & KNOWLEDGE ENGINEERING, 2018, 118 : 25 - 40
  • [2] Aligning Textual and Graphical Descriptions of Processes Through ILP Techniques
    Sanchez-Ferreres, Josep
    Carmona, Josep
    Padro, Lluis
    ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2017), 2017, 10253 : 413 - 427
  • [3] Aligning Actions and Walking to LLM-Generated Textual Descriptions
    Chivereanu, Radu
    Cosma, Adrian
    Catruna, Andy
    Rughinis, Razvan
    Radoi, Emilian
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [4] RETRIEVING IMAGES WITH GENERATED TEXTUAL DESCRIPTIONS
    Hoxha, Genc
    Melgani, Farid
    Demir, Beguem
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 5816 - 5819
  • [5] Search Tactics of Images' Textual Descriptions
    Lin, Yi-Ling
    Lan, Wen-Lin
    Hong, Ren-Yi
    Hsiao, I-Han
    PROCEEDINGS OF THE 27TH ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA (HT'16), 2016, : 303 - 308
  • [6] GENDER, FEMINISM, TEXTUAL SCHOLARSHIP, AND DIGITAL HUMANITIES
    Robinson, Peter
    INTERSECTIONALITY IN DIGITAL HUMANITIES, 2019, : 89 - 107
  • [7] Transforming remote sensing images to textual descriptions
    Zia, Usman
    Riaz, M. Mohsin
    Ghafoor, Abdul
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 108
  • [8] Inferring spatial relations from textual descriptions of images
    Elu, Aitzol
    Azkune, Gorka
    Lopez de Lacalle, Oier
    Arganda-Carreras, Ignacio
    Soroa, Aitor
    Agirre, Eneko
    PATTERN RECOGNITION, 2021, 113
  • [10] Evaluating a Taxonomy of Textual Uncertainty for Collaborative Visualisation in the Digital Humanities
    Benito-Santos, Alejandro
    Doran, Michelle
    Rocha, Aleyda
    Wandl-Vogt, Eveline
    Edmond, Jennifer
    Theron, Roberto
    INFORMATION, 2021, 12 (11)