Cross-Modality Personalization for Retrieval

被引:10
|
作者
Murrugarra-Llerena, Nils [1 ]
Kovashka, Adriana [1 ]
机构
[1] Univ Pittsburgh, Dept Comp Sci, Pittsburgh, PA 15260 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR.2019.00659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing captioning and gaze prediction approaches do not consider the multiple facets of personality that affect how a viewer extracts meaning from an image. While there are methods that consider personalized captioning, they do not consider personalized perception across modalities, i.e. how a person's way of looking at an image (gaze) affects the way they describe it (captioning). In this work, we propose a model for modeling cross-modality personalized retrieval. In addition to modeling gaze and captions, we also explicitly model the personality of the users providing these samples. We incorporate constraints that encourage gaze and caption samples on the same image to be close in a learned space; we refer to this as content modeling. We also model style: we encourage samples provided by the same user to be close in a separate embedding space, regardless of the image on which they were provided. To leverage the complementary information that content and style constraints provide, we combine the embeddings from both networks. We show that our combined embeddings achieve better performance than existing approaches for cross-modal retrieval.
引用
收藏
页码:6422 / 6431
页数:10
相关论文
共 50 条
  • [41] Neural Dynamics of Attentional Cross-Modality Control
    Rabinovich, Mikhail
    Tristan, Irma
    Varona, Pablo
    [J]. PLOS ONE, 2013, 8 (05):
  • [42] CROSS-MODALITY MATCHING OF LINGUAL PRESSURE TO LOUDNESS
    LEEPER, HA
    FETH, LL
    APPL, FJ
    [J]. PERCEPTUAL AND MOTOR SKILLS, 1978, 46 (03) : 911 - 924
  • [43] Comparisons of cross-modality integration in midbrain and cortex
    Stein, BE
    Wallace, MT
    [J]. EXTRAGENICULOSTRIATE MECHANISMS UNDERLYING VISUALLY-GUIDED ORIENTATION BEHAVIOR, 1996, 112 : 289 - 299
  • [44] Review of Cross-Modality Medical Image Prediction
    Zhou P.
    Chen H.-J.
    Yu Z.-K.
    Peng Y.-H.
    Li Y.-F.
    Yang F.
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (01): : 220 - 226
  • [45] Unsupervised Deep Cross-modality Spectral Hashing
    Hoang, Tuan
    Do, Thanh-Toan
    Nguyen, Tam V.
    Cheung, Ngai-Man
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 8391 - 8406
  • [46] Cross-Modality Sentiment Analysis for Social Multimedia
    Ji, Rongrong
    Cao, Donglin
    Lin, Dazhen
    [J]. 2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 28 - 31
  • [47] Cross-Modality Generalization in Reading and Spelling Instruction
    Miller, Sarah J.
    Noell, George H.
    McIver, Elise C.
    Lark, Catherine R.
    [J]. SCHOOL PSYCHOLOGY REVIEW, 2017, 46 (04) : 408 - 425
  • [48] Learning Cross-modality Similarity for Multinomial Data
    Jia, Yangqing
    Salzmann, Mathieu
    Darrell, Trevor
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 2407 - 2414
  • [49] THE RECOLLECTIVE EXPERIENCE OF CROSS-MODALITY CONFUSION ERRORS
    LANE, SM
    ZARAGOZA, MS
    [J]. MEMORY & COGNITION, 1995, 23 (05) : 607 - 610
  • [50] CROSS-MODALITY VALIDATION OF AFFECTIVE AROUSAL BY TEXTURE
    HAMA, H
    MINE, H
    MATSUYAMA, Y
    [J]. JAPANESE PSYCHOLOGICAL RESEARCH, 1981, 23 (04) : 196 - 202