Learning Program Representations for Food Images and Cooking Recipes

被引:12
|
作者
Papadopoulos, Dim P. [1 ,3 ]
Mora, Enrique [2 ]
Chepurko, Nadiia [1 ]
Huang, Kuan Wei [1 ]
Ofli, Ferda [4 ]
Torralba, Antonio [1 ]
机构
[1] MIT CSAIL, Cambridge, MA 02139 USA
[2] Nestle, Vevey, Switzerland
[3] DTU Compute, Lyngby, Denmark
[4] HBKU, Qatar Comp Res Inst, Ar Rayyan, Qatar
关键词
D O I
10.1109/CVPR52688.2022.01606
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we are interested in modeling a how-to instructional procedure, such as a cooking recipe, with a meaningfiil and rich high-level representation. Specifically, we propose to represent cooking recipes and food images as cooking programs. Programs provide a structured representation of the task, capturing cooking semantics and sequential relationships of actions in the form of a graph. This allows them to be easily manipulated by users and executed by agents. To this end, we build a model that is trained to learn a joint embedding between recipes and food images via self-supervision and jointly generate a program from this embedding as a sequence. To validate our idea, we crowdsource programs for cooking recipes and show that: (a) projecting the image-recipe embeddings into programs leads to better cross-modal retrieval results; (b) generating programs from images leads to better recognition results compared to predicting raw cooking instructions; and (c) we can generate food images by manipulating programs via optimizing the latent code of a GAN. Code, data, and models are available online(1).
引用
收藏
页码:16538 / 16548
页数:11
相关论文
共 50 条
  • [1] Learning Cross-modal Embeddings for Cooking Recipes and Food Images
    Salvador, Amaia
    Hynes, Nicholas
    Aytar, Yusuf
    Marin, Javier
    Ofli, Ferda
    Weber, Ingmar
    Torralba, Antonio
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3068 - 3076
  • [2] Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images
    Wang, Hao
    Sahoo, Doyen
    Liu, Chenghao
    Lim, Ee-peng
    Hoi, Steven C. H.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11564 - 11573
  • [3] Images & Recipes: Retrieval in the cooking context
    Carvalho, Micael
    Cadene, Remi
    Picard, David
    Soulier, Laure
    Cord, Matthieu
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2018, : 169 - 174
  • [4] Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images
    Marin, Javier
    Biswas, Aritro
    Ofli, Ferda
    Hynes, Nicholas
    Salvador, Amaia
    Aytar, Yusuf
    Weber, Ingmar
    Torralba, Antonio
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 187 - 203
  • [5] Cooking with Learning Analytics Recipes
    Jaakonmaeki, Roope
    Drachsler, Hendrik
    Kickmeier-Rust, Michael
    Dietze, Stefan
    Fortenbacher, Albrecht
    Marenzi, Ivana
    SEVENTH INTERNATIONAL LEARNING ANALYTICS & KNOWLEDGE CONFERENCE (LAK'17), 2017, : 572 - 573
  • [6] Learning Korean: Recipes from Home Cooking
    Tansley, Sarah
    Wyatt, Neal
    LIBRARY JOURNAL, 2022, 147 (12) : 26 - 26
  • [7] Data Engineering Challenges in Intelligent Food and Cooking Recipes
    Andres, Frederic
    2023 IEEE 39TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS, ICDEW, 2023, : 214 - 217
  • [8] Multi-subspace Implicit Alignment for Cross-modal Retrieval on Cooking Recipes and Food Images
    Li, Lin
    Li, Ming
    Zan, Zichen
    Xie, Qing
    Liu, Jianquan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3211 - 3215
  • [9] Multi-modal Cooking Workflow Construction for Food Recipes
    Pan, Liangming
    Chen, Jingjing
    Wu, Jianlong
    Liu, Shaoteng
    Ngo, Chong-Wah
    Kan, Min-Yen
    Jiang, Yugang
    Chua, Tat-Seng
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1132 - 1141
  • [10] Using LLMs to Extract Food Entities from Cooking Recipes
    Pitsilou, Vasiliki
    Papadakis, George
    Skoutas, Dimitrios
    2024 IEEE 40TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, ICDEW, 2024, : 21 - 28