Iterative Reorganization with Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning

被引:78
|
作者
Wei, Chen [1 ]
Xie, Lingxi [2 ,3 ]
Ren, Xutong [1 ]
Xia, Yingda [2 ]
Su, Chi [4 ]
Liu, Jiaying
Tian, Qi [3 ]
Yuille, Alan L. [2 ]
机构
[1] Peking Univ, Beijing, Peoples R China
[2] Johns Hopkins Univ, Baltimore, MD 21218 USA
[3] Huawei Inc, Noahs Ark Lab, Shenzhen, Guangdong, Peoples R China
[4] Kingsoft Cloud, Beijing, Peoples R China
关键词
D O I
10.1109/CVPR.2019.00201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning visual features from unlabeled image data is an important yet challenging task, which is often achieved by training a model on some annotation-free information. We consider spatial contexts, for which we solve so-called jigsaw puzzles, i.e., each image is cut into grids and then disordered, and the goal is to recover the correct configuration. Existing approaches formulated it as a classification task by defining a fixed mapping from a small subset of configurations to a class set, but these approaches ignore the underlying relationship between different configurations and also limit their applications to more complex scenarios. This paper presents a novel approach which applies to jigsaw puzzles with an arbitrary grid size and dimensionality. We provide a fundamental and generalized principle, that weaker cues are easier to be learned in an unsupervised manner and also transfer better. In the context of puzzle recognition, we use an iterative manner which, instead of solving the puzzle all at once, adjusts the order of the patches in each step until convergence. In each step, we combine both unary and binary features of each patch into a cost function judging the correctness of the current configuration. Our approach, by taking similarity between puzzles into consideration, enjoys a more efficient way of learning visual knowledge. We verify the effectiveness of our approach from two aspects. First, it solves arbitrarily complex puzzles, including high-dimensional puzzles, that prior methods are difficult to handle. Second, it serves as a reliable way of network initialization, which leads to better transfer performance in visual recognition tasks including classification, detection and segmentation.
引用
收藏
页码:1910 / 1919
页数:10
相关论文
共 19 条
  • [1] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles
    Noroozi, Mehdi
    Favaro, Paolo
    COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 : 69 - 84
  • [2] Unsupervised Fashion Style Learning by Solving Fashion Jigsaw Puzzles
    Chen, Jia
    Yuan, Haidongqing
    Fang, Fei
    Peng, Tao
    Hu, Xinrong
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1847 - 1852
  • [3] Solving Square Jigsaw Puzzles with Loop Constraints
    Son, Kilho
    Hays, James
    Cooper, David B.
    COMPUTER VISION - ECCV 2014, PT VI, 2014, 8694 : 32 - 46
  • [4] Jigsaw Clustering for Unsupervised Visual Representation Learning
    Chen, Pengguang
    Liu, Shu
    Jia, Jiaya
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11521 - 11530
  • [5] JigsawGAN: Auxiliary Learning for Solving Jigsaw Puzzles With Generative Adversarial Networks
    Li, Ru
    Liu, Shuaicheng
    Wang, Guangfu
    Liu, Guanghui
    Zeng, Bing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 513 - 524
  • [6] Deepzzle: Solving Visual Jigsaw Puzzles With Deep Learning and Shortest Path Optimization
    Paumard, Marie-Morgane
    Picard, David
    Tabia, Hedi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 3569 - 3581
  • [7] MEJIGCLU: MORE EFFECTIVE JIGSAW CLUSTERING FOR UNSUPERVISED VISUAL REPRESENTATION LEARNING
    Zhang, Yongsheng
    Liu, Qing
    Zhao, Yang
    Liang, Yixiong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2135 - 2139
  • [8] Siamese-Discriminant Deep Reinforcement Learning for Solving Jigsaw Puzzles with Large Eroded Gaps
    Song, Xingke
    Jin, Jiahuan
    Yao, Chenglin
    Wang, Shihe
    Ren, Jianfeng
    Bai, Ruibin
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2303 - 2311
  • [9] Unsupervised Representation Learning of Spatial Data via Multimodal Embedding
    Jenkins, Porter
    Farag, Ahmad
    Wang, Suhang
    Li, Zhenhui
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 1993 - 2002
  • [10] Unsupervised Visual Representation Learning by Graph-Based Consistent Constraints
    Li, Dong
    Hung, Wei-Chih
    Huang, Jia-Bin
    Wang, Shengjin
    Ahuja, Narendra
    Yang, Ming-Hsuan
    COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 678 - 694