Large-Scale Coarse-to-Fine Object Retrieval Ontology and Deep Local Multitask Learning

被引:4
|
作者
Ly, Ngoc Q. [1 ]
Do, Tuong K. [2 ]
Nguyen, Binh X. [1 ]
机构
[1] VNUHCM Univ Sci, Dept Informat Technol, Hcm 70000, Vietnam
[2] AIOZ Pte Ltd, Hcm 70000, Vietnam
关键词
45;
D O I
10.1155/2019/1483294
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Object retrieval plays an increasingly important role in video surveillance, digital marketing, e-commerce, etc. It is facing challenges such as large-scale datasets, imbalanced data, viewpoint, cluster background, and fine-grained details (attributes). This paper has proposed a model to integrate object ontology, a local multitask deep neural network (local MDNN), and an imbalanced data solver to take advantages and overcome the shortcomings of deep learning network models to improve the performance of the large-scale object retrieval system from the coarse-grained level (categories) to the fine-grained level (attributes). Our proposed coarse-to-fine object retrieval (CFOR) system can be robust and resistant to the challenges listed above. To the best of our knowledge, the new main point of our CFOR system is the power of mutual support of object ontology, a local MDNN, and an imbalanced data solver in a unified system. Object ontology supports the exploitation of the inner-group correlations to improve the system performance in category classification, attribute classification, and conducting training flow and retrieval flow to save computational costs in the training stage and retrieval stage on large-scale datasets, respectively. A local MDNN supports linking object ontology to the raw data, and an imbalanced data solver based on Matthews' correlation coefficient (MCC) addresses that the imbalance of data has contributed effectively to increasing the quality of object ontology realization without adjusting network architecture and data augmentation. In order to evaluate the performance of the CFOR system, we experimented on the DeepFashion dataset. This paper has shown that our local MDNN framework based on the pretrained NASNet architecture has achieved better performance (14.2% higher in recall rate) compared to single-task learning (STL) in the attribute learning task; it has also shown that our model with an imbalanced data solver has achieved better performance (5.14% higher in recall rate for fewer data attributes) compared to models that do not take this into account. Moreover, MAP@30 hovers 0.815 in retrieval on an average of 35 imbalanced fashion attributes.
引用
收藏
页数:40
相关论文
共 50 条
  • [21] Coarse-to-Fine Image DeHashing Using Deep Pyramidal Residual Learning
    Wang, Yongwei
    Ward, Rabab
    Wang, Z. Jane
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (09) : 1295 - 1299
  • [22] A Coarse-to-Fine Deep Learning Based Framework for Traffic Light Recognition
    Yao, Zikai
    Liu, Qiang
    Fu, Jie
    Xie, Qian
    Li, Bo
    Ye, Qing
    Li, Qing
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 1 - 13
  • [23] Large-Scale Video Retrieval via Deep Local Convolutional Features
    Zhang, Chen
    Hu, Bin
    Suo, Yucong
    Zou, Zhiqiang
    Ji, Yimu
    ADVANCES IN MULTIMEDIA, 2020, 2020
  • [24] Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning
    Yuan, Xiang
    Cheng, Gong
    Yan, Kebing
    Zeng, Qinghua
    Han, Junwei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6294 - 6304
  • [25] Multitask learning for large-scale semantic change detection
    Daudt, Rodrigo Caye
    Le Saux, Bertrand
    Boulch, Alexandre
    Gousseau, Yann
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 187
  • [26] LARGE-SCALE FACE IMAGE RETRIEVAL BASED ON HADOOP AND DEEP LEARNING
    Huang Yuanyuan
    Tang Yuan
    Xiong Taisong
    2020 17TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2020, : 326 - 329
  • [27] Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval
    Zhu, Yunquan
    Gao, Xinkai
    Ke, Bo
    Qiao, Ruizhi
    Sun, Xing
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11226 - 11235
  • [28] An Efficient Deep Learning Based Coarse-to-Fine Cephalometric Landmark Detection Method
    Song, Yu
    Qiao, Xu
    Iwamoto, Yutaro
    Chen, Yen-Wei
    Chen, Yili
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (08) : 1359 - 1366
  • [29] Planning Large-scale Object Rearrangement Using Deep Reinforcement Learning
    Ghosh, Sourav
    Das, Dipanjan
    Chakraborty, Abhishek
    Agarwal, Marichi
    Bhowmick, Brojeshwar
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [30] A coarse-to-fine deep learning framework for optic disc segmentation in fundus images
    Wang, Lei
    Liu, Han
    Lu, Yaling
    Chen, Hang
    Zhang, Jian
    Pu, Jiantao
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2019, 51 : 82 - 89