Large-Scale Coarse-to-Fine Object Retrieval Ontology and Deep Local Multitask Learning

被引:4
|
作者
Ly, Ngoc Q. [1 ]
Do, Tuong K. [2 ]
Nguyen, Binh X. [1 ]
机构
[1] VNUHCM Univ Sci, Dept Informat Technol, Hcm 70000, Vietnam
[2] AIOZ Pte Ltd, Hcm 70000, Vietnam
关键词
45;
D O I
10.1155/2019/1483294
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Object retrieval plays an increasingly important role in video surveillance, digital marketing, e-commerce, etc. It is facing challenges such as large-scale datasets, imbalanced data, viewpoint, cluster background, and fine-grained details (attributes). This paper has proposed a model to integrate object ontology, a local multitask deep neural network (local MDNN), and an imbalanced data solver to take advantages and overcome the shortcomings of deep learning network models to improve the performance of the large-scale object retrieval system from the coarse-grained level (categories) to the fine-grained level (attributes). Our proposed coarse-to-fine object retrieval (CFOR) system can be robust and resistant to the challenges listed above. To the best of our knowledge, the new main point of our CFOR system is the power of mutual support of object ontology, a local MDNN, and an imbalanced data solver in a unified system. Object ontology supports the exploitation of the inner-group correlations to improve the system performance in category classification, attribute classification, and conducting training flow and retrieval flow to save computational costs in the training stage and retrieval stage on large-scale datasets, respectively. A local MDNN supports linking object ontology to the raw data, and an imbalanced data solver based on Matthews' correlation coefficient (MCC) addresses that the imbalance of data has contributed effectively to increasing the quality of object ontology realization without adjusting network architecture and data augmentation. In order to evaluate the performance of the CFOR system, we experimented on the DeepFashion dataset. This paper has shown that our local MDNN framework based on the pretrained NASNet architecture has achieved better performance (14.2% higher in recall rate) compared to single-task learning (STL) in the attribute learning task; it has also shown that our model with an imbalanced data solver has achieved better performance (5.14% higher in recall rate for fewer data attributes) compared to models that do not take this into account. Moreover, MAP@30 hovers 0.815 in retrieval on an average of 35 imbalanced fashion attributes.
引用
收藏
页数:40
相关论文
共 50 条
  • [31] Deep Embedding Learning With Auto-Encoder for Large-Scale Ontology Matching
    Khoudja, Meriem Ali
    Fareh, Messaouda
    Bouarfa, Hafida
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2022, 18 (01)
  • [32] Deep Learning Based Single Pixel Imaging Using Coarse-to-fine Sampling
    Woo, Bing Hong
    Tham, Mau-Luen
    Chua, Sing Yee
    2022 IEEE 18TH INTERNATIONAL COLLOQUIUM ON SIGNAL PROCESSING & APPLICATIONS (CSPA 2022), 2022, : 127 - 131
  • [33] Novel Consensus Architecture To Improve Performance of Large-Scale Multitask Deep Learning QSAR Models
    Zakharov, Alexey V.
    Zhao, Tongan
    Dac-Trung Nguyen
    Peryea, Tyler
    Sheils, Timothy
    Yasgar, Adam
    Huang, Ruili
    Southall, Noel
    Simeonov, Anton
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (11) : 4613 - 4624
  • [34] Deep Hashing for Large-scale Image Retrieval
    Li Mengting
    Liu Jun
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 10940 - 10944
  • [35] Pairwise Geometric Matching for Large-scale Object Retrieval
    Li, Xinchao
    Larson, Martha
    Hanjalic, Alan
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 5153 - 5161
  • [36] Fast large-scale object retrieval with binary quantization
    Zhou, Shifu
    Zeng, Dan
    Shen, Wei
    Zhang, Zhijiang
    Tian, Qi
    JOURNAL OF ELECTRONIC IMAGING, 2015, 24 (06)
  • [37] A Coarse-to-Fine Approach for Handwritten Word Spotting in Large Scale Historical Documents Collection
    Almazan, J.
    Fernandez, D.
    Fornes, A.
    Llados, J.
    Valveny, E.
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 455 - 460
  • [38] Joint learning based deep supervised hashing for large-scale image retrieval
    Gu, Guanghua
    Liu, Jiangtao
    Li, Zhuoyi
    Huo, Wenhua
    Zhao, Yao
    NEUROCOMPUTING, 2020, 385 : 348 - 357
  • [39] Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning
    Tian, Kaibin
    Cheng, Yanhua
    Liu, Yi
    Hou, Xinglin
    Chen, Quan
    Li, Han
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5207 - 5214
  • [40] A Novel Coarse-to-Fine Deep Learning Registration Framework for Multimodal Remote Sensing Images
    Quan, Dou
    Wei, Huiyuan
    Wang, Shuang
    Gu, Yu
    Hou, Biao
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61