Improving Object Recognition of CNNs with Multiple Queries and HMMs

被引:1
|
作者
Czuni, Laszlo [1 ]
Nagy, Amr M. [1 ]
机构
[1] Univ Pannonia, Egyet Str 10, Veszprem, Hungary
关键词
Computer vision; object recognition; VGG16; Hidden Markov Model; information fusion;
D O I
10.1117/12.2559393
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
In our paper we combine neural networks with Hidden Markov Models for multiview object recognition. While convolutional neural networks are very efficient in object recognition there is still need for improvements in many practical cases. For example if the training is not satisfactory or the object localization is not solved with the neural network then information fusion from several images and from inertial sensors can still help a lot to improve recognition rate. In our use case we are to recognize objects from several directions with the VGG16 network. We assume that no localization of objects is possible on the images due to the lack of bounding box annotations, we have to recognize the objects even if they occupy only about 25% of the field of view. To overcome this problem we propose to use a Hidden Markov Model approach where the consecutive queries, shots taken from different viewing directions, are first evaluated with VGG16 inference and then with the Viterbi algorithm. The role of the later is to estimate the most probable sequence of poses of candidates (from the predefined 8 horizontal views in our experiments), thus we can select the most probable object. The approach, as evaluated with different number of queries over a set of 40 objects from the COIL-100 dataset, can result in significant increase of hit rate compared to one shot recognition or to combining individual shots without the HMM model.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Improving the robustness with multiple sets of HMMs
    Hirsch, Hans-Guenter
    Kitzig, Andreas
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 572 - 575
  • [2] Improving continuous speech recognition in Spanish by phone-class semicontinuous HMMs with pausing and multiple pronunciations
    Ferreiros, J
    Pardo, JM
    SPEECH COMMUNICATION, 1999, 29 (01) : 65 - 76
  • [3] Gradient adaptive sampling and multiple temporal scale 3D CNNs for tactile object recognition
    Qian, Xiaoliang
    Meng, Jia
    Wang, Wei
    Jiang, Liying
    FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [4] Enhancing CNNs Performance on Object Recognition Tasks with Gabor Initialization
    Rivas, Pablo
    Rai, Mehang
    ELECTRONICS, 2023, 12 (19)
  • [5] Inter-dependent CNNs for Joint Scene and Object Recognition
    Bappy, Jawadul Hasan
    Roy-Chowdhury, Amit K.
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3386 - 3391
  • [6] Improving Object Detection Accuracy with Region and Regression Based Deep CNNs
    Qu, Liang
    Wang, Shengke
    Yang, Na
    Chen, Long
    Liu, Lu
    Zhang, Xiaoyan
    Gao, Feng
    Dong, Junyu
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 318 - 323
  • [7] Visual transients improve object recognition of CNNs in sketches but not in natural images
    Schmittwilken, Lynn
    Kestel, Nico
    Maertens, Marianne
    PERCEPTION, 2022, 51 : 74 - 74
  • [8] 3D Object Recognition Method Using CNNs and Slicing
    Dumitru, Razvan Gabriel
    Toma, Sebastian Antonio
    Gorgan, Dorian
    PROCEEDINGS OF 2022 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS (AQTR 2022), 2022, : 113 - 118
  • [9] Multiple queries for large scale specific object retrieval
    Arandjelovic, Relja
    Zisserman, Andrew
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
  • [10] Memory Vectors for Particular Object Retrieval with Multiple Queries
    Sicre, Ronan
    Jegou, Nerve
    ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 479 - 482