A branched Convolutional Neural Network for RGB-D image classification of ceramic pieces

被引：0

作者：

Carreira, Daniel ^{[1
]}

Rodrigues, Nuno ^{[1
]}

Miragaia, Rolando ^{[1
]}

Costa, Paulo ^{[1
]}

Ribeiro, Jose ^{[1
]}

Gaspar, Fabio ^{[1
]}

Pereira, Antonio ^{[1
,2
]}

机构：

[1] Polytech Inst Leiria, Comp Sci & Commun Res Ctr, Sch Technol & Management, P-2411901 Leiria, Portugal

[2] Leiria Off, Inst New Technol, INOV INESC Inovacao, P-2411901 Leiria, Portugal

来源：

APPLIED SOFT COMPUTING | 2024年 / 165卷

关键词：

Ceramic manufacturing; Convolutional neural network; Data fusion; Image classification; RGB-D;

D O I：

10.1016/j.asoc.2024.112088

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

From smart sensors on assembly lines to robots performing complex tasks, the fourth industrial revolution is rapidly transforming manufacturing. The growing prominence of 3D cameras in the industry has led the computer vision community to explore innovative ways of integrating depth and color data to achieve higher precision, essential for ensuring product quality in manufacturing. In this study, we introduce an innovative branched convolutional neural network designed to produce high-speed classification of multimodal images, such as RGB-Depth (RGB-D) images. The fundamental concept underlying the branched approach is the specialization of each branch as a dedicated feature extractor for a single modality, followed by their merge (intermediate fusion) to enable effective classification. Feeding our model is our novel multimodal dataset, named CeramicNet, composed of 8 classes that include RGB, depth, and RGB-D variations to enable extensive experimentation and evaluation of the models which, to the best of our knowledge, has not been previously introduced in the computer vision community. We conducted a series of experiments on the CeramicNet dataset. These experiments aimed at fine-tuning the model, assessing the influence of various depth technologies, exploring individual modalities, examining their collective impact, and performing comprehensive data analysis. Comparing our solution against seven widely used models, we achieved remarkable results, securing the top position with a precision of 99.89, with a lead of over 1% against the nearest competitor. What is more, the proposed solution yields an inference time of 127.6 ms - being nearly three times faster than the second-best performer.

引用

页数：13

共 50 条

[21] RGB-D Object Recognition Using Deep Convolutional Neural Networks
Zia, Saman
Yuksel, Buket
Yuret, Deniz
Yemez, Yucel
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 887 - 894
[22] RGB-D OBJECT RECOGNITION WITH MULTIMODAL DEEP CONVOLUTIONAL NEURAL NETWORKS
Rahman, Mohammad Muntasir
Tan, Yanhao
Xue, Jian
Lu, Ke
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 991 - 996
[23] SiaTrans: Siamese transformer network for RGB-D salient object detection with depth image classification
Jia, XingZhao
DongYe, ChangLei
Peng, YanJun
IMAGE AND VISION COMPUTING, 2022, 127
[24] A Neural Network Approach to Human Posture Classification and Fall Detection Using RGB-D Camera
Manzi, Alessandro
Cavallo, Filippo
Dario, Paolo
AMBIENT ASSISTED LIVING, 2017, 426 : 127 - 139
[25] Salient object detection for RGB-D image by single stream recurrent convolution neural network
Liu, Zhengyi
Shi, Song
Duan, Quntao
Zhang, Wei
Zhao, Peng
NEUROCOMPUTING, 2019, 363 : 46 - 57
[26] Anisotropic Convolutional Neural Networks for RGB-D Based Semantic Scene Completion
Li, Jie
Wang, Peng
Han, Kai
Liu, Yu
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 8125 - 8138
[27] Image Classification Using PSO-SVM and an RGB-D Sensor
Lopez-Franco, Carlos
Villavicencio, Luis
Arana-Daniel, Nancy
Alanis, Alma Y.
MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
[28] Revisiting Deep Convolutional Neural Networks for RGB-D Based Object Recognition
Madai-Tahy, Lorand
Otte, Sebastian
Hanten, Richard
Zell, Andreas
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 29 - 37
[29] Review on Indoor RGB-D Semantic Segmentation with Deep Convolutional Neural Networks
Barchid, Sami
Mennesson, Jose
Djeraba, Chaabane
2021 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2021, : 199 - 202
[30] Learning structured group sparse representation for RGB-D image classification
Tu, Shuqin
Xue, Yueju
Liang, Yun
Zhang, Xiao
Lin, Huankai
Guo, Aixia
Journal of Information and Computational Science, 2015, 12 (11): : 4357 - 4367

← 1 2 3 4 5 →