Lip-Reading Classification of Turkish Digits Using Ensemble Learning Architecture Based on 3DCNN

被引:0
|
作者
Erbey, Ali [1 ,2 ]
Barisci, Necaattin [3 ]
机构
[1] Usak Univ, Distance Educ Vocat Sch, Dept Comp Programming, TR-64200 Usak, Turkiye
[2] Gazi Univ, Informat Inst, Informat Syst, TR-06560 Ankara, Turkiye
[3] Gazi Univ, Fac Technol, Dept Comp Engn, TR-06560 Ankara, Turkiye
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 02期
关键词
lip-reading; ensemble learning; 3DCNN; RECOGNITION;
D O I
10.3390/app15020563
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Understanding others correctly is of great importance for maintaining effective communication. Factors such as hearing difficulties or environmental noise can disrupt this process. Lip reading offers an effective solution to these challenges. With the growing success of deep learning architectures, research on lip reading has gained momentum. The aim of this study is to create a lip reading dataset for Turkish digit recognition and to conduct predictive analyses. The dataset has divided into two subsets: the face region and the lip region. CNN, LSTM, and 3DCNN-based models, including C3D, I3D, and 3DCNN+BiLSTM, were used. While LSTM models are effective in processing temporal data, 3DCNN-based models, which can process both spatial and temporal information, achieved higher accuracy in this study. Experimental results showed that the dataset containing only the lip region performed better; accuracy rates for CNN, LSTM, C3D, and I3D on the lip region were 67.12%, 75.53%, 86.32%, and 93.24%, respectively. The 3DCNN-based models achieved higher accuracy due to their ability to process spatio-temporal data. Furthermore, an additional 1.23% improvement was achieved through ensemble learning, with the best result reaching 94.53% accuracy. Ensemble learning, by combining the strengths of different models, provided a meaningful improvement in overall performance. These results demonstrate that 3DCNN architectures and ensemble learning methods yield high success in addressing the problem of lip reading in the Turkish language. While our study focuses on Turkish digit recognition, the proposed methods have the potential to be successful in other languages or broader lip reading applications.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Audio Based Violent Scene Classification Using Ensemble Learning
    Sarman, Sercan
    Sert, Mustafa
    2018 6TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2018, : 416 - 420
  • [42] Lung Cancer Classification using Reinforcement Learning-based Ensemble Learning
    Luo, Shengping
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (08) : 1112 - 1122
  • [43] An ensemble deep learning based IDS for IoT using Lambda architecture
    Rubayyi Alghamdi
    Martine Bellaiche
    Cybersecurity, 6
  • [44] An ensemble deep learning based IDS for IoT using Lambda architecture
    Alghamdi, Rubayyi
    Bellaiche, Martine
    CYBERSECURITY, 2023, 6 (01)
  • [45] A Hybrid 3DCNN and 3DC-LSTM Based Model for 4D Spatio-Temporal fMRI Data: An ABIDE Autism Classification Study
    El-Gazzar, Ahmed
    Quaak, Mirjam
    Cerliani, Leonardo
    Bloem, Peter
    van Wingen, Guido
    Thomas, Rajat Mani
    OR 2.0 CONTEXT-AWARE OPERATING THEATERS AND MACHINE LEARNING IN CLINICAL NEUROIMAGING, 2019, 11796 : 95 - 102
  • [46] 3DWDC-Net: An improved 3DCNN with separable structure and global attention for weld internal defect classification based on phased array ultrasonic tomography images
    Wang, Shaofeng
    Zhang, Erqing
    Zhou, Luncai
    Han, Yongquan
    Liu, Wenjing
    Hong, Jun
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2025, 229
  • [47] Enrichment of Machine Learning based Activity Classification in Smart Homes using Ensemble Learning
    Agarwal, Bikash
    Chakravorty, Antorweep
    Wiktorski, Tomasz
    Rong, Chunming
    2016 IEEE/ACM 9TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC), 2016, : 196 - 201
  • [48] HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON CO-LEARNING THROUGH DUAL-ARCHITECTURE ENSEMBLE
    Chen Xiaoyue
    Cao Xianghai
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2804 - 2808
  • [49] Multiple View Based 3D Object Classification Using Ensemble Learning of Local Subspaces
    Wu, Jianing
    Fukui, Kazuhiro
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3511 - 3514
  • [50] An ensemble framework based on Deep CNNs architecture for glaucoma classification using fundus photography
    Aziz-ur-Rehman
    Taj, Imtiaz A.
    Sajid, Muhammad
    Karimov, Khasan S.
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (05) : 5321 - 5346