Visual speech recognition for small scale dataset using VGG16 convolution neural network

被引:0
|
作者
Shashidhar R
Sudarshan Patilkulkarni
机构
[1] JSS Science and Technology University,Department of Electronics and Communication Engineering
来源
关键词
Visual speech recognition; Lip-reading; Convolutional neural network; VGG16;
D O I
暂无
中图分类号
学科分类号
摘要
Visual speech recognition is a method that comprehends speech from speakers lip movements and the speech is validated only by the shape and lip movement. Implementation of this practice not only helps people with hearing impaired but also can be used for professional lip reading whose application can be seen in crime and forensics. It plays a crucial role in aforementioned domains, as normal person’s speech will be converted to text. Here, it is proposed to enhance the visual speech recognition technique from the video. The dataset was created and the same was used for implementation and verification. The aim of the approach was to recognize words only from the lip movement using video in the absence of audio and this mostly helps to extract words from a video without audio that helps in forensic and crime analysis. The proposed method employs VGG16 pre trained Convolutional Neural Network architecture for classification and recognition of data. It was observed that the visual modality improves the performance of speech recognition system. Finally, the obtained results were compared with the Hahn Convolutional Neural Network architecture (HCNN). The accuracy of the recommended model is 76% in visual speech recognition.
引用
收藏
页码:28941 / 28952
页数:11
相关论文
共 50 条
  • [1] Visual speech recognition for small scale dataset using VGG16 convolution neural network
    Shashidhar, R.
    Patilkulkarni, Sudarshan
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (19) : 28941 - 28952
  • [2] Visual Speech Recognition for Kannada Language Using VGG16 Convolutional Neural Network
    Rudregowda, Shashidhar
    Kulkarni, Sudarshan Patil
    Gururaj, H. L.
    Ravi, Vinayakumar
    Krichen, Moez
    [J]. ACOUSTICS, 2023, 5 (01): : 343 - 353
  • [3] Fish species recognition using VGG16 deep convolutional neural network
    Hridayami, Praba
    Putra, I. Ketut Gede Darma
    Wibawa, Kadek Suar
    [J]. Journal of Computing Science and Engineering, 2019, 13 (03): : 124 - 130
  • [4] Flower Recognition Using VGG16
    Rahman, Md. Ashikur
    Laskar, Md. Saif
    Asif, Samir
    Imam, Omar Tawhid
    Reza, Ahmed Wasif
    Arefin, Mohammad Shamsul
    [J]. THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND CAPSULE NETWORKS (ICIPCN 2022), 2022, 514 : 748 - 760
  • [5] Diabetic retinopathy classification using VGG16 neural network
    da Rocha D.A.
    Ferreira F.M.F.
    Peixoto Z.M.A.
    [J]. Research on Biomedical Engineering, 2022, 38 (02) : 761 - 772
  • [6] Comparative study of CNN, VGG16 with LSTM and VGG16 with Bidirectional LSTM using kitchen activity dataset
    Aparna, R.
    Chitralekha, C. K.
    Chaudhari, Shilpa
    [J]. PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 836 - 843
  • [7] Visual Emotion Recognition based on transfer learning technique using VGG16
    Ayadi, Souha
    Lachiri, Zied
    [J]. PRZEGLAD ELEKTROTECHNICZNY, 2024, 100 (08): : 153 - 155
  • [8] Rice Processing Accuracy Classification Method Based on Improved VGG16 Convolution Neural Network
    Qi, Chao
    Zuo, Yi
    Chen, Zheqi
    Chen, Kunjie
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (05): : 301 - 307
  • [9] Development of signature recognition system using VGG16
    Moud, Deepak
    Saxena, Rakesh Kumar
    [J]. JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2023, 26 (03): : 807 - 813
  • [10] Handwritten Digit Recognition based on Improved VGG16 Network
    Cheng Shuhong
    Shang Guochao
    Zhang Li
    [J]. TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069