Lexical Interpretation of Visual Cues Using Deep Learning

被引:0
|
作者
Budarapu, Amrita [1 ]
Jain, Komal [1 ]
Sree, S. Bindu [1 ]
Varshitha, T. [1 ]
Niveditha, B. [1 ]
机构
[1] Narayanamma Inst Technol & Sci, Dept CSE AI&ML, Hyderabad, India
关键词
Lexical interpretation; Lip Reading; CNN; GRU; Visual cues;
D O I
10.1007/978-981-97-8031-0_89
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Lexical interpretation of visual cues is an approach for understanding spoken phrases by visually observing the movements and shapes of a speaker's lips. A comprehensive review of the existing methods exposes the limitations of traditional lip reading techniques in capturing both spatial and temporal dimensions of lip movements. To address this gap, this project presents an approach to advance lip reading efficacy by synergizing Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU). The system has achieved an accuracy of 97% on GRID dataset. The implications of this research extend to improved communication accessibility for individuals with hearing impairments as well as broader applications in areas such as criminal investigations and security.
引用
收藏
页码:833 / 842
页数:10
相关论文
共 50 条
  • [21] Automatic interpretation of salmon scales using deep learning
    Vabo, Rune
    Moen, Endre
    Smolinski, Szymon
    Husebo, Ase
    Handegard, Nils Olav
    Malde, Ketil
    ECOLOGICAL INFORMATICS, 2021, 63
  • [22] Support vector learning for gender classification using audio and visual cues
    Walavalkar, L
    Yeasin, M
    Narasimhamurthy, A
    Sharma, R
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2003, 17 (03) : 417 - 439
  • [23] Visual learning of affordance based cues
    Fritz, Gerald
    Paletta, Lucas
    Kumar, Manish
    Dorffner, Georg
    Breithaupt, Ralph
    Rome, Erich
    FROM ANIMALS TO ANIMATS 9, PROCEEDINGS, 2006, 4095 : 52 - 64
  • [24] Deep motion and appearance cues for visual tracking
    Danelljan, Martin
    Bhat, Goutam
    Gladh, Susanna
    Khan, Fahad Shahbaz
    Felsberg, Michael
    PATTERN RECOGNITION LETTERS, 2019, 124 : 74 - 81
  • [25] Learning the Visual Interpretation of Sentences
    Zitnick, C. Lawrence
    Parikh, Devi
    Vanderwende, Lucy
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1681 - 1688
  • [26] CLASSIFICATION OF BISYLLABIC LEXICAL STRESS PATTERNS IN DISORDERED SPEECH USING DEEP LEARNING
    Shahin, Mostafa
    Gutierrez-Osuna, Ricardo
    Ahmed, Beena
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6480 - 6484
  • [27] Visual Task Outcome Verification Using Deep Learning
    Erkent, Ozgur
    Shukla, Dadhichi
    Piater, Justus
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4821 - 4827
  • [28] Monocular Visual Odometry Using Unsupervised Deep Learning
    Liu, Fanning
    Liu, Zhenghua
    Wu, Qian
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 3274 - 3279
  • [29] Evaluating Visual Search in Glaucoma Using Deep learning
    Mishra, Anoop
    Belcher, Steven
    Anderson, David
    Khazanchi, Deepak
    AMCIS 2020 PROCEEDINGS, 2020,
  • [30] Visual fire detection using deep learning: A survey
    Cheng, Guangtao
    Chen, Xue
    Wang, Chenyi
    Li, Xiaobo
    Xian, Baoyi
    Yu, Hao
    NEUROCOMPUTING, 2024, 596