Lexical Interpretation of Visual Cues Using Deep Learning

被引：0

作者：

Budarapu, Amrita ^{[1
]}

Jain, Komal ^{[1
]}

Sree, S. Bindu ^{[1
]}

Varshitha, T. ^{[1
]}

Niveditha, B. ^{[1
]}

机构：

[1] Narayanamma Inst Technol & Sci, Dept CSE AI&ML, Hyderabad, India

来源：

PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, MACHINE LEARNING AND APPLICATIONS, VOL 1, ICDSMLA 2023 | 2025年 / 1273卷

关键词：

Lexical interpretation; Lip Reading; CNN; GRU; Visual cues;

D O I：

10.1007/978-981-97-8031-0_89

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Lexical interpretation of visual cues is an approach for understanding spoken phrases by visually observing the movements and shapes of a speaker's lips. A comprehensive review of the existing methods exposes the limitations of traditional lip reading techniques in capturing both spatial and temporal dimensions of lip movements. To address this gap, this project presents an approach to advance lip reading efficacy by synergizing Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU). The system has achieved an accuracy of 97% on GRID dataset. The implications of this research extend to improved communication accessibility for individuals with hearing impairments as well as broader applications in areas such as criminal investigations and security.

引用

页码：833 / 842

页数：10

共 50 条

[21] Automatic interpretation of salmon scales using deep learning
Vabo, Rune
Moen, Endre
Smolinski, Szymon
Husebo, Ase
Handegard, Nils Olav
Malde, Ketil
ECOLOGICAL INFORMATICS, 2021, 63
[22] Support vector learning for gender classification using audio and visual cues
Walavalkar, L
Yeasin, M
Narasimhamurthy, A
Sharma, R
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2003, 17 (03) : 417 - 439
[23] Visual learning of affordance based cues
Fritz, Gerald
Paletta, Lucas
Kumar, Manish
Dorffner, Georg
Breithaupt, Ralph
Rome, Erich
FROM ANIMALS TO ANIMATS 9, PROCEEDINGS, 2006, 4095 : 52 - 64
[24] Deep motion and appearance cues for visual tracking
Danelljan, Martin
Bhat, Goutam
Gladh, Susanna
Khan, Fahad Shahbaz
Felsberg, Michael
PATTERN RECOGNITION LETTERS, 2019, 124 : 74 - 81
[25] Learning the Visual Interpretation of Sentences
Zitnick, C. Lawrence
Parikh, Devi
Vanderwende, Lucy
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1681 - 1688
[26] CLASSIFICATION OF BISYLLABIC LEXICAL STRESS PATTERNS IN DISORDERED SPEECH USING DEEP LEARNING
Shahin, Mostafa
Gutierrez-Osuna, Ricardo
Ahmed, Beena
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6480 - 6484
[27] Visual Task Outcome Verification Using Deep Learning
Erkent, Ozgur
Shukla, Dadhichi
Piater, Justus
2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4821 - 4827
[28] Monocular Visual Odometry Using Unsupervised Deep Learning
Liu, Fanning
Liu, Zhenghua
Wu, Qian
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 3274 - 3279
[29] Evaluating Visual Search in Glaucoma Using Deep learning
Mishra, Anoop
Belcher, Steven
Anderson, David
Khazanchi, Deepak
AMCIS 2020 PROCEEDINGS, 2020,
[30] Visual fire detection using deep learning: A survey
Cheng, Guangtao
Chen, Xue
Wang, Chenyi
Li, Xiaobo
Xian, Baoyi
Yu, Hao
NEUROCOMPUTING, 2024, 596

← 1 2 3 4 5 →