The State of the Art and Prospects of Lip Reading

被引:0
|
作者
Chen X.-D. [1 ]
Sheng C.-C. [2 ]
Kuang G.-Y. [1 ]
Liu L. [2 ]
机构
[1] College of Electronic Science, National University of Defense Technology, Changsha
[2] College of Systems Engineering, National University of Defense Technology, Changsha
来源
基金
中国国家自然科学基金;
关键词
Computer vision; Deep learning; Lip reading; Spatiotemporal feature extraction; Visual speech recognition;
D O I
10.16383/j.aas.c190531
中图分类号
学科分类号
摘要
Lip reading, also known as visual speech recognition, aims to infer the content of a speech through the motion of the speaker's mouth. Lip reading is an important issue in the field of computer vision and pattern recognition. It has a wide range of applications in the fields of public security, medical, defense military and professional filming. In recent years, deep learning technology has greatly promoted the progress of lip reading research. Starting from the definition of lip reading problem, this paper first expounds the content and significance of lip reading research, and deeply analyzes the difficulties and challenges of lip reading research. Then, the recent achievements of lip reading research are introduced, and the current mainstream lip reading methods are combed, categorized and reviewed as well, including traditional methods and recent methods based on deep learning. Finally, the potential problems and possible research directions of lip reading research are discussed to arouse the attention and interest of this research, and promote the research progress of related issues. Copyright © 2020 Acta Automatica Sinica. All rights reserved.
引用
收藏
页码:2275 / 2301
页数:26
相关论文
共 185 条
  • [1] McGurk H, MacDonald J., Hearing lips and seeing voices, Nature, 264, 5588, pp. 746-748, (1976)
  • [2] Potamianos G, Neti C, Gravier G, Garg A, Senior A W., Recent advances in the automatic recognition of audiovisual speech, Proceedings of the IEEE, 91, 9, pp. 1306-1326, (2003)
  • [3] Calvert G A, Bullmore E T, Brammer M J, Campbell R, Williams S C R, McGuire P K, Et al., Activation of auditory cortex during silent lipreading, Science, 276, 5312, pp. 593-596, (1997)
  • [4] Deafness and hearing loss
  • [5] Tye-Murray N, Sommers M S, Spehar B., Audiovisual integration and lipreading abilities of older adults with normal and impaired hearing, Ear and Hearing, 28, 5, pp. 656-668, (2007)
  • [6] Akhtar Z, Micheloni C, Foresti G L., Biometric liveness detection: Challenges and research opportunities, IEEE Security and Privacy, 13, 5, pp. 63-72, (2015)
  • [7] Rekik A, Ben-Hamadou A, Mahdi W., Human machine interaction via visual speech spotting, Proceedings of the 2015 International Conference on Advanced Concepts for Intelligent Vision Systems, pp. 566-574, (2015)
  • [8] Suwajanakorn S, Seitz S M, Kemelmacher-Shlizerman I., Synthesizing obama: Learning lip sync from audio, ACM Transactions on Graphics, 36, 4, (2017)
  • [9] Chung J S, Zisserman A., Lip reading in the wild, Proceedings of the 2016 Asian Conference on Computer Vision, pp. 87-103, (2016)
  • [10] Chung J S, Senior A, Vinyals O, Zisserman A., Lip reading sentences in the wild, Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3444-3453, (2017)