共 50 条
- [1] Multi-pose lipreading and audio-visual speech recognition EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2012, : 1 - 23
- [2] Part-Based Lipreading for Audio-Visual Speech Recognition 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 2722 - 2726
- [3] Multi-pose lipreading and audio-visual speech recognition EURASIP Journal on Advances in Signal Processing, 2012
- [4] AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [5] Vision Transformers and Transfer Learning Approaches for Arabic Sign Language Recognition APPLIED SCIENCES-BASEL, 2023, 13 (21):
- [6] A robust hierarchical lip tracking approach for lipreading and audio visual speech recognition PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 3620 - 3624
- [7] Applying Generative Adversarial Networks and Vision Transformers in Speech Emotion Recognition Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13519 LNCS : 67 - 75
- [9] Concatenated Frame Image Based CNN for Visual Speech Recognition COMPUTER VISION - ACCV 2016 WORKSHOPS, PT II, 2017, 10117 : 277 - 289