共 50 条
- [1] Low-Latency Streaming Scene-aware Interaction Using Audio-Visual Transformers INTERSPEECH 2022, 2022, : 4511 - 4515
- [2] Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention INTERSPEECH 2023, 2023, : 2838 - 2842
- [3] VIDEO CAMERA IDENTIFICATION USING AUDIO-VISUAL FEATURES 2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
- [4] AUDIO-VISUAL SCENE-AWARE DIALOG AND REASONING USING AUDIO-VISUAL TRANSFORMERS WITH JOINT STUDENT-TEACHER LEARNING 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7732 - 7736
- [7] SYNCHRONIZED AUDIO-VISUAL FRAMES WITH FRACTIONAL POSITIONAL ENCODING FOR TRANSFORMERS IN VIDEO-TO-TEXT TRANSLATION 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2041 - 2045
- [8] Audio-visual quality and interactions between television audio and video ISSPA 2001: SIXTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2001, : 438 - 441