共 50 条
- [31] An audio-visual approach to web video categorization [J]. Multimedia Tools and Applications, 2014, 70 : 1007 - 1032
- [33] An audio-visual distance for audio-visual speech vector quantization [J]. 1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 523 - 528
- [34] Catching audio-visual mice:: The extrapolation of audio-visual speed [J]. PERCEPTION, 2003, 32 : 96 - 96
- [35] Learning word-like units from joint audio-visual analysis [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 506 - 517
- [36] Joint audio-video processing for biometric speaker identification [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 377 - 380
- [37] Joint audio-video processing for biometric speaker identification [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 561 - 564
- [38] Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14866 - 14876
- [39] Somatosensory contribution to audio-visual speech processing [J]. CORTEX, 2021, 143 : 195 - 204