共 50 条
- [22] Video-Based Cross-Modal Recipe Retrieval [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1685 - 1693
- [23] VCVTS: MULTI-SPEAKER VIDEO-TO-SPEECH SYNTHESIS VIA CROSS-MODAL KNOWLEDGE TRANSFER FROM VOICE CONVERSION [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7252 - 7256
- [25] Cross-modal correspondences in sine wave: Speech versus non-speech modes [J]. Attention, Perception, & Psychophysics, 2020, 82 : 944 - 953
- [26] END-TO-END VOICE CONVERSION VIA CROSS-MODAL KNOWLEDGE DISTILLATION FOR DYSARTHRIC SPEECH RECONSTRUCTION [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7744 - 7748
- [27] Video and audio are images: A cross-modal mixer for original data on video–audio retrieval [J]. Knowl Based Syst, 2024,
- [28] Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1157 - 1164
- [30] Infants Detect Cross-modal Cues to Identity in Speech and Singing [J]. NEUROSCIENCES AND MUSIC III: DISORDERS AND PLASTICITY, 2009, 1169 : 508 - 511