共 50 条
- [21] Deep medical cross-modal attention hashing [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1519 - 1536
- [22] Utilizing visual attention for cross-modal coreference interpretation [J]. MODELING AND USING CONTEXT, PROCEEDINGS, 2005, 3554 : 83 - 96
- [24] Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1097 - 1103
- [25] Gated Multi-modal Fusion with Cross-modal Contrastive Learning for Video Question Answering [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 427 - 438
- [27] VCD: Visual Causality Discovery for Cross-Modal Question Reasoning [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 309 - 322
- [29] Cross-modal body representation based on visual attention by saliency [J]. 2008 IEEE/RSJ INTERNATIONAL CONFERENCE ON ROBOTS AND INTELLIGENT SYSTEMS, VOLS 1-3, CONFERENCE PROCEEDINGS, 2008, : 2041 - +