共 50 条
- [1] Deconfounded Multimodal Learning for Spatio-temporal Video Grounding [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7521 - 7529
- [2] Deconfounded Visual Question Generation with Causal Inference [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5132 - 5142
- [5] Flexible Visual Grounding [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 285 - 299
- [7] Visual-Semantic Graph Matching for Visual Grounding [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4041 - 4050
- [9] Grounding Language in Visual and Conversational Contexts [J]. WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 366 - 366
- [10] Visual Grounding via Accumulated Attention [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7746 - 7755