共 50 条
- [31] Large Language Models are Temporal and Causal Reasoners for Video Question Answering 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4300 - 4316
- [32] Application of temporal information extraction techniques to question answering systems PROCESAMIENTO DEL LENGUAJE NATURAL, 2009, (42): : 25 - 30
- [33] Spatio-Temporal Graph Convolution Transformer for Video Question Answering IEEE ACCESS, 2024, 12 : 131664 - 131680
- [34] Dynamic Spatio-Temporal Modular Network for Video Question Answering PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4466 - 4477
- [36] Video Captioning Based on the Spatial-Temporal Saliency Tracing ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 59 - 70
- [37] Deep Video Harmonization by Improving Spatial-temporal Consistency Machine Intelligence Research, 2024, 21 : 46 - 54
- [38] Spatial-Temporal Separable Attention for Video Action Recognition 2022 INTERNATIONAL CONFERENCE ON FRONTIERS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, FAIML, 2022, : 224 - 228
- [40] ShiftFormer: Spatial-Temporal Shift Operation in Video Transformer 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1895 - 1900