共 50 条
- [21] A Hierarchical Multimodal Attention-based Neural Network for Image Captioning [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 889 - 892
- [22] Incorporating the Graph Representation of Video and Text into Video Captioning [J]. 2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 396 - 401
- [23] Video Captioning with Guidance of Multimodal Latent Topics [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1838 - 1846
- [24] Reconstruction Network for Video Captioning [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7622 - 7631
- [25] Graph Convolutional Neural Network for Multimodal Movie Recommendation [J]. 38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 1633 - 1640
- [26] Using Spatial Temporal Graph Convolutional Network Dynamic Scene Graph for Video Captioning of Pedestrians Intention [J]. 2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 179 - 183
- [28] Discriminative Latent Semantic Graph for Video Captioning [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3556 - 3564
- [29] Pivot Correlational Neural Network for Multimodal Video Categorization [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 402 - 417
- [30] Multimodal attention-based transformer for video captioning [J]. Applied Intelligence, 2023, 53 : 23349 - 23368