共 47 条
- [1] Vinyals O, Toshev A, Bengio S, Erhan D., Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 4, pp. 652-663, (2017)
- [2] Li Liang, Yan Chenggang Clarence, Chen Xing, Et al., Distributed image understanding with semantic dictionary and semantic expansion, Neurocomputing, 174, pp. 384-392, (2016)
- [3] Li Liang, Jiang Shuqiang, Huang Qingming, Learning hierarchical semantic description via mixed-norm regularization for image understanding, IEEE Transactions on Multimedia, 14, 5, pp. 1401-1413, (2012)
- [4] Antol S, Agrawal A, Lu Jiasen, Et al., VQA: Visual question answering, Proceedings of the International Conference on Computer Vision, pp. 1682-1690, (2014)
- [5] Anderson P, He Xiaodong, Buehler C, Et al., Bottom-up and top-down attention for image captioning and visual question answering, Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 6077-6086, (2018)
- [6] Yu Zhou, Yu Jun, Cui Yuhao, Et al., Deep modular co-attention networks for visual question answering, Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 6281-6290, (2019)
- [7] Xiao Junbin, Shang Xindi, Yao Angela, Chua Tat-Seng, NExT-QA: Next phase of question-answering to explaining temporal actions, Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 9777-9786, (2021)
- [8] Li Liang, Wang Shuhui, Jiang Shuqiang, Huang Qingming, Attentive recurrent neural network for weak-supervised multi-label image classification, Proceedings of the International Conference on Multimedia, pp. 1092-1100, (2018)
- [9] Zhou Baohang, Cai Xiangrui, Zhang Ying, Et al., MTAAL: Multi-task adversarial active learning for medical named entity recognition and normalization, Proceedings of the AAAI Conference on Artificial Intelligence, pp. 14586-14593, (2021)
- [10] Zhang Beichen, Li Liang, Su Li, Et al., Structural semantic adversarial active learning for image captioning, Proceedings of the International Conference on Multimedia, pp. 1112-1121, (2020)