Image captioning in Turkish language: Database and model

被引:4
|
作者
Yildiz, Tugba [1 ]
Sonmez, Elena Battini [1 ]
Yilmaz, Berk Dursun [1 ]
Demir, Ali Emre [1 ]
机构
[1] Istanbul Bilgi Univ, Dept Comp Engn, TR-34060 Istanbul, Turkey
关键词
Turkish image captioning; Turkish MS COCO database; computer vision; natural language proc; CNN; RNN;
D O I
10.17341/gazimmfd.597089
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Automatic image captioning is a challenging issue in artificial intelligence, which covers both the fields of computer vision and natural language processing. Inspired by the later advances in machine translation, a successful encoder-decoder technique is currently the state-of-the-art in English language captioning. In this study, we proposed an image captioning model for Turkish Language. This paper evaluates the encoder-decoder model on MS COCO database by coupling an encoder Convolutional Neural Network (CNN) -the component that is responsible for extracting the features of the given images-, with a decoder Recurrent Neural Network (RNN) -the component that is responsible for generating captions using the given inputs- to generate Turkish captions. We conducted the experiments using the most common evaluation metrics such as BLEU, METEOR, ROUGE and CIDEr. Results show that the performance of the proposed model is satisfactory in both qualitative and quantitative evaluations. Finally, this study introduces a Web platform (http://mscoco-contributor.herokuapp.com/website/), which is proposed to improve the dataset via crowd-sourcing and free to use. The Turkish MS COCO dataset is available for research purpose. When all the images are completed, a Turkish dataset will be available for comparative studies.
引用
收藏
页码:2089 / 2100
页数:12
相关论文
共 50 条
  • [1] Image Captioning in Turkish Language
    Yilmaz, Berk Dursun
    Demir, Ali Emre
    Sonmez, Elena Battini
    Yildiz, Tugba
    [J]. 2019 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2019, : 413 - 417
  • [2] Image Captioning in Turkish with Subword Units
    Kuyu, Menekse
    Erdem, Aykut
    Erdem, Erkut
    [J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [3] Unpaired Image Captioning by Language Pivoting
    Gu, Jiuxiang
    Joty, Shafiq
    Cai, Jianfei
    Wang, Gang
    [J]. COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 519 - 535
  • [4] Image Captioning with Pretrained Language Generators
    Vishnubhatla, Saketh
    Sinha, Nishant
    [J]. CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 427 - 427
  • [5] Conditional Embedding Pre-Training Language Model for Image Captioning
    Li, Pengfei
    Zhang, Min
    Lin, Peijie
    Wan, Jian
    Jiang, Ming
    [J]. NEURAL PROCESSING LETTERS, 2022, 54 (06) : 4987 - 5003
  • [6] Conditional Embedding Pre-Training Language Model for Image Captioning
    Pengfei Li
    Min Zhang
    Peijie Lin
    Jian Wan
    Ming Jiang
    [J]. Neural Processing Letters, 2022, 54 : 4987 - 5003
  • [7] c-RNN: A Fine-Grained Language Model for Image Captioning
    Gengshi Huang
    Haifeng Hu
    [J]. Neural Processing Letters, 2019, 49 : 683 - 691
  • [8] c-RNN: A Fine-Grained Language Model for Image Captioning
    Huang, Gengshi
    Hu, Haifeng
    [J]. NEURAL PROCESSING LETTERS, 2019, 49 (02) : 683 - 691
  • [9] An Empirical Study of Language CNN for Image Captioning
    Gu, Jiuxiang
    Wang, Gang
    Cai, Jianfei
    Chen, Tsuhan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1231 - 1240
  • [10] Language Model Agnostic Gray-Box Adversarial Attack on Image Captioning
    Aafaq, Nayyer
    Akhtar, Naveed
    Liu, Wei
    Shah, Mubarak
    Mian, Ajmal
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 626 - 638