Image captioning in Turkish language: Database and model

被引：4

作者：

Yildiz, Tugba ^{[1
]}

Sonmez, Elena Battini ^{[1
]}

Yilmaz, Berk Dursun ^{[1
]}

Demir, Ali Emre ^{[1
]}

机构：

[1] Istanbul Bilgi Univ, Dept Comp Engn, TR-34060 Istanbul, Turkey

来源：

JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY | 2020年 / 35卷 / 04期

关键词：

Turkish image captioning; Turkish MS COCO database; computer vision; natural language proc; CNN; RNN;

D O I：

10.17341/gazimmfd.597089

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Automatic image captioning is a challenging issue in artificial intelligence, which covers both the fields of computer vision and natural language processing. Inspired by the later advances in machine translation, a successful encoder-decoder technique is currently the state-of-the-art in English language captioning. In this study, we proposed an image captioning model for Turkish Language. This paper evaluates the encoder-decoder model on MS COCO database by coupling an encoder Convolutional Neural Network (CNN) -the component that is responsible for extracting the features of the given images-, with a decoder Recurrent Neural Network (RNN) -the component that is responsible for generating captions using the given inputs- to generate Turkish captions. We conducted the experiments using the most common evaluation metrics such as BLEU, METEOR, ROUGE and CIDEr. Results show that the performance of the proposed model is satisfactory in both qualitative and quantitative evaluations. Finally, this study introduces a Web platform (http://mscoco-contributor.herokuapp.com/website/), which is proposed to improve the dataset via crowd-sourcing and free to use. The Turkish MS COCO dataset is available for research purpose. When all the images are completed, a Turkish dataset will be available for comparative studies.

引用

页码：2089 / 2100

页数：12

共 50 条

[1] Image Captioning in Turkish Language
Yilmaz, Berk Dursun
Demir, Ali Emre
Sonmez, Elena Battini
Yildiz, Tugba
[J]. 2019 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2019, : 413 - 417
[2] Image Captioning in Turkish with Subword Units
Kuyu, Menekse
Erdem, Aykut
Erdem, Erkut
[J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
[3] Unpaired Image Captioning by Language Pivoting
Gu, Jiuxiang
Joty, Shafiq
Cai, Jianfei
Wang, Gang
[J]. COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 519 - 535
[4] Image Captioning with Pretrained Language Generators
Vishnubhatla, Saketh
Sinha, Nishant
[J]. CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 427 - 427
[5] Conditional Embedding Pre-Training Language Model for Image Captioning
Li, Pengfei
Zhang, Min
Lin, Peijie
Wan, Jian
Jiang, Ming
[J]. NEURAL PROCESSING LETTERS, 2022, 54 (06) : 4987 - 5003
[6] Conditional Embedding Pre-Training Language Model for Image Captioning
Pengfei Li
Min Zhang
Peijie Lin
Jian Wan
Ming Jiang
[J]. Neural Processing Letters, 2022, 54 : 4987 - 5003
[7] c-RNN: A Fine-Grained Language Model for Image Captioning
Gengshi Huang
Haifeng Hu
[J]. Neural Processing Letters, 2019, 49 : 683 - 691
[8] c-RNN: A Fine-Grained Language Model for Image Captioning
Huang, Gengshi
Hu, Haifeng
[J]. NEURAL PROCESSING LETTERS, 2019, 49 (02) : 683 - 691
[9] An Empirical Study of Language CNN for Image Captioning
Gu, Jiuxiang
Wang, Gang
Cai, Jianfei
Chen, Tsuhan
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1231 - 1240
[10] Language Model Agnostic Gray-Box Adversarial Attack on Image Captioning
Aafaq, Nayyer
Akhtar, Naveed
Liu, Wei
Shah, Mubarak
Mian, Ajmal
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 626 - 638

← 1 2 3 4 5 →