Look, Read and Ask: Learning to Ask Questions by Reading Text in Images

被引:2
|
作者
Jahagirdar, Soumya [1 ]
Gangisetty, Shankar [1 ]
Mishra, Anand [2 ]
机构
[1] KLE Technol Univ, Hubballi, India
[2] Indian Institue Technol Jodhpur, Vis Language & Learning Grp VL2G, Jodhpur, Rajasthan, India
关键词
Visual question generation (VQG); Conversational AI; Visual question answering (VQA); SCENE TEXT;
D O I
10.1007/978-3-030-86549-8_22
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a novel problem of text-based visual question generation or TextVQG in short. Given the recent growing interest of the document image analysis community in combining text understanding with conversational artificial intelligence, e.g., text-based visual question answering, TextVQG becomes an important task. TextVQG aims to generate a natural language question for a given input image and an automatically extracted text also known as OCR token from it such that the OCR token is an answer to the generated question. TextVQG is an essential ability for a conversational agent. However, it is challenging as it requires an in-depth understanding of the scene and the ability to semantically bridge the visual content with the text present in the image. To address TextVQG, we present an OCR-consistent visual question generation model that Looks into the visual content, Reads the scene text, and Asks a relevant and meaningful natural language question. We refer to our proposed model as OLRA. We perform an extensive evaluation of OLRA on two public benchmarks and compare them against baselines. Our model - OLRA automatically generates questions similar to the public text-based visual question answering datasets that were curated manually. Moreover, we 'significantly' outperform baseline approaches on the performance measures popularly used in text generation literature.
引用
收藏
页码:335 / 349
页数:15
相关论文
共 50 条
  • [1] Learning to Ask Unanswerable Questions for Machine Reading Comprehension
    Zhu, Haichao
    Dong, Li
    Wei, Furu
    Wang, Wenhui
    Qin, Bing
    Liu, Ting
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4238 - 4248
  • [2] Questions people ask about the role of phonological processes in learning to read
    Shankweiler D.
    Fowler A.E.
    [J]. Reading and Writing, 2004, 17 (5) : 483 - 515
  • [3] Answers that ask questions (Why read the classics)
    Jefferson, M
    [J]. NEW YORK TIMES BOOK REVIEW, 2001, : 39 - 39
  • [4] Learning to ask relevant questions
    Straach, J
    Truemper, K
    [J]. ARTIFICIAL INTELLIGENCE, 1999, 111 (1-2) : 301 - 327
  • [5] Learning What Questions to Ask
    Haggins, Bambi
    [J]. CINEMA JOURNAL, 2009, 49 (01): : 180 - 183
  • [6] READING RECOVERY - QUESTIONS CLASSROOM TEACHERS ASK
    HILL, LB
    HALE, MG
    [J]. READING TEACHER, 1991, 44 (07): : 480 - 483
  • [7] Teaching students to ask questions about what they read
    McMahon, R
    [J]. UNITING THE LIBERAL ARTS: CORE AND CONTEXT, 2002, : 37 - 40
  • [8] Learning How to Ask Research Questions
    Musante, Susan
    [J]. BIOSCIENCE, 2010, 60 (04) : 266 - 266
  • [9] Ask no questions ...
    Steele, B
    [J]. PROFESSIONAL ENGINEERING, 1999, 12 (03) : 21 - 21
  • [10] THE QUESTIONS THEY ASK
    NEWCOMB, R
    SAMMONS, M
    [J]. PERSONNEL, 1960, 37 (01) : 77 - 78