Making PDFs Accessible for Visually Impaired Users (and Findable for Everybody Else)

被引:1
|
作者
van Heusden, Ruben [1 ]
Ling, Hazel [1 ]
Nelissen, Lars [1 ]
Marx, Maarten [1 ]
机构
[1] Univ Amsterdam, Informat Inst, Informat Retrieval Lab, Amsterdam, Netherlands
关键词
Optical Character Recognition; Corpus Curation; Quality Control; Digital Libraries;
D O I
10.1007/978-3-031-43849-3_21
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We treat documents released under the Dutch Freedom of Information Act as FAIR scientific data and find that they are not findable nor accessible, due to text malformations caused by redaction software. Our aim is to repair these documents. We propose a simple but strong heuristic for detecting wrongly OCRed text segments, and we then repair only these OCR mistakes by prompting a large language model. This makes the documents better findable through full text search, but the repaired PDFs do still not adhere to accessibility standards. Converting them into HTML documents, keeping all essential layout and markup, makes them not only accessible to the visually impaired, but also reduces their size by up to two orders of magnitude. The costs of this way of repairing are roughly one dollar for the 17K pages in our corpus, which is very little compared to the large gains in information quality.
引用
收藏
页码:239 / 245
页数:7
相关论文
共 50 条
  • [1] Making Libraries Accessible for Visually Impaired Users: Practical Advice For Librarians
    Hamilton, Devney
    Keten, Burcu
    TURKISH LIBRARIANSHIP, 2011, 25 (04) : 509 - 518
  • [2] Making ProTools accessible for visually impaired
    Zahradnicky, Tomas
    Lorencz, Robert
    Musil, Pavel
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, 2008, 5105 : 781 - +
  • [3] Making nonaccessible applications accessible for visually impaired
    Zahradnicky, Tomas
    Lorencz, Robert
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, 2006, 4061 : 1047 - 1054
  • [4] A JBrick: Accessible Robotics Programming for Visually Impaired Users
    Ludi, Stephanie
    Jordan, Scott
    UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION: ACCESS TO LEARNING, HEALTH AND WELL-BEING, UAHCI 2015, PT III, 2015, 9177 : 157 - 168
  • [5] Making journals accessible to the visually impaired: the future is near
    Gardner, John
    Bulatov, Vladimir
    Kelly, Robert
    LEARNED PUBLISHING, 2009, 22 (04) : 314 - 319
  • [6] A-Cross: An Accessible Crossword Puzzle for Visually Impaired Users
    Ntoa, Stavroula
    Adami, Ilia
    Prokopiou, Giannis
    Antona, Margherita
    Stephanidis, Constantine
    UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION: USERS DIVERSITY, PT 2, 2011, 6766 : 342 - 351
  • [7] Design of an Architecture for Accessible Web Maps for Visually Impaired Users
    Calle-Jimenez, Tania
    Eguez-Sarzosa, Adrian
    Lujan-Mora, Sergio
    ADVANCES IN HUMAN FACTORS AND SYSTEMS INTERACTION, 2019, 781 : 221 - 232
  • [8] Making Programming Education More Accessible for Visually Impaired
    Konecki, Mario
    Ivkovic, Nikola
    Kaniski, Matija
    2016 39TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2016, : 887 - 890
  • [9] Making Gestural Interaction Accessible to Visually Impaired People
    Brock, Anke
    Truillet, Philippe
    Oriola, Bernard
    Jouffrais, Christophe
    HAPTICS: NEUROSCIENCE, DEVICES, MODELING, AND APPLICATIONS, PT II, 2014, 8619 : 41 - 48
  • [10] Accessible Online Indoor Maps for Blind and Visually Impaired Users
    Calle-Jimenez, Tania
    Lujan-Mora, Sergio
    ASSETS'16: PROCEEDINGS OF THE 18TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2016, : 309 - 310