Making PDFs Accessible for Visually Impaired Users (and Findable for Everybody Else)

被引:1
|
作者
van Heusden, Ruben [1 ]
Ling, Hazel [1 ]
Nelissen, Lars [1 ]
Marx, Maarten [1 ]
机构
[1] Univ Amsterdam, Informat Inst, Informat Retrieval Lab, Amsterdam, Netherlands
关键词
Optical Character Recognition; Corpus Curation; Quality Control; Digital Libraries;
D O I
10.1007/978-3-031-43849-3_21
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We treat documents released under the Dutch Freedom of Information Act as FAIR scientific data and find that they are not findable nor accessible, due to text malformations caused by redaction software. Our aim is to repair these documents. We propose a simple but strong heuristic for detecting wrongly OCRed text segments, and we then repair only these OCR mistakes by prompting a large language model. This makes the documents better findable through full text search, but the repaired PDFs do still not adhere to accessibility standards. Converting them into HTML documents, keeping all essential layout and markup, makes them not only accessible to the visually impaired, but also reduces their size by up to two orders of magnitude. The costs of this way of repairing are roughly one dollar for the 17K pages in our corpus, which is very little compared to the large gains in information quality.
引用
收藏
页码:239 / 245
页数:7
相关论文
共 50 条
  • [41] Towards a more accessible e-government in Jordan: an evaluation study of visually impaired users and Web developers
    Abu-Doush, Iyad
    Bany-Mohammed, Ashraf
    Ali, Emad
    Al-Betar, Mohammed Azmi
    BEHAVIOUR & INFORMATION TECHNOLOGY, 2013, 32 (03) : 273 - 293
  • [42] Kinaptic - Techniques and Insights for Creating Competitive Accessible 3D Games for Sighted and Visually Impaired Users
    Grabski, Andreas
    Toni, Toni
    Zigrand, Tom
    Weller, Rene
    Zachmann, Gabriel
    IEEE HAPTICS SYMPOSIUM 2016, 2016, : 325 - 331
  • [43] Teaching tip: Making data flow diagrams accessible for visually impaired students using excel tables
    Sauter, Vicki L.
    Journal of Information Systems Education, 2015, 26 (01) : 9 - 10
  • [44] Making documentation accessible to users with disabilities
    Chappell, GB
    STC'S 50TH ANNUAL CONFERENCE, PROCEEDINGS, 2003, : 339 - 344
  • [45] Visually Impaired Users on an Online Social Network
    Wu, Shaomei
    Adamic, Lada
    32ND ANNUAL ACM CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2014), 2014, : 3133 - 3142
  • [46] Smartphone Haptic Applications for Visually Impaired Users
    Voutsakelis, Georgios
    Diamanti, Athina
    Kokkonis, Georgios
    2021 IEEE 9TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2021), 2021, : 421 - 425
  • [47] Electromagnetic Sensing of Obstacles for Visually Impaired Users
    Scalise, Lorenzo
    Di Mattia, Valentina
    Russo, Paola
    De Leo, Alfredo
    Primiani, Valter Mariani
    Cerri, Graziano
    AMBIENT ASSISTED LIVING, 2014, : 187 - 194
  • [48] Characterization of visually impaired computer users' performance
    Rosa, RH
    Scott, IU
    Jacko, JA
    Pappas, CJ
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 1999, 40 (04) : S433 - S433
  • [49] ELECTRONIC CALCULATORS FOR VISUALLY IMPAIRED USERS - EVALUATION
    GOODRICH, GL
    BENNETT, RR
    WILEY, JK
    JOURNAL OF VISUAL IMPAIRMENT & BLINDNESS, 1977, 71 (04) : 154 - 157
  • [50] Wayfinding for visually impaired users of public buildings
    Robertson, BS
    Dunne, CH
    JOURNAL OF VISUAL IMPAIRMENT & BLINDNESS, 1998, 92 (05) : 349 - 354