Making PDFs Accessible for Visually Impaired Users (and Findable for Everybody Else)

被引:1
|
作者
van Heusden, Ruben [1 ]
Ling, Hazel [1 ]
Nelissen, Lars [1 ]
Marx, Maarten [1 ]
机构
[1] Univ Amsterdam, Informat Inst, Informat Retrieval Lab, Amsterdam, Netherlands
关键词
Optical Character Recognition; Corpus Curation; Quality Control; Digital Libraries;
D O I
10.1007/978-3-031-43849-3_21
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We treat documents released under the Dutch Freedom of Information Act as FAIR scientific data and find that they are not findable nor accessible, due to text malformations caused by redaction software. Our aim is to repair these documents. We propose a simple but strong heuristic for detecting wrongly OCRed text segments, and we then repair only these OCR mistakes by prompting a large language model. This makes the documents better findable through full text search, but the repaired PDFs do still not adhere to accessibility standards. Converting them into HTML documents, keeping all essential layout and markup, makes them not only accessible to the visually impaired, but also reduces their size by up to two orders of magnitude. The costs of this way of repairing are roughly one dollar for the 17K pages in our corpus, which is very little compared to the large gains in information quality.
引用
收藏
页码:239 / 245
页数:7
相关论文
共 50 条
  • [21] TAMPOKME: A multi-users audio game accessible to visually and motor impaired people
    Gaudy, Thomas
    Natkin, Stephane
    Le Prado, Cecile
    Dilger, Thierry
    Archambault, Dominique
    PROCEEDINGS OF CGAMES'2007: 11TH INTERNATIONAL CONFERENCE ON COMPUTER GAMES: AI, ANIMATION, MOBILE, EDUCATIONAL AND SERIOUS GAMES, 2007, 2007, : 73 - +
  • [22] Accessibility and visually impaired users
    Fernandes, Antonio Ramires
    Pereira, Jorge Ribeiro
    Campos, Jose Creissac
    ENTERPRISE INFORMATION SYSTEMS VI, 2006, : 310 - +
  • [23] Making Australian Drought Monitor dataset findable, accessible, interoperable and reusable
    Gacenga, Francis
    An-Vo, Duc-Anh
    McCulloch, Jillian
    Young, Richard
    Cobon, David
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 226
  • [24] The MARCO model: Making substance abuse services accessible to people who are visually impaired
    Nelipovich, M
    Wergin, C
    Kossick, R
    JOURNAL OF VISUAL IMPAIRMENT & BLINDNESS, 1998, 92 (08) : 567 - 569
  • [25] Multimodal Accessible Games for Visually Impaired Players
    Sepchat, Alexis
    Bourguigneau, Zoran
    Monmarche, Nicolas
    Slimane, Mohamed
    ASSISTIVE TECHNOLOGY FROM ADAPTED EQUIPMENT TO INCLUSIVE ENVIRONMENTS, 2009, 25 : 677 - 681
  • [26] Accessible Card Games for Visually Impaired Players
    Sepchat, Alexis
    Monmarche, Nicolas
    Slimane, Mohamed
    CHALLENGES FOR ASSISTIVE TECHNOLOGY, 2007, 20 : 802 - 806
  • [27] Accessible park environments and facilities for the visually impaired
    Siu, Kin Wai Michael
    FACILITIES, 2013, 31 (13-14) : 590 - 609
  • [28] Accessible Swarachakra : A virtual Keyboard for Visually Impaired
    Srivastava, Medha
    Bharath, Pabba Anu
    PROCEEDINGS OF THE 8TH INDIAN CONFERENCE ON HUMAN COMPUTER INTERACTION (INDIA HCI 2016), 2016, : 111 - 115
  • [29] Will It Ever Be FAIR? Making Archaeological Data Findable, Accessible, Interoperable, and Reusable
    Nicholson, Christopher
    Kansa, Sarah
    Gupta, Neha
    Fernandez, Rachel
    ADVANCES IN ARCHAEOLOGICAL PRACTICE, 2023, 11 (01): : 63 - 75
  • [30] Making Archaeological Collections More Findable and Accessible through Increased Coordination
    Neller, Angela
    Heckman, Jasmine
    Bollwerk, Elizabeth
    Myers, Kelsey Noack
    Wells, Josh
    ADVANCES IN ARCHAEOLOGICAL PRACTICE, 2024, 12 (01): : 34 - 42