Developing computer vision and machine learning strategies to unlock government-created records

被引:0
|
作者
Jansen, Greg [1 ]
Marciano, Richard [1 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
关键词
Computer vision; Machine learning; Artificial intelligence; 1950 US Census records; Sacramento; WWII Japanese American incarceration;
D O I
10.1007/s00146-025-02231-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper outlines the development of a proof-of-concept workflow using machine learning and computer vision techniques to unlock the data within digitized handwritten US Census forms from the 1950s. The 1950s US Census includes over 6.5 million page images and was only recently made available to the public on April 1, 2022, following a 72-year access restriction period. Our project uses computational treatments to assist researchers in their efforts to recover and preserve the history of the erased Sacramento Japantown. Sacramento once housed the fourth largest Japantown in the United States before experiencing WWII Japanese American Incarceration and the 1950s US Government program of urban renewal. The goal is to augment a researcher's work in selecting a subset of Census pages for further transcription and analysis. We demonstrate a workflow for extracting demographic information using computer vision for image segmentation, and machine learning for handwritten character recognition. The workflow consists of a computational filtering process for Census records and a user interface for page review. These computational techniques are suitable for other cities, states, and communities, and demonstrate new strategies to unlock vital demographic information. The approach highlights the potential benefits of computational techniques for the analysis of form-based historical records of the twentieth century that can have an impact on social justice.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Machine Learning for Video Action Recognition: a Computer Vision Approach
    Labayen, Mikel
    Aginako, Naiara
    Sierra, Basilio
    Olaizola, Igor G.
    Florez, Julian
    2018 14TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS), 2018, : 683 - 690
  • [32] Mineral grains recognition using computer vision and machine learning
    Maitre, Julien
    Bouchard, Kevin
    Bedard, L. Paul
    COMPUTERS & GEOSCIENCES, 2019, 130 (84-93) : 84 - 93
  • [34] Automatic apple detection in orchards with computer vision and machine learning
    El Abidine, M. Zine
    Ahmad, A.
    Dutagaci, H.
    Rousseau, D.
    XXXI INTERNATIONAL HORTICULTURAL CONGRESS, IHC2022: III INTERNATIONAL SYMPOSIUM ON MECHANIZATION, PRECISION HORTICULTURE, AND ROBOTICS: PRECISION AND DIGITAL HORTICULTURE IN FIELD ENVIRONMENTS, 2023, 1360 : 45 - 51
  • [35] Machine Learning Model Interpretability in NLP and Computer Vision Applications
    Chakrabarty, Navoneel
    ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 255 - 267
  • [36] Comparison of Detection Methods based on Computer Vision and Machine Learning
    Jia, Wenjuan
    Jiang, Yongyan
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON MECHANICAL, ELECTRONIC, CONTROL AND AUTOMATION ENGINEERING (MECAE 2017), 2017, 61 : 386 - 390
  • [37] A review on the application of computer vision and machine learning in the tea industry
    Wang, Huajia
    Gu, Jinan
    Wang, Mengni
    FRONTIERS IN SUSTAINABLE FOOD SYSTEMS, 2023, 7
  • [38] Smart Implementation of Computer Vision and Machine Learning for Pothole Detection
    Shah, Ashulosh
    Sharma, Gaurav
    Bhargava, Lava
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 65 - 69
  • [39] An Approach to Automate the Scorecard in Cricket with Computer Vision and Machine Learning
    Shahjalal, Md Asif
    Ahmad, Zubaer
    Rayan, Rushrukh
    Alam, Lamia
    2017 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL INFORMATION AND COMMUNICATION TECHNOLOGY (EICT 2017), 2017,
  • [40] User identification by matching radio "vision" and computer vision through means of machine learning
    de Pinho, Vinicius M.
    Popescu, Dalia
    2020 IFIP NETWORKING CONFERENCE AND WORKSHOPS (NETWORKING), 2020, : 671 - 672