Hierarchical approaches to Text-based Offense Classification

被引:1
|
作者
Choi, Jay [1 ]
Kilmer, David [2 ]
Mueller-Smith, Michael [1 ]
Taheri, Sema A. [2 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Measures Justice, Rochester, NY USA
基金
美国国家科学基金会;
关键词
REPORTING SYSTEM; CRIME; FUTURE;
D O I
10.1126/sciadv.abq8123
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Researchers working with administrative crime data often must classify offense narratives into a common scheme for analysis purposes. No comprehensive standard currently exists, nor is there a mapping tool to trans-form raw descriptions into offense types. This paper introduces a new schema, the Uniform Crime Classification Standard (UCCS), and the Text-based Offense Classification (TOC) tool to address these shortcomings. The UCCS schema draws from existing efforts, aiming to better reflect offense severity and improve type disambiguation. The TOC tool is a machine learning algorithm that uses a hierarchical, multilayer perceptron classification frame-work, built on 313,209 hand-coded offense descriptions from 24 states, to translate raw descriptions into UCCS codes. We test how variations in data processing and modeling approaches affect recall, precision, and F1 scores to assess their relative influence on model performance. The code scheme and classification tool are collabora-tions between Measures for Justice and the Criminal Justice Administrative Records System.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Text-based approaches for the categorization of images
    Sable, CL
    Hatzivassiloglou, V
    [J]. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS, 1999, 1696 : 19 - 38
  • [2] Image Sense Classification in Text-Based Image Retrieval
    Chang, Yih-Chen
    Chen, Hsin-Hsi
    [J]. INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2009, 5839 : 124 - 135
  • [3] HIERARCHICAL TEXT CLASSIFICATION USING CNNS WITH LOCAL APPROACHES
    Krendzelak, Milan
    Jakab, Frantisek
    [J]. COMPUTING AND INFORMATICS, 2020, 39 (05) : 907 - 924
  • [4] Hierarchical text classification using CNNs with local approaches
    Krendzelak, Milan
    Jakab, Frantisek
    [J]. Computing and Informatics, 2021, 39 (05) : 907 - 924
  • [5] Learning Hierarchical Reasoning for Text-Based Visual Question Answering
    Li, Caiyuan
    Du, Qinyi
    Wang, Qingqing
    Jin, Yaohui
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 305 - 316
  • [6] Generalization in Text-based Games via Hierarchical Reinforcement Learning
    Xu, Yunqiu
    Fang, Meng
    Chen, Ling
    Du, Yali
    Zhang, Chengqi
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1343 - 1353
  • [7] Hierarchical Gumbel Attention Network for Text-based Person Search
    Zheng, Kecheng
    Liu, Wu
    Liu, Jiawei
    Zha, Zheng-Jun
    Mei, Tao
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3441 - 3449
  • [8] PyPatentAlice: Text-based classification of patents after Alice
    Jurek, Dominik
    [J]. SOFTWARE IMPACTS, 2024, 19
  • [9] Text-based approaches for non-topical image categorization
    Sable C.L.
    Hatzivassiloglou V.
    [J]. International Journal on Digital Libraries, 2000, 3 (3) : 261 - 275
  • [10] Neuro-Symbolic Approaches for Text-Based Policy Learning
    Chaudhury, Subhajit
    Sen, Prithviraj
    Ono, Masaki
    Kimura, Daiki
    Tatsubori, Michiaki
    Munawar, Asim
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3073 - 3078