Improved variable and value ranking techniques for mining categorical traffic accident data

被引:6
|
作者
Wang, HJ [1 ]
Parrish, A [1 ]
Smith, RK [1 ]
Vrbsky, S [1 ]
机构
[1] Univ Alabama, Dept Comp Sci, Tuscaloosa, AL 35487 USA
关键词
variable and feature selection; variable ranking; value ranking; performance;
D O I
10.1016/j.eswa.2005.06.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ever increasing size of datasets used for data mining and machine learning applications has placed a renewed emphasis on algorithm performance and processing strategies. This paper addresses algorithms for ranking variables in a dataset, as well as for ranking values of a specific variable. We propose two new techniques, called Max Gain (MG) and Sum Max Gain Ratio (SMGR), which are well-correlated with existing techniques, yet are much more intuitive. MG and SMGR were developed for the public safety domain using categorical traffic accident data. Unlike the typical abstract statistical techniques for ranking variables and values, the proposed techniques can be motivated as useful intuitive metrics for non-statistician practitioners in a particular domain. Additionally, the proposed techniques are generally more efficient than the more traditional statistical approaches. (c) 2005 Elsevier Ltd. All rights reserved.
引用
收藏
页码:795 / 806
页数:12
相关论文
共 50 条
  • [1] MINING ON TRAFFIC ACCIDENT DATA BY APPLYING AN IMPROVED APRIORI ALGORITHM
    Li, Tingting
    Wang, Wei
    Sun, Jinyan
    Li, Xueyun
    Zhu, Meng
    [J]. 2011 INTERNATIONAL CONFERENCE ON INSTRUMENTATION, MEASUREMENT, CIRCUITS AND SYSTEMS (ICIMCS 2011), VOL 3: COMPUTER-AIDED DESIGN, MANUFACTURING AND MANAGEMENT, 2011, : 401 - 403
  • [2] Establishment of a Traffic Accident Appraisal System through Data Mining Techniques
    Kuo, Yi-Wen
    Lin, Chun-Nan
    [J]. ITE JOURNAL-INSTITUTE OF TRANSPORTATION ENGINEERS, 2019, 89 (08): : 38 - 43
  • [3] An Algorithm for Mining Outliers in Categorical Data through Ranking
    Suri, N. N. R. Ranga
    Murty, M. Narasimha
    Athithan, G.
    [J]. 2012 12TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS), 2012, : 247 - 252
  • [4] Latent variable techniques for categorical data
    Gillian Lancaster
    Mick Green
    [J]. Statistics and Computing, 2002, 12 : 153 - 161
  • [5] Latent variable techniques for categorical data
    Lancaster, G
    Green, M
    [J]. STATISTICS AND COMPUTING, 2002, 12 (02) : 153 - 161
  • [6] Feature Relevance Analysis and Classification of Road Traffic Accident Data through Data Mining Techniques
    Shanthi, S.
    Ramani, R. Geetha
    [J]. WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, WCECS 2012, VOL I, 2012, : 122 - 127
  • [7] Data-mining techniques for traffic accident modeling and prediction in the United Arab Emirates
    Taamneh, Madhar
    Alkheder, Sharaf
    Taamneh, Salah
    [J]. JOURNAL OF TRANSPORTATION SAFETY & SECURITY, 2017, 9 (02) : 146 - 166
  • [8] Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable
    dos Santos Gruginskie, Lucia Adriana
    Roehe Vaccaro, Guilherme Luis
    [J]. PLOS ONE, 2018, 13 (06):
  • [9] Improved Fuzzy Clustering Techniques for Categorical Data
    Saha, Indrajit
    Maulik, Ujjwal
    [J]. IAENG TRANSACTIONS ON ENGINEERING TECHNOLOGIES VOL 1, 2009, 1089 : 82 - +
  • [10] Incorrect attribute value detection for traffic accident data
    Deb, Rupam
    Liew, Alan Wee-Chung
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,