Taking the drudgery out of data checking: Automatic data validation using FORMATS to validate the data, PROC DATASETS to drive the process and MACROS to hang it all together.
被引:0
|
作者:
Trenery, D
论文数: 0引用数: 0
h-index: 0
机构:
Hoechst Mar Roussel, Uxbridge UB9 5HP, Middx, EnglandHoechst Mar Roussel, Uxbridge UB9 5HP, Middx, England
Trenery, D
[1
]
机构:
[1] Hoechst Mar Roussel, Uxbridge UB9 5HP, Middx, England
Checking data is a tedious but necessary stage before data analysis. This paper describes a utility that takes some of the drudgery out of the data checking process and enables one to be more confident about the quality of the data. You often have formats that label valid data values and are permanently associated to variables in a data set. This utility takes advantage of these formats to validate the data. I use the DATASETS procedure to drive the validation process. It provides a list of variables and their formats. Modified formats validate the data. Macros, the SQL procedure and the DATA step provide the tools to link the process. You supply the name of the data set you wish to check and optionally any identifying variable. Then the utility does the rest, listing observations with invalid values for those variables with an associated format. This utility has been tested and run on SAS(R) software version 6.11 and 6.12. It should run on ail operating systems. The intended audience is for SAS users of moderate or greater experience.