|
|
Examples from a current project
Original N's:
- records in original raw data file: 4,398,590 persons
- number of id values occurring 2+ times: 81,532
- total # of records with duplicate id problem: 170,029
- # of observations in SAS data set of persons after removing all those with duplicate id: 4,228,561
Files built for analysis, including intermediate files:
- N's: 1,408,233---252,172,992 observations
- File sizes: 75 mb---6.8 gb
- Job times: Real from 26 1/2 minutes to 3 1/2 hours, CPU from 21 1/2 minutes to 2 3/4 hours
|
|