ISID

Test if variable(s) uniquely identify each observation in a dataset

If you want to use ISID, feel free to download it here:

 

Download isid.sas

 

If you are not prompted to "Save To Disk", then right-click the link and choose "Save Link Target As..."  Otherwise, you will need to save the web page to your computer.  Make sure you save the isid.sas file as a plain text file not an htm/html file.

 

Disclaimer:  There is no warranty on this software either expressed or implied.  This program is released under the terms and conditions of GNU General Public License.

 

About ISID

Programmer:  Dan Blanchette
The Carolina Population Center
The University of North Carolina
Chapel Hill, NC  USA

Date:  13Feb2009
Last updated:  06Mar2009


Check a dataset to see if variable(s) uniquely identify each observation

 

%isid(dset= libref.data_set_name, varlist= variable name(s) );

 

Description

The ISID SAS macro is just like the Stata command isid.  It tests if the variable(s) uniquely identifies each observation in the dataset.  It's a good idea to run ISID before doing a data merge so that you know your merge variables are appropriate to use for the merge.  It is most often the case that you do not want to do a merge that generates the SAS note:
 NOTE:  MERGE Statement has more than one dataset with repeats of BY values
ISID will check the most recently created SAS dataset if no SAS dataset name (including the libref if not in the WORK library) is provided.  ISID will use the variable(s) the dataset is sorted by if no variable list to test the dataset with is provided.

 

Options

Option
Explanation
dset= name of dataset
The string following "dset=" is the SAS dataset name (and libref if not in the WORK library).  If dset= is not used, then ISID will use the most recently created SAS dataset.

varlist= list of variable(s) to check
The string following "varlist=" is a list of variables in the dataset to check to see if they uniquely identify each observation in the dataset.  If varlist= is not used, then ISID will use the variables the dataset is sorted by.

missok=missok
missok specifies that it is okay for the variables being checked by ISID to have missing values.  Missing values in variables that are supposed to uniquely identify an observation is generally not a good idea.  When using the missok option ISID will return a note in the SAS log saying what variables that are being checked have missing values if any of the variables have at least one missing value.  ISID will return an error message.

 

How to use the ISID macro:

Using the ISID SAS macro requires that you understand how to use the %include SAS statement and that you know how to call a SAS macro.
   %include "LOCATION AND NAME OF A FILE THAT CONTAINS SAS CODE";
For example, if you have copied this file to "c:\SASmacros", then you tell SAS about this macro by adding the following line to your SAS program:
  %include "c:\SASmacros\isid.sas"; 
The %include statement makes SAS aware of the ISID macro which is in the file "isid.sas".  To use the macro you have to make a call to it.  To do that you add a line like the following to your SAS program:
  %isid(dset= sashelp.shoes, varlist= stores); 
The information inside the parentheses is passed on to the ISID macro.  The string following "dset=" is the SAS dataset name (and libref if not in the WORK library).  The string "varlist=" is one or more variables to test if they uniquely identify each observation in the dataset.  An error message will be returned in the SAS log if the variables do not uniquely identify each observation:
 ERROR: variables  Product Stores do not uniquely identify each observation
and a note will be returned if they do:
 NOTE: variables Product Subsidiary Stores uniquely identify each observation

 

Examples

 
  %include "C:\SASmacros\isid.sas"; ** Include macro once in a SAS session and call it **;
                                     *  as many times as you like in that session.     **;
  
  %isid(dset= sashelp.shoes, varlist= region product stores sales);

  %isid(dset= sashelp.shoes, varlist= region stores);


  ** if the most recently created dataset is sorted then no dataset name 
   *  nor variable list is required **;

  proc sort data= sashelp.shoes
             out= work.new;
    by region product;
  run;

  %isid;

  %isid(dset= new);

  %isid(varlist= region product stores sales);

  %isid(varlist= region product stores sales, missok=missok);

Wink Plone Theme by Quintagroup © 2013.

Personal tools
This is themeComment for Wink theme