USESAS
Stata command to load SAS datasets into Stata. Once your data is in Stata it is a Stata dataset and should be treated as such. Check out Stata's nearly equivalent command fdause that allows you to use SAS transport/xport datasets.
If you want to use usesas, then feel free to use Stata's command ssc install to download and install usesas:
ssc install usesas , replaceDisclaimer: There is no warranty on this software either expressed or implied. This program is released under the terms and conditions of GNU General Public License.
About usesas
Programmer: Dan Blanchette (dan_blanchette@unc.edu)
The Carolina Population Center
The University of North Carolina
Chapel Hill, NC USA
Last updated: 09Dec2010
Use a SAS dataset in Stata 8.1+
The Carolina Population Center
The University of North Carolina
Chapel Hill, NC USA
Last updated: 09Dec2010
Use a SAS dataset in Stata 8.1+
usesas using filename [ , formats char2lab check clear xport describe
keep( variable names) if(SAS if statement) in(
firstobs/lastobs) quotes messy ]
Description
usesas loads a SAS datafile into memory. This usually occurs by supplying usesas a SAS
dataset (
usesas indicates that it has finished running by reporting to you how many observations and variables are in your dataset now in memory. For example:
NOTE: usesas calls SAS to run a SAS program. This requires the ability to run SAS on your computer.
*.sas7bdat, *.sd7, *.sd2, *.ssd01, *.xpt )
or an SPSS portable file (*.por), but
usesas can also load a SAS dataset into memory via a SAS program
(*.sas) that creates a SAS dataset.
The last dataset created by the SAS program will be the SAS dataset processed by usesas.
usesas figures out how much memory the SAS dataset will require to be loaded into Stata and sets Stata's
memory for you.
usesas indicates that it has finished running by reporting to you how many observations and variables are in your dataset now in memory. For example:
Stata reports that the dataset has 200 observations and 11 variables.usesas uses the SAVASTATA SAS macro to create the Stata dataset from the SAS dataset. Stata's net install command will download the SAVASTATA SAS macro for you.
NOTE: usesas calls SAS to run a SAS program. This requires the ability to run SAS on your computer.
Options
formats specifies to create value labels from SAS user-defined formats that are stored in a SAS formats
catalog file that has the same name as the dataset and is in the same directory as the SAS dataset. For example:
MySasData.sas7bcat. If this file
doesn't exist, usesas will look for the file
formats.sas7bcat in the same directory as
the dataset.
char2lab specifies to encode long SAS character variables like the Stata command
encode. Character variables that are too long for
a Stata string variable are maintained in value labels. This is all done with the CHAR2FMT
SAS macro.
check specifies that basic stats for both datasets are to be generated to compare the SAS input dataset
with the Stata output dataset in order to make sure usesas created the files correctly. This is a
comparison that should be done after any datafile is converted to any other type of datafile by any software. The SAS
file is created in the same directory as the SAS datafile and is named starting with the name of the datafile followed by
"
_SAScheck.lst" (SAS). e.g.
"mydata_SAScheck.lst"
clear specifies to clear the data currently in memory before running usesas.
float specifies that numeric variables that would otherwise be stored as numeric type double be stored
with numeric type float. This option should only be used if you are certain you have no integer variables that have
more than 7 digits (like an ID variable).
xport specifies that the input dataset is a SAS Transport/Xport dataset. Since there is no standard
file extension for SAS Xport datasets, this option is required.
describe makes usesas act somewhat like the Stata command
describe using. It does not bring the full dataset
into memory. Instead it specifies for usesas only to load the descriptive information about the using
dataset into Stata's memory as a Stata dataset and print it. So, instead of loading the actual dataset into Stata,
usesas loads the descriptive information (variable names, what type of variables they are, the variable
labels and formats associated to the variables) into Stata as a dataset. You can clear the descriptive data out of
Stata's memory or use the descriptive data however you like to create variable lists for your actual invocation of
usesas. This may be helpful for situations where the SAS dataset has more variables than your version
of Stata can handle. You can create a variable list from the variable called "name" to create another
invocation of usesas to read in only the variables you need.
If you do not want to have the describe option list the descriptive information of the imported dataset, you can use the option listnot with describe. The descriptive information will still be loaded into Stata as a Stata dataset. The descriptive data are sorted in the variable order of the using dataset so a variable list for usesas could be created like so:
The above scalars and macros contain information about the dataset that was described, not information of the dataset of descriptive information that usesas loaded into Stata with the describe option.
keep allows for a list of variables from the SAS dataset to be read in. This list is used in the SAS code portion of usesas so must be written in the SAS varlist style. SAS does not allow for varlists to contain stars/asterisks (
If you do not want to have the describe option list the descriptive information of the imported dataset, you can use the option listnot with describe. The descriptive information will still be loaded into Stata as a Stata dataset. The descriptive data are sorted in the variable order of the using dataset so a variable list for usesas could be created like so:
. display "`= trim(name[1])'--`= name[2047]'" id--income88which could then be used like so to keep the first 2,047 variables in the using dataset (2,047 is the maximum number of variables that Stata Intercooled can handle):
. usesas using "mySASdata.sas7bdat", clear keep(`= trim(name[1])'--`= name[2047]')SAS variable lists using two dashes "--" tells SAS to use the variables that exist positionally between the first variable and the last variable in the using dataset inclusively. Read more about this under the documentation of the keep option. The describe option makes usesas return the following in r():
| Scalars: | |
| r(N) | number of observations in using dataset |
| r(k) | number of variables in using dataset |
|
|
|
| Macros: | |
| r(varlist) | variables in using dataset |
| r(sortlist) | variables by which using data are sorted |
The above scalars and macros contain information about the dataset that was described, not information of the dataset of descriptive information that usesas loaded into Stata with the describe option.
keep allows for a list of variables from the SAS dataset to be read in. This list is used in the SAS code portion of usesas so must be written in the SAS varlist style. SAS does not allow for varlists to contain stars/asterisks (
* ) or question marks ( ? ).
For example:
keep(var1-var20)includes only vars that start with "var" and end in a number between 1 and 20.
keep(var1 var20)includes only vars that start with "var" and end in a number between 1 and 20.
keep(var1--var20)includes only vars that in the dataset between
var1 and
var20. This is like Stata's
varlist style
var1-var20.
if allows for a SAS if statement to subset the data before it's read in. Any valid
SAS style if statement will work.
in allows for subsetting the data before it's read in. Use only
#/#
where both numbers are positive, for example 1/30 for the first 30 observations.
quotes specifies that double quotes that exist in string variables are to be replaced with single
quotes. Since the data are written out to an ASCII file and then read into Stata, double quotes there are rare
instances when double quotes are not allowed inside string variables.
messy specifies that all the intermediary files created by usesas during its operation
are not to be deleted. The messy option prevents usesas from cleaning up after it
has finished. This option is mostly useful for debugging purposes in order to find out where something went
wrong. All intermediary files have a name starting with an underscore (
_ ) followed
by the process ID and are located in Stata's temp directory.
Examples
. usesas using mySASdata.sas7bdat . usesas using "c:\data\mySASdata.sas7bdat", check . usesas using "mySASdata.xpt", xport . usesas using "mySASdata.sas7bdat", formats . usesas using "mySASdata.sas7bdat", keep(id--qvm203a) if(1980<year<2000) in(1/500) . usesas using "mySASdata.sas7bdat", describe . usesas using "mySASdata.sas7bdat", describe nolist // then submit the following actual invocation of usesas: . usesas using "mySASdata.sas7bdat", clear keep(`r(sortlist)' `= trim(name[1])'--`= name[2047]')
Setting up usesas
NOTE: If you are setting up this program on your computer for the first time, you may need to edit the
sasexe.ado file to set the location of
your SAS executable file (sas.exe) .
If you do not, usesas will try to set it for you. usesas also may need to have the
location of the SAS macros
savastata.sas and
char2fmt.sas set. The
sasexe.ado file is an ASCII text file and
should be saved as such after editing. Stata's do-file editor will do the job.


