SAVAS (c-shell) script

Save SAS datasets as Stata datasets/Save Stata datasets as SAS datasets. Based on the file extensions savas... makes SAS (version 6.09 or later) datafile copies of Stata (any version up to and including the version of Stata being run) datafiles. or: makes Stata (version 7 or later) datafile copies of SAS (version 6.09 or later) datafiles.

 

If you want to use savas

on UNIX/Linux machines, then feel free to download load it here:

Download savas.csh (c-shell script).

 

If you are not prompted to "Save To Disk", then right-click the link and choose "Save Link Target As..." Otherwise, you will need to save the web page as a plain text file (not as htm/html). You may want to subsequently rename the file from savas.csh to savas.

 

You will also need the SAS macros SAVASTATA and CHAR2FMT:

 

Download: savastata.sas and char2fmt.sas.

 

If you are not prompted to "Save To Disk", then right-click the link and choose "Save Link Target As...". Otherwise, you will need to save the web pages as plain text files (not as htm/html). The best way to download SAVASTATA and CHAR2FMT is to use Stata's command

ssc install to get the Stata program usesas which also uses the SAVASTATA and CHAR2FMT SAS macros. The SAS macro file savastata.sas

will be in the same directory as other ado-files that start with the letter "s" and the char2fmt.sas file will be in the same directory as ones that start with the letter "c":

 ssc install usesas , replace

You will also need the Stata program savasas. Use Stata's command

ssc install to get all the files required for savasas

:

 ssc install savasas , replace

 

Here is the savas

man page.

Click here to download the savas man page. If you are not prompted to "Save To Disk", then right-click the link and choose "Save Target Link As..." Otherwise, you will need to save the web page as a plain text file (not as htm/html).

 

Disclaimer: There is no warranty on this software either expressed or implied. This program is released under the terms and conditions of GNU General Public License.

 

About savas

Programmer: Dan Blanchette
The Carolina Population Center
The University of North Carolina
Chapel Hill, NC USA
Date: 02Dec2003
Last updated: 25Mar2008


Make a Stata datafile from a SAS datafile or a SAS datafile from a Stata datafile

 

savas [-options] DataSetName.ext ...

 

Examples

(The dollar sign indicates a UNIX/Linux prompt. Do not type the dollar sign to use savas.)
 $ savas mystata.dta $ savas mysas.sas7bdat $ savas -fmts mystata.dta $ savas -r mystata.dta $ savas -r ../group/mysas.sas7bdat $ savas -c ../group/mystata.dta $ savas analysis.dta analysis2.sas7bdat child_data.dta $ savas -fmts mysas.sas7bdat $ savas -x analysis.xpt analysis2.exp child_data.Apr02.stx 

 

Description

savas uses both SAS and Stata installed on the same Linux/Unix machine to make copies of one or more SAS/Stata datasets as Stata/SAS files. The output dataset will have the same name, but with the appropriate filename extension:

 

SAS Version 9/8: .sas7bdat

SAS Version 6 UNIX: .ssd01

SAS Version 6 Linux: .ssd02

SAS 6 Transport/Xport: .xpt, .xport, .exp, .export, .sasx, .stx, .v5x, .v6x, .trans, or .expt file extensions plus whatever file extension the file might have.

SAS Transport files created by PROC CPORT: .cport and .ssp file extensions plus whatever file extension the file might have.

SPSS portable files: .por file extension.

Stata: .dta

 

savas can convert SPSS portable files to Stata thanks to SAS's SPSS read-only engine. savas recognizes these files by the ".por" file extension. NOTE: Starting in SPSS 11, SPSS will open and save SAS sas7bdat files.

By default the Stata/SAS file is created in the same directory as the SAS/Stata datafile, but with the appropriate filename extension and contains all observations and every variable in the SAS/Stata datafile. savas requires the use of both Stata and SAS on the same machine.

savas cannot process files that have filenames or are in directories that contain single or double quotes.

The procedure is as follows:
    1. savas creates a Stata/SAS program that loads the Stata/SAS dataset into Stata/SAS and calls the savas Stata/SAS program.

    1. savas uses either Stata's command fdasave to save the dataset in memory temporarily as a SAS xport datafile or has SAS write the data to an ASCII text file.

    1. savas writes a Stata/SAS input program to load the dataset into Stata/SAS and to assign variable names, labels (and formats).

    1. savas runs the program in Stata/SAS in batch mode to load the data.

  1. Stata/SAS saves the data as whatever version Stata/SAS file type specified.

NOTE: If saving to old versions of SAS or Stata that have variable name restrictions less than the version of the dataset being processed, savas checks for variable names that are too long for the output dataset; and, if the "-rename" option is issued, savas renames them to the first 8 characters or up to 7 plus a number. In addition, it will display this list of renamed variables.

 

If the SAS/Stata dataset is sorted by one or more variables, the Stata/SAS dataset will also be sorted by those same variables. The maximum length for a string variable to be passed on to SAS is 200 characters. In such cases, the first 200 characters will be taken and passed on to SAS (this is a limitation of the SAS xport dataset used to transfer data from Stata to SAS). If saving a SAS dataset as a Stata dataset, long character variables will be truncated to the maximum length that Stata will allow. This maximum may be 80 or 244 depending on what version of Stata is being used. Stata's help page on limits will let you know which applies. savas will report which, if any, variables were truncated and to what length they were truncated. Stata variables labels can be up to 80 characters in length.

 

Options

Option Explanation
-c/-curdir
savas saves the Stata/SAS dataset to the current working directory, even though the Stata/SAS dataset may be located elsewhere.
-rename
specifies that any required renaming of variable names is to be done. The -rename option is only necessary when saving to a older version of SAS or Stata or when variable names are not unique in SAS. When saving to an older version rename attempts to rename long variable names (more than 8 characters) to be unique by shortening all long variable names to the first 8 characters or up to the 7 plus a number. savas lists all variables that were renamed. If more than one dataset is submitted to savas, then this option will only work for the first dataset. Check out the -force option.
-r/-replace
By default, savas warns the user if the output dataset already exists, and asks permission to overwrite it. Option -replace suppresses this interactive behavior and replaces any existing output dataset without warning. If more than one dataset is submitted to savas, then this option will only work for the first dataset. Check out the -force option.
-force
is equivalent to using both -rename and -replace and will maintain these options if more than one dataset is submitted to savas.
-check
creates two check files for the user to compare the input dataset with the output dataset to make sure savas created the files correctly. This is a comparison that should be done after any datafile is converted to any other type of datafile by any software. The files are created in the same directory as the output datafile and are named starting with the name of the datafile followed by either "_SAScheck.lst" (SAS) or "_STATAcheck.log" (Stata), e.g. "mydata_SAScheck.lst" and "mydata_STATAcheck.log".
-fmts/-formats
specifies to either save value labels that exist in the Stata dataset as SAS formats in a file that will have the same name as the datafile but with the ".sas7bcat" file extension or to use such a file if creating a Stata dataset. This formats catalog file will be created or needs to be in the same directory as the SAS datafile. By default value labels are not saved or created. NOTE: SAS formats have to be 8 characters or less and cannot end in a number. savas makes some attempt to rename invalid SAS formats, but it would be best for you to rename or drop them in Stata before using savas. Stata does not allow string variables to have user-defined formats numbers with decimal values.
-sas6
indicates to save the Stata file as a SAS version 6 file. SAS 9 will read/open SAS 6 files but will not save to a version 6 SAS dataset.
-sasx
indicates to save the Stata file as a SAS version 6 transport/xport file using the xport engine.
-o/-old
indicates to save the Stata file as previous version of Stata to the current version, e.g., version 8.
-i/-intercooled
indicates to save the Stata file as Intercooled. This is only necessary if Stata SE or Stata MP is being used.
-char2lab
indicates to use the SAS macro CHAR2FMT to convert long character variables to numeric with Stata value labels. This is like Stata's encode command. This option is only helpful when saving to a Stata 9 or higher dataset since Stata 9 added the feature of allowing value labels to be up to 32,000 characters long.
-q/-quotes
indicates to replace double quotes ( " ) occurring in character variables with single quotes ( ' ) and replace compound quotes ( `" or "' ) occurring in variable labels or formats with single quotes ( ' ). savas cannot process character variables with double quotes or variable labels or formats with compound quotes when converting a dataset from SAS to Stata.
-x/-xport
savas converts SAS transport files into Stata datafiles. NOTE: Multiple transport datafiles can be processed at a time but all datafiles need to be SAS transport files. There can be no intermixing of regular SAS/Stata datafiles and transport files when using this option.
-f/-float
prevents the use of Stata's variable type `double'. All variables whose SAS precision would require Stata's double type are created as float. This option may lead to a loss of precision, but saves space: a float is stored in 4 bytes, a double in 8 bytes.
-rights
sets the file permission of the new SAS file to be whatever default file permissions would be for a new file in that directory. The default permissions are the same as the Stata datafile.
-b/-beep
beeps upon completion.
-s/-silent
be silent; in this case, savas does not print any output to the screen, except for error messages. By default, savas tells what stage of the conversion process is currently being executed, and it reports number of variables, number of observations, and more.
-ascii/-sascode
specifies that only a datafile and an input program are to be created. By default, savas executes all four steps outlined above. The -ascii/-sascode option aborts this process after step 3. The user then needs to read in the data manually using Stata/SAS. savas writes a SAS program (mydata_infile.sas) to read in the SAS datafile (mydata.xpt) or savas writes a Stata do-file (_mydata_infile.do) to read in the ASCII datafile (_mydata_.raw).
-m/-messy
savas specifies that all the intermediary files created by savas during its operation are not to be deleted. The -messy option prevents savas from cleaning up after it has finished. This option is mostly useful for debugging purposes in order to find out where something went wrong. All intermediary files have a name starting with an underscore ( _ ) followed by the process ID and are located in the temp directory.
-obs=n
converts only the first n observations. By default, savas converts all observations of the Stata/SAS dataset.
-varfile=filename
may be used to select only a subset of variables to be included in the Stata/SAS dataset. This will speed up the conversion process and is useful in situations where the number of variables is too large for a non-Stata SE (Special Edition) file, more than 2,047 variables. The filename is the name of a file whose contents are variable names only. These variable names are case-insensitive when saving to Stata. If saving to SAS, multiple variables can be listed using any of Stata's specified varlist rules. For example, var* is understood as var1, var2, ... or if saving to Stata, multiple variables with the same stem may be specified as ranges according to general SAS rules. For example, var1-var20 is understood as var1, var2, ..., var20.
-n/-nice=n
runs SAS/Stata nicely. The default is 20. This should be used if you have a very large datafile and there are others using the UNIX/Linux box. For example:
 $ savas -n=10 mystata.dta

 

Features

savas attempts to transfer Stata value labels to SAS formats and vice versa. savas creates only one format per value label and vice versa rather than creating a new format or value label for each variable that was assigned that format or value label. So, if you have a SAS dataset with one yes_no. format assigned to twenty variables, the new Stata dataset will have one yes_no value label assigned to those twenty variables. Date formats are translated as closely as possible. Fixed SAS formats (Fw.d) translate into Stata's %w.df format. SAS date formats are translated as closely as possible. Unformatted variables get Stata's default formats for the appropriate data type (%8.0g for bytes and ints, %9.0g for floats, and %10.0g for doubles), except for long variables, which savas formats as %12.0g. savascan process multiple files at a time. Try:
 $ savas *.sas7bdat 
or:
 $ savas *.dta


savas stamps the SAS creation date and time on the Stata dataset name, so that the Stata user knows not only when the Stata dataset was created, but also the original SAS creation date and time. Not all SAS variable names are acceptable in Stata.

savasattempts to prevent conflicts by using uppercase names for reserved names. These reserved names are:
  • _all
  • _B
  • byte
  • _coef
  • _cons
  • double
  • float
  • if
  • in
  • int
  • long
  • _pi
  • _pred
  • _rc
  • _se
  • _skip
  • _uniform
  • using
  • with
  • names starting with `str' and followed by an integer. (For example, name "street" does not pose any problems, but a SAS variable named "str10" will be translated into a Stata variable named "STR10")
  • A SAS variable named "_n" translates into "_______N" (and a warning is issued.)
Not all Stata variable names are acceptable in SAS because Stata allows variable names to be different based on upper or lower or mixed case. So the variable gender can be in the same dataset as "Gender" or "GENder" etc. savas attempts to prevent conflicts by testing for situations like the "gender" issue and when the -rename option is issued savas attempts to rename the variables to be unique by adding a number to the end of the variable name. If saving to an older version, then -rename will shorten all variable names that are longer than 8 characters.

 

Acknowledgements

This script was inspired by the sas2stata script developed at RAND.

 

Bugs

None known.

SAS character variables may be up to 32,767 characters in length; Stata 7 and 8 Intercooled limit string variables to 80 characters; Starting with Stata 9 Intercooled and SE limit string variables to 244 characters. savas will truncate such variables and write out a warning.

Stata Intercooled datasets are limited to 2,047 variables. Stata 6 datasets have a maximum width (number of bytes) of 8,192. Stata Intercooled datasets have a maximum width (number of bytes) of 24,564. Stata SE datasets can store as many variables as a SAS 8 dataset, 32,767 and have a max width of 12 times the number of variables. SAS 9 datasets may have an unlimited number of variables.

Wink Plone Theme by Quintagroup © 2013.

Personal tools
This is themeComment for Wink theme