Basic Conceptual Differences Between SAS and Stata

To learn Stata you will nearly have to unlearn SAS. Think of this page as the "door" to the Stata mindset.




There is only one observation in memory at a time.  Typically, you can only access the values of variables on the current observation. The whole dataset is in memory at a time.  You can access the values of variables in any observation in the whole dataset.  For this reason the RETAIN statement makes no sense in Stata.  But the norm is to the variable value of the current observation is used.

A series of statements modify the dataset one observation at a time. One command at a time modifies the dataset one observation at a time, then the next command, etc.
Think of each command as a data step with only one statement in it:
  data mydata; 
   set mydata;
    some statement;

In general, you never need to worry about how much memory SAS will require during your session. If you are using a previous version of Stata to version 12, then Stata requires the user to be aware of how much of the computer's memory will be required to load the dataset into memory.  Stata's default setting for memory may not be set to a large enough setting.  A quick rule of thumb would be to set Stata's memory to 20% more than the size of the dataset.  If the dataset is smaller than 30 megabytes simply set the memory to 40 megs.
 set memory 40m 
To make this setting to be your default every time, you can do one of two things, but not both:
1) Use the "permanent" option:
set memory 40m, permanent
2) Or click the Stata icon, right click the icon, choose properties, and edit the target field to look something like this:
 C:\Stata\stata.exe /m40
"/m40" sets the memory to 40 megabytes.

In Linux/UNIX you would have to assign an alias to call Stata that would look something like this:
 alias xstata='xstata -m40' 

For numeric variables, missing values ( . ) are the smallest values. For numeric variables, missing values ( . ) are the LARGEST values.

Starting in version 8, Stata offers special missing values .a through .z where ( . ) is less than .a and .z is the largest value.

Using capital or lower case while programming does not matter.  For example variable Age is the same variable age or aGe, etc. Case matters.  Lower case is most often used.  For example, variables Age, age and aGe are all different variables, albeit not very good ones.

All Stata commands are in lower case.

The end-of-line delimiter is the semi-colon ( ; ).
 if oldvar1 = 2 and oldvar2 = 1 then newvar = 1;
The default end-of-line delimiter is the carriage return, though you can set the end-of-line delimiter to be the semi-colon with the "#delimit ;" command.

Since the carriage return tells Stata where the command ends, the "#delimit ;" command allows you to write a command that spans many lines in your do-file.  "#delimit cr"returns Stata to using carriage returns.

Starting with Stata 8, you can just use 3 forward slashes at the end of the line to tell Stata to continue to the next line before finishing the command:
 generate newvar= 1  if oldvar1 == 2 ///
                      & oldvar2 == 1   

Conditions come first and then something happens.  For example:
 if age>10 then age1=1; 
Conditions come at the end of a command.  For example:
 generate age1= 1  if age > 10 & age < . 
Remember to consider that the variable age may equal missing for some cases and that missing values are greater than any other number.

All statements have to be spelled out completely and usually spelled correctly for SAS to recognize the statement. Stata recognizes a command when it is spelled out enough for Stata to understand what command you intend.  For example, the command generate can be typed only as gen.  In Stata documentation you will notice commands with the first few letters underlined which indicates what the minimum characters needs to be typed.

If you want to keep a record of your SAS session in a log file, you save your log window when you are finished with your SAS session. If you want to keep a record of your Stata session in a log file, Stata requires you to decide to do that at the beginning of the session you want to log.  You can control when you are logging to the same file or to different files, but you can only log to one file at a time.  In Stata batch mode, a log file is automatically created just like it is with SAS in batch mode.

If you want to create a file that contains a series of SAS statements (a SAS program), create a normal ASCII text file with the file extension ".sas" like "".  An ASCII text file is a basic text file that can be opened with any text editing software like WordPad, NotePad, pico, vi, etc. If you want to create a file that contains a series of Stata commands (a Stata do-file), create a normal ASCII text file with the file extension ".do" like "".

Wink Plone Theme by Quintagroup © 2013.

Personal tools
This is themeComment for Wink theme