SAS Functions Matched to Stata Functions
SAS functionsSAS Institute's Online SAS documentation |
Stata functions |
byte() lets you use ASCII code to represent characters. |
char() |
ceil() returns the smallest integer that is greater than or equal to the
value of the number or numeric variable. |
ceil() |
compbl() replaces multiple blank spaces with just one in a character string. |
itrim() |
countw() counts how many words are in a character string.The default delimiter is not just a blank space. |
wordcount() |
day() returns what the day of the month (1-31) is from the numeric date variable. |
day() |
floor() returns the largest integer that is less than or equal to the
value of the number or numeric variable. |
floor() |
in() is true if the value of the preceding variable has the value
of at least one value listed. |
inlist() |
index() returns the first position of a character in a
character string. find() is similar. |
strpos() |
indexw() returns the first position of the first character of a word in a
character string. findw() is similar. |
There is no equivalent Stata function, but regexm() can at least be used to test if the word exists in a string. |
input() creates numeric data from character data.
input() can be used like
Stata's date():
acquisition_date= "25Apr2001"; new_date= input(acquisition_date, date9.); format new_date date9.; |
real() but
the destring command
is likely a better solution.
// Stata 10 or higher date(acquisition_date, "DMY") // earlier Stata versions date(acquisition_date, "dmy")real() cannot have formats specified like SAS's input(). |
int() returns the integer value of the number or numeric variable. |
int() |
left() removes leading blank spaces (on the left-hand side)
of a character variable. |
ltrim() |
length() returns how long a string is. |
length() |
lowcase() REPLACE ALL CHARACTERS IN A STRING with lowercase
characters. |
lower() |
max() returns the largest value in a list of numeric variables. |
max() |
mean() returns the average of the arguments. |
egen's
rowmean() |
mdy() creates a numeric date variable from month, day and year variables. |
mdy() |
min() returns the smallest value in a list of numeric variables. |
min() |
missing() is true (returns the value 1) if the
argument/variable has a missing value. nmiss() can be used like
Stata's missing() to check if
any of a set of variables have missing values but it can only handle numeric variables. Starting
in SAS 9.2 cmiss() in that it counts how many character or numeric
variables listed have missing values. |
missing() |
mod() returns the remainder from the division of the first
argument by the second argument. |
mod() |
month() returns the month of the year from a numeric date variable. |
month() |
propcase() replaces the first letter of a word With A Capital Letter
and the following letters to lowercase. |
proper() |
prxchange() lets you use regular to match patterns in strings
and replace them with whatever. |
regexr()
regexs() lets you use the group variables created in regexm() // change "Smith, John" into "John Smith"
replace name= regexs(2) + regexs(1) ///
if regexm(name,"(.*),(.*)"
|
prxposn() returns the character string captured by a group in
regular expression. prxposn() requires a regular expression
id variable.
** create at most 3 groups from characters
* that were separated by at least 1 space: *;
reg_id= prxparse("/([^ ]+) ([^ ]+) ([^ ]+)/");
if prxmatch(reg_id, my_char_var) then do;
word1= prxposn(reg_id, 1, my_char_var);
word2= prxposn(reg_id, 2, my_char_var);
word3= prxposn(reg_id, 3, my_char_var);
end;
|
regexs() but the split command is easier to use and is more powerful in that it can create as many new string variables that are needed. |
put() creates character data from numeric data. |
string() but the tostring command is likely a better solution. |
ranuni() returns a random number between 0 and 1. |
runiform() starting
in Stata 10 or uniform() in previous versions. |
right() removes trailing blank spaces (on the right-hand side)
of a character variable. |
rtrim() |
round() returns the closest integer value of the number or numeric
variable. |
round() |
scan() return the nth word in a string. |
word() |
strip() and trim() remove both
leading and trailing blanks but strip() will leave no spaces if
the character value has only spaces where trim() will leave 1 space. |
trim() |
substr() returns a substring of characters from a string. |
substr() |
translate() replaces all occurrences of a character in a string
with whatever. You cannot specify to only replace the first N occurrences like you can
with Stata's subinstr(). |
subinstr() |
tranwrd() replaces all occurrences of a substring of characters
in a string with whatever. You cannot specify to only replace the first N occurrences of
the string nor only replace a word like you can with Stata's
subinword().
Using tranwrd() like so:
string_var= "Good Partnership";
tranwrd(string_var,"art","");
will change string_var to "Good P nership".
Stata's subinword()
would not modify string_var because string_var
did not have the string art as a word (starting the string and followed
by a space or ending the string and being preceded by a space or being surrounded by spaces).
|
subinword() |
upcase() replaces all characters in a string WITH UPPERCASE
CHARACTERS. |
upper() |
year() returns the four digit year (century year) from
a numeric date variable. |
year() |


