Sas Basicsppt
-
Upload
saurabh-kumar-singh -
Category
Documents
-
view
218 -
download
0
Transcript of Sas Basicsppt
-
8/6/2019 Sas Basicsppt
1/60
-
8/6/2019 Sas Basicsppt
2/60
2
-
8/6/2019 Sas Basicsppt
3/60
SAS Overview
3
-
8/6/2019 Sas Basicsppt
4/60
4
` Data Portion
` Rules for SAS name (Varibale/Data Set)
-
8/6/2019 Sas Basicsppt
5/60
5
-
8/6/2019 Sas Basicsppt
6/60
6
` Observation flow via Data step
-
8/6/2019 Sas Basicsppt
7/60
` Program data vector (PDV) is a logical area in memory where SAS builds a data set, oneobservation at a time. When a program executes, SAS reads data values from the input buffer (alogical area in memory into which SAS reads each record of raw data when SAS executes anINPUT statement) or creates them by executing SAS language statements. The data values areassigned to the appropriate variables in the program data vector. From here, SAS writes the valuesto a SAS data set as a single observation. When the DATA step reads a SAS data set, SAS readsthe data directly into the program data vector.
1. The SAS automatic variables ( _N_ and _ERROR_ )
2. Temp variable (First., Last. , IN , END etc)
While reading a Raw data file:-
` Compilation Phase - It checks for syntax errors and conversion of data step into the machine codealong with creation of two things viz. INPUT BUFFER and PDV.
`
Execution Phase - Before beginning of the this phase all the variables will be initialized tomissing(if character) and periods(if numeric)[use "put _all_" statement to check PDV status in thelog]. Then when the input statement is encountered for the first time the first record from raw datafile is moved from it to INPUT BUFFER after that one observation is then again moved to PDV.When run statement is encountered, then the implicit OUTPUT statement forces the read data rowto the output data set.
7
-
8/6/2019 Sas Basicsppt
8/60
` Sample programe
Data Total_points (drop=TeamName); 1
input TeamName $ ParticipantName $ Event1 Event2 Event3; 2
TeamTotal + (Event1 + Event2 + Event3); 3
datalines;
Knights Sue 6 8 8
Kings Jane 9 7 8Knights John 7 7 7
;
Run;
` Values from the First Record are Read (Input buffer, PDV, Output)
8
Computed Value of the Sum Statement
http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000961108.htm
-
8/6/2019 Sas Basicsppt
9/60
` Parts of a Program
Introduction comments
Data import
Data manipulation/cleaning
Data analysis
Data export
9
-
8/6/2019 Sas Basicsppt
10/60
` Semi-colon at the end of every statement
` Spacing for aesthetics, not syntax
` Comments
/* Comment */
* Comment ;
*/; - completes open comments from prior runs
` Run; at the end to execute the code
10
-
8/6/2019 Sas Basicsppt
11/60
` OPTIONS statement
Specifies SAS system options
Often used at the beginning of a program to specify
page formatting, limit the number of observationsprocessed, etc.
Unless re-specified, options stay the samethroughout the entire SAS session for all DATA stepsand PROCs that execute (even if the currentprogram does not specify them).
OPTIONNOPRINT MLOGIC SYMBOLGENpagesize=60 ;
DM LOG 'CLEAR' LOG;
11
-
8/6/2019 Sas Basicsppt
12/60
` A Libname is a reference to a directory.
libname test c:\temp\;
Must be used for a permanent SAS data set
` A Fileref is a reference to a specific file. filename flatfile c:\temp\test.txt;
Provides an easy way to change the file used whentesting
12
-
8/6/2019 Sas Basicsppt
13/60
` % Let allows you to create substitution for namesthroughout the data set
Use % Let when high chance that it will change
Allows user to make changes in one place, rather
than throughout the program
Often use % Let for path names and file names
13
-
8/6/2019 Sas Basicsppt
14/60
` Example of a %Let statement
%LET OUTPUTPATH=C:\Documents and Settings\MyDocuments\SAS99 v3\Testing v3\Output;
Filename PROGFILE "&OUTPUTPATH.\PROGFILE.sas";Filename EXCLUDE "&OUTPUTPATH.\EXCLUDE.sas";
Filename OUT_ENT "&OUTPUTPATH.\Entry_nonzero.xls";
Filename OUT_DQ "&OUTPUTPATH.\Mat_Exceptions.xls";
14
-
8/6/2019 Sas Basicsppt
15/60
Importing Data into SAS
15
-
8/6/2019 Sas Basicsppt
16/60
` SAS can import the following types of files:
Text (fixed-length, variable-length and delimited)
Excel workbooks
Access databases
DBF files
EBCIDIC files (IBM Mainframe)
16
SAS File Transfer.sas
-
8/6/2019 Sas Basicsppt
17/60
` There are two types of data sets:
Working
x Temporary data set stored in the Work directory
x Once SAS is closed, the data sets are deleted
Permanent
x Data sets stored in folder on computer
x Use libname reference as a pipeline to point to
permanent SAS datasets for programx If SAS is closed, data sets are saved on the
computer
17
-
8/6/2019 Sas Basicsppt
18/60
` SAS views dates as the number of days sinceJanuary 1, 1960.
` Since SAS considers dates to be numeric,
calculations are easily performed.
` SAS reads and prints a wide variety of datesincluding mm/dd/yy, yyyymmdd, Julian, etc..
` YEARCUTOFF= system optionhttp://ftp.sas.com/service/techsup/tsnews-l/0040.html
18
-
8/6/2019 Sas Basicsppt
19/60
` When raw data contains invalid dates (e.g. 000000),dates can be read in as character and then convertedto a date format to eliminate errors in the log. Forexample,
IF DATEVAR1 NE '000000' THEN DATEVAR =INPUT(DATEVAR1,YYMMDD6.);
` Comparisons to SAS date constants
IF date > 12312000; WRONG!
IF date > 31dec2000d; RIGHT!
` SAS Date Functions
19
-
8/6/2019 Sas Basicsppt
20/60
` Informat Tells SAS how to
read data valuesinto a variable
Used in inputstatements
` Format
Tells SAS how towrite data valueseither to a file, areport, or a screen
Used in outputstatements
20
DATA TEMP;INFILE "G:\Mydocument\GLDET.TXT" DELIMITER = '~'
MISSOVER DSD LRECL=32767 FIRSTOBS=2 ;
INFORMAT GL_ACCOUNT__ $33. ;
INFORMAT GL_ACCOUNT_DESCRIPTION $128. ;
INFORMAT ACCOUNT_TYPE_INDICATOR $1. ;
INFORMAT FDC_CODE $10. ;
FORMAT GL_ACCOUNT__ $33. ;FORMAT GL_ACCOUNT_DESCRIPTION $128. ;
FORMAT ACCOUNT_TYPE_INDICATOR $1. ;
FORMAT FDC_CODE $10. ;
INPUT
GL_ACCOUNT__ $
GL_ACCOUNT_DESCRIPTION $
ACCOUNT_TYPE_INDICATOR $
FDC_CODE $
;
RUN;
-
8/6/2019 Sas Basicsppt
21/60
Data Analysis
and Manipulation
21
-
8/6/2019 Sas Basicsppt
22/60
` Once SAS imports the data, there are two types ofstatements for manipulation and analysis:
Data Step
x If-Then-Else statements
x Mergesx Loops
Proc Step
x
SAS-defined procedures for analysis (summaries,statistics, etc.)
22
-
8/6/2019 Sas Basicsppt
23/60
` DATA steps in SAS are used to:
Read in raw data files
x Use INPUT statement
Read in and manipulate SAS data sets
x Use SETor MERGEstatements
x Subset data
x Create additional fields
x Perform calculations
x Merge SAS data sets
x Append SAS data setsx Loop through SAS data sets
x etc.
23
-
8/6/2019 Sas Basicsppt
24/60
` The DROP= option eliminates the specified variables fromthe output data set. For example, DATA newdata1(DROP=field1 field2);
` The KEEP= option keeps only the specified variables in the
output data set. For example, DATA newdata2(KEEP=field3 field4);
` The RENAME= option renames variables in the output dataset. This is commonly used for merges. For example,
DATA new(REN
AME=(oldname=newname));
24
-
8/6/2019 Sas Basicsppt
25/60
` Example of a Data Step
data rev_chart (keep = acct_type rev_debit_amt rev_debit_ct rev_credit_amtrev_credit_ct
exp_debit_amt exp_debit_ct exp_credit_amt exp_credit_ct period);set input_data year_input_data;if Acct_Type = 'Revenue' then do;
rev_debit_amt = debit_amt;rev_debit_ct = debit_cnt;rev_credit_amt = credit_amt;rev_credit_ct = credit_cnt;
end;else if Acct_Type = 'Expense' then do;
exp_debit_amt = debit_amt;exp_debit_ct = debit_cnt;
exp_credit_amt = credit_amt;exp_credit_ct = credit_cnt;
end;else delete;run;
25
-
8/6/2019 Sas Basicsppt
26/60
` If-Then-Else Statements Simple logic in order to perform data manipulation
data data1;set libname1.perm_data1;IF day_of_Week = "MON" THEN ID = 2; ELSE
IF day_of_Week = "TUE" THEN ID = 3; ELSEIF day_of_Week = "WED" THEN ID = 4; ELSEIF day_of_Week = "THU" THEN ID = 5; ELSEIF day_of_Week = "FRI" THEN ID = 6; ELSEIF day_of_Week = "SAT" THEN ID = 7; ELSEIF day_of_Week = "SUN" THEN ID = 1; ELSEIF day_of_Week = "HOL" THEN ID = 8;
run;
26
-
8/6/2019 Sas Basicsppt
27/60
` Use If-Then-Else statements to subset your data
Use If condition to keep records with condition
Break up data set into several data sets by condition
` Example of SAS code to subset data:
DATA excep(RENAME=(sumfield=excepamt)gooddata(DROP=field1 field2);
SET origdata;
sumfield = field1 + field2;
IF sumfield GT 100 THENOUTPUT excep;ELSEOUTPUT gooddata;
RUN;
27
-
8/6/2019 Sas Basicsppt
28/60
` If there are several steps in an If-Then-Else statement, use a Then Do orElse Do
Use End statement at end of Do statement
` Example of code:
DATA DATAKEEP DATAEXCLUDE;SET DATA1;IF VAR1="KEEP" THEN DO;
DATASET="DATAKEEP";
OUTPUT DATAKEEP;
END;ELSE DO;
DATASET="DATAEXCLUDE";
OUTPUT DATAEXCLUDE;
END;RUN;
28
-
8/6/2019 Sas Basicsppt
29/60
` Blanks and Null Values
There are several ways to flag blank / null records:
x charvar = ;
x numvar = .;
29
-
8/6/2019 Sas Basicsppt
30/60
` A SAS merge combines records that have equalvalues based on the field(s) in the BYstatement.
` The data sets must be sorted by the field(s) in theBYstatement prior to performing the merge.
`
The IF statement will select the records to includein the resulting table.
` Many to many merges will not have the same resultas a many to many join.
` Overlaying of common variables will occur (Update
will be helpful) Use PROC CONTENTS and RENAME= data setoption
30
-
8/6/2019 Sas Basicsppt
31/60
31
SAS Dataset 1
CodeIn2
A 1A 1
B 1
C 1
D 1F 1
SAS Dataset 4
CodeAmt
A 9
B 1
C .
E 2
F .
3
1
SAS Dataset 2
CodeAmt
A 1
B .
C 2
D 5
F 3
SAS Dataset 2
CodeIn1
A 1
C 1
C 1
E 1
F 1
F 1
-
8/6/2019 Sas Basicsppt
32/60
Merge SAS Data Set1-2
CodeIn2 In1A 1 1A 1 1
B 1 0C 1 1C 1 1D 1 0E 0 1
F 1 1F 1 1
32
Example SAS MERGE.SAS
Merge SAS Data Set3-4
Code AmtA 9B 1C .D 5E 2F .
Update SAS Data Set3-4
Code Amt
A 9B 1C 2D 5E 2F 3
-
8/6/2019 Sas Basicsppt
33/60
` Example of Merge in Data step:
data data_merge notina notinb;
merge data1 (in=a) data2 (in=b);
by field1 field2 field3;if a and b then output data_merge;
else if a and not b then output notinb;
else if not a and b then output noina;
run;
http://support.sas.com/techsup/technote/ts705.pdf
33
-
8/6/2019 Sas Basicsppt
34/60
` Loops allow the users to iterate through a variable or data step ina controlled manner
` Two types of loops in SAS Do Until Do While
` Must use an END statement
` Example of Do Loop code:
DATA newfile;SET oldfile;
DO WHILE (condition);task to perform during condition;
END;RUN;
34
-
8/6/2019 Sas Basicsppt
35/60
` CAT, CATS,CATT,CATX
35
-
8/6/2019 Sas Basicsppt
36/60
` Count
36
-
8/6/2019 Sas Basicsppt
37/60
` Countc
37
-
8/6/2019 Sas Basicsppt
38/60
` Ifc
ifc (condition, value #1, value #2, value #3)
looks at a condition and returns user specified, character values if the condition is true, false, orresults in a missing value.
Parameters: condition (numeric expression)
value #1: char expression if condition is true.
value #2: char expression if condition is false.
value #3: char expression if condition results in a missing value.
` Ifn
ifn (condition, value #1, value #2, value #3)
looks at a condition and returns user specified, numeric values if the condition is true, false, orresults in a missing value.
Parameters:
condition (numeric expression)
value #1: num expression if condition is true.
value #2: num expression if condition is false.
value #3: num expression if condition results in a missing value.
` ifn comparison to if/then/else
com= ifn(sales >= quota, sales* mqpct , sales* nmqpct );
is the same as:
if sales >= quota then
com=sales* mqpct
` else
com=sales* nmqpct );
38
ifc.txt
-
8/6/2019 Sas Basicsppt
39/60
` lengthc, lengthm, lengthn
Differences:
lengthc: returns the length of a string including trailing blanks.
lengthm: returns the length of the variable allocated in memory.
lengthn: returns the length of a string excluding the trailing blanks.
1. returns 0 if a string is blank, compared to the length function whichreturns 1.
` subpad
parameters:
-string string to take an excerpt (substring) from.
-position location of first character in the substring.
-length the number of characters in the subpad WILL return a variable with the length specified, padding the resultswith spaces.
subpad can return a string with a length of zero.
39
-
8/6/2019 Sas Basicsppt
40/60
` substrn
parameters:
-string string to take an excerpt (substring) from.
-position - location of first character in the substring.
-length - the number of characters in the
same as the substr function except:
substrn truncates the result when length exceeds the length of the string.
there will be no error messages for invalid third arguments.
` ANYALNUM (variable)beats using: indexc(lowcase(variable),qwertyuiopasdfghjklzxcvbnm1234567890)
` ANYALPHA (variable)
beats using: indexc(lowcase(variable),qwertyuiopasdfghjklzxcvbnm)
` NOTALNUM (variable)
beats using: indexc(lowcase(variable),qwertyuiopasdfghjklzxcvbnm1234567890) =0
40
substr.txt
-
8/6/2019 Sas Basicsppt
41/60
` Proc statements are SAS procedures for dataanalysis
Proc Contents
Proc Datasets
Proc Means Proc Summary
Proc Freq
Proc Format
Proc Univariate Proc Print
41
-
8/6/2019 Sas Basicsppt
42/60
` Provides field information and data set information
Field Info: Name, Type, Format, Label
Data Set Info: Creation date, Number of observations
Automatically prints to output
Very useful to perform prior to a merge to ensurevariables from one data set will not write over variablesfrom the other data set.
` Example of code:
PROC CONTENTS DATA=dataset;
RUN;
42
-
8/6/2019 Sas Basicsppt
43/60
` Provides information regarding a library, including datasets in thelibrary
Can perform the following:
x Append datasets together
x Copy datasets to other libraries
x Delete datasets
x Modify the contents of a dataset
x Rename a dataset
` Example of code:
PROC DATASETS LIBRARY=libname;
RUN;
43
Proc dataset.txt
-
8/6/2019 Sas Basicsppt
44/60
` Computes descriptive statistics on numeric variables in a SASdata set. Results can be output to a new SAS data set. Defaults to print all descriptive statistics (e.g. mean, standard
deviation, minimum, etc.) Output automatically prints If no numeric variables are specified in the VAR statement, then
all numeric variables in the data set are analyzed
` Example of code:
PROC MEANS DATA=oldfile;CLASS division dept;
VAR units value;
RUN;
44
-
8/6/2019 Sas Basicsppt
45/60
` Computes descriptive statistics on numeric variables in a SASdata set and outputs the results into a new SAS data set
Descriptive statistics must be specified in the OUTPUTstatement
Output does not automatically print
If no numeric variables are specified in the VAR statement, then
a simple count of the observations is generated.
` Example of Code:
PROC SUMMARY DATA=oldfile NWAY;
CLASS division dept;
VAR units value;
OUTPUT OUT=currfile SUM=;
RUN;
45
-
8/6/2019 Sas Basicsppt
46/60
` Produces 1-way to n-way frequency and cross-tabulationtables
` Example of Code:
PRO
C FREQ DATA=dataset
NOPRI
NT;TABLES variable / LIST MISSINGOUT=datanew;
RUN;
DATA datanew2;SET datanew;
IF COUNT > 1 THENOUTPUT datanew2;RUN;
46
-
8/6/2019 Sas Basicsppt
47/60
` PROC FORMAT, because of its name, is most often used tochange the appearance of data for presentation.
` Example of Code:Proc format;value tempfmt low -< 61 = ' 1'
61 -< 63 = ' 2'63 - high = ' 3'other = ' ';run;data pme;set pme;tempcode=put(avgtemp, tempfmt.);Run;http://www2.sas.com/proceedings/sugi30/001-30.pdf
474
7
Proc format.txt
-
8/6/2019 Sas Basicsppt
48/60
` Provides statistics on numeric variables
Automatically prints the results to output
` From the report:
Number of missing observations
Number of records > 0, = 0, and
-
8/6/2019 Sas Basicsppt
49/60
` Print data set to screen Less frequently used if using PC SAS However, when using SAS on the Mainframe, Proc
Print would allow you to see your data
*SET EXTERNAL LOG;
PROC PRINTTO
PRINT = "&PATH.\PROGRAMS\&CLIENT. &PERIOD. &PROGRAM..DOC"
LOG = "&PATH.\PROGRAMS\&CLIENT. &PERIOD. &PROGRAM..DOC
NEW;
RUN;
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146809.htm
49
-
8/6/2019 Sas Basicsppt
50/60
` What is a macro?
A macro is a block of code meant for repetitiveruns through the data
50
-
8/6/2019 Sas Basicsppt
51/60
` Example of macro code:
%MACRO Macro_name(DATASET);
DATA &DATASET.;
SET &DATASET.;
IF Var1=. THEN Var1=0;
RUN;
%MEND Macro_name;
%Macro_name(TOTAL);
51
-
8/6/2019 Sas Basicsppt
52/60
SAS Output
52
-
8/6/2019 Sas Basicsppt
53/60
` There are several ways you can export data:
Proc Export
ODS (Output Delivery System)
DDE (Data Dynamic Exchange)
53
-
8/6/2019 Sas Basicsppt
54/60
` Proc Export is the mostcommon way to export data
` Very similar to the ProcImport step, but works in
the reverse` SAS can export to the
following files:
Excel
Access
DBF Text
54
Example
SAS Program appended
at #17
-
8/6/2019 Sas Basicsppt
55/60
55
DOC File
Inven ory Data for SAS Basi s.txt
Input Data
SAS Basi s Exer ise1 Solution.sas
Input Data
Exercise1 Exercise2
DOC File
AR Inventory for SAS Training.txt AR Data SAS Training.txt
SAS Basi s Exer ise2 Solution.sas
-
8/6/2019 Sas Basicsppt
56/60
Debugging and Checking SAS
Programs
56
-
8/6/2019 Sas Basicsppt
57/60
` It is essential to check your SAS log for errors:
Error (in red): Very Bad!
Warning (in green): Maybe bad!
` SAS will not stop its processing if it finds errors!
57
-
8/6/2019 Sas Basicsppt
58/60
` Lets run Debug Program.sas to see different types oferrors.
` Things to look out for: Colors or perform search for error and warning
Dropped records or observations
` Visualize and understand how your results shouldlook. Did the results turn out as planned?
5858
Use inventory raw file appended on slide #50SAS Basics ebug Program.sas
-
8/6/2019 Sas Basicsppt
59/60
` proc SQL
` Macros
` Proc Tabulate/Report
` Merge using Proc format
59
-
8/6/2019 Sas Basicsppt
60/60