Accessing Data from SAS Code
In the previous blog, I have discussed some of the basics of the SAS. As discussed the SAS table has an extension of .sas7bdat.
In this blog, I am going to discuss how we can access SAS data and tables from the SAS code.
Accessing Structured Data
SAS tables can be accessed using a proc contents
step, and by specifying the path of the table to the data parameter.
This step creates a summary of the SAS table, with the table name, number of observations, variables name, sort information, file access permissions, and a lot more.
Using the hardcoded full path for the table might cause some issues, for eg. When the path changes sometimes then we have to change the entire codebase to accommodate the new file path. This is taken care of by including libraries.
SAS Libraries
Accessing a table requires tow information:
- Path
- Type
But this is tedious sometimes. Libraries, on the other hand, gives us a way to specify this information, in a very simple and efficient way. A library is a collection of similar data files in the same location and of the same type.
First, we specify what library we want to create using libname
which is a global variable. After the libname
, we specify libref
which is the name of the library we want to create with and then we specify the engine parameter, which tells what type of data is to be imported. By default, the engine is set to base
which is the SAS table. Other parameters are xlsx
, hadoop
, teradata
.After that we specify the path of the file. The library includes only those files for which the engine is specified.
These libraries get destroyed when you close the SAS session, but the data are not destroyed. There are several libraries that are created automatically. One such library is the work library, which is also the default name when creating a library using libname step. This library is temporary, and hence the data is deleted after each SAS session. Some libraries are also created by the SAS administrator, hence we can use it directly in the code.
Once the library is created we can access the SAS table just by referencing to the table we want to access as: “ libname.tablename”. To clear the library after use, we use libname libref clear
.
To create an Excel library, in SAS we use the following code:
We use the validvarname, to specify that the spaces in the columns name be replaced by the underscore.
Accessing Unstructured Data
To use and play around with the unstructured data in the SAS, we first have to import it into the SAS table. proc import
statement is used to import the unstructured data into the SAS table. It has various different options, including DATAFILE
which is the file path, DBMS
which is the filetype, and the OUT
which specifies the output table in SAS.
Notice the keyword replace
after the OUT
option. This is done to replace any existing SAS table or to run the same code twice without any conflicts.
To avoid data truncation, use the guessingrows option, and set it to MAX. This is by default set to 20.
Importing Excel File
Sometimes we want to import an excel file into a SAS table, this can be done using the proc import statement. As follows:
While importing the Excel workbook, the DBMS option must be specified as xlsx. By default the import reads only the first sheet of the excel file, this can be dealt with by including the sheet parameter, with the sheet name specified.
SAS assumes that the headers are in the first line and the data portion starts after that.