This page summarizes SAS import and export of data files such as ASCII text (e.g., Tab and CSV delimited), spreadsheet (Excel and Lotus 1-2-3), and database (dBase III and Access).
If you have simple data, read them using INFILE and DATALINES; otherwise, use IMPORT and EXPORT procedures. Personally, I prefer the IMPORT/EXPORT Wizard to IMPORT and EXPORT procedures mainly due to its user-friendly interface and flexibility.
If you have a data set generated in other software packages (e.g., Excel, dBASE III, Paradox, Stata, and SPSS), simply use data conversion utilities.
In UNIX machines, IMPORT and EXPORT procedures must be executed under the X window. Otherwise, you have to use the "noterminal" option in non-interactive (or batch) mode ("sas -noterminal file_name.sas").
This web document may not be used for any commercial purposes. You
should know that this page may contain mistakes and errors.
If you have any question and suggestion, please leave a message on SAS bulletin board
.
OVERVIEW
In SAS, data are read through the INFILE statement and the IMPORT procedure.
You may use the PUT statement in a Data Step or the EXPORT procedure to export data sets into external files.
INFILE should be used in a DATA step, while IMPORT and EXPORT are independent procedures.
Using the INFILE statement, you can read data directly using the DATALINES (CARDS) statement, import various ASCII text files, import data sets through network (i.e., FTP and HTTP). INFILE and DATALINES also read data in a matrix form.
IMPORT read ASCII text files, database (ACCESS, dBASE), and spreadsheet (Lotus 1-2-3, Excel).
INFILE STATEMENT
The INFILE statement identifies external files to be read in the INPUT statement.
It reads an ASCII text file that is typically delimited by a space (default), Tab, comma, or other delimiters.
Let us take the following example first.
LIBNAME js 'c:\sas';
FILENAME egov 'c:\sas\egov.txt';
DATA js.seoul;
INFILE egov LRECL=250 FIRSTOBS=7 OBS=700;
INPUT name $ male training corrupt negative;
RUN;
- The LIBNAME statement designate a library, an alias of the collection of data sets, to the specified directory (c:\sas).
- The FILENAME statement associates a file reference with a external file (drive+path+filename). Without the statement, you should explicitly specify the drive, paths, and file name in the INFILE statement as "INFILE 'c:\sas\egov.txt';
- The "DATA js.seoul" creates a SAS data set "seoul" in the "js" library. The result of this DATA step is stored into the "seoul.sas7bdat" in the c:\sas\.
- The INFILE statement specifies the file to be read. You may add options, such as LRECL=n FIRSTOBS=n OBS=n, if necessary.
- The "LRECL=250", the logical record length, tells SAS to read a line up to 250 columns.
- The "FIRSTOBS=7" tells SAS to read data from the 7th line; The first 6 lines are ignored.
- The "OBS=700" tells SAS to read first 700 observations. Omitting this option reads all observations in a data file.
- The INPUT statement specifies a list of variable name, type, and length depending on input styles.

INFILE OPTIONS
- DSD (delimiter-sensitive data) tells SAS to treats two consecutive delimiters as a missing value and removes quotation marks from character values.
- FIRSTOBS specifies a record number that SAS begin reading input data records.
- FLOWOVER (default) reads the next data line if there is no enough data in the current data line for all variables specified. This method may be problematic when there is missing values.
- LRECL (logical record length) specifies the logical record length. This option is not valid when the DATALINES is specified.
- MISSOVER sets all remaining variables without values to missing. And then read the next line for new observations.
- STOPOVER stops a DATA step when INPUT does not find values for all the variables specified. This option is useful when we need to check missing values in raw data.
- TRUNCOVER does not read a new data line when INPUT does not find values in the current data lines for all variables specified. This option sets all remaining variables without values to missing.
DATA js.miss;
INFILE DATALINES MISSOVER;
INPUT id korean english;
DATALINES;
01 87 65
02 86
03 95
04
RUN;
The above example reads four observations with missing. Note that "english" of the second observation is missing since consecutive blanks are interpreted as a delimiter.
If MISSOVER option is omitted, SAS reads only two observations with data massed up; the "english" of the second observations is set "03".
If STOPOVER option is used instead of MISSOVER, SAS will stop at the second data line since there is not enough data values for variables specified.

IMPORT & EXPORT PROCEDURES
Since the version 8.0, SAS provides the IMPORT procedure to read various types of external files, such as ACESS, EXCEL, and dBASE III.
In previous versions, the DBF and DIF procedures deal with dBASE III and spreadsheet files, respectively.
The IMPORT procedure creates data sets from various types of external files.
So an additional DATA step is not necessary unless the data set needs to be manipulated. In other words, the IMPORT procedure do not require the INPUT and DATALINES (or CARDS) statements.
There are two ways of using the IMPORT procedure. One is to write a IMPORT statement in a standard SAS program, the other is to use the IMPORT/EXPORT Wizard which is available under the FILE menu. Personally,
I prefer the IMPORT Wazard to the procedure, since the former provides a user-frendly interface, high flexibility, and other useful features.
Is the IMPORT statement always better than the INFILE? It depends.
For example, when you need to read a huge ACCESS file (say larger than 1GB), but only several variables are needed for analyses, the INFILE statement is recommended.
IMPORT & EXPORT WIZARD
The Import/Export Wizard provides extremely flexible ways of reading external data.
The wizard works in the graphic user interface (GUI) environment such as X-window and Microsoft Windows.
If your data file is messy and ill-organized, the wizard will be a good solution.
The Import/Export Wizard actually works in the same way that the Import and Export procedures in a SAS program do; The Wizard can generate SAS programs for Import and Export procedures.
In order to use the Wizard,
- Click "File" from the top menu bar and choose "Import Data..."
- Choose a file type from the list. You may choose "User-defined formats" for messy data.
- Locate an external file to be imported.
- Provide a library and SAS data set name.
- If you want SAS to create a SAS program file that is equivalent to what the Wizard does, specify the file name.

READ DIRECT DATA ENTRY
You can directly input data in a Data Step. INFILE DATALINES tells SAS to construct a data set from data lines after the DATALINES statement.
The following example reads a numeric variable id, a string variable depart, and a numeric variable price.
Note that data items are delimited (separated) by a space. In this case of the space-delimited format, INFILE DATALINES can be omitted.
DATA fruit;
INFILE DATALINES;
INPUT id depart $ employee;
DATALINES;
1 sales 16
2 finance 25
3 research 10
RUN;

IMPORT & EXPORT ASCII TEXT FILES
The above example implies that ASCII text data (egov.txt) are delimited with space (or blank).
If a ASCII file is delimited by other than a space, you need to explicitly specify delimiter using DELIMITER (or DLM) option with using various characters or combinations of chracters as a delimiter.
For example, DLM='09'x for Tab, DLM='&' for ampersand , DLM='^' for caret, DLM=';' for semicolon, DLM='END', and so forth.
DATA js.seoul;
INFILE egov.txt DELIMITER=',';
INPUT name $ male training corrupt negative;
RUN;
The INFILE is also able to read the data stream of DATALINES (or CARDS).
You may specify DATALINES instead of an external file name.
Note that each data element in the following example is separated by caret (^).
DATA js.fruit;
INFILE DATALINES DELIMITER='^';
INPUT id fruit $ sales;
DATALINES;
1^Grape^100
2^Pear^77
RUN;
What if a comma is used as a delimiter and there is a missing value in the second data line?
The DSD option is the answer. DSD reads a value as missing between two consecutive delimiters (,,). Without this option, the above program reads only first observation.
DATA fruit;
INFILE DATALINES DLM=',' DSD;
INPUT id fruit $ sales;
DATALINES;
1,Grape,100
2,,77
RUN;
- "DATA fruit" does not specify a library, thus the data set "fruit" is stored in the "work" library, which is the default library existing on the RAM only.
- Since the DSD option sets a comma as the default delimiter, DLM=',' is not necessary in this example. But if other delimiter is used, both DLM and DSD option should be used to read missing values correctly.
If you want to export a data set into an external file, use PUT statement. PUT allows you to control the output format flexibly, but it is a bit difficult, especially for beginners, to use this statement correctly.

Now, let us use IMPORT and EXPORT procedures to handle ASCII text files. You need to specify delimiter of the ASCII text file in the DBMS option.
PROC IMPORT DATAFILE="c:\sas\ego.csv" OUT=jeeshim.egov DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
- The DATAFILE specifies an external file to be imported.
- The OUT specifies the SAS data file to be created.
- The DBMS specifies the type of the external file. For example,"DBMS=DLM", "DBMS=CSV", and "DBMS=TAB".
- The REPLACE overwrites an existing file, if any.
- The GETNAMES reads variable names from the first line of the data file.
- The DATAROW option tells the row from which SAS reads observations.
- GETNAMES and DATAROW require semicolon at the end.
The following example exports a data set to a space delimited ASCII text file.
The DATA and the OUTFILE respectively specify a data set to be exported and an external file to which data are exported.
PROC EXPORT
DATA=jeeshim.egov
OUTFILE="c:\sas\ego.txt"
DBMS=DLM REPLACE;
RUN;

IMPORT & EXPORT EXCEL FILES
The IMPORT procedure imports spreadsheet files such as Excel and Lotus 1-2-3.
PROC IMPORT DATAFILE="c:\sas\assets.xls" OUT=js.pc DBMS=EXCEL2000 REPLACE;
SHEET="computer";
GETNAMES=YES;
RUN;
- The SHEET statement specifies the worksheet to be imported. If this statement is omitted, the worksheet will be referred to "sheet1", "sheet2", and so on.
- The EXCEL2000 indicates the Microsoft EXCEL 2000 format. Other releases are represented by EXCEL4, EXCEL5, and EXCEL97. For LOTUS 123 files, use "DBMS=WK1" (WK3 or WK4) option.
- The GETNAMES=YES reads variable names from the first row of the worksheet. The default is YES.
The following example exports a data set to an Excel file.
Note that there is only one semicolon at the end of EXPORT statement.
PROC EXPORT
DATA=js.egov
OUTFILE="c:\sas\egov.xls"
DBMS=EXCEL2000 REPLACE;
RUN;

IMPORT & EXPORT dBASE III FILES
A dBase III file has ony one table that has a well defined data structure.
You may omit the "DBMS=DBF" option since SAS can recognized file format with the extension, "dbf."
PROC IMPORT
DATAFILE="c:\sas\comuter.dbf"
OUT=js.pc
DBMS=DBF REPLACE;
RUN;
The following example exports a data set to a dBase III file.
PROC EXPORT
DATA=js.pc
OUTFILE="c:\sas\pc.dbf"
DBMS=DBF REPLACE;
RUN;

IMPORT & EXPORT ACCESS FILES
Unlike a dBASE III or FoxPro file, an Access file can has more than one table with security features.
Thus, you have to provide database, table, account (identification), and password to access a database file.
PROC IMPORT TABLE="comuter" OUT=js.pc DBMS=ACCESS REPLACE;
UID=""; PWD=""; WGDB="";
DATABASE="c:\sas\asset.mdb";
RUN;
- The DATABASE statement specifies the drive, path, and access file name (e.g., "asset.mdb").
- The TABLE option specifies the table name in the access file (e.g., "computer").
- The UID statement specifies the user ID who is allowed to access the database. Similarly, the PWD specifies the password to log in the database. You may ignore them unless specific UID and PWD are designated.
The EXPORT procedure can export a SAS data set to a Access table.
Note that the OUTTABLE specifies a table name to be generated in the ACCESS file.
PROC EXPORT
DATA=js.egov
OUTTABLE="egov2000"
DBMS=ACCESS2000 REPLACE;
DATABASE="c:\sas\egov.mdb";
RUN;

USING NETWORK RESOURCES
SAS can access data through FTP, HTTP, and TCP/IP.
This network functionality provides high flexibility and convenience in the information era.
The following example uses FTP to access the data resource.
Note that FTP, USER, PASS, and HOST are reserved words for protocol, account name, password, and host computer name, respectively.
The PROMPT specifies to prompt for the user login password, if necessary.
FILENAME myftp FTP 'gov.txt' USER='kucc625' PASS='xxxxx' PROMPT
HOST='mdss.iu.edu' CD='/sas/egov';
DATA js.egov;
INFILE myftp;
INPUT year country $ domain index;
RUN;
The following examples use URL to access data files available on a website.
FILENAME myurl URL 'http://masil.org/archives/airline.txt';
DATA js.egov;
INFILE myurl;
INPUT airline year output cost fuel load;
RUN;
FILENAME myurl URL 'http://www.masil.org/archives/smoking.txt';
DATA js.smoking;
INFILE myurl FIRSTOBS=34;
INPUT state $ cigar bladder lung kidney leukemia area;
RUN;
In order to use the TCP/IP connection, you have to use the SOCKET instead of URL.

USING INFORMATS
If data include comma in numbers, dollar ($), and percent (%), you have to use informats such as COMMAn.n, DOLLARn.n, and PERCENTn.n in the formatted input style.
DATA sales;
INFILE DATALINES;
INPUT year quan COMMA9.0 sales DOLLAR10.0 rate PERCENT3.;
DATALINES;
2000 184,871 $2,875,879 80%
2001 875,877 $5,987,972 89%
RUN;
If data are in date informats, you may use MMDDYY10. and MMDDYY8. formats. Note that the period (.) in the third observation indicates a missing value.
DATA book;
INFILE DATALINES;
INPUT start MMDDYY10. end MMDDYY8.;
DATALINES;
08/12/1999 03/01/04
03/02/1987 02/25/91
03/02/1991 .
RUN;

READING MATRICES
SAS can also read such matrix forms as correlation coefficients (TYPE=CORR), covariance (TYPE=COV), and parameter estimators (TYPE=EST).
DATA corr_mat (TYPE=CORR);
INFILE DATALINES MISSOVER;
matrix ='Correlation';
INPUT _NAME_ $ var1 var2 var3;
DATALINES;
var1 1.00000
var2 0.25757 1.00000
var3 0.57844 0.54865 1.00000
RUN;
- The _TYPE_, a special SAS variable, is used to distinguish the various statistics such as _TYPE_='CORR', _TYPE='MEAN', and _TYPE_'STD'.
- The _NAME_ is needed to identify the row of the correlation matrix.

USING CLIPBOARD
SAS can read text data from and write text data to the clipboard.
You may import a worksheet of an Excel file. First highlight the part of worksheet in Excel and copy it into the Windows Clipboard. Suppose you choose 5 variables.
FILENAME clipboard CLIPBRD;
DATA excel;
INFILE clipboard
INPUT x1-x5;
RUN;
ACCESSING THROUGH ODBC
SAS can access data through the ODBC connection.
You need to define a DSN for database or spreadsheet files in advance ("access_dsn").
- Run the "ODBC Data Source Administrator" at the Control Panel
- Click "System DSN" and then choose "Addd" to create a new DSN
- Select a proper driver (mdb, xls, etc.) and then provide DSN name and description
- Specify database, spreadsheet, or database server (local or SQL server name)
- Click "Advanced" to enter login ID and password, if necessary
The DSN is connected by the CONNECT statement in the SQL procedure.
A series of SQL statements such as CREATE and SELECT follow the CONNECT stateemnt.
PROC SQL;;
CONNECT TO ODBC AS db_con
(DATASRC="access_dsn" USER=kucc625 PASSWORD=xxxx);
SELECT * FROM CONNECTION TO ODBC(SELECT * FROM egov2004);
...;
QUIT;

DATA CONVERSION UTILITIES
SAS cannot directly read customized data formats such as STATA *.dta. Accordingly, you have to export a data set in the software into a general file format (e.g., *.csv) and then import it into a SAS data set.
Or you may use the COPY procedure to create a transport file that can be recognized by other software.
However, these tasks often become cumbersome and burdensome to most researchers.
The most efficient solution in this case is to use professional data transferring utilities such as Stat/Transfer and DBMS/COPY that transfer data from one software to another.
These utilities support a variety of file formats such as SAS, Stata, Gauss, Rats, ACCESS, dBASE, FoxPro, Paradox, Lotus 1-2-3, Quattro, Excel, Sigmaplot, Minitab, and SPSS.

REFERENCES
- SAS Institute. 2005. SAS Language Reference: Concepts, Version 9. Cary, NC: SAS Institute.
- SAS Institute. 2005. SAS Language Reference: Dictionary, 2nd ed., Version 9, Volumn 3. Cary, NC: SAS Institute.
- Burlew, Michele M. 2002. Reading External Data Files Using SAS: Examples Handbook. Cary, NC: SAS Institute.