PROC TABULATE "summarizes values for all observations in a data set and prints the summaries in table format. It can produce tables with one to three dimensions, and within each dimension it allows multiple variables to be reported one after another or hierarchically."
"PROC TABULATE constructs tables of descriptive statistics from class variables, analysis variables, and keywords for statistics." It "treats the values of variables used to create a table in one of the two ways: as discrete classifications of data (class variables) or as continuous values for which you can request descriptive statistics (analysis variables)."
If you want to directly generate complicated and customized tables without a spreadsheet program, the TABULATE will be best alternative. There might be some difficulties being familiar with the grammars of TABULATE, but it will highly worth investing some time and efforts to do that (I never agree with Mr. Cho!). As a good tool for the customized tables, TABULATE is equipped with high flexibility, various functionality, and diverse formats.
Although FREQ provides similar features including chi-square test (see the below cross table), it cannot support various statistics (e.g., mean, sum, and standard deviation). And PROC REPORT provides excellent functionality of generating professional reports, but its focus is on dealing efficiently with huge data set, not producing table-based outputs.
Available statistics are as follows. N and NMISS do not require any nonmissing observations. SUM, MEAN, MAX, MIN, RANGE, USS, and CSS require at ltast one nonmissing observation. VAR, STD, STDERR, CV, T, and PRT require at least two observations.
The TABULATE statement is always accompanied by one or more TABLE statements specifying the tables to be produced. You need to specify the variables to be used in the table in either a VAR statement (for analysis variables) or a CLASS statement (for class variables), but not both.
Class variable means "any variable, numeric or character, that you want to use to classify data into groups or categories of information." Normally each class variable has a small number of discrete values or unique levels. It can have character, integer, or even decimal values, but the number of unique values should be limited. CLASS statement is to "identify variables in the input data set as class variables.
Analysis variable means "any numeric variable for which you want to compute statistics." It is a quantitative or continuous variable as well as a qualitative or discrete variable for which you want descriptive statistics. VAR statement is to identify analysis variables in the input data set.
Missing values: "If an observation contains missing values for any variable listed in the CLASS statement, the observation is not included in the table unless you specify the MISSING option in the PROC TABULATE statement." Missing values for class variables cause observations to be omitted (skipped) from the analysis performed to produce tables. "If an observation contains missing values for a variable listed in the VAR statement, the value is omitted from calculations of all statistics except N and NMISS; Missing value for analysis variable affect only the statistics for those variables. A different type of missing value that is often confused with missing values in observations is the missing value used to represent empty table cells."
"By default, PROC TABULATE evaluates each page it prints and omits columns and rows for combinations of class variable values that do not exist. To change the default, you can specify the PRINTMISS option in the TABLE statement, and TABULATE will print the same headings for each subtable."
TABULATE provides the PCTN and PCTSUM statistics to allow you to print the percentage of the value in a single cell to the value in another cell or to the total of a group of cells. You need to construct the PCTN or PCTSUM expression using a denominator definition that describes to TABULATE what categories of information should be summed to arrive at the denominator. Without the denominator, TABULATE automatically summarizes the values in all SUM cells (for the PCTSUM denominator) or all N cells (for the PCTN denominator): defaults of each.
The format for specifying a denominator definition is to enclose it in brackets <> that is appended to the PCTN or PCTSUM statistcs (e.g., PCTN
When you concatenate ALL with other elements in the column dimension, TABULATE prints a separate column that summarizes the observations reported in each row of the table.
Removing levels of headings: there are times when the multiple levels of heading for class and analysis variables are not necessary in your tables. this often occurs when you replace a default heading with more descriptive text that actually includes the information in two level of heading, table share='number of shares'*sum=' '; no cell for sum title

TABULATE Specifications (Basic grammar)
PROC TABULATE
ORDER=FORMATTED (ordered by the formatted values); ORDER=DATA (the order that the observations are read from the data set); ORDER=FREQ (order the values of the class variables so the value that occurs most frequently in the data set appears first); ORDER=INTERNAL (ordered by the SORT procedure: defaults)
The FORMCHAR=a string of 11 characters. The default is FORMCHAR='|----|+---'; FORMCHAR=' 11 blank space '; for no horizontal and vertical line. FORMCHAR='4FBFACBFBC4F8F4FABBFBB'X; for IMB 1403 printer FORMCHAR='FABFACCCBCEB8FECABCBBB'X; for IBM 6670 printer FORMCHAR='B3C4DAC2BFC3C5B2C0C1D9'X;
MISSING considers missing values as valid levels for the class variables. Unless the Missing option is specified, TABULATE does not include observations with a missing value for one or more class variables in the analysis.
NOSEPS eliminates horizontal separator lines from the row titles and body of the printed table except the column title section of the table.
VARDEF=divisor specifies the divisor to be used in the calculation of the variances. If divisor is DF(default), the degrees of Freedom (N-1) is used as the divisor.
TABLE page expression, row-expression, column expression /option list;
/BOX=value specifies the text to be placed in the empty box above the row titles. It allows you to move the page heading into the top left box of the table or insert either a variable name (or label) or a descriptive string in the box. When BOX=_PAGE_, the page-dimension text appears in the box. If the page-dimension text does not fit, it is placed in its default position and the box is left empty. When BOX='sting', the quoted string appears in the box. Any name, label, or quoted string that does not fit is truncated. When BOX=variable, the name or label of a variable appears in the box. (TABLE gender*marital, income tax /BOX='Income and Tax';).
/CONDENSE prints multiple logical pages on a single physical page.
/PRINTMISS species that row and column headings are the same for all logical pages of the table.
/ROW=spacing specifies whether all title elements in a row crossing are allotted space even when they are blank. When ROW=CONSTANT (OR CONST), the default, all row title elements have space allotted to them; when ROW=FLOAT, the row title space is divided equally among the nonblank title elements in the crossing.
/RTSPACE=number or RTS=number supplies an integer value that species the number of print positions allotted to the headings in the row dimension. the default value I on-fourth of the LINESIZE=value.
Format USAGE in the TABLE statement: variable*F=format (income*F=6.1).
Remove or change variable name in the TABLE statement (TABLE sum*stock=' ';)
FORMAT variable list formats the values of class variables used as headings in the page, row, and column dimensions. It may be used in combination with the PROC FORMAT; to group values of class variables. This statement has no effect on either analysis variables or the content of table cells.
KEYLABEL keyword='label' ...; replace text in the label anywhere the specified keyword is use, unless another label is assigned in the TABLE statement (KEYLABEL MEAN='Average';).
LABEL variable='label'; replaces the name of the variable in the page, row, or column heading where the variable appears. The maximum length of the label (in both KEYLABEL and LABEL)is 40 characters.
Hierarchical positioning: An Asterisk '*' between variables indicate statistics will be listed in a hierarchical position. On the other hand, a SPACE between variables indicates statistics of them will be listed in the parallel position, not in a hierarchical position. For instance, 'TABLE gender marital, income*...' will add two rows for the marital status below gender.

/RTS=8; It is an option for adjusting the cell length (Equivalent to /RTSPACE=8;). You can compare the above with the following result of FREQ procedure (PROC FREQ; TABLE gender*party /NOCOL MISSING; RUN;).





The following SAS script illustrates how to construct two dimentional tables.

1.gif)
2.gif)
The following SAS script constructs a three dimentional table.




The following SAS scripts are examples of more complicated tables.


The following SAS script, adapted from Kim(1993), generates the Normal probability distribution table using PROBNORM(), which returns the probability from the standard normal distribution.
Notice that prob=' '*SUM=' '; removes variable names prob and SUM from the table, otherwise they would appear on the top of the table.

The following SAS script generates the Student T probability distribution table using TINV(), which returns a quantile from the t distribution.
