Handbook of SAS® DATA Step Programming
By Arthur Li
Chapman and Hall/CRC – 2013 – 275 pages
To write an accomplished program in the DATA step of SAS®, programmers must understand programming logic and know how to implement and even create their own programming algorithm. Handbook of SAS® DATA Step Programming shows readers how best to manage and manipulate data by using the DATA step.
The book helps novices avoid common mistakes resulting from a lack of understanding fundamental and unique SAS programming concepts. It explains that learning syntax does not solve all problems; rather, a thorough comprehension of SAS processing is needed for successful programming. The author also guides readers through a programming task. In most of the examples, the author first presents strategies and steps for solving the problem, then offers a solution, and finally gives a more detailed explanation of the solution.
Understanding the DATA steps, particularly the program data vector (PDV), is critical to proper data manipulation and management in SAS. This book helps SAS programmers thoroughly grasp the concept of DATA step processing and write accurate programs in the DATA step. Numerous supporting materials, including data sets and programs used in the text, are available on the book’s CRC Press web page.
"…covers a good number of topics attractive to SAS users. The strength of the book is a simple and straight-to-the-point approach taken by the author. Examples and a few exercises appearing at the conclusion of each chapter are very helpful in understanding the topics. It is a good source as a supplement to a textbook or a statistics course using SAS. … All of the code is written and explained very efficiently. Anyone interested in the topic will find this handbook very easy to follow."
—The American Statistician, May 2014
"Handbook of SAS DATA Step Programming provides a thorough introduction to the statements and functionalities of the SAS DATA step. The book can be used as an introductory tutorial for beginning SAS programmers or as a reference book for experienced users. … Many users of DATA step have a strong statistical background, but not all of them really try to understand DATA step as a programming language. The Handbook covers this gap and helps statisticians improving their programming efficiency using SAS DATA step."
—Journal of Statistical Software, January 2014
Introduction to SAS
SAS Program and Language
Reading DATA into SAS
Creating and Modifying Variables
Base SAS Procedures
Subsetting Data by Selecting Variables
Changing the Appearance of Data
Creating Variables Conditionally
The IF-THEN/ELSE Statement
Executing One of Several Statements
Modifying the IF-THEN/ELSE Statement with the Assignment Statement
Understanding How the DATA Step Works
DATA Step Processing Overview
Retaining the Value of Newly Created Variables
Conditional Processing in the DATA Step
BY-Group Processing in the DATA Step
Introduction to BY-Group Processing
Applications Utilizing BY-Group Processing
Writing Loops in the DATA Step
Implicit and Explicit Loops
Utilizing Loops to Create Samples
Using Looping to Read a List of External Files
Introduction to Array Processing
Functions and Operators Related to Array Processing
Some Array Applications
Applications That Use Multi-Dimensional Arrays
Combining Data Sets
Vertically Combining Data Sets
Horizontally Combining Data Sets
Data Input and Output
Introduction to Reading and Writing Text Files
Reading Text Files
Creating Text Files
Data Step Functions
Introduction to Functions and CALL Routines
Date and Time Functions
Functions for Converting Variable Types
Useful SAS Procedures
Using the SORT Procedure to Eliminate Duplicate Observations
Using the COMPARE Procedure to Compare the Contents of Two Data Sets
Restructuring Data Sets Using the TRANSPOSE Procedure
Creating the User-Defined Format Using the FORMAT Procedure
Using the OPTIONS Procedure to Modify SAS System Options
Exercises appear at the end of each chapter.
Arthur Li is a biostatistician at the City of Hope National Medical Center in Los Angeles County, California. He is also a part-time statistical programming instructor at the University of Southern California, where he received an MS in biostatistics. He often gives presentations and seminars on DATA step programming and statistical analysis using SAS software at SAS conferences.