# A Gentle Introduction to Stata, Revised Third Edition

#### By **Alan C. Acock**

Stata Press – 2012 – 401 pages

Updated to reflect the new features of Stata 11, **A Gentle Introduction to Stata, Third Edition** continues to help new Stata users become proficient in Stata. After reading this introductory text, you will be able to enter, build, and manage a data set as well as perform fundamental statistical analyses.

**New to the Third Edition**

- A new chapter on the analysis of missing data and the use of multiple-imputation methods
- Extensive revision of the chapter on ANOVA
- Additional material on the application of power analysis

The book covers data management; good work habits, including the use of basic do-files; basic exploratory statistics, including graphical displays; and analyses using the standard array of basic statistical tools, such as correlation, linear and logistic regression, and parametric and nonparametric tests of location and dispersion. Rather than splitting these topics by their Stata implementation, the material on graphics and postestimation are woven into the text in a natural fashion. The author teaches Stata commands by using the menus and dialog boxes while still stressing the value of do-files. Each chapter includes exercises and real data sets are used throughout.

**Getting started**

Conventions

Introduction

The Stata screen

Using an existing dataset

An example of a short Stata session

Summary

Exercises

**Entering data**

Creating a dataset

An example questionnaire

Develop a coding system

Entering data using the Data Editor

Value labels

The Variables Manager

The Data Editor (Browse) view

Saving your dataset

Checking the data

Summary

Exercises

**Preparing data for analysis**

Introduction

Planning your work

Creating value labels

Reverse-code variables

Creating and modifying variables

Creating scales

Save some of your data

Summary

Exercises

**Working with commands, do-files, and results**

Introduction

How Stata commands are constructed

Creating a do-file

Copying your results to a word processor

Logging your command file

Summary

Exercises

**Descriptive statistics and graphs for one variable**

Descriptive statistics and graphs

Where is the center of a distribution?

How dispersed is the distribution?

Statistics and graphs—unordered categories

Statistics and graphs—ordered categories and variables

Statistics and graphs—quantitative variables

Summary

Exercises

**Statistics and graphs for two categorical variables**

Relationship between categorical variables

Cross-tabulation

Chi-squared test

Degrees of freedom

Probability tables

Percentages and measures of association

Odds ratios when dependent variable has two categories

Ordered categorical variables

Interactive tables

Tables—linking categorical and quantitative variables

Power analysis when using a chi-squared test of significance

Summary

Exercises

**Tests for one or two means**

Introduction to tests for one or two means

Randomization

Random sampling

Hypotheses

One-sample test of a proportion

Two-sample test of a proportion

One-sample test of means

Two-sample test of group means

Testing for unequal variances

Repeated-measures t test

Power analysis

Nonparametric alternatives

Mann–Whitney two-sample rank-sum test

Nonparametric alternative: Median test

Summary

Exercises

**Bivariate correlation and regression**

Introduction to bivariate correlation and regression

Scattergrams

Plotting the regression line

Correlation

Regression

Spearman’s rho: Rank-order correlation for ordinal data

Summary

Exercises

**Analysis of variance**

The logic of one-way analysis of variance

ANOVA example

ANOVA example using survey data

A nonparametric alternative to ANOVA

Analysis of covariance

Two-way ANOVA

Repeated-measures design

Intraclass correlation—measuring agreement

Summary

Exercises

**Multiple regression**

Introduction to multiple regression

What is multiple regression?

The basic multiple regression command

Increment in R-squared: Semipartial correlations

Is the dependent variable normally distributed?

Are the residuals normally distributed?

Regression diagnostic statistics

Outliers and influential cases

Influential observations: DFbeta

Combinations of variables may cause problems

Weighted data

Categorical predictors and hierarchical regression

A shortcut for working with a categorical variable

Fundamentals of interaction

Power analysis in multiple regression

Summary

Exercises

**Logistic regression**

Introduction to logistic regression

An example

What is an odds ratio and a logit?

The odds ratio

The logit transformation

Data used in rest of chapter

Logistic regression

Hypothesis testing

Testing individual coefficients

Testing sets of coefficients

Nested logistic regressions

Power analysis when doing logistic regression

Summary

Exercises

**Measurement, reliability, and validity**

Overview of reliability and validity

Constructing a scale

Generating a mean score for each person

Reliability

Stability and test–retest reliability

Equivalence

Split-half and alpha reliability—internal consistency

Kuder–Richardson reliability for dichotomous items

Rater agreement—kappa (K)

Validity

Expert judgment

Criterion-related validity

Construct validity

Factor analysis

PCF analysis

Orthogonal rotation: Varimax

Oblique rotation: Promax

But we wanted one scale, not four scales

Scoring our variable

Summary

Exercises

**Working with missing values—multiple imputation**

The nature of the problem

Multiple imputation and its assumptions about the mechanism for missingness

What variables do we include when doing imputations?

Multiple imputation

A detailed example

Preliminary analysis

Setup and multiple-imputation stage

The analysis stage

For those who want an R^{2} and standardized βs

When impossible values are imputed

Summary

Exercises

A What’s next?

**Introduction to the appendix**

Resources

Web resources

Books about Stata

Short courses

Acquiring data

**Summary**

**References**

**Author index (pdf) **

**Subject index(pdf)**

**Alan C. Acock** is a University Distinguished Professor of Family Science and the Knudson Chair for Family Research in the College of Health and Human Sciences at Oregon State University. He has published more than 120 articles in leading social and behavioral sciences journals. Dr. Acock’s research interests encompass quantitative methodology and family studies.

