Analytic Methods in Sports
Using Mathematics and Statistics to Understand Data from Baseball, Football, Basketball, and Other Sports
Chapman and Hall/CRC – 2014 – 254 pages
Chapman and Hall/CRC – 2014 – 254 pages
The Most Useful Techniques for Analyzing Sports Data
One of the greatest changes in the sports world in the past 20 years has been the use of mathematical methods to analyze performances, recognize trends and patterns, and predict results. Analytic Methods in Sports: Using Mathematics and Statistics to Understand Data from Baseball, Football, Basketball, and Other Sports provides a concise yet thorough introduction to the analytic and statistical methods that are useful in studying sports.
The book gives you all the tools necessary to answer key questions in sports analysis. It explains how to apply the methods to sports data and interpret the results, demonstrating that the analysis of sports data is often different from standard statistical analysis. Requiring familiarity with mathematics but no previous background in statistics, the book integrates a large number of motivating sports examples throughout and offers guidance on computation and suggestions for further reading in each chapter.
"A comprehensive and up-to-date look at the primary tools and techniques in sports analytics, covering every major sport, Analytic Methods in Sports condenses what took me five years to learn into 200 pages. It’s both easy to read and complete with mathematic rigor. If you’re serious about getting into analytics in any sport at any level, this needs to be on your bookshelf."
—Brian Burke, Founder of Advanced Football Analytics and NFL Team Consultant
"Many people enter the rapidly growing sports analytics industry without the adequate tools to perform analysis. In his book, Severini details the fundamental statistical skill set needed to succeed with examples from every major sport. It will appeal to readers just introduced to the field of statistics as well as the more experienced looking to further develop their ability to manage and interpret data. A worthy addition to any analyst’s library."
—Keith Goldner, Chief Analyst, numberFire
Organization of the book
Describing and Summarizing Sports Data
Types of data encountered in sports
Summarizing results by a single number: mean and median
Measuring the variation in sports data
Sources of variation: comparing between-team and within-team variation
Measuring the variation in a qualitative variable such as pitch type
Using transformations to improve measures of team and player performance
Home runs per at-bat or at-bats per home run?
Applying the rules of probability to sports
Modeling the results of sporting events as random variables
Summarizing the distribution of a random variable
Point distributions and expected points
Relationship between probability distributions and sports data
Tailoring probability calculations to specific scenarios: conditional probability
Relating unconditional and conditional probabilities: the law of total probability
The importance of scoring first in soccer
Using the law of total probability to adjust sports statistics
Comparing NFL field goal kickers
Two important distributions for modeling sports data: the binomial and normal distributions
Using Z-scores to compare top NFL season receiving performances
Applying probability theory to streaks in sports
Using probability theory to evaluate "statistical oddities"
Using the margin of error to quantify the variation in sports statistics
Calculating the margin of error of averages and related statistics
Using simulation to measure the variation in more complicated statistics
The margin of error of the NFL passer rating
Comparison of teams and players
Could this result be due to chance? Understanding statistical significance
Comparing the American and National Leagues
Margin of error and adjusted statistics
Important considerations when applying statistical methods to sports
Using Correlation to Detect Statistical Relationships
Linear relationships: the correlation coefficient
Can the "Pythagorean theorem" be used to predict a team’s second-half performance?
Using rank correlation for certain types of nonlinear relationships
The importance of a top running back in the NFL
Recognizing and removing the effect of a lurking variable
The relationship between ERA and LOBA for MLB pitchers
Using autocorrelation to detect patterns in sports data
Quantifying the effect of the NFL salary cap
Measures of association for categorical variables
Measuring the effect of pass rush on Brady’s performance
What does Nadal do better on clay?
A caution on using team-level data
Are batters more successful if they see more pitches?
Modeling Relationships Using Linear Regression
Modeling the relationship between two variables using simple linear regression
The uncertainty in regression coefficients: margin of error and statistical significance
The relationship between WAR and team wins
Regression to the mean: why the best tend to get worse and the worst tend to get better
Trying to detect clutch hitting
Do NFL coaches expire? A case of missing data
Using polynomial regression to model nonlinear relationships
The relationship between passing and scoring in the EPL
Models for variables with a multiplicative effect on performance using log transformations
An issue to be aware of when using multi-year data
Regression Models with Several Predictor Variables
Multiple regression analysis
Interpreting the coefficients in a multiple regression model
Modeling strikeout rate in terms of pitch velocity and movement
Another look at the relationship between passing and scoring in the EPL
Multiple correlation and regression
Measuring the offensive contribution of players in La Liga
Models for variables with a synergistic or antagonistic effect on performance using interaction
A model for 40-yard dash times in terms of weight and strength
Interaction in the model for strikeout rate in terms of pitch velocity and movement
Using categorical variables, such as league or position, as predictors
The relationship between rebounding and scoring in the NBA
Identifying the factors that have the greatest effect on performance: the relative importance of predictors
Factors affecting the scores of PGA golfers
Choosing the predictor variables: finding a model for team scoring in the NFL
Using regression models for adjustment
Adjusted goals-against average for NHL goalies
Descriptions of Available Datasets
Suggestions for further reading appear at the end of each chapter.
Thomas A. Severini is a professor of statistics at Northwestern University. He is a fellow of the American Statistical Association and the author of Likelihood Methods in Statistics and Elements of Distribution Theory. He received his PhD in statistics from the University of Chicago. His research areas include likelihood inference, nonparametric and semiparametric methods, and applications to econometrics.