Skip to Content

Reproducible Research with R and RStudio

By Christopher Gandrud

Chapman and Hall/CRC – 2013 – 294 pages

Series: Chapman & Hall/CRC The R Series

Purchasing Options:

  • Add to CartPaperback: $69.95
    978-1-46-657284-3
    July 15th 2013

Description

Bringing together computational research tools in one accessible source, Reproducible Research with R and RStudio guides you in creating dynamic and highly reproducible research. Suitable for researchers in any quantitative empirical discipline, it presents practical tools for data collection, data analysis, and the presentation of results.

With straightforward examples, the book takes you through a reproducible research workflow, showing you how to use:

  • R for dynamic data gathering and automated results presentation
  • knitr for combining statistical analysis and results into one document
  • LaTeX for creating PDF articles and slide shows, and Markdown and HTML for presenting results on the web
  • Cloud storage and versioning services that can store data, code, and presentation files; save previous versions of the files; and make the information widely available
  • Unix-like shell programs for compiling large projects and converting documents from one markup language to another
  • RStudio to tightly integrate reproducible research tools in one place

Whether you’re an advanced user or just getting started with tools such as R and LaTeX, this book saves you time searching for information and helps you successfully carry out computational research. It provides a practical reproducible research workflow that you can use to gather and analyze data as well as dynamically present results in print and on the web. Supplementary files used for the examples and a reproducible research project are available on the author’s website.

Reviews

"Gandrud has written a great outline of how a fully reproducible research project should look from start to finish, with brief explanations of each tool that he uses along the way. … the readers who will get the most use from this book are those are already working in R and just need a way to organize their work. That being said, advanced undergraduate students in mathematics, statistics, and similar fields as well as students just beginning their graduate studies would benefit the most from reading this book. Many more experienced R users or second-year graduate students might find themselves thinking, ‘I wish I’d read this book at the start of my studies, when I was first learning R!’ … a good text for beginning graduate students or advanced undergraduate students who are just starting to do technical research. … This book could be used as the main text for a class on reproducible research …"

The American Statistician, November 2014

"Three recent books have significantly influenced how I use R in reproducible work: Dynamic Documents with R and knitr by Yihui Xie, Reproducible Research with R and RStudio by Christopher Gandrud, and Implementing Reproducible Research edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng … I recommend all three books to R users at any level. There really is something here for everyone."

—Richard Layton, PhD, PE, Rose-Hulman Institute of Technology, Terre Haute, Indiana, USA

Contents

Getting Started

Introducing Reproducible Research

What Is Reproducible Research?

Why Should Research Be Reproducible?

Who Should Read This Book?

The Tools of Reproducible Research

Why Use R, knitr, and RStudio for Reproducible Research?

Book Overview

Getting Started with Reproducible Research

The Big Picture: A Workflow for Reproducible Research

Practical Tips for Reproducible Research

Getting Started with R, RStudio, and knitr

Using R: The Basics

Using RStudio

Using knitr: The Basics

Getting Started with File Management

File Paths and Naming Conventions

Organizing Your Research Project

Setting Directories as RStudio Projects

R File Manipulation Commands

Unix-like Shell Commands for File Management

File Navigation in RStudio

Data Gathering and Storage

Storing, Collaborating, Accessing Files, and Versioning

Saving Data in Reproducible Formats

Storing Your Files in the Cloud: Dropbox

Storing Your Files in the Cloud: GitHub

RStudio and GitHub

Gathering Data with R

Organize Your Data Gathering: Makefiles

Importing Locally Stored Data Sets

Importing Data Sets from the Internet

Advanced Automatic Data Gathering: Web Scraping

Preparing Data for Analysis

Cleaning Data for Merging

Merging Data Sets

Analysis and Results

Statistical Modeling and knitr

Incorporating Analyses into the Markup

Dynamically Including Modular Analysis Files

Reproducibly Random: set.seed

Computationally Intensive Analyses

Showing Results with Tables

Basic knitr Syntax for Tables

Table Basics

Creating Tables from R Objects

Showing Results with Figures

Including Non-knitted Graphics

Basic knitr Figure Options

Knitting R’s Default Graphics

Including ggplot2 Graphics

JavaScript graphs with googleVis

Presentation Documents

Presenting with LaTeX

The Basics

Bibliographies with BibTeX

Presentations with LaTeX Beamer

Large LaTeX Documents: Theses, Books, and Batch Reports

Planning Large Documents

Large Documents with Traditional LaTeX

knitr and Large Documents

Child Documents in a Different Markup Language

Creating Batch Reports

Presenting on the Web with Markdown

The Basics

Markdown with Pandoc and Custom CSS

Slideshows with Markdown, knitr, and HTML

Publishing Markdown Documents

Conclusion

Citing Reproducible Research

Licensing Your Reproducible Research

Sharing Your Code in Packages

Project Development: Public or Private?

Is it Possible to Completely Future Proof Your Research?

Bibliography

Index

Author Bio

Christopher Gandrud is a research associate at the Hertie School of Governance. He was previously a lecturer of international relations at Yonsei University and a fellow in government at the London School of Economics (LSE). He has published articles on political economy and quantitative methods in the Review of International Political Economy and the International Political Science Review. He earned a PhD in political science from the LSE.

Name: Reproducible Research with R and RStudio (Paperback)Chapman and Hall/CRC 
Description: By Christopher Gandrud. Bringing together computational research tools in one accessible source, Reproducible Research with R and RStudio guides you in creating dynamic and highly reproducible research. Suitable for researchers in any quantitative empirical discipline, it...
Categories: Bioinformatics, Statistical Computing, Statistical Theory & Methods