This tutorial will demonstrate how to work with theoretical probability distributions in R, generate random data from distributions or data sets, and plot probability distribution curves. We will be working with the diamonds data frame which is loaded with the tidyverse package. Please install the gridExtra package before executing the R code below. This package will allow us to put multiple ggplot objects onto one panel.


Please click the button below to open an interactive version of all course R tutorials through RStudio Cloud.

Note: you will need to register for an account before opening the project. Please remember to use your GMU e-mail address.



Click the button below to launch an interactive RStudio environment using Binder.org. This will launch a pre-configured RStudio environment within your browser. Unlike RStudio cloud, this service has no monthly usage limits, but it may take up to 10 minutes to launch and you will not be able to save your work.


Binder



Data

The R code below imports the tidyverse and gridExtra packages. The gridExtra is useful for combining multiple plots created with ggplot into a single visualization.

We will be working with the built-in diamonds data which is automatically loaded with the tidyverse package. A row in this dataset represents a diamond with its associated characteristics and measurements.


library(tidyverse)
library(gridExtra)



The diamonds data set contains the prices and other attributes of almost 54,000 diamonds. To learn more about the data, execute ?diamonds in your R console.


diamonds