Practice Activity Week 2

Accessing the Practice Activity

Download the template Practice Activity Quarto file here: pa-2.qmd

Important

Be sure to save the file inside your Week 2 folder of your STAT 431 (or 541) folder!

Writing Functions

  1. Write a function called summ_stats() that takes a vector x as input and produces the following output:
  • for numeric variables: returns the mean, median, standard deviation, and IQR as a dataframe
  • for categorical variables: returns the number of levels (categories) of the variable as a dataframe

Hint: You can use tibble() to create the data frame. For example, tibble(a = 1:2, b = 2:3) creates a data frame with variables a and b.

  1. Confirm that your function works on each type of variable by running the code below.
summ_stats(diamonds$carat)

summ_stats(diamonds$color)

Iterating Functions

  1. Use map() to apply your summ_stats() function to every column in the diamonds dataset.

Hint: Look up the bind_rows() documentation from dplyr to combine summary statistics for all the variables into one data frame. The .id argument will be especially helpful in adding the variable names!

Data Transformation

  1. Let’s make the output more intuitive, with the variables on the columns and the summary statistics on the rows.

Hint: You will need to do a double pivot (pivot_longer() then pivot_wider()) to achieve this result!

Using Helper Functions (Stat 541 only)

  1. Now that we have the output we want, let’s use our code to write a summ_df() function that takes a data frame as an input and outputs a table of summary statistics for every variable in the data frame.

Hint: The body of your function should contain all the code from Question 3.

  1. Demonstrate that your function works using the mpg dataset.