Project 3, Checkpoint 2

Now it’s time to fill out the functions in your package.

Set up basic structure

Choose two of the “aspirational” code chunks from Project 3, Checkpoint 1 - one plot, and one summary.

Run use_this::use_r("some-name") to create a new .R file for each function, choosing a name for the file that relates to the name of the function. Then run use_this::use_test("some-name") with the same file name to create the corresponding test.

In the files some-name.R, fill out the roxygen2 style documentation according to how you plan for the function to work.

Create the tests

Creating the tests before the functions exist??? Yes!

You’ve written “aspirational” code - so you already know what you want to have happen when your functions are run. Let’s formalize this first, so that as we construct the functions, we can easily test them.

In the file tests/testthat/test-some-name.R, edit the code of the unit test.

If your output is a plot, it’s hard to check specifics of the plot, but you’ll still want to make sure a plot was actually created. Your test will probably look something like this:

test_that("plot gold medals works", {
  dat <- load_data()
  
  gold_plot_simple <- plot_olympic_golds(olympic_data, "United States")
  gold_plot_year <- plot_olympic_golds(olympic_data, "United States", min_year = 2000)
  gold_plot_gender <- plot_olympic_golds(olympic_data, "United States", by_gender = TRUE)
  
  expect_s3_class(gold_plot_simple, "ggplot")
  expect_s3_class(gold_plot_year, "ggplot")
  expect_s3_class(gold_plot_gender, "ggplot")
})

If your output is summary calculations, you’ll want to also check the values themselves, by computing them outside the function and comparing. Your test will probably look like this:

test_that("summarize medals works", {
  dat <- load_data()
  
  result <- medal_summary(dat,                                country = "United States")
  
  # Check object type
  expect_s3_class(result, "tibble")
  
  # Check expected column names
  expect_equal(names(result), c("Gold", "Silver", "Bronze"))
  
  # Check expected values
  expect_equal(result$Gold[1], 2474)
  expect_equal(result$Silver[1], 1512)
  expect_equal(result$Bronze[1], 1233)
})

How did I know the numbers to expect? I ran code on my data and looked at it!

tuesdata <- tidytuesdayR::tt_load('2024-08-06')

---- Compiling #TidyTuesday Information for 2024-08-06 ----
--- There is 1 file available ---


── Downloading files ───────────────────────────────────────────────────────────

  1 of 1: "olympics.csv"

dat <- tuesdata$olympics

dat |>
    dplyr::filter(team == "United States") |>
    dplyr::summarize(
      Gold = sum(medal == "Gold", na.rm = TRUE),
      Silver = sum(medal == "Silver", na.rm = TRUE),
      Bronze = sum(medal == "Bronze", na.rm = TRUE)
    )

# A tibble: 1 × 3
   Gold Silver Bronze
  <int>  <int>  <int>
1  2474   1512   1233

At first this might seem silly - you’re running the exact code from inside your function to get the expected result and then comparing it to what the function actually produces.

But this approach “locks down” the correct output, so that if you make the function more complicated in the future, you’ll still be checking that it works correctly.

Write the functions

Now it’s time to fill out your actual function code!

Since these functions are meant to be shortcuts for code you already wrote, you can begin by simply copying that code into the function body. Then, you’ll need to generalize the function for different inputs and outputs.

Don’t forget to validate your function inputs!

It can be challenging to write functions with unquoted variable names in them. For example,

dat |> select(team)

will work but

select_team <- function(dat, team) {
  dat |> select(team)
}

will not.

Stat 541 students should review the Unit 1 material about non-standard evaluation.

Stat 431 students, you may always take variable names as string input. Then use .data[[var_name]] wherever you used to have the name of the variable, e.g.

select_team <- function(dat, team) {
  dat |> select(.data[[team]])
}

Refactor your functions

Look for some steps in your functions where something is repeated, or might be repeated in the future. Some examples might include:

Loading and immediately filtering your data.
Validating function input
Computing frequently used summary statistics.

Create a new helper function for at least one of these steps, and include this function (and its documentation and tests!) in your package.

Document, build, test!

Make sure to re-document, re-build, and re-test your package before you do your final push to GitHub!