Package-based Workflow

What is a package workflow?

Why a package workflow?

  • Ease students into developer mindset.

  • Emphasize unit-testing and documentation frameworks.

  • Practice with code review of others’ code.

  • Easier to performance check code.

  • Grading via automatic tests.

How-to

  1. Create a skeleton R package on GitHub.

  2. Include pre-written unit tests.

  3. Include vignette-style function demos.

  4. Duplicate unit tests with a new dataset for your own use in grading.

  5. Require code review as part of assignment.

Student Assignment

  1. Complete the functions.

  2. Document the functions.

  3. Build the package.

  4. Run the unit tests.

  5. Render the demo document.

What do students turn in?

  1. A link to their GitHub of the package. I recommend GitHub Classroom for this.

  2. A rendered html, showing the functions being run on data.

  3. A code review of someone else’s package.

Grading

Four grading elements to combine:

  • Does the package properly build and install or are there errors?

  • Does the package pass the provided unit tests?

  • Do the package functions behave as expected in the demo document?

  • Are the peer code reviews positive?

  • Does the package pass the secret unit tests?

OUR TURN

Setup

Make a sandbox for yourself

  1. Make a folder somewhere called “Package Assignment Practice” or similar.

  2. Run usethis::create_package("YOUR PATH/Package Assignment Practice/example_skeleton_package"). This will open a new R Project.

  3. In the new project, run usethis::use_r("simple_linear_regression")

  4. Run usethis::use_test("simple_linear_regression")

  5. New File > Quarto > slr_demo.qmd

Create the demo document first

Include code that you hope will work properly…

simple_linear_regression(data = mtcars,
                         explanatory = "gear",
                         response = "mpg")

… and code you hope will error in a reasonable way.

simple_linear_regression(data = mtcars,
                         explanatory = "gear",
                         response = "not a variable")

Sketch out the function

How much detail you provide depends on the level of your students.

For the first pass, I recommend:

  • Include all the roxygen-style documentation

  • Include the inputs and names.

  • Give them a few lines to get started.

  • Use comments to show where you want them to edit.

  • Provide the object structure of the expected output.

  • Make sure the code is all runnable as-is. (But not fully correct yet!)

Example

#' Implements simple linear regression by hand
#'
#' @param dat A data frame
#' @param response The name of a response variable in the data frame.
#' @param explanatory The name of the explanatory variable in the data frame.
#'
#' @return A data frame of coefficients
#'
#' @import dplyr
#'
#' @export
simple_linear_regression <- function(dat, response, explanatory){

  x <- dat[[explanatory]]
  y <- dat[[response]]

  x_bar <- mean(x)
  y_bar <- mean(y)

  ### Edit code after here

  sd_x <- 1
  sd_y <- 1

  beta_0 <- 1
  beta_1 <- 1

  results <- tibble::tibble(
    Intercept = 0,
    Slope = 1
  )

  return(results)

}

Unit tests

test_that("simple linear regression is correct", {

  my_result <- simple_linear_regression(mtcars, "mpg", "hp")

  mass_result <- lm(mpg ~ hp, data = mtcars)

  expect_equal(coef(mass_result)[['hp']], my_result$Slope,
               tolerance = 0.05)
  expect_equal(coef(mass_result)[[1]], my_result$Intercept,
               tolerance = 0.05)

})

Secret Unit Tests

  • Make a copy of the tests folder in your package.

  • Save that copy outside the package folder, in your “Package Assignment Practice” folder.

  • Change the unit test code to use a different dataset or different variables.

Activity: Be the student

Auto-grading with secret tests

Open the script “testthat.R” …

library(testthat)
library(example_skeleton_package)

test_check("example_skeleton_package")

… and run it!

Working with GitHub

If student work is on GitHub rather than in folders…

remotes::install_github("imastudent/example_skeleton_package")

… and loop through each.

Automation

urls <- # list of student repos

all_test_results <- tibble()
all_time_results <- tibble()

for (url in submissions){
  
    unloadNamespace("example_skeleton_package")
    remove.packages("example_skeleton_package", ".")
    
    remotes::install_github(submissions)

    library(example_skeleton_package)

    lr_test <- testthat::test_dir("./tests/testthat",
               reporter = "minimal",
               stop_on_failure = FALSE)
      
      ## save output however you prefer
    
}

Leveling up

Function design/development

Package creation

  • Don’t provide roxygen2 documentation; require students do it.

  • Require additional unit tests

  • Require multiple .R files and multiple functions per file.

  • Practice CRAN submission?

  • R Packages Textbook

Statistical computing

  • Matrix operations, e.g. for multiple regression

  • Iteration to convergence, e.g. kmeans

  • Model fitting via gradient descent

  • Emphasize speed/efficiency

Speed/memory tests

big_dat <- #some large dataset
  
for (url in submissions){
  
    unloadNamespace("example_skeleton_package")
    remove.packages("example_skeleton_package", ".")
    
    remotes::install_github(submissions)

    library(example_skeleton_package)

    benches <- bench::mark(
      slr = safely(simple_linear_regression)(big_dat, price, gearbox),
      memory = FALSE,
      check = FALSE,
      time_unit = "ms",
      min_time = Inf,
      max_iterations = 20,
      filter_gc = FALSE
    ) %>%
      select(
        expression, median
      ) 
    
}