usethis::use_course("Cal-Poly-Advanced-R/entomology")Practice Activity 9: Find the bugs
Overview
The repository Cal-Poly-Advanced-R/entomology contains a package that is mostly complete and mostly correctly structured, that performs some simple analyses and summaries on a dataset concerning a few species of beetle.1
Unfortunately, this package contain many bugs, and your job is to track them down! It also is missing any @example documentation or testthat tests to help identify the issues.
To make things a little easier, there is exactly one problem in each of the nine different .R files (not counting package.R, which is just a formality). There is also one extra bonus problem in the summary.R file.
Run the following to download the package source code:
Then open pa-9.qmd.
Step One: Understand goals
The primary goal of this package is to guess the species of an unknown beetle, based on two measurements of its horns (or aedeagus): the width in microns, and the angle in degrees.
The entomology package supplies a function called beetle_classify_nearest, which takes in a width and angle and returns the predicted species. However, this function doesn’t work, due to errors in its code and the functions it depends on.
Begin by looking at the source code of beetle_classify_nearest, which is in the classify.R file. In your own words, what do the steps of this function seem to be doing?
Then, notice that beetle_classify_nearest depends on two new functions: beetle_species and beetle_centroid. What .R file are these in? What do they do? What new functions to they depend on? What do those do? And so on.
Step Two: Find syntax errors
If our code is poorly formed, we even won’t be able to run it to test it!
In your entomology package project, attempt to build the package. (Reminder: Ctrl/Cmd-Shift-B is a shortcut!)
What error does this build encounter? Fix it, then build your package!
Step Three: Test basic functions.
Run
beetle_data(). Does this seem to work as expected? If not, fix it.Run
beetle_species(). Does this seem to work as expected? If not, fix it.
Rebuild your package once you have fixed data.R.
Remember, there is exactly one bug in each file, except summary.R which has two
Step Four: Test validation functions
Notice that the package provides two helper functions to make sure that users input appropriate values to the main functions. These are in validation.R.
Try to run the following unit test:
validate_species("Con")You’ll find that the validate_species function is not found, even if your package is fully build and loaded. This is because it is an unexported function - it was made to help the developer simplify their design, not so that users could access it and use it. Notice that the roxygen2 documentation comments above these functions do not include an @export line, and that these functions do not show up in the NAMESPACE file.
To access and run an unexported function, use triple colons :::, e.g.
entomology:::validate_species("Con")Run some other unit tests on these functions until you discover the hidden bug.
Hint: Make sure to try every possible input that the function claims it can accept.
Step Five: Test small functions
Now, write your own unit tests for the basic calculation and data wrangling functions in counts.R, correlation.R, angle.R, and filter.R.
These unit tests should include:
The simplest possible function calls that you expect to work correctly.
A more complicated function call that you still expect to work correctly.
One or more function calls that you expect to cause an error.
Make sure to state what you expect the output to be before you run the function; and be sure to consider the object types and structures you expect, not just the values.
Continue developing and running unit tests until you find the bug in each of these files. Once you have fixed all of them, re-build and re-load your package, and then try the tests again to make sure they behave as desired.
Step Six: Test complicated functions.
Now that you’ve fixed problems in the helper functions, we can revisit the main ones.
Create unit tests as above for the functions in classify.R and summary.R.
Use debugging techniques like debugonce() or browse() to find where problems are happening.
Hint: While there is only one bug directly in this file, there is also a problem in another file that causes this code to break, that you should track down via your debugging.
Test Coverage (Stat 541 Only)
All your unit tests should be included formally in the package using the testthat structure.
Run covr::package_coverage("."). You should have 100% coverage!
Hints and Advice
Most of the errors in this package are not computational; instead, they are related to syntax problems, inconsistent object types and structures, etc.
The bugs are all in the code itself, not in the descriptions or documentation of the functions; you can trust that the documentation describes what the functions should do.
Don’t overlook the power of “rubber duck debugging” - describe what the function does, step by step, to a partner and see if you notice anything amiss as you say it out loud.
Debugging is hard! If you find yourself stumped and frustrated, take a break and come back later with fresh eyes. And don’t forget to work together with you peers - don’t just split the list in half, discuss each bug together to track them down faster.
Footnotes
Lubischew, A.A. (1962) On the use of discriminant functions in taxonomy. Biometrics, 18, 455-477.↩︎