Code Speed

Fortunately, we stand on the shoulders of giants. Many skilled computer scientists have put a lot of time and effort into writing R functions that run as fast as possible.

To speed up the code, without deep knowledge of computer algorithms and inner workings, you can sometimes come up with clever ways to avoid these pitfalls.

First, though: as you start thinking about writing faster code, you’ll need a way to check whether your improvements actually sped up the code.

Required-readingRequired Reading
Required-readingRequired Reading
Required-readingRequired Reading

Stat 541 Only lobstr::obj_size()

Use faster existing functions

Because R has so many packages, there are often many functions to perform the same task. Not all functions are created equal!

data.table

For speeding up work with data frames, no package is better than data.table!

Required-readingRequired Reading
ImportantCheck-in: Efficiency
  1. Here is some dplyr code. Re-write it in data.table syntax.
congress_age %>%
  group_by(state) %>%
  summarize(mean_age = mean(age))
  1. Using either the tictoc or microbenchmark packages perform a speed test to see which code is faster, and by how much.

Tip 3: Only improve what needs improving

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, [they] will be wise to look carefully at the critical code; but only after that code has been identified.” — Donald Knuth.

Speed and computational efficiency are important, but there can be a trade off. If you make a small speed improvement, but your code becomes overly complex, confusing, or difficult to edit and test it’s not worth it!

Also, consider your time and energy as the programmer. Suppose you have a function that takes 30-minutes to run. It relies on a subfunction that takes 30-seconds to run. Should you really spend your efforts making the subfunction take only 10-seconds?

The art of finding the slow bits of your code where you can make the most improvement is called profiling.

📖 Required Reading: Profiling

Optional: Super advanced mode

The following are some suggestions of ways you can level-up your code speed. These are outside the scope of this class, but you are welcome to explore them!