Challenge 2: Spicing things up with ggplot2

For this week’s Challenge, you will have three different options to explore. I’ve arranged these options in terms of their “spiciness,” or the difficulty of completing the task. You are only required to complete one task, but if you are interested in exploring more than one, feel free!

This is a great place to let your creativity show! Make sure to indicate what additional touches you added, and provide any online references you used.

Tip

You will be working in the challenge-2.qmd file, found in the repository you cloned for Lab 2. Within this document, make sure to specify that your plots are contained in your document (self-contained: true) and that your code is visible to the reader (echo: true). If there are other formatting specifics you would like to include, feel free to toss those in the YAML, too!

Then, create a setup code chunk to load the packages and read in the surveys.csv data file exactly how you did in Lab 2.

Create another code chunk and paste in your code from Question 14 or Question 15 in Lab 2 – we will be modifying the box plot of weights by species!

🌶 Medium: Ridge Plots

In Lab 2, you used side-by-side boxplots to visualize the distribution of weight within each species of rodent. Boxplots have substantial flaws, namely that they disguise distributions with multiple modes.

A “superior” alternative is the density plot. However, ggplot2 does not allow for side-by-side density plots using geom_density(). Instead, we will need to make use of the ggridges package to create side-by-side density (ridge) plots.

For this challenge you are to change your boxplots to ridge plots. You will need to install the ggridges package and explore the geom_density_ridges() function.

🌶 🌶 Spicy: Exploring Color Themes

The built-in ggplot() color scheme may not include the colors you were looking for. Don’t worry – there are many other color palettes available to use!

You can change the colors used by ggplot() in a few different ways.

Manual Specification

Add the scale_color_manual() or scale_fill_manual() functions to your plot and directly specify the colors you want to use. You can either:

  1. define a vector of colors within the scale functions (e.g. values = c("blue", "black", "red", "green")) OR

  2. create a vector of colors using hex numbers and store that vector as a variable. Then, call that vector in the scale_color_manual() function.

Here are some example hex color schemes:

# A vector of a RG color deficient friendly palette with gray:
cdPalette_grey <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")

# A vector of a RG color deficient friendly palette with black:
cdPalette_blk <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
Note

If you are interested in using specific hex colors, I like the image color picker app to find the colors I want.

Package Specification

While manual specification may be necessary for some contexts, it can be a real pain to handpick 5+ colors. This is where color scales built-in to R packages come in handy! Popular packages for colors include:

  • RColorBrewer – change colors by using scale_fill_brewer() or scale_colour_brewer().

  • viridis – change colors by using scale_colour_viridis_d() for discrete data, scale_colour_viridis_c() for continuous data.

  • ggsci – change colors by using scale_color_<PALNAME>() or scale_fill_<PALNAME>(), where you specify the name of the palette you wish to use (e.g. scale_color_aaas()).

Note

This website provides an exhaustive list of color themes available through various packages.

In this challenge you are expected to use this information to modify the boxplots you created Lab 2. First, you are to color the boxplots based on the variable genus. Next, you are to change the colors used for genus using either manual color specification or any of the packages listed here (or others!).

🌶 🌶 🌶 Hot: Exploring ggplot2 Annotation

Some data scientists advocate that we should try to eliminate legends from our plots to make them more clear. Instead of using legends, which cause the reader’s eye to stray from the plot, we should use annotation.

We can add annotation(s) to a ggplot() using the annotate() function:

ggplot(data = surveys, 
       mapping = aes(x = weight, y = species, color = genus)
       ) +
  geom_boxplot() +
  scale_color_manual(values = cdPalette_grey) + 
  annotate("text", y = 6, x = 200, label = "Sigmodon") +
  annotate("text", y = 4, x = 200, label = "Perognathus") +
  theme(legend.position = "none") +
  labs(x = "Weight (g)",
       y = "",
       subtitle = "by Species and Genera",
       title = "Rodent Weight")

Note that I’ve labeled the “Sigmodon” and “Perognathus” genera, so the reader can tell that these boxplots are associated with their respective genus.

In this challenge you are expected to use this information to modify the boxplots you created in Lab 2. First, you are to color the boxplots based on the variable genus. Next, you are to add annotations for each genus next to the boxplot(s) associated with that genus. Finally, you are expected to use the theme() function to remove the color legend from the plot, since it is no longer needed!

Challenge 2 Submission

For Lab 2 you will submit only your HTML file. Your HTML file is required to have the following specifications in the YAML options (at the top of your document):

  • have the plots embedded (embed-resources: true)
  • include your source code (code-tools: true)
  • include all your code and output (echo: true)

If any of the options are not included, your Lab 2 or Challenge 2 assignment will receive an “Incomplete” and you will be required to submit a revision.

In addition, your document should not have any warnings or messages output in your HTML document. If your HTML contains warnings or messages, you will receive an “Incomplete” for document formatting and you will be required to submit a revision.