Basics of Graphics

Tuesday, September 23

Today we will…

  • Warm-up for ggplot2 practice activity (30-minutes)
  • Set-up for the practice activity (15-minutes)
    • Review pair programming norms
    • Learn how to access practice activities
  • Take a 10-minute break
  • Complete the practice activity (60-minutes)
    • Find your partner!
    • Fill out the collaboration survey (2-minutes)

Data Context for Today

library(palmerpenguins)

?penguins

A screenshot of the R documentation page for the penguins dataset. The documentation page describes the context of the data (body measurements of penguins from the Palmer Archipeligo of Antarctica), the variables included in the data (species, island, bill_length_mm, etc.), and the units for each variable.

What do you notice about these data?

Grammar of Graphics

The Grammar of Graphics (GoG) is a principled way of specifying exactly how to create a particular graph from a given data set. It helps us to systematically design new graphs.


Think of a graph or a data visualization as a mapping…

FROM variables in the data set (or statistics computed from the data)…

TO visual attributes (or “aesthetics”) of marks (or “geometric elements”) on the page/screen.

How to Build a Graphic

Complete this template to build a basic graphic:

ggplot(
  data = <DATA>, 
  mapping = aes(<MAPPINGS>)
  ) +
  <GEOM FUNCTION>() + 
  any other arguments...

Notice, every + adds another layer to our graphic.

Tip

Notice that I’m using named arguments to make my code easier to read.

ggplot(data = penguins)

What do you expect to see after running this code?

An image of a blank gray square representing a blank plotting canvas.

ggplot(data = penguins, 
       mapping = aes(x = species, y = bill_length_mm)
       )

What do you expect to see after running this code?

An image of a gray square with white gridlines representing a plotting canvas where the variables have been assigned to the x and y aesthetics. On the x-axis there is a variable named 'class' with seven different values mapped to a particular white gridline---2seater, compact, midsize, minivan, pickup, subcompact, suv. On the y-axis there is a variable named 'hwy' with three printed values mapped to different white gridlines---20, 30, and 40. The spaces between these gridlines represent values between the printed values (e.g., 25).

ggplot(data = penguins, 
       mapping = aes(x = species, y = bill_length_mm)
       ) +
  geom_jitter() +
  geom_boxplot()

What do you expect to see after running this code?

The same visualization is presented, except there are not boxplots superimposed on top of the points (dots) for each vehicle type. The boxplots display the median (center line), quantiles (edges), and min / max (whiskers) of each vehicle's mileage.

Aesthetics

We map variables (columns) from the data to aesthetics on the graphic using the aes() function.

What aesthetics can we set?

  • x, y
  • color, fill
  • linetype
  • shape
  • size

Tip

See ggplot2 cheat sheet for more!

Geometric Objects

Wee use a geom_XXX() function to represent data points.

one variable

  • geom_bar()
  • geom_density()
  • geom_dotplot()
  • geom_histogram()
  • geom_boxplot()

two variable

  • geom_boxplot()
  • geom_point()
  • geom_line()

Tip

See ggplot2 cheat sheet for more!

ggplot2 Resources

Every person should have a ggplot2 cheatsheet!

On the Front

  • Column 1: the “template” for making a ggplot
  • Column 2: creating plots for one continuous or one discrete variable
  • Column 3: creating plots for two continuous variables

On the Back

  • Column 4: adding facets and labels to your plot

A picture of the ggplot2 cheatsheet, which contains helpful information on assembling a variety of visualizations all using the ggplot2 package.

PA 2: Using Data Visualization to Find the Penguins

Artwork by Allison Horst

Essential Abilities

This puzzle will require knowledge of:

  • types of variables
  • types of visualizations
  • what visualization(s) go with different variable types
  • ggplot2 functions to create visualizations
  • choosing between different aesthetic options

None of us have all these abilities. Each of us has some of these abilities.

Essential Abilities

This collaboration will require:

🌐 Coordination & Collaboration

  • Clear communication
  • Teamwork
  • Interpersonal skills

🔧 Engineering Mindset

  • Problem-solving
  • Decision-making

📊 Management

  • Leadership
  • Time management

💡 Innovation

  • Critical thinking

🤝 Social Responsibility

  • Empathy

Collaborative Protocol

During your collaboration, you and your partner will alternate between two roles:

Computer

  • Reads out the prompt and ensures the group understands what is being asked.
  • Encourages the Coder to vocalize their thinking.
  • Asks the Coder to explain their thinking.
  • Types the code specified by the Coder into the Quarto document.
  • Runs the code provided by the Coder.
  • Works with Coder to debug the code.
  • Evaluates the output against the question prompt.

Coder

  • Confirms they understand what the prompt is asking.
  • Talks with Computer about their ideas.
  • Explains their thinking.
  • Directs the Computer what to type.
  • Manages resources (e.g., cheatsheets, textbook).
  • Works with Computer to debug the code.

Collaborative Protocol

During your collaboration, you and your partner will alternate between two roles:

Computer

  • Does not give hints to the Coder for how to solve the problem.
  • Does not solve the problem themselves.
  • Does not tell the Coder how to correct an error.

Coder

  • Does not ask the Computer how they would solve the problem.
  • Does not ask the Computer what functions / tools they should use.
  • Does not ask the Computer to debug the code.

PA 2 Warm-up

Creating a Graphic

To create a specific type of graphic, we will combine aesthetics and geometric objects. When sitting down to create a plot, it’s great to start with a game plan!

  1. What variables are you interested in?
  2. What types of variables are these?
  3. Where do you want to put each of these variables? (i.e., what aesthetics)
  4. What type(s) of geometries do you need?

Task 1

Create a plot showing the number of penguins captured on each island.

  1. What type of variable is island?
  1. What type of plot would you make for this type of variable?
  1. What geom would you use to get this plot?

Task 2

Create plot displaying the most common bill lengths for the penguins in these data.

  1. What type of variable is bill_length_mm?
  1. What type of plot would you make for this type of variable?
  1. What geom would you use to get this plot?

Task 3

Create a plot showing the relationship between a penguin’s bill length and body mass.

  1. What type of variables are bill_length_mm and body_mass_g?
  1. What type of plot would you make for these types of variables?
  1. What geom would you use to get this plot?

Working with Your Partner

Group Norms

  1. Be curious. Don’t correct.
  2. Be open minded.
  3. Ask questions rather than contribute.
  4. Respect each other.
  5. Allow each teammate to contribute to the activity through their role. Do not divide the work.
  6. Ask Dr. T or Jasmine group questions. No cross talk with other groups.
  7. Communicate with each other!

Task Cards

A diagram shows a collaborative software development process in four stages arranged in a cycle. At the top, a woman speaks with the label 'VOCALIZE.' To the right, she points to a diagram with the label 'EXPLAIN.' At the bottom, a man types on a laptop with the label 'IMPLEMENT.' On the left, a computer monitor displays a bug symbol with the label 'DEBUG.' Arrows connect the stages in a loop: Vocalize → Explain → Implement → Debug → back to Vocalize.

10-minute Break

Team Assignments

Section 70 (9:00 am)

Section 71 (12:00 pm)

Accessing the Practice Activity in Google Colab

The partner whose family name starts first alphabetically starts as the Computer! The Computer needs to:

  • Click on the Practice Activity 2 link from Canvas
  • Log-in to your Google account
  • Save a copy of the Colab notebook in your Google Drive

A screenshot of the options provided when you click on the File pane within Google Colab. The option to save a copy in Drive is highlighted, to demonstrate how each student needs to make a copy of the notebook before sharing it with their partner.

Sharing with Your Partner

Once you have your copy, you need to:

  • Share your copy with your partner’s Google account
  • Make sure the Coder can open the file
  • The Computer should plug their laptop into the monitor
  • The Coder must close their computer

A screenshot of the options provided when you click on the Share pane (in the upper right corner) within Google Colab. The user has typed in a Gmail address to share it with Laura Smith, who will receive an email with a link to the document when the Notify option is checked.

Things to Know About Colab

Only one person can type at a time

If two people type at the same time, only one document will be able to save.

This requires your group to adhere to the collaborative protocol!

Code that was run on one person’s computer will not appear on another person’s computer

When you switch roles, the new Computer will need to run all the code that was typed by the previous Computer.

A screenshot of the 'Run all' button from Google Colab, which is located at the top of the notebook, to the right of the 'Code' and 'Text' buttons for adding content to the notebook. Clicking on this button will run every code chunk in the entire notebook.

The “Run all” button at the top of the document can help you do this!

Submission

  • When you have completed the visualization tasks, you will work as a group to answer the five questions posed at the end of the document.

  • Each person will input the answers to these questions in the PA2 Canvas quiz.

  • The person who last occupied the role of Computer will print the notebook as a PDF and submit the PDF for the group.

    • Only one submission per group!