Today we will…
Extracts subsets of data and places them in side-by-side plots.
You can set scales to let axis limits vary across facets using the scales
argument.
The x-axis limits adjust to individual facets.
"free_y"
– only y-axis limits adjust to individual facets.
#
and the header (e.g., # Guinea Pig Teeth
).embed-resources: true
line in the YAML?By default, Quarto does not embed plots in the HTML document. Instead, it creates a “Lab-1-files” folder which stores all your plots.
This means that by default your plots are not included in your HTML file!
---
title: "Lab 1"
author: "Dr. T!"
format: html
editor: source
embed-resources: true
---
The embed-resources: true
line in the YAML forces the visualizations to be embedded in your HTML!
Artwork by Allison Horst
Look at the file extension for the type of data file.
.csv
: “comma-separated values”
Name, Age
Bob, 49
Joe, 40
.xls
, .xlsx
: Microsoft Excel spreadsheet
.csv
readxl
package.txt
: plain text
Using base R
functions:
read.csv()
is for reading in .csv
files.
read.table()
and read.delim()
are for any data with “columns” (you specify the separator).
The tidyverse has some cleaned-up versions of the base R functions in the readr
and readxl
packages:
read_csv()
is for comma-separated data.
read_tsv()
is for tab-separated data.
read_table()
is for white-space-separated data.
read_delim()
is any data with “columns” (you specify the separator). The above are special cases.
read_xls()
and read_xlsx()
are specifically for dealing with Excel files.
Remember to load the readr
and readxl
packages first!
Structure: boxplot, scatterplot, etc.
Aesthetics: features such as color, shape, and size that map other variables to structural features.
Both the structure and aesthetics should help viewers interpret the information.
The next slide will have one point that is not like the others.
Raise your hand when you notice it.
features that we see and perceive before we even think about it
They will jump out at us in less than 250 ms.
E.g., color, form, movement, spatial location.
There is a hierarchy of features (e.g., color is stronger than shape).
Gestalt Hierarchy | Graphical Feature |
---|---|
1. Enclosure | Facets |
2. Connection | Lines |
3. Proximity | White Space |
4. Similarity | Color/Shape |
Implications for practice:
Do not use rainbow color gradients!
Be conscious of what certain colors “mean”.
For categorical data, try not to use more than 7 colors:
If you need to, you can use colorRampPalette()
from the RColorBrewer
package to produce larger palettes:
To make your graphic friendly for people with different color vision deficiencies…
To make your graphic friendly for people with different color vision deficiencies…
To make your graphic friendly for people with different color vision deficiencies…
There are several packages with color scheme options:
These packages have color palettes that are aesthetically pleasing and, in many cases, color deficiency friendly.
You can also take a look at other ways to find nice color palettes.
custom_labels <- c(
Dream = "Dream Island",
Torgersen = "Torgersen Island",
Biscoe = "Biscoe Island"
)
ggplot(data = penguins,
mapping = aes(x = bill_length_mm,
y = bill_depth_mm,
color = species)) +
geom_point() +
facet_wrap(~island, labeller = as_labeller(custom_labels)) +
labs(x = "Bill Length",
y = "",
color = "Species of Penguin",
title = "Do penguins with shorter bills have deeper bills?")
ggplot(data = penguins,
mapping = aes(x = bill_length_mm,
y = bill_depth_mm,
color = species)) +
geom_point() +
facet_wrap(~island) +
labs(x = "Bill Length",
y = "",
color = "Species of Penguin",
title = "Do penguins with shorter bills have deeper bills?") +
theme_bw() +
theme(legend.position = "top")
ggplot(data = penguins,
mapping = aes(x = bill_length_mm,
y = bill_depth_mm,
color = species)) +
geom_point() +
facet_wrap(~island) +
labs(x = "Bill Length",
y = "",
color = "Species of Penguin",
title = "Do penguins with shorter bills have deeper bills?") +
scale_y_continuous(limits = c(10, 30),
breaks = seq(from = 10, to = 30, by = 5)
)
Starting with Lab 2, your labs will have an appearance / code format portion.
Review the code formatting guidelines before you submit your lab!
Each week, you will be assigned one of your peer’s labs to review their code formatting.
It is good practice to put each geom
and aes
on a new line.
Part of learning to program is learning from a variety of resources. Thus, I expect you will use resources that you find on the internet.
In this class the assumed knowledge is the course materials, including the course textbook, coursework pages, and course slides. Any functions / code used outside of these materials require direct references.
A research team from The University of Illinois is studying students’ ability to decipher what data visualizations in the media do and do not reveal. They have developed an assessment to measure students’ visual data literacy, and our class will be among the first to test it out and offer feedback.
The assessment form is linked in the “ADDITIONAL CHALLENGE OPPORTUNITIES” module on Canvas.
Completing the survey can count as an additional demonstration of you “extending” your thinking.
As with all research, it is up to you whether you give consent for your data to be used for research purposes! You will be asked about this at the end of the survey. That choice completely up to you!