%>%
trout group_by(section) %>%
summarize(
mean_length = mean(length_1_mm,
na.rm = TRUE)
)
“Complete” = Satisfactory
mpg
dataset“Incomplete” = Growing
mpg
datasetLab 1 revisions are due by Friday, April 18 (at midnight).
Reflections
Revisions are required to be accompanied with reflections on what you learned while completing your revisions. Reflections must be written in your Lab 1 Quarto file (next to the problems you revised).
How would you describe a categorical variable?
R
…categorical variables can have either character
or factor
data types
factor
– structured & fixed number of levels / options
character
– unstructured & variable number of levels
Fill in the associated data types (e.g. character, factor, integer, double) with each type of variable.
Variable | Data Type in R |
---|---|
Categorical variable | |
Continuous numerical variable | |
Discrete numerical variable |
dplyr
– a tool bag for data wranglingfilter()
select()
mutate()
summarize()
arrange()
group_by()
Brainstorm definitions for each verb
filter()
select()
mutate()
group_by()
summarize()
arrange()
If you wanted means for each level of a categorical variable, what would you do?
The HJ Andrews Experimental Forest houses one of the larges long-term ecological research stations, specifically researching cutthroat trout and salamanders in clear cut or old growth sections of Mack Creek.
# A tibble: 2 × 2
section mean_length
<chr> <dbl>
1 clear cut forest 85.3
2 upstream old growth coniferous forest 81.4
Why na.rm = TRUE
?
The channels of the Mack Creek which were sampled were classified into the following groups:
"C"
"I"
"IP"
"P"
"R"
"S"
"SC"
NA
cascade
riffle
isolated pool
pool
rapid
step (small falls)
side channel
not sampled by unit
filter()
-ing Specific Channel TypesThe majority of the Cutthroat trout were captured in cascades (C), pools (P), and side channels (SC). Suppose we want to only retain these levels of the unittype
variable.
%in%
not ==
!
As a variable on the x
- or y
-axis
As a color
/ fill
As a facet
How would this histogram look if there was no variation in salamander length?
What are possible causes for the variation in salamander length?
What are the aesthetics included in this plot?
What is one aspect of this plot that you believe is well done? What is one aspect of the plot that could be improved?
aes
thetics included in the plotsRead the Background and Methods in the summary at the beginning of the Tuan et al. paper. Then answer the following questions:
What data is involved in this research study?
What are the research questions / research goals?