library(openintro)
Lab 3 Revisions are due next Wednesday (May 7)
Statistical Critique revisions are due next Wednesday (May 7)
The first draft of your Midterm Project is due on Sunday at midnight.
Deadline Extension
A deadline extension is permitted for the first draft. Deadline extensions are not permitted for the final version (due next week).
The description of your data goes in your Introduction.
The description of your variables goes at the beginning of your Methods, in the Variables subsection!
Be cautious in how you are using the resources I provided—do not copy these descriptions.
Inserting a verbatim copy of the descriptions seen in the data resources is plagiarism.
In text citation
If you wish to borrow elements of these descriptions, you need to quote them and provide a reference to the resource. e.g., “This data set has been of interest to medical researchers who are studying the relation between habits and practices of expectant mothers and the birth of their children” (United States Department of Health and Human Services, 2014).
Be sure to review the feedback provided in your Midterm Project Proposal before starting your first draft.
Specifically, make sure I did not have any objections to the variables you chose for your analysis!
For the and_vertebrates
data, you should include species
as an explanatory variable. If you don’t you are assuming the same relationship applies to trout AND salamanders.
For the hbr_maples
data, you cannot use year
as a numerical variable. There are only two years of data!
year
as a categorical variable, let Dr. T know and they will help you write code to change the data type of this variable!Locate what package your data live in (found in the directions for the midterm project proposal)
Load in the package you need!
Get started!
moderndive Package
We will be using the moderndive package to get our regression tables, so do not remove this package from your project!
You will make two total visualizations:
geom_smooth(method = "lm")
geom_parallel_slopes()
Next, you will decide which of these two models seems like the better model.
Look at the plot where the lines are allowed to be different! Does it look like they are?
If the lines look different – you should use the different slopes (interaction) model!
If the lines look similar – you should use the parallel slopes (additive) model!
No p-values
Your model decision needs to rely exclusively on the visualizations, you cannot use p-values to make your decision.
lm()
Are the slopes different? You need to fit a different slopes model! Use a *
to separate the variables!
get_regression_table()
Regardless of the model you fit, you need to get your estimated coefficients using the get_regression_table()
function!
Now interpret!
Comments from Project Proposals