Week Four: Simple Linear Regression (Basic Regression)
Welcome!
In this week’s coursework we are going to start exploring statistical methods—linear regression. We will start with the “basic” case or simple linear regression, with a single quantitative explanatory variable and a quantitative response. We will review the concepts behind linear regression, learn how to visualize regression models, and practice obtaining the estimated regression lines in R
.
0.1 Learning Outcomes
By the end of this coursework you should be able to:
- describe the difference between explanatory and predictive modeling
- produce data visualizations for a simple linear regression
- describe how
R
decides on the “best” regression line - fit a simple linear regression in
R
- write out an estimated regression line
- interpret the slope of an estimated regression line
- interpret the intercept of an estimated regression line
- describe what an “offset” is in a simple linear regression
- interpret the offsets of an estimated regression line
- calculate a residual for an observation
1 Prepare
1.1 Textbook Reading
Chapter 5 (https://moderndive.com/5-regression.html)
Don’t worry about Section 5.3.3, it goes a bit too in the weeds for what we are interested in.
Reading Guide
Please use a different color for your answers to the reading guide, so it is easier to find your responses! 😊
1.2 Concept Quiz – Due Monday by the start of class
1. What does the correlation coefficient measure?
- strength of a relationship between two variables
- strength of a linear relationship between two variables
- strength of a relationship between two numerical variables
- strength of a linear relationship between two numerical variables
2. How do you interpret a slope coefficient?
- For any increase of \(x\), the slope is the expected increase in \(y\)
- For any increase of \(x\), the slope is the expected increase in the mean of \(y\)
- For a 1-unit increase in \(x\), the slope is the expected increase in \(y\)
- For a 1-unit increase in \(x\), the slope is the expected increase in the mean of \(y\)
3. How do you interpret an intercept coefficient?
- For a 1-unit increase in \(x\), the intercept is the expected increase in \(y\)
- For a 1-unit increase in \(x\), the intercept is the expected increase in the mean of \(y\)
- For an \(x\) value of 0, the intercept is the expected value of \(y\)
- For an \(x\) value of 0, the intercept is the expected value of the mean of \(y\)
4. Match the explanatory and response variables to the correct variables in the lm()
function syntax.
lm(variable1 ~ variable2, data = <NAME OF DATASET>)
5. What geom_
adds a linear regression line to a scatterplot?
geom_line()
geom_smoother()
geom_regression()
geom_smooth()