<- lm(bill_length_mm ~ bill_depth_mm * species,
species_lm data = penguins)
Question 6 (Bootstrap Distribution):
Question 8 (Interpret Confidence Interval):
For 1 numerical & 1 categorical explanatory variables
For 1 numerical & 1 categorical explanatory variables
bill_depth_mm:species
), it is testing if the slopes are different!term | Df | Sum Sq | Mean Sq | F value | Pr(>F) |
---|---|---|---|---|---|
bill_depth_mm | 1 | 561.5708 | 561.570850 | 93.96508 | 9.384795e-20 |
species | 2 | 7460.3201 | 3730.160057 | 624.15060 | 7.116342e-114 |
bill_depth_mm:species | 2 | 134.2515 | 67.125732 | 11.23184 | 1.897629e-05 |
Residuals | 336 | 2008.0631 | 5.976378 | NA | NA |
bill_depth_mm:species
: With a p-value of 0.0000189, I would conclude there is evidence that the slopes are different!
For 2 numerical explanatory variables
term | Df | Sum Sq | Mean Sq | F value | Pr(>F) |
---|---|---|---|---|---|
body_mass_g | 1 | 3599.71136 | 3599.71136 | 186.673943 | 3.685224e-34 |
bill_depth_mm | 1 | 27.41605 | 27.41605 | 1.421742 | 2.339508e-01 |
Residuals | 339 | 6537.07813 | 19.28342 | NA | NA |
body_mass_g
: With a p-value of approximately 0, I would conclude there is a relationship between body mass and bill length (after accounting for bill depth)!
bill_depth_mm
: With a p-value of 0.234, I would conclude there is not a relationship between bill depth and bill length (after accounting for body mass)!
\[H_0: \beta_1 = 0\]
\[H_A: \beta_1 \neq 0\]
What do these hypotheses mean in words?
\(\widehat{score} = 3.88 + 0.066 \times \text{bty_avg}\)
Will we reject \(H_0\)?
by generating statistics that could have happened if \(H_0\) was true and comparing our observed statistic to these statistics to see how unusual it is
\(\beta_1\) = 0
which means…
score | bty_avg | shuffled_score |
---|---|---|
4.7 | 3.000 | 2.8 |
3.6 | 4.833 | 4.6 |
4.3 | 4.333 | 4.5 |
4.8 | 3.500 | 2.8 |
3.7 | 3.000 | 3.7 |
3.6 | 3.167 | 3.9 |
4.3 | 3.167 | 4.1 |
3.5 | 4.833 | 3.6 |
3.6 | 4.333 | 3.5 |
4.6 | 7.333 | 4.6 |
4.4 | 2.500 | 4.4 |
4.5 | 4.333 | 4.3 |
4.2 | 2.833 | 3.4 |
4.8 | 6.500 | 3.3 |
4.3 | 4.000 | 4.8 |
4.8 | 3.667 | 4.1 |
3.1 | 1.667 | 4.7 |
4.6 | 6.500 | 4.0 |
4.8 | 5.000 | 4.5 |
4.1 | 6.167 | 4.8 |
4.5 | 4.167 | 3.5 |
3.6 | 3.000 | 4.7 |
4.8 | 6.167 | 4.1 |
4.8 | 3.667 | 4.1 |
3.8 | 3.167 | 4.4 |
Which results in a slope statistic of…
[1] 0.06663704
We want a sampling distribution of lots of statistics that could have happened if \(H_0\) was true!
If it is unlikely to happen, then we should reject \(H_0\)!
Why???
The probability of observing a statistic as or more extreme than 0.0666 if \(H_0\) was true is approximately 0%.
So, it seems like \(H_0\) is not a reasonable characterization of the relationship.
How can we know if our p-value is reliable?
For our p-value to be trustworthy, we need to know that the conditions of our model are not violated.
For linear regression we are assuming…
Linear relationship between \(x\) and \(y\)
Independent observations
Normality of residuals
Equal variance of residuals
In general, when the conditions associated with these methods are violated, we will underestimate the true standard error (spread) of the sampling distribution.
Linear relationship between \(x\) and \(y\)
What should we do?
Variable transformation!
Independence of observations
The evals
dataset contains 463 observations on 94 professors. Meaning, professors have multiple observations.
What should we do?
Best – use a random effects model
Reasonable – collapse the multiple scores into a single score
Equal variance of residuals
What should we do?
Variable transformation!
Normality of residuals
What should we do?
Variable transformation!
For the L, I, and E conditions…
we need to ask for help!
For the N condition…
we need to use simulation instead of mathematical formulas to get our p-value and confidence interval.
bty_avg
0.06663704
How did R calculate the \(t\)-statistic?
\(SE_{b_1} = \frac{\frac{s_y}{s_x} \cdot \sqrt{1 - r^2}}{\sqrt{n - 2}}\)
[1] 0.01495204
\(t = \frac{b_1}{SE_{b_1}} = \frac{0.067}{0.014952} = 4.4809947\)
# A tibble: 2 Ă— 7
term estimate std_error statistic p_value lower_ci upper_ci
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 intercept 3.88 0.076 51.0 0 3.73 4.03
2 bty_avg 0.067 0.016 4.09 0 0.035 0.099
Different \(t\)-distributions
A \(t\)-distribution has a slightly different shape depending on the sample size. Technically, we are using a \(t\)-distribution with \(n - 2\) degrees of freedom to get our p-value.
Why do you think that is?
A \(t\)-distribution can be a reasonable approximation for the permutation distribution if the normality condition is not violated.
What is the relationship between life expectancy GDP per capita?