Week 7 – Inference for Regression
This week we are finally talking about p-values and confidence intervals! There are two portions of the coursework. First, we will read about confidence intervals for multiple linear regression, connecting ideas from the sampling variability chapter we read last week. The second half of the coursework discusses the use of p-values for multiple linear regression, connecting to confidence intervals!
1 Textbook Reading – Part 1 (Due Monday by the beginning of class)
Required Reading: Confidence Intervals for the Slope
Required Reading: Hypothesis Test for Slope & Inference Conditions
2 Textbook Reading – Part 2 (Due Wednesday by the start of class)
3 Concept Quizes
3.1 Confidence Intervals
1. Match each item to it’s respective analogy:
point estimate
confidence interval
fishing with a net
fishing with a spear
2. To create a 95% confidence interval using the percentile method, what percentiles of the bootstrap distribution do you need to calculate?
- 0th
- 2.5th
- 5th
- 90th
- 95th
- 97.5th
3. To create a 95% confidence interval using the standard error method, what standard error do you use?
- the sample standard deviation
- the bootstrap distribution standard deviation
- a resample standard deviation
- 1.96
4. We almost never know if our confidence interval captured the true population parameter.
- True
- False
5. What percentage of 99% confidence intervals do you expect to capture the true population parameter?
6. The word “confident” in a confidence interval interpretation corresponds to what aspect of the interval?
- the accuracy of the original sample
- the reliability of the procedure for constructing confidence intervals
- the precision of the bootstrap samples
7. Which of the following are true?
- Smaller sample sizes tend to produce narrower confidence intervals.
- Smaller sample sizes tend to produce wider confidence intervals.
- Lower confidence levels tend to produce wider confidence intervals.
- Lower confidence levels tend to produce narrower confidence intervals.
8. In a regression table, what does the “std_error” value associated with the slope represent?
- the standard deviation of the sample
- the standard deviation of the bootstrap distribution
- the estimated standard deviation of the sampling distribution
- the standard error of the sample
9. In a regression table, how is the “std_error” value calculated?
- a mathematical formula
- the standard deviation of the sample
- the standard deviation of the bootstrap distribution
10. What percentage confidence interval is output in a regression table?
- 99%
- 95%
- 90%
3.2 Hypothesis Tests
1. Match each procedure to the question it addresses.
confidence intervals
hypothesis tests
- What are plausible values for the population parameter?
- What are plausible values for the sample statistic?
- Is the population parameter different from 0?
- Is the value of the parameter different from a specified quantity?
2. Which of the following are always true for hypothesis statements?
- They are about sample statistics.
- They are about relationships between variables.
- They are about population parameters.
- They are about differences in groups.
3. To simulate what could have happened if the [null hypothesis was true / alternative hypothesis was true], we [separate the (x, y) pairs / keep the (x, y) pairs together] and [resample with replacement / randomly reassign a new y to each x].
4. Which of the following are true about a null distribution? (select all that apply)
- It is a distribution of statistics.
- The values on the distribution represent what might have happened if the null hypothesis was true.
- The values on the distribution represent what might have happened if the alternative hypothesis was true.
- It is a distribution of sample observations.
5. Name one similarity between a permutation distribution and a bootstrap distribution.
6. Name one difference between a permutation distribution and a bootstrap distribution.
7. For linear regression, the null distribution is always centered at ____.
8. Which of the following are true about a p-value? (select all that apply)
- It is calculated assuming the null hypothesis is true.
- It is the probability the null hypothesis is true.
- It quantifies how “surprising” our data are.
- It compares the observed statistic to a distribution of values that could have happened if the null was true.
- It is calculated assuming the alternative hypothesis is true.
- It is a probability.
9. Which of the following is true about a small p-value?
- The sample statistic is unlikely to have happened by chance.
- The sample size was large.
- The sample statistic is unlikely to have happened if the null hypothesis was true.
- The sample statistic was large.
10. If you obtain a large p-value, what can you conclude about your hypotheses?
- We cannot say the alternative hypothesis is false.
- We cannot say the null hypothesis is false.
- The null hypothesis is true.
- The alternative hypothesis is true.
11. If the probability of a Type I error goes down, what can you say about the probability of a Type II error?
- The probability of a Type II error goes down.
- The probability of a Type II error stays the same.
- The probability of a Type II error goes up.
12. If you obtained a small p-value (e.g., 0.02), what could you say about what you would expect if you constructed a 95% confidence interval?
- It would contain the null hypothesized value.
- It would not contain the null hypothesized value.
- It would contain the sample statistic.
- It would contain the true population parameter.
13. In a regression table, what is the “statistic” value associated with the slope?
- a bootstrap statistic
- a z-statistic
- the sample slope statistic
- a t-statistic
14. In a regression table, how is the p-value calculated?
- Using a permutation distribution with 1000 resamples
- Using a bootstrap distribution with 1000 samples
- Using a Normal distribution
- Using a t-distribution