Week 7 – Inference for Regression

This week we are finally talking about p-values and confidence intervals! There are two portions of the coursework. First, we will read about confidence intervals for multiple linear regression, connecting ideas from the sampling variability chapter we read last week. The second half of the coursework discusses the use of p-values for multiple linear regression, connecting to confidence intervals!

1 Textbook Reading – Part 1 (Due Monday by the beginning of class)

Required Reading: Confidence Intervals for the Slope

Reading Guide

Download the Word Document

Required Reading: Hypothesis Test for Slope & Inference Conditions

2 Textbook Reading – Part 2 (Due Wednesday by the start of class)

Reading Guide

Download the Word Document

3 Concept Quizes

Note there are two concept quizzes this week!

3.1 Confidence Intervals

Due Monday by the beginning of class

1. Match each item to it’s respective analogy:

point estimate

confidence interval

fishing with a net

fishing with a spear

2. To create a 95% confidence interval using the percentile method, what percentiles of the bootstrap distribution do you need to calculate?

0th
2.5th
5th
90th
95th
97.5th

3. To create a 95% confidence interval using the standard error method, what standard error do you use?

the sample standard deviation
the bootstrap distribution standard deviation
a resample standard deviation
1.96

4. We almost never know if our confidence interval captured the true population parameter.

True
False

5. What percentage of 99% confidence intervals do you expect to capture the true population parameter?

6. The word “confident” in a confidence interval interpretation corresponds to what aspect of the interval?

the accuracy of the original sample
the reliability of the procedure for constructing confidence intervals
the precision of the bootstrap samples

7. Which of the following are true?

Smaller sample sizes tend to produce narrower confidence intervals.
Smaller sample sizes tend to produce wider confidence intervals.
Lower confidence levels tend to produce wider confidence intervals.
Lower confidence levels tend to produce narrower confidence intervals.

8. In a regression table, what does the “std_error” value associated with the slope represent?

the standard deviation of the sample
the standard deviation of the bootstrap distribution
the estimated standard deviation of the sampling distribution
the standard error of the sample

9. In a regression table, how is the “std_error” value calculated?

a mathematical formula
the standard deviation of the sample
the standard deviation of the bootstrap distribution

10. What percentage confidence interval is output in a regression table?

3.2 Hypothesis Tests

Due Wednesday by the beginning of class

1. Match each procedure to the question it addresses.

confidence intervals

hypothesis tests

What are plausible values for the population parameter?
What are plausible values for the sample statistic?
Is the population parameter different from 0?
Is the value of the parameter different from a specified quantity?

2. Which of the following are always true for hypothesis statements?

They are about sample statistics.
They are about relationships between variables.
They are about population parameters.
They are about differences in groups.

3. To simulate what could have happened if the [null hypothesis was true / alternative hypothesis was true], we [separate the (x, y) pairs / keep the (x, y) pairs together] and [resample with replacement / randomly reassign a new y to each x].

4. Which of the following are true about a null distribution? (select all that apply)

It is a distribution of statistics.
The values on the distribution represent what might have happened if the null hypothesis was true.
The values on the distribution represent what might have happened if the alternative hypothesis was true.
It is a distribution of sample observations.

5. Name one similarity between a permutation distribution and a bootstrap distribution.

6. Name one difference between a permutation distribution and a bootstrap distribution.

7. For linear regression, the null distribution is always centered at ____.

8. Which of the following are true about a p-value? (select all that apply)

It is calculated assuming the null hypothesis is true.
It is the probability the null hypothesis is true.
It quantifies how “surprising” our data are.
It compares the observed statistic to a distribution of values that could have happened if the null was true.
It is calculated assuming the alternative hypothesis is true.
It is a probability.

9. Which of the following is true about a small p-value?

The sample statistic is unlikely to have happened by chance.
The sample size was large.
The sample statistic is unlikely to have happened if the null hypothesis was true.
The sample statistic was large.

10. If you obtain a large p-value, what can you conclude about your hypotheses?

We cannot say the alternative hypothesis is false.
We cannot say the null hypothesis is false.
The null hypothesis is true.
The alternative hypothesis is true.

11. If the probability of a Type I error goes down, what can you say about the probability of a Type II error?

The probability of a Type II error goes down.
The probability of a Type II error stays the same.
The probability of a Type II error goes up.

12. If you obtained a small p-value (e.g., 0.02), what could you say about what you would expect if you constructed a 95% confidence interval?

It would contain the null hypothesized value.
It would not contain the null hypothesized value.
It would contain the sample statistic.
It would contain the true population parameter.

13. In a regression table, what is the “statistic” value associated with the slope?

a bootstrap statistic
a z-statistic
the sample slope statistic
a t-statistic

14. In a regression table, how is the p-value calculated?

Using a permutation distribution with 1000 resamples
Using a bootstrap distribution with 1000 samples
Using a Normal distribution
Using a t-distribution

4 R Tutorial – Due Wednesday by the beginning of class

Required Tutorial: Randomization Test for the Slope

Required Tutorial: Parameters and Confidence Intervals