Lab 7: Investigating Bergmann’s Rule

Author

The names of your group members here!

Data

Today we will explore the pie_crab dataset contained in the lterdatasampler R package. The data is from a study by Johnson et al. at the Plum Island Ecosystem Long Term Ecological Research site, studying the relationship between the size (carapace width) of a Fiddler Crab and the geographical location of its habitat. These data can be used to investigate if Bergmann’s Rule applies to Fiddler Crabs, or specifically that the size of a crab increases as the distance from the equator increases.

Our Investigation

The focus of this lab is on quantifying the relationship between the size of fiddler crabs (response) and the environment in which they live (explanatory).

Exploring the Data

1. How many crabs were included in this study?

Visualizing Relationships

2. Create a scatterplot modeling the relationship between latitude (explanatory) and the size of a fiddler crab (response). Don’t forget to add descriptive axis labels!

3. Describe the relationship you see in the scatterplot. Be sure to address the four aspects we discussed in class: form, direction, strength, and unusual points!

Testing for a Relationship

Frequently, the first step in a statistical analysis is to conduct a hypothesis test, testing if there is a relationship between the variables. Let’s go through the four steps needed to determine whether there is a relationship between these variables.

Step 1: Summarizing the Relationship

Specifically, we are interested in the slope statistic, as it captures the relationship between latitude and crab size.

4. Fill in the code below to calculate the observed slope for the relationship between the the size of a fiddler crab (response) and latitude (explanatory). Note: Nothing will be output when you run this code!

obs_slope <- pie_crab %>% 
  specify(response = ____, 
          explanatory = ____) %>% 
  calculate(stat = ____)

Step 2: Obtain the Permuted Statistics

After you have calculated your observed statistic, you need to create a permutation distribution of statistics that might have occurred if the null hypothesis was true.

5. Fill in the code below to generate 500 permuted statistics for the permutation distribution and save these statistics in an object named null_dist.

null_dist <- pie_crab %>% 
  specify(response = ____, 
          explanatory = ____) %>% 
  hypothesize(null = ____) %>% 
  generate(reps = ____, 
           type = "permute") %>% 
  calculate(stat = ____)

Step 3: Visualize the Null Distribution

We can visualize this null distribution by running the following code:

visualise(null_dist) +
  labs(x = "Slope Statistic for Relationship Between Crab Size & Latitude", 
       title = "Simulated Null Distribution with 500 Reps")

Step 4: Obtain the p-value

Now that you have calculated the observed statistic and generated a permutation distribution, you can calculate the p-value for your hypothesis test using the get_p_value() function.

6. Fill in the code below to calculate the p-value for the hypothesis test.

get_p_value(null_dist, 
            obs_stat = ____, 
            direction = ____)

7. Based on your p-value and an \(\alpha = 0.1\), what decision would you reach regarding the null hypothesis?

8. Based on your decision, what would you conclude about the relationship between these variables? Be specific!

Quantifying the Relationship

Now that you’ve confirmed that there is a relationship between these variables, let’s summarize how large this relationship is!

Let’s go through the four steps needed to determine whether there is a relationship between these variables.

Step 1: Approximate Other Statistics that Could Have Happened

First, we’ll use bootstrapping to obtain statistics that might have happened if we went out and collected a different sample of fiddler crabs.

9. Fill in the code to generate 500 bootstrap slope statistics (from 500 bootstrap resamples).

bootstrap_dist <- pie_crab %>% 
  specify(response = ____, 
          explanatory = ____) %>% 
  generate(reps = ____, 
           type = ____) %>% 
  calculate(stat = ____)

Step 2: Visualize the Bootstrap Distribution

Alright, now that we have the bootstrap slope statistics, let’s see how it looks!

visualise(bootstrap_dist) +
  labs(x = "Slope Statistic for Relationship Between Crab Size & Latitude", 
       title = "Bootstrap Distribution with 500 Resamples")

Step 3: Obtain Confidence Interval

The next step to obtain our confidence interval! First we need to determine what percentage of statistics we want to keep in the confidence interval. 90%? 95%? 99%? 80%?

This seems like a study where we care a bit less about our interval capturing the true value, at least compared to something like a medical study. So, I think this could be a great instance to use a 90% confidence interval.

10. Use the get_confidence_interval() function to find the 90% confidence interval from your bootstrap distribution, using the percentile method!

11. Interpret the confidence interval you obtained in #10. Make sure to include (1) your confidence level, (2) the statistic you are estimating, (3) the population of interest, and (4) your endpoints!

12. Based on this interval, does Bergmann’s Rule apply to Fiddler Crabs? Specifically, does the size of a crab increases as its distance from the equator increases?