STAT 313: Midterm Project Proposal

Due February 5, 2023 by 5pm

This week you will get started on your midterm project by selecting what dataset you wish to analyze and writing an introduction about the dataset you chose.

1 Step 1: Picking a Dataset

1.1 Option 1 – Use your own data

If you have a dataset from another course or your own research which you believe can be appropriately modeled with a linear regression, you can propose to use these data.

Deliverable

For the Midterm Project Proposal assignment on Canvas, you are required to state at the beginning of the document, that you have chosen to use your own dataset. You are also required to submit your dataset as an Excel file (stored as either a .csv or .xlsx).

1.2 Option 2 – Use a public dataset

I’ve compiled a list of datasets from a variety of contexts, all which are relatively tidy and ready for analysis. Each of these datasets are included in an R package, so there is no need for you to download the dataset! All you will need to do is load in the necessary package (e.g., library(lterdatasampler)) at the beginning of your analysis.

From the lterdatasampler package:

  • and_vertebrates: Size data for Cutthroat trout and salamanders in different sections of forest (from Lab 3).
  • ntl_icecover: Data on duration of ice cover of lakes in the Madison, WI area (from Lab 4).
  • hbr_maples: Data on the growth of Sugar Maple (Acer saccharum) seedlings in response to calcium addition.
  • pie_crab: Data on Fiddler crab body size in salt marshes from Florida to Massachusetts.

From the openintro package:

From the moderndive package:

From the gapminder package:

  • gapminder: Data on life expectancy, GDP per capita, and population by country (discussed in ModernDive textbook).

Deliverable

For the Midterm Project Proposal assignment on Canvas, you are required to state at the beginning of the document, the name of the dataset you have chosen to use.

2 Step 2: Writing an Introduction

2.1 If you chose to use your own data

Step 1: Describe the context of your dataset in your own words! How were the data collected? Was there a study these data came from?

I don’t know anything about your data, so I expect for this description to be extensive. If the data you are using are from a class, you are responsible for obtaining the context of the data–you cannot simply say “These were data used in BIO 253.”

Step 2: Choose your variables

We will using a simple linear regression to analyze the data you chose. Thus, there are some stipulations for the variables you can choose. You must choose

  • one numeric variable for the response variable
  • one numeric variable for the explanatory variable

Once you have each of these variables decided, you then need to choose one additional explanatory variable. This additional explanatory variable can be either numerical or categorical.

Write-up

Describe each variable you chose for your analysis—how was the variable measured? What unit was the variable measured in? What types of values does the variable take on (this is especially important if you chose a categorical variable)?

2.2 If you chose a dataset from the list above

Step 1: Describe the context of your dataset in your own words! How were the data collected? Was there a study these data came from? Were these data included in any publications?

Getting information about your dataset

To obtain information on the dataset, click on the link provided in its name!

Step 2: Choose your variables

We will using a simple linear regression to analyze the data you chose. Thus, there are some stipulations for the variables you can choose. You must choose

  • one numeric variable for the response variable
  • one numeric variable for the explanatory variable

Once you have each of these variables decided, you then need to choose one additional explanatory variable. This additional explanatory variable can be either numerical or categorical.

Write-up

Describe each variable you chose for your analysis—how was the variable measured? What unit was the variable measured in? What types of values does the variable take on (this is especially important if you chose a categorical variable)?

3 Step 3: Submitting on Canvas

For the Midterm Project Proposal assignment on Canvas, your proposal is required to consist of both components (dataset and description).

You are allowed to use any text editing software to make your proposal (e.g., Word, Pages, Google Docs), but your submission must be a PDF. If you are unsure how to save your file as a PDF, I recommend using Google!

Caution

Remember, if you chose to use your own dataset, you are required to submit your data as an Excel file (.csv or .xlsx).