Midterm Project Feedback & Machine Learning

STAT 313

Deadline Reminders

  • Lab 4 revisions are due tonight
  • Statistical Critique 1 revisions are due tonight
  • The final version of your Midterm Project is due Sunday at midnight

Deadline Extensions

You cannot request deadline extensions for the final version of your Midterm Project. The assignment portal closes at 11:59pm on Sunday. Do not ride the line.

Midterm Project Review

Help your peers!

  • Are the arguments / sentences easy to understand? Does the report flow?

  • Is the same information included in multiple places?

  • Do the plots have nice axis labels?

  • Can you easily find the regression equations? Do the equations make sense?

  • Do the interpretations / conclusions from the equations make sense?

Peer Feedback

Study Limitations

Based on how the study was designed, what population can you infer these results onto?


Justify what population you believe your analysis can be inferred onto.

  • The sample of [possums / professors / crabs]?
  • Some larger population of [possums / professors / crabs]?

Your justification needs to connect with how the researchers collected their data!

Based on how the study was designed, what can you say about the relationships between the variables?


Stating that the study was “observational” doesn’t tell me that you understand what would be required to use cause-and-effect language!

  • What specifically would have the researchers have needed to do in order to use causal language?

Machine Learning

Machine Learning

“the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyze and draw inferences from patterns in data”


Data is Power

Does your phone recognize your face?

Joy Buolamwini found that she had to put on a white mask for the facial detection program to “see” her face.

How does Google label your images?

Should the cash bail system change?

Will your car be able to drive you?

Lab 6

Forward Selection (by Hand)

  1. Start with the most basic model (one mean)

  2. Decide which one variable to add (based on adjusted \(R^2\))

  3. Decide if you should add another variable

\(\vdots\)

  1. Stop adding variables when adjusted \(R^2\) stops increasing

A More Automated Option

evals_train %>% 
  map(.f = ~lm(score ~ .x + <VARIABLES SELECTED>, data = evals_train)) %>% 
  map_df(.f = ~get_regression_summaries(.x)$adj_r_squared) %>% 
  select(-ID, 
         -score,
         -<VARIABLE 1 SELECTED>,
         -<VARIABLE 2 SELECTED>
         ) %>% 
  pivot_longer(cols = everything(), 
               names_to = "variable", 
               values_to = "adj_r_sq") %>% 
  slice_max(adj_r_sq)