STAT 313: Applied Experimental Design and Regression Models

Spring 2023

Instructor: Dr. Allison Theobold – call me Dr. Theobold or Dr. T! I use she / they pronouns.

1 Welcoming Classroom

I value diversity, inclusion and equity in this (and every) class. I hold the fundamental belief that everyone is fully capable of learning and doing statistics. There is more than one way to address a statistical problem, and our learning will be richer by being open to different ideas, rejecting stereotypes, and being aware of—in order to minimize—our biases. I look forward to getting to know you all as individuals and as a learning community.

2 Key information

Class meetings

Room: Baker Science (Building 180), Room 260

Times:

  • Section 01: 12:10 - 2:00pm
  • Section 02: 2:10 - 4:00pm

How to contact me

Email: atheobol@calpoly.edu1

Office: Building 25 (at the top of the Baker lawn) Office 105 (by the Statistics Department Office)

Student Hours

These are hours reserved for you! Student hours can different each week, depending on the assignments we have. Some days it may be you and I having a one-on-one conversation, some days it may be talking in a group about questions you have regarding an assignment, and some days it may be you working with others while I circulate around the room to answer questions that arise.

This quarter I will be holding student hours on Tuesdays and Thursdays at the following times:

Day Time
Tuesdays 10:10am – 11:00am (in-person)
Thursdays 10:10am – 11:00am (in-person)

Personal Meetings

If you would like to talk with me individually, I’ve reserved time on Wednesdays from 2:10pm to 3:00pm for individual appointments. You can make appointments through the following link: https://calendly.com/allisontheobold

I do request that you make appointments at least 1-hour ahead of time, so I don’t miss our meeting!

If you need to meet, but none of the student hours work for you please let me know! It is possible we can communicate asynchronously through Discord or email, but I am happy to schedule a meeting during another time if necessary.


3 Required Materials

For this course we will be using one main textbook, accompanied by additional resources. The textbooks we are using are free, but have the option to obtain a printed copy if you wish.

3.1 Textbooks

Çetinkaya-Rundel and Hardin, Introduction to Modern Statistics. https://openintro-ims.netlify.app/

Ismay & Kim, Modern Dive: Statistical Inference via Data Science. https://moderndive.com

3.2 Required Technology

R is the statistical software we will be using in this course (https://cran.r-project.org/)

RStudio is the most popular way to interact with the R software. We will be interacting with RStudio through Posit Cloud (Posit is the company that owns RStudio). You will join the Stat 313 workspace, and then be able to access the course homework and lab assignments. We will be walking through this in the first week of lab!

For questions of general interest, such as course clarifications or conceptual questions, please use the Class Discord Server. Refer to the Day One Class Setup materials for more information on how to effectively use this server.

4 What is this class?

Catalog Description: Applications of statistics for students not majoring in statistics or mathematics. Analysis of variance including one-way classification, randomized blocks, and factorial designs; multiple regression, model diagnostics, and model comparison. Prerequisite: Stat 217, Stat 218, Stat 221 or Stat 312

4.1 My course goals

Data Visualization & Summarization

  • create visualizations for one and two numerical variables

  • use facets and / or color to include additional variables into a visualization

  • calculate numerical summaries of variables

  • find summaries of variables across multiple groups

Working with Data & Reproducibility

  • select necessary columns from a dataset

  • filter rows from a dataset for numerical and categorical variables

  • modify existing numerical and categorical variables and / or create new variables

  • create professional-looking, reproducible analyses using Posit projects, Quarto documents, and the here package

Linear Models & Model Selection

  • identify which linear model is appropriate for a given research question

  • describe the conditions required to obtain reliable estimates from linear models

  • use visualizations, summary statistics, and critical thinking to evaluate if linear model conditions are violated

  • identify methods to remedy condition violations

  • fit additive and interactive linear models in R

  • interpret the coefficient estimates of a linear model

  • use visualizations and model selection techniques to determine if a specific variable should be included in a linear model

Study Design

  • distinguish between an experiment and an observational study

  • identify sources of variation and describe how to account for them

  • explain differences in sampling methods and contrast the inferences they permit

  • argue what population a given sample is representative of

Fundamentals of Statistical Inference

  • identify the parameter of interest for a given linear model and associated research question

  • outline the null (\(H_0\)) and alternative (\(H_A\)) hypotheses for a given research question

  • describe what a null distribution is and how it is used to obtain a p-value

  • interpret a p-value in the context of a research question

  • use a p-value to make a decision about a hypothesis test and reach a conclusion about a research question

  • distinguish between Type I and Type II errors

  • describe how sample size, significance level, and sampling variability effect Type I and Type II errors

  • outline the strengths and weaknesses of significance testing

  • describe what a bootstrap distribution is and how it is used to obtain a confidence interval

  • interpret a confidence interval in the context of the parameter of interest

  • describe the connection between confidence intervals and hypothesis testing

5 Grade Breakdown

A-level B-level C-level D-level
Reading Guides (12 total) at least 11 completed at least 10 completed at least 8 completed 6 or fewer completed
Concept Quizzes
(15 total)
at least 14 completed at least 12 completed at least 10 completed 9 or fewer completed
Tutorials
(14 total)
at least 13 completed at least 11 completed at least 10 completed 8 or fewer completed
Critiques & Lab Assignments
(10 total)
at least 9 Satisfactory2 at least 8 Satisfactory at least 7 Satisfactory 6 or fewer Satisfactory
Peer Evaluations
(2 total)
2 completed 2 completed at least 1 completed at least 1 completed
Midterm Project Excellent Satisfactory Progressing Not Completed
Final Project Excellent Satisfactory Progressing Not Completed

I will set + / - grades based on how close you are to the next higher (or lower) letter grade. For example, if you meet all the criteria for an A but only completed 8 satisfactory labs and critiques, you would earn an A-. Similarly, if you meet all the criteria for an A, but received a Satisfactory on a project – with more sections receiving an Excellent than Satisfactory – you would earn an A-.

5.1 What assessments will there be?

The main idea of assessment in this course is communication and feedback. There are several different types of assessments or assignments in this class, but they will all allow you to check your own understanding and progress towards the learning goals, get in-depth feedback from me, and let me know where to spend more time or approach something differently.

Each one is described briefly here, grouped into categories by course goal. See Section 5 for an explanation of how these contribute to your final grade. Our class schedule with topics is in Section 10.

5.2 Interpret and use statistical concepts

Readings and videos (every week)

I favor a “flipped classroom,” as it gives you more time to clarify and solidify statistical concepts through hands-on exercises. Each week, you will read the required chapter(s), completing a required reading guide to walk you through the central concepts for each week. You are required to submit your completed reading guides by Tuesday at noon.

Concept quizzes (every week)

Each week there will be a short (~10 questions) quiz over the reading and videos from the week. These quizzes are intended to ensure that you grasped the key concepts from the week’s readings. The quizzes are not timed, so you can feel free to check your answers with the textbook and / or videos if you so wish. The quizzes are marked on completion as complete or incomplete. You are required to complete the concept quiz by Tuesday at noon.

Tutorials (every week)

You can think of the tutorials as an “interactive” textbook, as they interweave statistical ideas alongside examples of how to work with data in R and hands-on exercises writing the R code necessary to complete a given task. Each exercise has hints available if you get stuck!

The tutorials are work at your own pace, so you can complete them all at once or slowly throughout the week. The lab assignments will require for you to put the skills you learned in the tutorials to work, so you are required to complete each week’s tutorial before Thursday’s lab.

The tutorials are marked on completion as complete or incomplete. You will submit a screenshot of the completion page at the end of the tutorial to confirm that you completed the tutorial for the week. You are required to complete the tutorial by Thursdays at noon.

5.3 Critically evaluate the use of Statistics

Statistical critiques (every 4-weeks)

These assignments are case studies in which you will evaluate a data visualization or statistical analysis, determining how well-performed and presented the analysis was and making recommendations for improving or using the analysis. Critiques are due roughly 1-week after they are assigned and should take 1-2 hours. You will receive feedback and a mark of Success or Growing (elaborated more on later), and you will be able to revise based on that feedback. There will be two total critiques.

5.4 Perform statistical analyses to answer research questions

Lab Assignments (every week)

Labs will be assigned on Thursday every week, providing the opportunity to explore the course concepts in the context of real data. Lab assignments will require for you to work through the tutorial for the week, thus the tutorials are due before the start of class on Thursday.

You will complete the lab assignments in the same teams you collaborate with in class. You will access the lab assignment through Posit Cloud, which you will be walked through during the first lab. Your group will be expected to submit your completed lab on Canvas. You will need to submit both the HTML and Quarto documents. The individual assigned as the Report Manager (described below) will submit the team’s completed Lab assignment by Sunday at midnight.

Success / Growing Grading

I expect that you will approach each lab assignment seriously, investing the necessary time and energy to prepare your responses. Different from what you may have experienced, lab assignments are graded for “proficiency” of specific learning targets, which describe what you should be able to do after taking this course. You’ll receive a score for each problem on an assignment according to the Success / Growing rubric below, as well as feedback to help you improve.

Score Justification
Successful The solution to the problem is correct, legible, and easy to follow, with all reasoning provided. Any error is trivial.
Growing The solution shows growth toward mastering the topic; however, it is incorrect and/or incomplete.

Completing Revisions

After the first submission of your lab or critique, you will have the option to retry any problems for which you scored a Growing (G). A written reflection on how your understanding of the problem changed will accompany any revision. If you don’t earn a Success by the second try, you can make an appointment with me to meet during my office hours (or another agreed-upon time) to create a reassessment strategy. You can schedule up to one meeting per week.

You can submit up to one revised Lab Assignment or Statistical Critique per week, until the Sunday before finals week – June 11.

Reflections on Your Learning

Revisions must include a reflection describing how you revised your thinking when completing your revision. It’s not enough to say “[x] was wrong, so I fixed it”—you have to talk about why you got [x] wrong in the first place and what you learned that changed your mind. What do you know now that you didn’t know before? Who or what helped you learn?

  • If your revision does not include reflections, I’ll ask you to add them.
  • See some examples of really good reflections here – they’re (mostly) from an introductory statistics class, but I think you’ll get the idea.

Submit your revision to the same assignment box on Canvas as your original. This helps me keep track of any outstanding revisions each person has.

5.5 Synthesize statistical ideas

Midterm & Final Projects

There will be two projects throughout the quarter, where you will be asked to synthesize the statistical concepts you have learned in a formal statistical report. Your critiques will help guide you toward how you do / don’t want your report to look. Each project will be done independently, and requires you to submit a project proposal and draft report before the final deadline. You are encouraged to use the feedback received on these assignments to improve your final report. The final reports will be graded as Excellent, Satisfactory, Progressing, or No Credit based on a rubric that will be shared with the initial assignment.


6 Classroom community and policies

Weekly expectations

The module for each week will be released on Friday by 5pm, so you can look over the content and see what the plan is for the week.

Getting help

Discord: We will use Discord to manage questions and responses regarding course content. There are channels for the different components of each week (e.g., Week 1 Lab Assignment). Please do not send an email about homework questions or questions about the course material. It is incredibly helpful for others in the course to see the questions you have and the responses to those questions. I will try to answer any questions posted to Discord within 3-4 hours (unless it is posted at midnight). If you think you can answer another student’s question, please respond!

Email: I do my best to reply to emails promptly and helpfully. However, I receive a lot of emails. To help both you and me, here are some specific expectations about emails:

  • Please tell me what course and section (by time or number) you are in!
  • If you email me between 9am and 4pm on Monday through Friday, I’ll try my best to reply to you on the same day.
  • If you email me outside of those hours, I will do my best to respond to you by the next working day. For my own mental health, I do not work on weekends. Thus, if you send an email after 4pm on Friday or during the weekend, you will not receive a response until Monday morning.
  • If your question is much easier to discuss face-to-face, I may ask you to meet with me in my office or on Discord (at a time that works for both of us) rather than answering directly in an email.
  • Include any relevant photos / screenshots / error messages / PDFs / links with your email.

7 Well-being, Access, and Accommodations

7.1 What if I have accommodations or feel that accommodations would be beneficial to my learning?

I enthusiastically support the mission of Disability Resource Center to make education accessible to all. I design all my courses with accessibility at the forefront of my thinking, but if you have any suggestions for ways I can make things more accessible, please let me know. Come talk to me if you need accommodation for your disabilities. I honor self-diagnosis: let’s talk to each other about how we can make the course as accessible as possible. See also the standard syllabus statements, which include more information about formal processes.

7.2 I’m having difficulty paying for food and rent, what can I do?

If you have difficulty affording groceries or accessing sufficient food to eat every day, or if you lack a safe and stable place to live, and you believe this may affect your performance in the course, I urge you to contact the Dean of Students for support. Furthermore, please notify me if you are comfortable in doing so. This will enable me to advocate for you and to connect you with other campus resources.

7.3 My mental health is impairing my ability to engage in my classes, what should I do?

National surveys of college students have consistently found that stress, sleep problems, anxiety, depression, interpersonal concerns, death of a significant other and alcohol use are among the top ten health impediments to academic performance. If you are experiencing any mental health issues, I and Cal Poly are here to help you. Cal Poly’s Counseling Services (805-756-2511) is a free and confidential resource for assistance, support and advocacy.

7.4 Someone is threatening me, what can I do?

I will listen and believe you if someone is threatening you. I will help you get the help you need. I commit to changing campus culture that responds poorly to dating violence and stalking.

7.5 What if I can’t arrange for childcare?

If you are responsible for childcare on short notice, you are welcome to bring children to class with you. If you are a lactating parent, you many take breaks to feed your infant or express milk as needed. If I can support yo in navigating parenting, coursework, and other obligations in any way, please let me know.

8 Attendance, Extensions, and Technology

8.1 What if I need to miss class?

I encourage you to attend every class session, but policies are for narcs. I put a great deal of time into making each class session engaging and worth your time. Attendance in this course is not explicitly required, but it degrades your team’s trust in you when they don’t see you in class.

Here’s what you should do if you do miss a class:

  • Talk to a classmate to figure out what information you missed
  • Check Canvas for any necessary handouts or changes to assignments
  • Email me with any questions you have after reviewing notes and handouts

If you miss a bunch of classes, please come talk to me. I’m working from the assumption that you care and are trying, but something is getting in your way (health issues? depression / anxiety? college stress?); let’s figure out what that is and how I can help.

8.2 What if I need to turn something in late?

Assignments are expected to be submitted on time. However, every student will be permitted to submit up to three individual assignments up to 1-week late, by completing the deadline extension form. When you complete the deadline extension form you will be required to state (1) what assignment you need an extension for, and (2) your proposed new deadline. Your new deadline must be within 1-week of the original deadline. All deadline extensions must be done through the form, so I can keep track of who has used their allotment of extensions. All late work submitted without a deadline extension will receive a penalty of 5% per day for up to four (4) days. This means if you did not submit a deadline extension request, you cannot turn in work more than four days past the deadline.

8.3 Do I need to bring a computer to class?

You are allowed to use technology in the classroom! In fact, we will often do so as part of in-class activities. However, our class is held in a computer laboratory, so bringing a laptop is not required. You are permitted to use the lab computers, but if you would like to take notes on your computer / surface you are welcome to bring it to class.

9 Exceptations, Respect, and Integrity

9.1 How can I expect to be treated in this course?

Following Ihab Hassan, I strive to teach statistics so that people will stop killing each other. In my classroom, diversity and individual differences are a sources of strength. One of the greatest failures of Statistics, historically and in the present, has been the exclusion of voices from the field. Everyone here can learn from each other, and doing so is vital to the structure of the course. Significant portions of this course involve group work and discussion in class. Some discussions will touch on sensitive topics. So that everyone feels comfortable participating in these activities, we must listen to each other and treat each other with respect. Any attitude or belief that espouses the superiority of one group of people over another is not welcome in my classroom. Such beliefs are directly destructive to the sense of community that we strive to create, and will sabotage our ability to learn from each other (and thus sabotage the entire structure of the course).

In summary: Be good to each other.

9.2 Working in teams

Your team will be rotating group roles each week, so that one person does not act as the “team manager” for more than one week. Instead the following roles will circulate each week, so that each member of the group is able to complete each role.

Facilitator

Manages team progress through the task

  • Leads discussion

  • Makes sure everyone understands the task

  • Checks in with group members

  • Keeps the group moving forward

Recorder / Reporter

Manages in-class report

  • Responsible for organizing and recording answers to the assignment during discussions

  • Compiles a summary of the solutions discussed

  • Sends summary to report editor

Report Editor

Manages out-of-class report

  • Asks professor team questions

  • Reviews draft summary provided by reporter

  • Solicits feedback from the team

  • Shares summary with the team

  • Submits final assignment

Team Captain

Manages team participation

  • Encourages participation

  • Enforces norms

  • Brings conversation back if it deviates

  • Substitutes for absent roles

Peer evaluations

There will be confidential peer evaluations completed every four weeks. I will use these to check-in on each group’s dynamics and ensure that everyone feels their voice is being heard.

9.3 What consititutes plagiarism in a statistics class?

Paraphrasing or quoting another’s work without citing the source is a form of academic misconduct. This included the R code produced by someone else! Writing code is like writing a paper, it is obvious if you copied-and-pasted a sentence from someone else into your paper because the way each person writes is different.

Even inadvertent or unintentional misuse or appropriation of another’s work (such as relying heavily on source material that is not expressly acknowledged) is considered plagiarism. If you are struggling with writing the R code for an assignment, please reach out to me. I would prefer that I get to help you rather than you spending hours Googling things and get nowhere!

Any incident of dishonesty, copying or plagiarism will be reported to the Office of Student Rights and Responsibilities. Cheating or plagiarism will earn you a grade of N on the problem or assignment and will remove your ability to submit revisions for that assignment.

If you have any questions about using and citing sources, you are expected to ask for clarification.

For more information about what constitutes cheating and plagiarism, please see https://academicprograms.calpoly.edu/content/academicpolicies/Cheating.

10 Course Organization

This class is organized into six units. The skills learned at the beginning of the course will carry throughout the remainder of the course. I hope that you are able to see the connections between the topics of the different units, since they are all part of one big family—the regression family!

10.1 Unit 1: Foundations of Statistics (Week 1)

This introductory unit has three big tasks, (1) review statistical and data oriented concepts you have (likely) seen before, (2) think critically about why statistics is used in science, and (3) think about how (historically) statistics has been used for inference.

Reading: Chapters 1 and 2 in Introduction to Modern Statistics (IMS), with supplementary articles

10.2 Unit 2: Exploratory Data Analysis (Weeks 2 & 3)

This unit focuses on building skills for working with and visualizing different types of data. First, we will focus on categorical data–creating summary tables and barcharts. Next, we will turn our attention to numerical data–calculating summary statistics, histograms, scatterplots, and linegraphs.

Reading: Chapters 4 & 5 in IMS and Chapter 2 in Modern Dive

10.3 Unit 3: Regression Modeling (Weeks 4 & 5)

In this unit we finally begin exploring statistical methods. You will put the tools you learned for wrangling and visualizing to work in the context of linear regression. We will start in a (likely) familiar context–linear regression. Once we’ve explored the concepts of “simple” / basic regression we will turn up the heat and add some additional explanatory variables with multiple linear regression.

Reading:

  • Chapter 5 in Modern Dive
  • Chapter 6 in Modern Dive

10.4 Unit 4: Model Selection & Inference for Regression (Weeks 6 & 7)

This unit focuses on how we decide what variables should be included in our regression models and what we can say about the final models we obtain. We will explore these ideas using concepts you have seen before: hypothesis tests and confidence intervals. We will visit the ideas of p-values and significance testing, with a emphasis on making (and justifying) sound scientific decisions with the intention of obtaining the best regression model we can.

10.5 Unit 5: Condition Violations (Week 8)

There are occasions where the conditions required for linear regression are violated. Rather than throwing up our hands and saying “Oh, well!”, we can use variable transformations to lessen condition violations. This unit will explore the use of log transformations to remedy non-linear relationships and non-constant variance.

10.6 Unit 6: ANOVA a Categorical Case of Regression (Weeks 9 & 10)

To wrap up the quarter, we will look at a special case of linear regression–ANOVA. In this special case, our regression will include only categorical variables as explanatory variables. We will first review how we compare the means of two groups and then connect with what we learned about categorical variables in multiple linear to conceptualize how we can compare the means of three or more groups.

This unit will explore Chapter 22 in IMS

Footnotes

  1. See Section 6.0.2 for information on what you can expect when you email me.↩︎

  2. A “Satisfactory” lab or critique occurs when every problem has been marked Satisfactory.↩︎