Introduction to Version Control

Common Mishaps

  • Q1: Who collected these data? When were the data collected?

  • Q4: Column titles of 2008 and 2018 are not descriptive!

    • Creating column names that describe the values stored in those columns!
    • The names_prefix = argument to pivot_wider() can help you make better column names!
    • DVS-6: I can create tables which make my summaries clear to the reader
  • Q4: Unless you specify .groups = "drop" within summarize() your table still is grouped!

    • group_by() + summarize() only drops the first group.
    • If you have two variables inside group_by(), then the data will still be grouped by the second variable!
  • Q7: The data description contains important information!

    • mc_toddler – Aggregated weekly, full-time median price charged for Center-based Care for toddlers.
    • mhi_2018Median household income expressed in 2018 dollars.

Recreating the Plot

DVS-2: I use plot modifications to make my visualizations clearer to the reader

  • Facets ordered based on developmental stage not alphabetically
  • Ordering colors in the legend so they appear in the same order as the lines in the plot.

. . .

DVS-3: I show creativity in my visualizations

  • Exploring different color themes
    • Personally, I like the "Accent" palette from the RColorBrewer package, but you might like others!
    • Getting 10 colors is hard! I would recommend looking into the colorRampPalette() function to get more colors.

. . .

  • Exploring different plot themes
    • Personally, I like theme_bw(), but you might like others!

Other “Big Picture” Code Feedback

I strongly recommend against nested functions, as they are difficult for people to understand what your code is doing. Having two lines is not less efficient and is more readable.

mutate(age_group = fct_relevel(fct_recode(age_group,
                                          "Infant" = "mc_infant",
                                          "Toddler" = "mc_toddler",
                                          "Preschool" = "mc_preschool"),
                                "Infant",
                                "Toddler",
                                "Preschool"))

Saving Objects That Aren’t Worth Saving

We should only save objects that we need to use later!

lowest_child_care_price_2018 <- ca_childcare |>
  filter(study_year == 2018) |>
  group_by(region) |>   
  summarise(median_infant_price = median(mc_infant)) |> 
  slice_min(order_by = median_infant_price)

lowest_child_care_price_2018

Version Control

A process of tracking changes to a file or set of files over time so that you can recall specific versions later.

Git vs GitHub

knitr::include_graphics("https://bornsql.ca/wp-content/uploads/2022/03/Git-Logo-2Color.png") 

The image is the official logo for Git, a version control system. It consists of a diamond-shaped orange symbol with white lines inside that resemble a branching structure, symbolizing version control and branching in Git. Next to the symbol is the word 'Git' written in bold, black lowercase letters. The logo visually represents Git's core function of managing changes in code through branching and merging.

  • A system for version control that manages a collection of files in a structured way.
  • Uses the command line or a GUI.
  • Git is local.

Git vs GitHub

git's logo, a red diamond, with two 'branches', one large branch and one smaller branch stemming from the main branch.

  • A system for version control that manages a collection of files in a structured way.
  • Uses the command line or a GUI.
  • Git is local.

GitHub's logo, a black circle, with the outline of a cat in white. The cat seems to have a snake-like tail.

  • A cloud-based service that lets you use git across many computers.
  • Basic services are free, advanced services are paid (like RStudio!).
  • GitHub is remote.

Why Learn GitHub?

  1. GitHub provides a structured way for tracking changes to files over the course of a project.
  • Think Google Docs or Dropbox history, but more structured and powerful!
  1. GitHub makes it easy to have multiple people working on the same files at the same time.

  2. You can host a URL of fun things (like the class text, these slides, the course website, etc.) with GitHub pages.

Git Repositories

Git is based on repositories.

  • Think of a repository (repo) as a directory (folder) for a single project.
    • This directory will likely contain code, documentation, data, to do lists, etc. associated with the project.
    • You can link a local repo with a remote copy.

A red file folder, with the git logo on it (one large branch with one smaller branch stemming off of it).

Actions in Git

Cloning a Repo


Create an exact copy of a remote repo on your local machine.

A diagram of the process of cloning a repository. At the top of the picture, there is a cloud (representing the internet), with a pink box labeled 'remote' symbolizing the remote GitHub repository. There is a down arrow connecting the cloud to a laptop, mimicking the process of cloning a remote repository onto a local computer. The laptop has a greeen box labeled 'local' symbolizing the local copy of the remote GitHub repository.

Committing Changes

Tell git you have made changes you want to add to the repo.

  • Also provide a commit message – a short label describing what the changes are and why they exist.

The red line is a change we commit (add) to the repo.

A diagram of the process of committing changes that were made to a document. On the left is a document with four lines of text. The third line is colored red, to symbolize where a change was made, while the other lines are colored black. There is a right arrow connecting the document to a laptop, with the phrase 'git commit' printed above the arrow. The arrow terminates at a green box labeled 'local' on the laptop, symbolizing committing changes made to the document to the local repository.

. . .

The log of these changes is called your commit history.

  • You can always go back to old copies!

Commit Tips

  • Use short, but informative commit messages.
  • Commit small blocks of changes – commit every time you accomplish a small task (e.g., one problem in the lab).
    • You’ll have a set of bite-sized changes (with description) to serve as a record of what you’ve done.
    • With frequent commits, its easier to find the issue if / when you mess up!

Pushing Changes


Update the copy of your repo on GitHub so it has the most recent changes you’ve made on your machine.

A diagram of the process of pushing local changes to the remote repository. There is a laptop with a green box labeled 'local' symbolizing the local copy of the GitHub repository. Above the laptop is cloud with a pink box labeled 'remote' symbolizing the remote GitHub repository (that lives on the internet). There is an arrow pointing from the laptop to the cloud with the phrase 'git push' next to the arrow, symbolizing the action of pushing the local changes (that have been committed) up to the remote repository.

Pulling Changes


Update the local copy of your repo (the copy on your computer) with the version on GitHub.

A diagram of the process of pulling from the remote repository to update the local repository. There is a laptop with a green box labeled 'local' symbolizing the local copy of the GitHub repository. Above the laptop is cloud with a pink box labeled 'remote' symbolizing the remote GitHub repository (that lives on the internet). There is an arrow pointing from the cloud to the laptop with the phrase 'git pull' next to the arrow, symbolizing the action of pull the changes that exist on the remote repository (possibly from a different computer) to update the local repository.

Workflow

When you have an existing local repo:

  1. Pull the repo to make sure you have the most up to date version (especially if you are working on different computers).
  2. Make some changes locally.
  3. Commit the changes to git.
  4. Push your changes to GitHub.

Connect GitHub to RStudio

Previous Steps

You were asked to complete the following steps before coming to class today:

  1. Create a GitHub account
  2. Introduce yourself to git (in RStudio)
  3. Generate a Personal Access Token (PAT)
  4. Store your PAT in RStudio

Verifying Your Connection

Open RStudio and run the following code in your console (lower left pane):


usethis::git_sitrep()

. . .

You should see something like:

── GitHub user 
• Default GitHub host: 'https://github.com'
• Personal access token for 'https://github.com': '<discovered>'
• GitHub user: 'atheobold'
• Token scopes: 'admin:org, admin:public_key, delete:packages, delete_repo, gist, notifications, repo, user, workflow, write:packages'
• Email(s): 'atheobol@calpoly.edu (primary)', 'theobold.allison970@gmail.com', '12439090+atheobold@users.noreply.github.com'
ℹ No active usethis project

. . .

If that is not the case, Dr. Theobold will help you troubleshoot in 5-minutes!

Accessing Lab 1

Accessing Lab 1

Here are step by step directions: Copying the Lab Assignment with GitHub Classroom in 11 Steps


Step 1: Open the Lab 1 assignment on GitHub Classroom

Step 2: Open your Lab 1 repository

Step 3: Clone the repository to your computer

Once You’ve Cloned the Repo

Step 4: Open the lab-1.qmd file

Step 5: Change your name

Step 6: Commit your change (with a nice message!)

Step 7: Push your change

Lab 1 & Challenge 1 Instructions

*I would highly recommend having these pulled up alongside RStudio while you work!

To do…

  • Lab 1: Introduction to Quarto
    • Due Sunday (9/29) at 11:59pm
  • Challenge 1: Modifying Your Quarto Document
    • Due Sunday (9/29) at 11:59pm
  • Complete the Week 2 Coursework
    • Check-ins 2.1, 2.2, 2.3 due Tuesday (10/1) by the start of class