---
title: "Lab 9: Searching for Efficiency & Making Great Tables"
author: "Your name here!"
format: html
editor: source
embed-resources: true
html-table-processing: none
---

```{r}
#| label: packages

```

## Formatting Tables

In this lab, we will also practice making nice, report worthy, tables!

I would recommend you think of tables no different from the visualizations
you've been making. We want **all** aspects of our tables to be clear to the
reader, so the comparisons we want them to make are straightforward. You should
be thinking about:

- Column headers
- Grouping headers
- Order of columns
- Order of rows
- Number of decimals included for numeric entries
- etc.

There are many packages for generating tables but I recommend either `kable()`
function from the **knitr** package or `gt()` function from the **gt** package
and their add-ons.

**For simple tables**

-   the `kable()` function from the **knitr** package for *simple* tables
-   the `gt()` function from the **gt** package

**For more sophisticated tables**

-   styling functions from the **kableExtra** package (e.g., `kable_styling()`,
`kable_classic()`)
-   add-on functions from the **gt** package (e.g., `cols_label()`, 
`tab_header()`, `fmt_percent()`)

## The Data

For this week's lab, we will be **revisiting questions from previous lab assignments**, 
with the purpose of using functions from the `map()` family to  iterate over certain tasks. To do this, we will need to load in the data from Lab 2, Lab 3, and Lab 7.

**Question 1: Edit the code below to read in the appropriate datsets that you should have saved from the previous labs!**

```{r}
#| label: read-in-old-data

# Data from Lab 2
surveys <- read_csv(here::here("Week 2", "Lab 2", "surveys.csv"))

# Data from Lab 3
evals <- read_csv(here::here("Week 3", "Lab3", "teacher_evals.csv")) |> 
  rename(sex = gender)

# Data from Lab 7
fish <- read_csv(here::here("Week 7", "Lab 7", "BlackfootFish.csv"))
```

## Lab 2

First up, we're going to revisit Question 2 from Lab 2. This question asked:

> What are the data types of the variables in this dataset?

**Question 2: Using `map_chr()`, produce a table of the data type of each variable in the `surveys` dataset. Specifically, the table should have two columns `Variable` and `Data Type` with a row for each variable and be displayed using `kable()`.**

*Hint: You will want to check out the `enframe()` function to help with this task.*

```{r}
#| label: map-data-types-of-surveys


```


**Question 3: Format the table nicely! Your table must use either `kable()` and functions in the kableExtra package or `gt()` and functions from the gt package to produce 
a table with the following qualities:**

- rows are ordered to make the information easy to understand
- include a caption or header
- use bolded column names

*Note that you should assign the column names when creating the table, not by renaming columns in the dataset itself because we hate variable names with spaces in them!*

```{r}
#| label: formatted-var-tab


```

## Lab 3

Now, were on to Lab 3 where we will revisit two questions. 

In the original version of Lab 3, Question 4 asked you to:

> Change data types in whichever way you see fit (e.g., is the instructor ID
> really a numeric data type?)

**Question 4: Using `map_at()` or `map_if()`, convert the `course_id`, `weekday`, `academic_degree`, `time_of_day`, and `sex` columns to factors. In other words, convert all character variables into factors. DO NOT PRINT OUT YOUR NEW DATA FRAME, just show the code.**

*Hint: You will need to use `bind_cols()` to transform the list output back into a data frame.* 

```{r}
#| label: map-to-mutate-columns


```

Next up, we're going revisit Question 7 which asked:

> What are the demographics of the instructors in this study? Investigate the
> variables `academic_degree`, `seniority`, and `sex` and summarize your
> findings in ~3 complete sentences.

Many people created **multiple** tables of counts for each of these
demographics, but in this exercise we are going to create **one** table with 
every demographic.

**Question 5: Recreate the (mainly unformatted) table shown in the instructions using [one]{.underline} pipeline. It is okay if the rows are not in the same order in your table,  but the rest of the table should match mine. Meaning, you will need to do some renaming of the names of the variables and their levels.**

```{r}
#| label: unformatted-summary-table

```

**Question 6: Now turn your output into a very nice table, like one of the examples shown in the instructions. You can style your table using either functions from the kableExtra package or from the gt package.** 

**Your table does not need to be an exact copy of the example but it should include
[all]{.underline} of the following:**

- Some way of clearly indicating the three variable types as row groups
- Giving nice column names
- Using a column header that spans the `Count` and `%` columns 
- Nicely formatting the % column to include % signs and only 1-2 digits
- Giving the table a title or a caption

```{r}
#| label: fancy-table

```

## Lab 7

For our last problem, we will revisit a question from the most recent lab. 
Question 1 asked you to use `across()` to make a table which summarized: 

> What variable(s) have missing values present?  
> How many observations have missing values?

**Question 7: Using `map_int()`, produce a nicely formatted table of the number of missing values for each variable in the `fish` data.** 

*Hint: You will want to check out the `enframe()` function to help with this task.*

```{r}
#| label: map-missing-values-of-fish


```

**Question 8: Now turn your output into a very nice table, like the example shown in the instructions using the gt package. Specifically, your table should color the cells with 0 missing values green and cells with > 0 missing values red.** 

*Hint: You will find this documentation page helpful: https://gt.rstudio.com/reference/data_color.html* 

```{r}
#| label: colored-table-missing-values-of-fish


```