sum(1, 2)
[1] 3
A pipe in R
is a way of stitching together many commands to make your code easier for a human to read, and to keep it from running off the pages of the PDF documents you will submit to us.
These two code chunks do exactly the same thing:
sum(1, 2)
[1] 3
1 |>
sum(2)
[1] 3
So the pipe operator |>
passes (or pipes), the number 1
into the sum
function as the first input. In a simple example like this, sum(1, 2)
is definitely the way I would write the code, but as things get more elaborate, you will want to pipe.
Let’s consider the COVID data set from class on 9/3/2024:
library(tidyverse)
<- read_csv("delta.csv") delta
Recalling the information here, be prepared to adjust the file path to match how you have organized your files and folders.
A row in this data set is a person, and we record whether or not that person died from/with COVID, and whether or not they were vaccinated. So there are four categories in all:
The following code creates a nifty lil’ table that tallies up the number of people in each group and calculates the proportion of people that died versus survived within each vaccination group:
|>
delta count(vaccine, outcome) |>
group_by(vaccine) |>
mutate(prop = n / sum(n))
# A tibble: 4 × 4
# Groups: vaccine [2]
vaccine outcome n prop
<chr> <chr> <int> <dbl>
1 Unvaccinated died 250 0.00166
2 Unvaccinated survived 150802 0.998
3 Vaccinated died 477 0.00407
4 Vaccinated survived 116637 0.996
So within each vaccination group, the numbers sum to one. This code is equivalent, but it provides a truly horrific reading experience:
mutate(group_by(count(delta, vaccine, outcome), vaccine), prop = n / sum(n))
# A tibble: 4 × 4
# Groups: vaccine [2]
vaccine outcome n prop
<chr> <chr> <int> <dbl>
1 Unvaccinated died 250 0.00166
2 Unvaccinated survived 150802 0.998
3 Vaccinated died 477 0.00407
4 Vaccinated survived 116637 0.996
The COVID data contains another variable indicating whether or not the person was older or younger than fifty. Write some code (it will be very similar to the code above) that produces a table that breaks things down by vaccination status, outcome, and age. How many rows should this table contain?