Unsectioned

Left join

Front

Active users

4

All-time users

5

Favorites

0

Last updated

11 months ago

Date created

Sep 30, 2020

Unsectioned

(16 cards)

Left join

```
left_join(table_1,
table_2,
by = "column")
```

Add columns of `table_2`

to `table_1`

that match by the column specified. `NA`

s are added to rows of `table_2`

not matching rows of `table_1`

.

`dplyr::bind_rows`

function

Binds data frames, by row, into a tibble

`dplyr::setequal`

function

Returns whether the data frames are equal, regardless of row order

Right join

```
right_join(table_1,
table_2,
by = "column")
```

Add columns of `table_1`

to `table_2`

that match by the column specified. NAs are added to rows of `table_1`

not matching rows of `table_2`

.

Semi join

```
semi_join(table_1,
table_2,
by = "column")
```

Only keep rows of `table_1`

that match by the column specified to `table_2`

. No extra columns added.

`dplyr::bind_cols`

function

Binds data frames, by column, into a tibble

Sensitivity

$$\frac{TP}{TP + FN}$$

AKA **true positive rate (TPR)** or **recall**

Probability: \(P(\hat{Y} = 1|Y = 1)\)

`dplyr::union`

function

Unique rows between tables with the same column names

\(F_1\) score

\[\frac{1}{\frac{1}{2}\cdot\left(\frac{1}{\text{recall}} + \frac{1}{\text{precision}}\right)}\]

Harmonic average of *recall* and *precision*

AKA **balanced accuracy**

Precision

$$\frac{TP}{TP + FP}$$

AKA **positive predictive value (PPV)**

Probability: \(P(Y = 1|\hat{Y} = 1)\)

Full join

```
full_join(table_1,
table_2,
by = "column")
```

All columns of `table_1`

and `table_2`

that match by the column specified. NAs are added to unmatched rows.

Inner join

```
inner_join(table_1,
table_2,
by = "column")
```

Keep only rows matching by the column specified

`dplyr::intersect`

function

Common rows between tables with the same column names

`dplyr::setdiff`

function

Rows in the first table argument not present in the second table argument

Anti join

```
anti_join(table_1,
table_2,
by = "column")
```

Only keep rows of `table_1`

that *don't* match by the column specified to `table_2`

. No extra columns added.

Specificity

$$\frac{TN}{TN + FP}$$

AKA **true negative rate (TNR)**

Probability: \(P(\hat{Y} = 0|Y = 0)\)