 # Statistics Programming

UCLA STATS 202A

Scott Mueller (lvl 17)
Unsectioned

### Preview this deck

Left join

Front     ### 0.0

0 reviews

 5 0 4 0 3 0 2 0 1 0

Active users

4

All-time users

5

Favorites

0

Last updated

11 months ago

Date created

Sep 30, 2020

## Cards(16)

Unsectioned

(16 cards)

Left join

Front
left_join(table_1,
table_2,
by = "column")

Add columns of table_2 to table_1 that match by the column specified. NAs are added to rows of table_2 not matching rows of table_1.

Back

dplyr::bind_rows function

Front

Binds data frames, by row, into a tibble

Back

dplyr::setequal function

Front

Returns whether the data frames are equal, regardless of row order

Back

Right join

Front
right_join(table_1,
table_2,
by = "column")

Add columns of table_1 to table_2 that match by the column specified. NAs are added to rows of table_1 not matching rows of table_2.

Back

Semi join

Front
semi_join(table_1,
table_2,
by = "column")

Only keep rows of table_1 that match by the column specified to table_2. No extra columns added.

Back

dplyr::bind_cols function

Front

Binds data frames, by column, into a tibble

Back

Sensitivity

Front

$$\frac{TP}{TP + FN}$$

AKA true positive rate (TPR) or recall

Probability: $$P(\hat{Y} = 1|Y = 1)$$

Back

dplyr::union function

Front

Unique rows between tables with the same column names

Back

$$F_1$$ score

Front

$\frac{1}{\frac{1}{2}\cdot\left(\frac{1}{\text{recall}} + \frac{1}{\text{precision}}\right)}$

Harmonic average of recall and precision

AKA balanced accuracy

Back

Precision

Front

$$\frac{TP}{TP + FP}$$

AKA positive predictive value (PPV)

Probability: $$P(Y = 1|\hat{Y} = 1)$$

Back

Full join

Front
full_join(table_1,
table_2,
by = "column")

All columns of table_1 and table_2 that match by the column specified. NAs are added to unmatched rows.

Back

Inner join

Front
inner_join(table_1,
table_2,
by = "column")

Keep only rows matching by the column specified

Back

dplyr::intersect function

Front

Common rows between tables with the same column names

Back

dplyr::setdiff function

Front

Rows in the first table argument not present in the second table argument

Back

Anti join

Front
anti_join(table_1,
table_2,
by = "column")

Only keep rows of table_1 that don't match by the column specified to table_2. No extra columns added.

Back

Specificity

Front

$$\frac{TN}{TN + FP}$$

AKA true negative rate (TNR)

Probability: $$P(\hat{Y} = 0|Y = 0)$$

Back