Statistics Programming

Statistics Programming

UCLA STATS 202A

Scott Mueller (lvl 19)
Unsectioned

Preview this deck

Left join

Front

Star 0%
Star 0%
Star 0%
Star 0%
Star 0%

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Active users

6

All-time users

7

Favorites

0

Last updated

4 years ago

Date created

Sep 30, 2020

Cards (16)

Unsectioned

(16 cards)

Left join

Front
left_join(table_1,
		  table_2,
		  by = "column")

Add columns of table_2 to table_1 that match by the column specified. NAs are added to rows of table_2 not matching rows of table_1.

Back

dplyr::bind_rows function

Front

Binds data frames, by row, into a tibble

Back

dplyr::setequal function

Front

Returns whether the data frames are equal, regardless of row order

Back

Right join

Front
right_join(table_1,
		   table_2,
		   by = "column")

Add columns of table_1 to table_2 that match by the column specified. NAs are added to rows of table_1 not matching rows of table_2.

Back

Semi join

Front
semi_join(table_1,
		  table_2,
		  by = "column")

Only keep rows of table_1 that match by the column specified to table_2. No extra columns added.

Back

dplyr::bind_cols function

Front

Binds data frames, by column, into a tibble

Back

Sensitivity

Front

$$\frac{TP}{TP + FN}$$

AKA true positive rate (TPR) or recall

Probability: \(P(\hat{Y} = 1|Y = 1)\)

Back

dplyr::union function

Front

Unique rows between tables with the same column names

Back

\(F_1\) score

Front

\[\frac{1}{\frac{1}{2}\cdot\left(\frac{1}{\text{recall}} + \frac{1}{\text{precision}}\right)}\]

Harmonic average of recall and precision

AKA balanced accuracy

Back

Precision

Front

$$\frac{TP}{TP + FP}$$

AKA positive predictive value (PPV)

Probability: \(P(Y = 1|\hat{Y} = 1)\)

Back

Full join

Front
full_join(table_1,
		  table_2,
		  by = "column")

All columns of table_1 and table_2 that match by the column specified. NAs are added to unmatched rows.

Back

Inner join

Front
inner_join(table_1,
		   table_2,
		   by = "column")

Keep only rows matching by the column specified

Back

dplyr::intersect function

Front

Common rows between tables with the same column names

Back

dplyr::setdiff function

Front

Rows in the first table argument not present in the second table argument

Back

Anti join

Front
anti_join(table_1,
		  table_2,
		  by = "column")

Only keep rows of table_1 that don't match by the column specified to table_2. No extra columns added.

Back

Specificity

Front

$$\frac{TN}{TN + FP}$$

AKA true negative rate (TNR)

Probability: \(P(\hat{Y} = 0|Y = 0)\)

Back