R Programming Data Science

R Programming Data Science

memorize.aimemorize.ai (lvl 286)
Section 1

Preview this deck

paste0( )

Front

Star 0%
Star 0%
Star 0%
Star 0%
Star 0%

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Active users

3

All-time users

3

Favorites

0

Last updated

1 year ago

Date created

Mar 1, 2020

Cards (114)

Section 1

(50 cards)

paste0( )

Front

a calling sequence eliminating spaces

Back

factor

Front

a special type of vector for categorical data

Back

data frame

Front

often stores datasets. A special type of list that is more general than a matrix. A two dimensional array with columns of vectors of the same length, but possibly of different types. Can be created using the data.frame( ) command. Variables can be accessed using the $.

Back

cbind( )

Front

yields a matrix object with columns specified

Back

sum( )

Front

count of the elements meeting the condition

Back

character vector

Front

the most complex type of atomic vector. Each element is a string, which can be an arbitrary amount of data.

Back

rbind( )

Front

yields a matrix object with rows specified

Back

include=FALSE

Front

option that omits R code and results

Back

double vectors

Front

belongs to the numeric class; the default type of number in R.

Back

list

Front

several vectors having the same length, but possibly having different data types

Back

```{r ChunkName1} ```

Front

option that names code chunk

Back

typeof( )

Front

provides information about the underlying data structure of objects (i.e. logical, integer, double, complex, character, list).

Back

exists function

Front

conveys whether an object exists in the workspace

Back

logical vectors

Front

simplest type of atomic vector; three possible values: TRUE, FALSE, or NA

Back

library( )

Front

a function that will load an installed package

Back

recursive vectors

Front

another word for list, meaning that lists can contain other lists

Back

replicability

Front

the same person (or different people) getting the same results using different data

Back

attributes( )

Front

displays the attributes associated with an object

Back

rm command

Front

removes an object in the workspace

Back

*sentence*: the square root of 77 is `r sqrt(77)` first <- "Foo" last <- "Fu" *sentence*: the bunny's name is paste(first, last)

Front

How to say "The square root of 77 is 8.775" and/or The bunny is named Foo Fu

Back

data science

Front

a combination of statistics, computer science, and mathematics that focuses on extracting meaningful information from data from a specific domain.

Back

install.packages( )

Front

a function used to download/install packages

Back

vector

Front

a one-dimensional array of items of the same data type (homogeneous)

Back

position

Front

(numerical) where in relation to other things?

Back

object

Front

something stored in R's memory; created using the <- or = assignment

Back

message=FALSE or warning=FALSE

Front

option that suppresses messages/warnings in output

Back

paste( )

Front

a calling sequence

Back

eval=FALSE

Front

option that does not run a code chunk, but displays the code

Back

reproducibility

Front

the same person (or different people) getting the same results using the same data

Back

results='hide' or fig.show='hide'

Front

option that runs a code chunk but suppresses output

Back

unlist( )

Front

a function that flattens (or makes a vector out of) the elements in a list; coerces objects into a common type.

Back

prompt=TRUE

Front

option that shows > prompt in front of commands

Back

[ ]

Front

extracts a sub-list

Back

angle

Front

(numerical) how wide? parallel to something else?

Back

str( )

Front

provides information about the structure of each object

Back

"Environment" tab

Front

shows the names (and values) of all the objects that exist in the current workspace

Back

length

Front

(numerical) how big (in one dimension)?

Back

collapse=TRUE

Front

option that compresses code and results in the chunk output

Back

data.frame

Front

Where data sets are often stored; special type of list that is more general than a matrix. A two-dimensional array with columns of vectors of the same length, but of possibly different types.

Back

Visual cues

Front

1) Position 2) Length 3) Angle 4) Direction 5) Shape 6) Area 7) Volume 8) Color 9) Shade

Back

matrix

Front

two dimensional vectors; rectangular objects where all entries have the same type; can be created using cbind( )

Back

$

Front

a shorthand for extracting named elements of list; similar to [ [ ] ] except you don't need to use quotes.

Back

echo=FALSE

Front

option that omits R code but not results

Back

integer vectors

Front

belongs to the numeric class; made by placing an "L" after the number

Back

class( )

Front

a function that returns the classes to which an object belongs to.

Back

comment=NULL

Front

option that places no comment ## in front of output

Back

[ [ ] ]

Front

extracts a single component from a list

Back

error=TRUE

Front

option that knits document even with code error present

Back

Taxonomy for data graphics

Front

1) visual cues 2) coordinate system 3) scale 4) context

Back

c( ) function

Front

concatenates scalars

Back

Section 2

(50 cards)

facet_wrap()

Front

used to facet the plot by a single variable

Back

select( )

Front

takes a subset of the columns specified

Back

filter( )

Front

takes a subset of the rows specified

Back

cartesian coordinate system

Front

the familiar (x, y) rectangular coordinate system with two perpendicular axes

Back

shade

Front

(either) to what extent? how severe?

Back

geom_density( )

Front

creates a density curve in ggplot

Back

key =

Front

the name of the variable in the narrow format that will identify each case individually and receive the values.

Back

ungroup()

Front

returns "focus" back to the entire data frame

Back

coord_quickmap( )

Front

sets the aspect ratio correctly for maps.

Back

categorical scale

Front

may have no ordering (e.g. Democrat, Republican) or be ordinal (Never a Smoker, Current Smoker, Former Smoker)

Back

color= or label=

Front

conveys information about the data by mapping the aesthetics in the plot to the variables in your dataset

Back

draw the figure

Front

draw this figure

Back

geom_histogram( )

Front

creates a histogram in ggplot

Back

group_by()

Front

change the scope, or unit of analysis, from individual cases to aggregated groups

Back

[!is.na()]

Front

omit NAs

Back

semi_join( )

Front

returns all rows from the first table where there are matching values in the second table, keeping just columns from first table.

Back

geographic coordinate system

Front

where we have locations on the curved surface of the earth. Trying to represent these locations on a two dimensional plane.

Back

geom_point( )

Front

adds a layer of points to the ggplot, which creates a scatterplot

Back

interval( )

Front

provides the time elapsed

Back

ylab(" ")

Front

creates a y axis title in ggplot

Back

full_join( )

Front

returns all rows and all columns from both the first table and the second table. Where there are not matching values, returns NA for the one missing. This is a mutating join.

Back

geom_boxplot( )

Front

creates a boxplot in ggplot

Back

layers

Front

adding a new layer on top of an existing data graphic. This new layer can provide context or comparison, but there's a limit to how many layers humans can reliably parse.

Back

lag(x)

Front

shows the time that has passed has passed since x

Back

geom_smooth( )

Front

setting the line type of a line

Back

numeric scale

Front

this quality is most commonly set on a linear, logarithmic, or percentage scale.

Back

color

Front

(either) to what extent? how severe?

Back

coord_flip( )

Front

switches the x and y axes.

Back

spread( )

Front

converts a data table from narrow to wide

Back

polar coordinate system

Front

the radial analog of the cartesian system with points identified by radius p and angle theta

Back

numeric scale

Front

the quality is most commonly set on a linear, logarithmic, or percentage scale

Back

xlab(" ")

Front

creates an x axis title in ggplot

Back

area

Front

(numerical) how big (in two dimensions)?

Back

volume

Front

(numerical) how big (in three dimensions)?

Back

anti_join( )

Front

returns all rows from the first table where there are NOT matching values in the second table, keeping just columns from first table.

Back

context

Front

titles/subtitles; helps the viewer make meaningful comparisons.

Back

geom_polygon()

Front

connects start and end points which fills a closed polygon shape

Back

scale

Front

translates values into visual cues

Back

facets

Front

a single data graphic that can be composed of several small multiples of the same basic plot, with one (discrete) variable changing in each of the small sub-image.

Back

mutate( )

Front

add or modify existing columns

Back

direction

Front

(numerical) at what slope? in a time series, going up or down?

Back

piping %>%

Front

allows you to put the output of one function into the input of another

Back

right_join( )

Front

rows of the second table that are always returned, regardless of whether there's a match in the first table. NAs inserted into columns where no matched data is found.

Back

summarize( )

Front

aggregates data across rows

Back

arrange( )

Front

sorts the rows specified

Back

shape

Front

(categorical) belonging to which group?

Back

geom_path()

Front

connects the dots between x (i.e latitude) and y (i.e. longitude) in a given group

Back

geom_bar( )

Front

adds bars to the ggplot, which creates a bar chart

Back

x1 <- list(c(1, 2), c(3, 4)) x2 <- list(list(1, 2), list(3, 4)) x3 <- list(1, list(2, list(3)))

Front

create the following lists

Back

gather( )

Front

converts a data table from wide to narrow

Back

Section 3

(14 cards)

geom_text( )

Front

plots text on the ggplot

Back

left_join( )

Front

rows of the first table that are always returned, regardless of whether there's a match in the second table. NAs inserted into columns where no matched data is found.

Back

Front

Back

x[location_start:location_end]

Front

how extract multiple entries from a vector (x)

Back

x$variable[ ] or x[[ ]][ ]

Front

how to extract an entry from a specific variable from x

Back

x[c(location_one, location_two)]

Front

how to extract certain entries from a vector (x)

Back

na.rm=TRUE

Front

removes NA values from the location specified

Back

as_datetime( )

Front

switch to date; compatible with lubridate

Back

parse_number( )

Front

drops non-numerics before/after first numeric value EX: gradrate_2014 -> 2014

Back

which( )

Front

gives index (location) values of entries that satisfy the condition

Back

inner_join( )

Front

contains any rows/columns that have matches in both tables.

Back

value =

Front

are the variables in the wide format that can be be combined in the resulting narrow format

Back

x[location_end:location_start]

Front

how to reverse the order of entries in a vector (x)

Back

x[row number, "column name"]

Front

how to extract a specific entry from a source (x)

Back