often stores datasets. A special type of list that is more general than a matrix. A two dimensional array with columns of vectors of the same length, but possibly of different types. Can be created using the data.frame( ) command. Variables can be accessed using the $.
Back
cbind( )
Front
yields a matrix object with columns specified
Back
sum( )
Front
count of the elements meeting the condition
Back
character vector
Front
the most complex type of atomic vector. Each element is a string, which can be an arbitrary amount of data.
Back
rbind( )
Front
yields a matrix object with rows specified
Back
include=FALSE
Front
option that omits R code and results
Back
double vectors
Front
belongs to the numeric class; the default type of number in R.
Back
list
Front
several vectors having the same length, but possibly having different data types
Back
```{r ChunkName1}
```
Front
option that names code chunk
Back
typeof( )
Front
provides information about the underlying data structure of objects (i.e. logical, integer, double, complex, character, list).
Back
exists function
Front
conveys whether an object exists in the workspace
Back
logical vectors
Front
simplest type of atomic vector; three possible values: TRUE, FALSE, or NA
Back
library( )
Front
a function that will load an installed package
Back
recursive vectors
Front
another word for list, meaning that lists can contain other lists
Back
replicability
Front
the same person (or different people) getting the same results using different data
Back
attributes( )
Front
displays the attributes associated with an object
Back
rm command
Front
removes an object in the workspace
Back
*sentence*: the square root of 77 is `r sqrt(77)`
first <- "Foo"
last <- "Fu"
*sentence*: the bunny's name is paste(first, last)
Front
How to say "The square root of 77 is 8.775"
and/or
The bunny is named Foo Fu
Back
data science
Front
a combination of statistics, computer science, and mathematics that focuses on extracting meaningful information from data from a specific domain.
Back
install.packages( )
Front
a function used to download/install packages
Back
vector
Front
a one-dimensional array of items of the same data type (homogeneous)
Back
position
Front
(numerical) where in relation to other things?
Back
object
Front
something stored in R's memory; created using the <- or = assignment
Back
message=FALSE or warning=FALSE
Front
option that suppresses messages/warnings in output
Back
paste( )
Front
a calling sequence
Back
eval=FALSE
Front
option that does not run a code chunk, but displays the code
Back
reproducibility
Front
the same person (or different people) getting the same results using the same data
Back
results='hide' or fig.show='hide'
Front
option that runs a code chunk but suppresses output
Back
unlist( )
Front
a function that flattens (or makes a vector out of) the elements in a list; coerces objects into a common type.
Back
prompt=TRUE
Front
option that shows > prompt in front of commands
Back
[ ]
Front
extracts a sub-list
Back
angle
Front
(numerical) how wide? parallel to something else?
Back
str( )
Front
provides information about the structure of each object
Back
"Environment" tab
Front
shows the names (and values) of all the objects that exist in the current workspace
Back
length
Front
(numerical) how big (in one dimension)?
Back
collapse=TRUE
Front
option that compresses code and results in the chunk output
Back
data.frame
Front
Where data sets are often stored; special type of list that is more general than a matrix. A two-dimensional array with columns of vectors of the same length, but of possibly different types.
Back
Visual cues
Front
1) Position
2) Length
3) Angle
4) Direction
5) Shape
6) Area
7) Volume
8) Color
9) Shade
Back
matrix
Front
two dimensional vectors; rectangular objects where all entries have the same type; can be created using cbind( )
Back
$
Front
a shorthand for extracting named elements of list; similar to [ [ ] ] except you don't need to use quotes.
Back
echo=FALSE
Front
option that omits R code but not results
Back
integer vectors
Front
belongs to the numeric class; made by placing an "L" after the number
Back
class( )
Front
a function that returns the classes to which an object belongs to.
Back
comment=NULL
Front
option that places no comment ## in front of output
Back
[ [ ] ]
Front
extracts a single component from a list
Back
error=TRUE
Front
option that knits document even with code error present
Back
Taxonomy for data graphics
Front
1) visual cues
2) coordinate system
3) scale
4) context
Back
c( ) function
Front
concatenates scalars
Back
Section 2
(50 cards)
facet_wrap()
Front
used to facet the plot by a single variable
Back
select( )
Front
takes a subset of the columns specified
Back
filter( )
Front
takes a subset of the rows specified
Back
cartesian coordinate system
Front
the familiar (x, y) rectangular coordinate system with two perpendicular axes
Back
shade
Front
(either) to what extent? how severe?
Back
geom_density( )
Front
creates a density curve in ggplot
Back
key =
Front
the name of the variable in the narrow format that will identify each case individually and receive the values.
Back
ungroup()
Front
returns "focus" back to the entire data frame
Back
coord_quickmap( )
Front
sets the aspect ratio correctly for maps.
Back
categorical scale
Front
may have no ordering (e.g. Democrat, Republican) or be ordinal (Never a Smoker, Current Smoker, Former Smoker)
Back
color=
or
label=
Front
conveys information about the data by mapping the aesthetics in the plot to the variables in your dataset
Back
draw the figure
Front
draw this figure
Back
geom_histogram( )
Front
creates a histogram in ggplot
Back
group_by()
Front
change the scope, or unit of analysis, from individual cases to aggregated groups
Back
[!is.na()]
Front
omit NAs
Back
semi_join( )
Front
returns all rows from the first table where there are matching values in the second table, keeping just columns from first table.
Back
geographic coordinate system
Front
where we have locations on the curved surface of the earth. Trying to represent these locations on a two dimensional plane.
Back
geom_point( )
Front
adds a layer of points to the ggplot, which creates a scatterplot
Back
interval( )
Front
provides the time elapsed
Back
ylab(" ")
Front
creates a y axis title in ggplot
Back
full_join( )
Front
returns all rows and all columns from both the first table and the second table. Where there are not matching values, returns NA for the one missing. This is a mutating join.
Back
geom_boxplot( )
Front
creates a boxplot in ggplot
Back
layers
Front
adding a new layer on top of an existing data graphic. This new layer can provide context or comparison, but there's a limit to how many layers humans can reliably parse.
Back
lag(x)
Front
shows the time that has passed has passed since x
Back
geom_smooth( )
Front
setting the line type of a line
Back
numeric scale
Front
this quality is most commonly set on a linear, logarithmic, or percentage scale.
Back
color
Front
(either) to what extent? how severe?
Back
coord_flip( )
Front
switches the x and y axes.
Back
spread( )
Front
converts a data table from narrow to wide
Back
polar coordinate system
Front
the radial analog of the cartesian system with points identified by radius p and angle theta
Back
numeric scale
Front
the quality is most commonly set on a linear, logarithmic, or percentage scale
Back
xlab(" ")
Front
creates an x axis title in ggplot
Back
area
Front
(numerical) how big (in two dimensions)?
Back
volume
Front
(numerical) how big (in three dimensions)?
Back
anti_join( )
Front
returns all rows from the first table where there are NOT matching values in the second table, keeping just columns from first table.
Back
context
Front
titles/subtitles; helps the viewer make meaningful comparisons.
Back
geom_polygon()
Front
connects start and end points which fills a closed polygon shape
Back
scale
Front
translates values into visual cues
Back
facets
Front
a single data graphic that can be composed of several small multiples of the same basic plot, with one (discrete) variable changing in each of the small sub-image.
Back
mutate( )
Front
add or modify existing columns
Back
direction
Front
(numerical) at what slope? in a time series, going up or down?
Back
piping %>%
Front
allows you to put the output of one function into the input of another
Back
right_join( )
Front
rows of the second table that are always returned, regardless of whether there's a match in the first table. NAs inserted into columns where no matched data is found.
Back
summarize( )
Front
aggregates data across rows
Back
arrange( )
Front
sorts the rows specified
Back
shape
Front
(categorical) belonging to which group?
Back
geom_path()
Front
connects the dots between x (i.e latitude) and y (i.e. longitude) in a given group
Back
geom_bar( )
Front
adds bars to the ggplot, which creates a bar chart
rows of the first table that are always returned, regardless of whether there's a match in the second table. NAs inserted into columns where no matched data is found.
Back
Front
Back
x[location_start:location_end]
Front
how extract multiple entries from a vector (x)
Back
x$variable[ ]
or
x[[ ]][ ]
Front
how to extract an entry from a specific variable from x
Back
x[c(location_one, location_two)]
Front
how to extract certain entries from a vector (x)
Back
na.rm=TRUE
Front
removes NA values from the location specified
Back
as_datetime( )
Front
switch to date; compatible with lubridate
Back
parse_number( )
Front
drops non-numerics before/after first numeric value
EX: gradrate_2014 -> 2014
Back
which( )
Front
gives index (location) values of entries that satisfy the condition
Back
inner_join( )
Front
contains any rows/columns that have matches in both tables.
Back
value =
Front
are the variables in the wide format that can be be combined in the resulting narrow format
Back
x[location_end:location_start]
Front
how to reverse the order of entries in a vector (x)