R CODE

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/67

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

68 Terms

New cards

example()

example on how to use the function

New cards

provides documentation on a specific function or dataset that are part of a package

New cards

←

assignment operator

— assigning value to an object / variable

New cards

ls()

check how many objects you have created / available

New cards

rm()

removes objects that you don’t need

New cards

seq()

creates a sequence of numbers

New cards

length.out= argument

sets a desired length of the sequence

New cards

library()

loads packages

New cards

every time a new R session is open

when to use the library () function?

New cards

int

stands for integers

EX]: 1,2,3

New cards

dbl

stands for doubles or real numbers

EX]: -1, 1.5, 4/5

New cards

continuous

measured data

can have infinite values within a possible range

EX]: i am 3.1” tall and i weigh 34.16 grams

New cards

discrete

observations can only exist at limited values, often counts

EX]: i have 8 legs and 4 spots

New cards

date

stands for dates

EX]: (01/21/2025)

New cards

dttm

stands for date-times, a date + a time

EX]: (01/21/2025 11:00am)

New cards

fctr

stands for factors

— R uses to represent categorical variables with fixed possible values

EX]: freshman, sophomore, junior, senior

New cards

lgl

stands for logical

— vectors that contain only TRUE or FALSE values

New cards

chr

stands for characters

— vectors or strings

EX]: “this is a string”

New cards

nominal

unordered descriptions

EX]: “i’m a turtle” and “i’m a butterfly”

New cards

ordinal

ordered descriptions

EX]: “i am unhappy” and “i am awesome”

New cards

binary

only 2 mutually exclusive outcomes

New cards

vector

the simplest data structure in R which consists of an ordered set of values of the same type (e.g. numeric, charcater, data, etc)

New cards

scalar

a vector of length 1

New cards

data frame / tibble / dataset

data structure that organizes data into a 2-dimensional table of rows and columns, much like a spreadsheet.

New cards

glimpse()

get a sense of all the columns and their content

New cards

str()

get a more detailed sense of columns and all their contends

New cards

colnames()

know the columns name of your dataset as a list

New cards

functions to get to know your data

glimpse()
str()
colnames()
?

New cards

the 5 + 1 data manipulations

arrange ()
filter()
select ()
mutate()
summarize()

— group_by()

New cards

arrange()

reorder / sort observations / rows

New cards

filter()

keep observations / rows based on conditions

New cards

select()

pick variables / columns

New cards

mutate()

create new variables / columns or update existing ones

New cards

summarize()

produce descriptive statistics

New cards

group_by()

change the unit of analysis by creating groups based on one of more variables / columns

New cards

ends_with()

select function

matches names that end with whatever is in the ()

New cards

starts_with()

select function

matches names that start with whatever is in the ()

New cards

contains()

select function

matches names that contains whatever is in the ()

New cards

transmute()

used to compute the new column only

New cards

na.rm=T argument

critical if the column you are using for your average contains missing values

New cards

can be used to rewrite multiple operations. think of it like reading “then”

New cards

measures of location

mean() and median()

New cards

n()

includes a count

New cards

mean()

the sum divided by the length

New cards

median()

a value where 50% of () is above it and 50% is below it

New cards

measures of spread

sd()

New cards

sd()

the root mean squared deviation

New cards

measures of rank

min() and max()

New cards

min()

identifies the smallest value

New cards

max()

identifies the largest value

New cards

data= argument

adds in the dataset to use in the graph, so data is loaded in the background

New cards

mapping=aes()

maps what variables we want to visualize on our axes

New cards

geom

determines the visual structure / shape of the chart

New cards

aesthetics

color / fill
size
alpha (transparency)
shape

New cards

choosing the right chart depends on…

the data type of the columns: is the data numerical or categorial?
the objective of the chart: what is trying to be conveyed with your chart?

New cards

distribution chart

shows how values in a dataset are spread out or clustered.

highlights the range of data, concentration of data points, and whether data tends to be skewed towards specific values.

EX]: histograms, boxplots, violin, and density plots

New cards

correlation chart

used to examine the relationship between two (or more) numerical variables.

! correlation does not imply causation

EX]: scatter plot, smoothing lines, 2d charts, heatmaps, and correlograms.

New cards

ranking chart

displays how different categories (categorical variables) compare in terms of a certain measure

EX]: bar charts, lollipop charts, dot plots

New cards

evolution chart / time-series chart

shows how a variable changes over time

highlights trends, patterns, seasonality, or fluctuations over a period

EX]: line chart

New cards

geom_col()

use to show the categorical variable with respect to a numerical variable or if you want to use two variables (x and y)

New cards

geom_bar()

only used to show one variable (x or y)

New cards

static / local

if you put an aesthetic in the geom function, it is…

just changes the aesthetic

New cards

dynamic / global

if you put an aesthetic in the mapping=aes(), it is…

assigns an aesthetic to an x or y variable and automatically includes a legend

New cards

facet_wrap()

charts by a single variable

New cards

facet_grid()

chart by the combination of two variables

New cards

linetype= argument

adds different line styles (solid, dashed, etc) to differentiate

New cards

se=F argument

removes the gray area in the geom_smooth() function or displays the confidence interval around the smooth

New cards

method=”lm” argument

makes the lines linear in a geom_smooth function