1/78
These flashcards encompass essential concepts and functions related to data visualization and manipulation in R, aiming to help the student review for their midterm exam.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
The function to import comma-separated data in R is __.
read_csv("file.csv")
To export a tibble as CSV, you would use __.
write_csv(df, "out.csv")
The tidy data principle states that each __ corresponds to one column.
variable
Use the pipe %>% to __ multiple steps together.
chain
To check for any missing values in a data frame, you can use __.
any(is.na(df))
In data inspection, the function __ provides a compact structure view.
glimpse(df)
To convert a character variable to a factor, the function used is __.
as_factor(x)
In dplyr, the function __ is used to remove duplicate rows.
distinct()
To sort rows in a data frame, you would use __.
arrange()
The function __ summarizes multiple columns at once.
summarize(across())
To create a new variable based on an existing one, you can use __.
mutate()
The core function to count frequency of values in dplyr is __.
count(var)
The command __ reshapes data from wide format to long format.
pivot_longer()
To fill missing values in a variable, use __.
replace_na(list(var = value))
The function __ allows joining two data frames using a common key.
left_join()
A rectangular shape storing both geometry and attributes in spatial data is called __.
a shapefile.
In ggplot2, to add a regression line, you use __.
geom_smooth(method = "lm")
Shapefile projection can be changed using __.
st_transform(object, crs = 4326)
To visualize distribution by category, you would typically use __.
geom_boxplot()
The function __ is used to handle multiple conditions in data transformation.
case_when()
The function to import comma-separated data in R is __.\n\n
read_csv("file.csv")\n\n
To export a tibble as CSV, you would use __.\n\n
write_csv(df, "out.csv")\n\n
The tidy data principle states that each __ corresponds to one column.\n\n
variable\n\n
Use the pipe %>% to __ multiple steps together.\n\n
chain\n\n
To check for any missing values in a data frame, you can use __.\n\n
any(is.na(df))\n\n
In data inspection, the function __ provides a compact structure view.\n\n
glimpse(df)\n\n
To convert a character variable to a factor, the function used is __.\n\n
as_factor(x)\n\n
In dplyr, the function __ is used to remove duplicate rows.\n\n
distinct()\n\n
To sort rows in a data frame, you would use __.\n\n
arrange()\n\n
The function __ summarizes multiple columns at once.\n\n
summarize(across())\n\n
To create a new variable based on an existing one, you can use __.\n\n
mutate()\n\n
The core function to count frequency of values in dplyr is __.\n\n
count(var)\n\n
The command __ reshapes data from wide format to long format.\n\n
pivot_longer()\n\n
To fill missing values in a variable, use __.\n\n
replace_na(list(var = value))\n\n
The function __ allows joining two data frames using a common key.\n\n
left_join()\n\n
A rectangular shape storing both geometry and attributes in spatial data is called __.\n\n
a shapefile.\n\n
In ggplot2, to add a regression line, you use __.\n\n
geom_smooth(method = "lm")\n\n
Shapefile projection can be changed using __.\n\n
st_transform(object, crs = 4326)\n\n
To visualize distribution by category, you would typically use __.\n\n
geom_boxplot()\n\n
The function __ is used to handle multiple conditions in data transformation.\n\n
case_when()\n\n
To select a subset of rows based on conditions, you would use __.\n\n
filter()\n\n
The function to choose specific columns in a data frame is __.\n\n
select()\n\n
To perform grouped operations, first use __.\n\n
group_by()\n\n
To change the name of a column, use __.\n\n
rename()\n\n
To create a scatter plot in ggplot2, you would use __.\n\n
geom_point()\n\n
The command __ reshapes data from long format to wide format.\n\n
pivot_wider()\n\n
To remove rows containing any missing values, use __.\n\n
drop_na()\n\n
The function to import comma-separated data in R is ****\\.\n\n
read\_csv(\"file.csv\")\n\n
To export a tibble as CSV, you would use ****\\.\n\n
write\_csv(df, \"out.csv\")\n\n
The tidy data principle states that each ****\\ corresponds to one column.\n\n
variable\n\n
Use the pipe %>% to ****\\ multiple steps together.\n\n
chain\n\n
To check for any missing values in a data frame, you can use ****\\.\n\n
any(is.na(df))\n\n
In data inspection, the function ****\\ provides a compact structure view.\n\n
glimpse(df)\n\n
To convert a character variable to a factor, the function used is ****\\.\n\n
as\_factor(x)\n\n
In dplyr, the function ****\\ is used to remove duplicate rows.\n\n
distinct()\n\n
To sort rows in a data frame, you would use ****\\.\n\n
arrange()\n\n
The function ****\\ summarizes multiple columns at once.\n\n
summarize(across())\n\n
To create a new variable based on an existing one, you can use ****\\.\n\n
mutate()\n\n
The core function to count frequency of values in dplyr is ****\\.\n\n
count(var)\n\n
The command ****\\ reshapes data from wide format to long format.\n\n
pivot\_longer()\n\n
To fill missing values in a variable, use ****\\.\n\n
replace\_na(list(var = value))\n\n
The function ****\\ allows joining two data frames using a common key.\n\n
left\_join()\n\n
A rectangular shape storing both geometry and attributes in spatial data is called ****\\.\n\n
a shapefile.\n\n
In ggplot2, to add a regression line, you use ****\\.\n\n
geom\_smooth(method = \"lm\")\n\n
Shapefile projection can be changed using ****\\.\n\n
st\_transform(object, crs = 4326)\n\n
To visualize distribution by category, you would typically use ****\\.\n\n
geom\_boxplot()\n\n
The function ****\\ is used to handle multiple conditions in data transformation.\n\n
case\_when()\n\n
To select a subset of rows based on conditions, you would use ****\\.\n\n
filter()\n\n
The function to choose specific columns in a data frame is ****\\.\n\n
select()\n\n
To perform grouped operations, first use ****\\.\n\n
group\_by()\n\n
To change the name of a column, use ****\\.\n\n
rename()\n\n
To create a scatter plot in ggplot2, you would use ****\\.\n\n
geom\_point()\n\n
The command ****\\ reshapes data from long format to wide format.\n\n
pivot\_wider()\n\n
To remove rows containing any missing values, use ****\\.\n\n
drop\_na()\n\n
To initialize a ggplot object, defining the default dataset and aesthetic mappings, you would use ****\\.\n\n
ggplot()\n\n
Inside ggplot2, to map variables to visual aesthetics (like x, y, color), the function used is ****\\.\n\n
aes()\n\n
To separate a plot into subplots based on one or more discrete variables in ggplot2, you can use ****\\ or ****\\.\n\n
facet\wrap() or facet\grid()\n\n
To add a main title, subtitle, captions, or axis labels to a ggplot visualization, you would use ****\\.\n\n
labs()\n\n
In dplyr, to select rows by their integer position, you would use the function ****\\.\n\n
slice()\n\n