1/19
Flashcards covering key concepts in R programming for data science.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
R
An open-source programming language specifically designed for statistical computing and data visualization.
Primary purposes of R in data science
Data manipulation, statistical modeling, and graphical representation.
Atomic vectors
The simplest data structures in R; they are homogeneous.
Types of atomic vectors
Logical, Integer, Double, Character, Complex, Raw.
List in R
A versatile data structure that can contain elements of different types.
Difference between a list and an atomic vector
Unlike atomic vectors, lists allow for heterogeneous elements.
Data frame in R
A table-like structure where each column can contain different types of data.
Matrix in R
A two-dimensional array that can only contain one data type throughout.
Packages in R
Collections of R functions, data, and compiled code bundled together.
How to install and load packages in R
They are installed using the install.packages("package_name")
command and loaded using library(package_name)
.
ggplot2
R package used for creating high-quality, customizable visualizations using the grammar of graphics framework.
How ggplot2 improves data visualization
Allows users to construct plots layer by layer, making the visualizations more intuitive and informative.
dplyr
An R package designed for data manipulation.
Common dplyr functions
Functions like filter()
, select()
, mutate()
, and summarise()
help streamline data analysis workflows.
RStudio
An integrated development environment (IDE) for R that provides a user-friendly interface.
Features of RStudio
It includes features like a script editor, console, environment viewer, and file browser.
R Markdown
A file format that allows users to combine narrative text, code, and outputs in a single document.
Use of R Markdown
Creating dynamic reports, presentations, and reproducible research.
Tidyverse
A collection of R packages that share an underlying philosophy and common data structures for data science.
Key packages in the Tidyverse
ggplot2, dplyr, tidyr, readr, purrr, and tibble.