STAT 201 MT 1 code

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/6

flashcard set

Earn XP

Description and Tags

3, 4

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

7 Terms

1
New cards

calculating the CI from a bootstrap distribution

knowt flashcard image

2
New cards

plotting a CI on a bootstrap distribution

knowt flashcard image

<img src="https://knowt-user-attachments.s3.amazonaws.com/1b909959-1d60-4a8b-b9d0-474abae939a3.png" data-width="100%" data-align="center" alt="knowt flashcard image"><p></p>
3
New cards

quantile()

used to calculate quantiles of a dataset; returns data points that divides the data set into equal sized groups

syntax:

quantile(x, prob)

x = a numeric vector containing your data

probs = a numeric vector of probabilities (0-1) for which you want to find the quantiles

knowt flashcard image

*if you do not specify the probs argument, R will calculate the quartiles by default; 0, 25, 75, 100

4
New cards

code for the normal approximation of sampling distribution 

knowt flashcard image

(this is for the sampling distribution of means, in this case sd = sigma/sqrt(n))

<img src="https://knowt-user-attachments.s3.amazonaws.com/a0282335-b122-454e-91d9-a53e4fb959e8.png" data-width="100%" data-align="center" alt="knowt flashcard image"><p>(this is for the sampling distribution of means, in this case sd = sigma/sqrt(n))</p>
5
New cards

plotting the normal distribution/bell curve

knowt flashcard image
  • geom_line : specifies that we want to add a line

  • data = tibble() : instead of using a pre-existing data frame, this code creates the data for the line (on the fly) using the tibble() function; this tibble has two columns:

    • values: a sequence of numbers that will serve as the x-coordinates for the curve 

    • density: calculates the corresponding y-coordinate for each value in the values column

  • dnorm(…) : calculates the height (probability density) of the normal distribution curve for each point in values

  • aes(values, density) : aesthetic mapping that tells geom_line() which columns to use for the plot: map the values column to the x-axis and the density column to the y-axis

*in short the code generates a series of (x, y) coordinates for a perfect normal curve with a specific mean and standard deviation and then connects them with a line to visualize the curve

<img src="https://knowt-user-attachments.s3.amazonaws.com/c5d09cda-e9e2-476c-a51a-601f618add54.png" data-width="100%" data-align="center" alt="knowt flashcard image"><ul><li><p>geom_line : specifies that we want to add a line</p></li><li><p>data = tibble() : instead of using a pre-existing data frame, this code creates the data for the line (on the fly) using the tibble() function; this tibble has two columns:</p><ul><li><p>values: a sequence of numbers that will serve as the x-coordinates for the curve&nbsp;</p></li><li><p>density: calculates the corresponding&nbsp;y-coordinate for each value in the values column </p></li></ul></li><li><p>dnorm(…) : calculates the height (probability density) of the normal distribution curve for each point in values </p></li><li><p>aes(values, density) : aesthetic mapping that tells geom_line() which columns to use for the plot: map the values column to the x-axis and the density column to the y-axis </p></li></ul><p></p><p>*in short the code generates a series of (x, y) coordinates for a perfect normal curve with a specific mean and standard deviation and then connects them with a line to visualize the curve</p><p></p>
6
New cards

qnorm()

calculates the quantile function for the normal distribution; give qnoem a probability (value between 0 and 1) and it gives back the corresponding z-score (the value on the x-axis of a standard normal curve)

knowt flashcard image

7
New cards

density vs frequency

use frequency (the default) to see how many observations are in each bin

use density to see the shape of the distribution and to compare it to a theoretical probability distribution

  • when using density (eg. freq = FALSE) the y-axis is rescaled so that the total area of all the bars in the histogram equals 1