1/28
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
random variable
a function that assigns a numerical value to each point in the sample space
describes the uncertain outcomes of a random process
denoted by x
probability distribution
lists the possible outcome (x) for a random variable (X) + their associated probabilities
discrete
a random variable that takes on one of a list of possible values (counts)
continuous
a random variable that takes on any value in an interval
mean
weighted sum of possible values with probabilities as weights
denoted by μ
also referred to as the expected value of X or E(X)
standard deviation
σ = SD(X) = root(var(X))
add a constant to X → X ± c
E(X ± c) = E(X) ± c
Var(X ± c) = Var(x)
SD(X ± c) = SD(x)
multiply x by a constant → cX
E(cX) = cE(X)
SD(cX) = |c|SD(X)
Var(cX) = c2Var(X)
addition and multiplication rules
E(a+bX) = a + bE(X)
SD(a + bX) = |b|SD(X)
Var(a + bX) = b2Var(X)
joint probabilities
NOT independent events
need joint probability distribution that gives probabilities for events of the form (X=x, Y=y)
independence relationship
two random variables are independent if + only if the joint probability distribution is the product of the marginal distributions → P(x,y) = P(x)P(y) for all x,y
expected value relationship
the expected value of a product of independent random variables is the product of their expected values → E(xy) = E(x)E(y)
variance relationship
the variance of the sum of independent random variables is teh sum of their expected values → Var(x+y) = Var(x) + Var(y)
*not necessarily the sum of the variances
expected value relationship
the expected value of a sum of random variables is the sum of their expected values → E(x+y) = E(x)+E(y)
covariance
the covariance between random variables is the expected value of the product of deviators from the means → Cov(x,y) = E((x-μx)(y-μy)
correlation
the correlation between two random variables is the covariance divided by the product of standard deviations
Corr(x,y) = (Cov(x,y))/((σx)(σy))
standardized measure of the association between two random variables
denoted by ρ (rho)
always between -1 and 1
bernoulli
0 or 1 (failure or success)
E(B) = p // Var(B) = p(1-p)
fixed probability of success (p)
independence
binomial
y, the sum of iid bernoulli random variables, is a binomial random variable; fixed number of trials (n), fixed number of successes (p) (independence)
y= number of successes in n bernoulli trials (each trial with probability of success p)
P(Y=y) = (n!)/((n-y)!y!) * py(1-p)n-y
E(Y)=np // Var(Y) = np(1-p)
poisson
number of events in interval
describes the number of events determined by a random process during an interval of time or space (couonts)
E(X) = λ // Var(X) = λ
P(X=x) = e-λ*(λx/x!)
independently + identically distributed (iid)
random variables that are independent of each other + share a common probability distribution are said to be independent + identically distributed
if n random variables are iid with mean μx and standard deviation σx
continuous random variables
counts don’t work for everything
prices, costs, revenue, etc. ($), percent change, other measurement data
uniform random variables
lower bound (a), upper bound (b)
equally likely to be any value in between a + b
E(X) = (b-a)/2
Var(X) = (b-a)2/12
normal random variables
visualizing data: continuous range of values (histrogram, no skew)
defined by parameters μ + σ2 (smaller σ2 = narrow)
is continuous + can assume any value in an interval
standard normal
Normal (μ=0, σ2=1) = probability density function (pdf)
values more likely to be closer to mean
entire area under the curve = 1
z-scores
for any value from the given normal distribution, we can convert it to a value from the standard normal distribution
z= (x-μ)/σ
multimodaility
more than one mode suggests data comes from distinct groups
skewness=lack of symmetry
outliers = unusual extreme values
parameter
a characteristic of the population
statistic
an oberseved characteristic of a sample
central limit theorem (CLT)
tells us that if n is sufficiently large, the distribution of sample means is normally distributed