1/33
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Categorical data is
Qualitative: often binary indicators
Ordinal: order is meaningful but differences aren’t
Numerical (Quantitative) data is
Cardinally meaningful, i.e. discrete or continuous
The support of a variable is
The set of values it can take
Multivariate variables are
The list of measured features when we measure several features of an object
Cross-sectional data sets have
One observation per unit
E.g. data on one attribute measured in N people a cross-section dataset would be indicated as

A time series is
A series of data points indexed in time order
A time series on a multivariate variable is
A single observation
The function for discrete variables is a
Probability Mass Function
The function for continuous variables is a
Probability Density Function
A PDF for a continuous distribution with a = 1.5 and b = 2 looks like

A PMF looks like

The formal definition of a PDF is

What are the parameters of the binomial distribution
N and p
What does the PMF of a binomial distribution with N = 10 and p = 0.5 look like relative to N = 15 and N = 20 (same p)

All PDFs have the properties
They are non-negative: f (x) ≥ 0
The area underneath them is equal to one: ∫ f(x)dx = 1
The PDF for X~U (a, b) is
f(x) = 1/(b-a) if a≤x≤b
=0 otherwise
The cumulative distribution function (CDF) is
A function giving the probability that the variable X is less than or equal to some value x
The CDF is defined mathematically as
F(x) = P(X≤x) for all x in the support of X
For a variable X~U(a, b) the CDF is

For any CDF

In the CDF, the 1st quartile is the value such that
P(X ≤ x) = 0.25
The probability that the value of a Standard Normal variable would lie within ±1.645 of its central location (μ = 0) is
0.9
The probability that the value of a Standard Normal variable would lie within ±1.96 of its central location (μ = 0) is
0.95
The probability that the value of a Standard Normal variable would lie within ±2.58 of its central location (μ = 0) is
0.99
if X ∼ N(μ, σ^2) and Z = (X−μ)/σ then
Z ∼ N(0, 1)
Suppose X ∼ N(μ, σ^2) and further suppose that we want to know P(X ≤ x). What value do we look up in the statistical table?

The PDF of a variable x, given the CDF, is

The joint CDF is
F (x, y) = P (X ≤ x, Y ≤ y)
The joint PMF is
f (x, y) = P (X = x, Y = y)
The joint PDF is

The marginal PDF of x, f(x), is
f(x, whatever the value of y)
The PDF of a variable x, given the CDF, is

If X and Y are independent then the joint PDF is
The product of the marginals. f (x, y) = f (x) g (y)
If X and Y are independent then the joint CDF is
The product of the marginals. F (x, y) = F (x) G (y)