1/198
Evil evil evil
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Pointwise Convergence
X_n (w) -> X (w) for all w
Almost Sure Convergence (P)
Xn -> a.s X iff P(Xn -> X) =1
Almost Sure Convergence (A)
Xn -> a.s X iff Xn (w) -> x (w) for all w in F in A where P(F)=1
Convergence in Probability
Xn-> P X iff for all e>0, P(||xn-x||>e) -> 0
Convergence in Lr for some r>0
Xn->Lr X iff E[||Xn-X||r] -> 0
Convergence in Distribution
Xn-> d X iff Fn(x) -> F(x) for all x where F is continuous
Weak Law of Large Numbers
As n -> infinity, Xbar -> p u
Subset of convergence in probability
Xn -> p X iff Xnk -> p Xk
Subset of convergence almost surely
Xn -> a.s X iff Xnk -> a.s Xk
Subset of convergence in Lr
Xn -> Lr X iff Xnk -> Lr Xk
Difference between convergence in probability and convergence almost surely
"Xn -> a.s X iff P(union {||Xn -X||} as n approaches infinity, for all e>0 After a sample n, all of the draws are withing the epsilon bound."
Convergence almost surely and convergence in probability
Convergence almost surely implies convergence in probability
Convergence in Lr and convergence in probability
Convergence in Lr implies convergence in probability
Convergence in probability and convergence in distribution
Convergence in probability implies convergences in distribution
Order statistics
"X(n) is the nth order statistic = max (X1…Xn) X(1) is the smallest order statistic = min (X1…Xn)"
Almost sure convergence (Sum)
Xn -> a.s X iff for any e>0, Sum P(||Xn-X||>e) < infinity
Convergence in probability and convergence almost surely ( subsequences)
If Xn -> p X then there is a subsequence Xnk that converges to X almost surely
Lebesgue Dominated Convergence (a.s)
Xn -> a.s X , |Xn| <= Y in L1, for all n then lim(E[Xn]) = E[lim(Xn)] =E[X]
Lebesgue Dominated Convergence (p)
Xn -> p X , |Xn| <= Y in Lw for all n then lim(E[Xn]) = E[lim(Xn)] =E[X]
Convergence in probability (E)
Xn -> p X iff E[||Xn-x||]/(1+||Xn-X|| -> 0
Levy's continuity theorem
"Xn -> d X iff cxn(u) -> cx(u) for all u convergence in distributions is equivalent to the convergence of their characteristic functions."
Cramer Wall Device
Xn -> d X iff aTXn -> d aTX for all a in Rp
Exception to the Cramer wall device
If a is a constant vector and Xn ->d a then that implies Xn -> p a
Rookie mistake (converges to n)
Don't ever write that Xn -> ab+n because n is changing, so that can't happen
Central Limit Theorem in R
"If Xn has finite mean and variance then, sqrt(n) (Xbar -u) -> d N(0,sigma^2)"
Central Limit Theorem in Rp
"If Xn is iid in Rp with finite mean and variance then, sqrt(n) (xbar-u) -> d Np(0, sigma matrix)"
Kth moment
If K in N, the kth moment of a random variable X is defined as uk= E[x^k]
Sample Kth moment
uhatk=1/n sum(xi^k)
Slutsky's 1 for almost sure convergence and convergence in probability
"Assume Xn -> p X, then
1) if Rp -> Rd is measurable and X is in the set of points where f is continous then, f(Xn) -> p, a.s f(X)"
Slutsky's 2 for almost sure convergence and convergence in probability
If Yn is another sequence and Xn-Yn -> p 0, then Yn-> p Xn, Yn-> a.s Xn
Slutsky's 3 for almost sure convergence and convergence in probability
Suppose Zn -> Z then (Xn, Zn) -> p (X, Z) and (Xn, Zn) -> a.s (X,Z)
Asymptotically equivalent
Xn-Yn -> p 0
Coefficient of variation
v=sigma/u
Slutsky's 1 for convergence in distribution
"Assume Xn -> d X, then
1) if Rp -> Rd is measurable and X is in the set of points where f is continous then, f(Xn) -> d, a.s f(X)"
Slutsky's 2 for convergence in distribution
If Yn is another sequence and Xn-Yn→ 0 then Yn → d Xn
Slutsky's 3 for convergence in distribution
"If Zn is in Rd and Zn -> d c ( a constant) then (Xn, Zn) -> d (X, c) This only happens if c is a constant"
Cramer's Delta Method 1D
"if f is continuously differentiable, on a small area around the mean, then sqrt(n) (f(xn)-f(u)) ->d Np(0, Df(u)Sigma Matrix Df(u)T Where Df(u) is the Jacobian matrix of f centered at u."
Central Kth moment
u'k= E[(X-u)^k]
Skewness
g1=u'3/sigma^3
Kurtosis
g2= (u'4/sigma^4)-3
Second Order Cramer's Delta Method
"f is continuously differentiable twice around u, and Df(u)=0, then n(f(Xn)-f(u))-> d (xTD^2f(u)x)/2 where D^2f(u) is the hessian of f centered at u."
Wishart Distribution
If X1 …Xn are iid N (0,1) then X1TX1 + … + XnTXn ~ Wishart p(n, sigma)
Xbar (Sample Mean)
1/n sum(xi)
Sample Median
argmin theta (sum(|xi-theta|)
Interpoint distance
The distance between any two points. (xi-xj)^2
Distributions of S^2 and xbar
Assume X1 … Xn iid with finite mean and variance, then E[xbar]=u , var(xbar)=sigma^2/n, E[sigma^2] =sigma^2, var(S^2)=1/n (u'4-(n-3/n-1)sigma^4)
Exponential Family
"A pdf or pmf belongs to the exponential family ifff(y;theta) =h(y)exp{b(theta)a(y)+c(theta)} =c(theta) h(x) exp{sum(ti(x)T wi(theta))}"
Sampling Distribution (general)
The distribution of theta over xi
Sampling Distribution of Xbar
Xbar ~N(mu, sigma^2/n)
Sampling distribution of S^2
"1) (n-1)/sigma^2 S^2~ Chi^2 n-1
2) E[S^2]=sigma^2
3) var(S^2) =2sigma^4/n-1"
Chi^2 distribution
The sum of normal random variables squared
Chi^2 pdf
1/(2^(k/2)gamma(k/2)) x^(k/2)-1exp{-x/2}
tn distribution
"if Z is a standard normal random variable, and X is distributed Chi^2 n, then Z/sqrt(x/n)~tn"
tn pdf
1/(1+t^2/n)^(n+1)/2
Fmn distribution
The ratio of chi squared random variables, X/m/Y/n
Fmn pdf
chi squared m/m/chi squared n/n
Which distribution is used for inference on sigma^2
Chi squared
Which distribution is used for inference on mu
t
Which distribution is used for inference on comparing to sigma^2
F
Median
The value such that P(x>=median)>=1/2 and P(x
Is the median unique
No
Sample median (n odd)
X(n+1/2)
Sample median (n even)
(X(n/2)+X(n/2+1))/2
Sample Range
X(n)-X(1)
nth order CDF
fx(x)Fx(x)^(j-1)(1-Fx(x))^(n-j)
Beta Distribtuion pdf
z^(a-1)(1-x)^B-1
Expectation for a beta
a/a+B
CDF for joint order statistics
fx(x)fy(y)Fx(x)^(i-1)(Fx(y)-Fx(x))^(j-i-1)(1-Fx(y))^(n-j)
Conditional pdf for order statistics
fx(i),x(j)(x,y)/fx(j)(y)
Statistic
A statistic is any function of the data that can be computed without knowing any parameters theta
Sufficient Statistic
T(x) is a sufficient statistic for theta iff X|T(x) does not depend of theta.
Sufficiency Principal
MLEs and Bayesian estimators are sufficient
Sufficiency with PDFs
If fx(x;theta)/fT(T(x);theta) does not depend on theta, then T(x) is sufficient
Factorization Criteria
If the PDF of PMF of X can be written as h(x)g(T(x);theta) then T(x) is sufficient for theta
Functions of sufficient statistics
g(T(x)) is also a sufficient statistic iff g() is a one to one function
Ancillary Statistic
"A(x) is an ancillary statistic iff its distribution does not depend on theta. A(x)~ F(not theta)"
Sufficient and Ancillary statistics still have to actually be statistics
They can't depend on unknown parameters
Location family
If the CDF is of the form F(X-theta)
Scale family
If the CDF is of the form F(X\theta)
Location-Scale Family
If the CDF is of the form F(x-mu/sigma)
Likelihood
"If X has a PMF or PDF, the likelihood function is L(theta;x)=fx(x;theta) =product(fx(x)) for all the Xs"
Likelihood Principal
Data samples with proportional likelihoods should lead to the same inference
MoMs
"1)Find the specified moment 2) Set equal to the thing you want 3) solve for the parameter"
Are MoMs unique
No
MLEs
"1) find likelihood 2) log(likelihood) 3)derivative 4)max
Can there be more than one MLE
Yes
Effective Likelihood
L*(t;x)=max(t=f(theta) (L(theta;x)
Effective Likelihood and MLEs
If theta hat maximized L(theta;x) then f(theta hat) maximizes L*(t;x)
Posterior Mean
"E(theta|x) x is fixed (the data)"
Posterior Variance
"var(theta|x) x is fixed (the data)"
Bias
"Etheta[theta hat]-theta The difference between our estimate and the true value"
Unbiased
"Etheta[theta hat]-theta=0 for all theta Bias theta(theta hat) =0 for all theta"
If our estimator has no moments how do we get bias
"Use the median bias =median theta (theta hat) -theta"
Warnings for bias
"An unbiased estimator may not exist Unbiased estimators may still not be good"
MSE (scalar)
"MSE(theta hat)=E[(theta hat-theta)^2] =var(theta hat)+Bias theta(theta hat)^2"
MSE (vector)
"MSE(theta hat)=Etheta[||theta hat -theta||^2] =tr(var theta (theta hat))+||Bias theta (theta hat)||^2"
UMVUE (acronym)
Uniformly minimum variance unbiased estimator of theta
UMVUE (math)
"theta hat is a UMVUE of theta iff
1) Bias theta (theta hat) =0 for all theta
2) For an unbiased estimator theta2 of theta, var theta (theta hat)<= var(theta2) for all theta"
Does the UMVUE always exist
No, not even if you have an unbiased estimator