Note

0.0(0)

Take a practice test

Chat with Kai

Explore Top Notes

Studied by 4 people

Chapter 22: Making Consumer Decisions

Studied by 31 people

Northwood, Irvine, southern California - CASE STUDY

Studied by 3 people

Chemical Equilibria

Studied by 63 people

Chapter 6 - Pakistan Movement in the Early 20th Century

Studied by 453 people

Period 3, c.1750 to c.1900

Studied by 90 people

Discrete Distributions

Discrete Distributions Study Guide

1. Variance and Standard Deviation

Variance (σ²): Measures how spread out the values of a numerical random variable are from the mean.
Standard Deviation (σ): The square root of variance, keeping the unit consistent with the data.

Formula for Variance:

Var(X) = E[(X - μ)²] = ∑ (x - μ)² P(X = x)

where μ = E(X) (expected value of X).

Example: Rolling a Fair Die

Mean: E(D) = 3.5
Variance: Var(D) = 2.92
Standard Deviation: σ(D) = 1.71

Effect of Distribution on Variance

If extreme values are more likely, variance increases.
If extreme values are less likely, variance decreases.

2. Properties of Expected Value and Variance

For a random variable X and constants c, Y:

E(X + c) = E(X) + c
E(cX) = cE(X)
Var(X + c) = Var(X)
Var(cX) = c² Var(X)
Var(X + Y) = Var(X) + Var(Y) (only if X and Y are independent)

3. Bernoulli Distribution

A Bernoulli trial is a single experiment with two outcomes (success/failure).
Probability mass function (PMF):
P(X = 1) = p, P(X = 0) = 1 - p
Mean: E(X) = p
Variance: Var(X) = p(1 - p)
Standard Deviation: σ(X) = √(p(1 - p))

4. Binomial Distribution

Models the number of successes in n independent Bernoulli trials.
PMF:
P(X = k) = (n choose k) pˆk (1 - p)ˆ(n - k)
where (n choose k) = n! / (k!(n - k)!)
Mean: E(X) = np
Variance: Var(X) = np(1 - p)
Standard Deviation: σ(X) = √(np(1 - p))

Example: Flipping a Coin 20 Times (p = 2/3)

E(X) = 20 * (2/3) = 13.33
Var(X) = 20 (2/3) (1/3) = 4.44
σ(X) = √(4.44) = 2.11

5. Zipf Distribution (Inverse Power Law Distribution)

Describes data where a few values occur very frequently, and many values occur very rarely.
PMF:
P(X = k) ∝ (k + d)⁻ᵇ
where d is an offset and α is the exponent.
Common in natural language processing, web analysis, wealth distributions.

Example: Word Frequency in the British National Corpus

Rank 1 ("the"): 6.2 million occurrences
Rank 2 ("of"): 2.9 million occurrences
Rank 3 ("and"): 2.67 million occurrences
Follows y = c (r + 1)⁻α, where α ≈ 1.08

Implications of Zipf's Law

Fat Head: A few values dominate (e.g., the top 175 words account for 50% of tokens).
Long Tail: A large portion of occurrences come from rare values (e.g., words occurring only once make up 0.5% of all tokens).
Issues for AI:
- Easy to capture common cases, but difficult to cover rare cases.
- Requires large datasets for accurate modeling.

6. Key Takeaways

Variance and standard deviation measure the spread of a distribution.
Bernoulli distribution models single-trial success/failure.
Binomial distribution models multiple independent trials.
Zipf distribution explains power-law relationships in data.
AI applications face challenges due to rare event distributions.

7. Practice Questions

True or False: The variance of a binomial distribution is always less than its mean.
If a fair die is rolled 30 times, what is the expected number of times it lands on 6?
In a language corpus, the most common word appears 5 million times. The second most common appears 2.5 million times. Estimate how many times the 10th most common word appears using Zipf's law.

🚀 Use this guide to master Discrete Distributions for problem sets and exams!

Note

0.0(0)

Take a practice test

Chat with Kai

Explore Top Notes

Studied by 4 people

Chapter 22: Making Consumer Decisions

Studied by 31 people

Northwood, Irvine, southern California - CASE STUDY

Studied by 3 people

Chemical Equilibria

Studied by 63 people

Chapter 6 - Pakistan Movement in the Early 20th Century

Studied by 453 people

Period 3, c.1750 to c.1900

Studied by 90 people