Randomness and Simulation

Why is it important to understand randomness?

  • Modeling Real-World Phenomena: Many natural and social processes involve randomness.

  • Statistical Inference relies on the assumption of randomness.

  • Avoiding Overfitting: The goal is to look at available data and predict new data.

  • Simulation and Monte Carlo Methods: Many data science techniques, such as Monte Carlo simulations, rely on randomness to explore different possible outcomes of a process.

Mutually exclusive events cannot happen at the same time P(A and B) = 0

Independent events cannot influence each other P(A and B) = P(A)P(B)

A random variable is a mapping from an outcome to a number.

EX: randomly select an American and define X=age

Discrete random variables

  • may only assume a countable number of possible values

  • Probabilities are defined by a probability function

  • Flip a coin and let X=0 if tails and 1 if heads

    • Probability function: p(x) = .5 for x=0 or 1; p(x) = 0 otherwise

  • Probabilities for specific values

  • This is an example of a Bernoulli random variable, which is a special case of a binomial random variable

Continuous random variables

  • May assume any number in an interval

  • Every fraction is a possibility

  • Probabilities are defined by a probability density function

runif(n,min,max)    Uniform(continuous)

rnorm and rbinom also exists