Randomness and Simulation
Why is it important to understand randomness?
Modeling Real-World Phenomena: Many natural and social processes involve randomness.
Statistical Inference relies on the assumption of randomness.
Avoiding Overfitting: The goal is to look at available data and predict new data.
Simulation and Monte Carlo Methods: Many data science techniques, such as Monte Carlo simulations, rely on randomness to explore different possible outcomes of a process.
Mutually exclusive events cannot happen at the same time P(A and B) = 0
Independent events cannot influence each other P(A and B) = P(A)P(B)
A random variable is a mapping from an outcome to a number.
EX: randomly select an American and define X=age
Discrete random variables
may only assume a countable number of possible values
Probabilities are defined by a probability function
Flip a coin and let X=0 if tails and 1 if heads
Probability function: p(x) = .5 for x=0 or 1; p(x) = 0 otherwise
Probabilities for specific values
This is an example of a Bernoulli random variable, which is a special case of a binomial random variable
Continuous random variables
May assume any number in an interval
Every fraction is a possibility
Probabilities are defined by a probability density function
runif(n,min,max) Uniform(continuous)
rnorm and rbinom also exists