Random Variables and Probability Distributions Notes
Conceptual Foundations of Random Variables
A random variable is defined as a function that assigns a numerical value to each possible outcome of a random experiment.
The statistical framework discussed is based on the theory and practices in statistical modeling by H. M. Samadhi Chaturanga Rathnayake.
Two primary frameworks are used to represent random phenomena: * Discrete: Mapping to exact, observable, and countable states (represented by a staircase metaphor). * Continuous: Mapping to fluid, spectrum-based transitions (represented by a ramp metaphor).
Discrete Probability Distributions
Probability Mass Function (PMF): A mathematical tool used to calculate the exact likelihood of an exact outcome with absolute precision.
Examples of Discrete Scenarios: * Tossing a coin times results in a finite sample space of , , , or heads. * Inspecting light bulbs for defects results in outcomes of , , or .
Expected Value (): Determined for discrete variables by multiplying each possible outcome by its probability and summing the results ().
Discrete Uniform Distribution: Occurs when all outcomes have equal probability, such as each value in the set having a probability.
Geometric Distribution: Used to model the probability of the first success occurring on a specific trial in a sequence (e.g., the first successful milk sample test occurs on exactly the test).
Negative Binomial Distribution: Used to find the probability of a specific number of successes occurring over a specific timeline (e.g., the success occurring on exactly the trial).
Poisson Distribution: A framework used to approximate binomial probabilities when the number of trials is very large () and the probability of success is very small (), utilizing the mathematical constant .
Continuous Probability Distributions
Probability Density Function (PDF): Measures probability over an interval using integration (calculus) rather than pinpointing exact values; the probability at any single infinite decimal point is effectively .
Examples of Continuous Variables: Student height ( to ), mango weight ( to ), and time (up to ).
Continuous Uniform Distribution: * Models scenarios such as waiting for an elevator between and . * Variance is calculated using the formula . For a range, the variance is approximately .
Exponential Distribution: Used to find probabilities regarding the waiting time between independent events, such as a highway patrol officer waiting for speeding cars arriving at a rate of every ().
Normal Distribution (Bell Curve): * Characterized by perfect symmetry where the mean (), median, and mode are aligned at the highest peak. * Empirical Rule: States that of a population falls within standard deviations () of the mean. * Z-scores: A standardization method that translates data into units of standard deviations, allowing for scale-independent probability calculations using areas under the curve.
Questions & Discussion
/
Foundational Framework Debate: A core disagreement exists regarding whether discrete or continuous modeling is more fundamental. * The discrete argument emphasizes the certainty of counting and physical occurrences (such as Planck lengths and atoms). * The continuous argument emphasizes that reality is a seamless, analog spectrum and that discrete models are human-invented approximations.
Interplay of Models: The participants agree that both models are interconnected; for instance, the continuous waiting time in an exponential distribution is triggered by discrete events (passing cars).
Convergence on Statistical Metrics: Both frameworks share the same goal of identifying the center (Expected Value) and the spread (Variance) of data.
Closing Conclusion: A probability distribution is fundamentally an arrangement that bridges inputs and outputs; both the discrete staircase and the continuous ramp are considered vital for navigating the mathematics of reality.