Deterministic Phenomena: Outcomes are consistent under identical conditions; the process is predictable.
Example: A perfectly controlled physical experiment where measurements yield identical results repeatedly.
Stochastic Phenomena: Outcomes vary under identical conditions due to inherent randomness and unpredictability.
Example: Measurements affected by various uncontrollable noise factors that produce scatter in results.
Key Questions in Stochastic Systems:
How can we identify the true value amidst noise?
How to model trends or relationships revealed by varying data?
Statistics: The science of data—collecting, analyzing, interpreting, and presenting empirical data.
Inferential Statistics: Drawing conclusions about populations from sample data:
Requires understanding the uncertainty in sampling and predicting population parameters based on collected samples.
Types of Data Sets:
Univariate: Observations on a single variable (e.g., battery lifetimes).
Bivariate: Observations on two variables simultaneously (e.g., height and weight).
Multivariate: Observations on multiple variables, allowing the study of relationships between them.
**Methods:
Retrospective Studies:** Analyzing historical data.
Observational Studies:** Observing behavior without interference.
Designed Experiments: Controlled experiments where variables are deliberately manipulated.
Example of Observational Study:
Examining stress levels among different smoking habit groups.
Dotplots: Each data point represented by a dot; good for small datasets.
Histograms: Bars represent frequency of data values; visualizing distribution shapes (e.g., unimodal, bimodal).
Time Series Plots: Visualize how data points vary over time, revealing trends and patterns.
Scatter Plots: Used for bivariate data to show potential relationships between two variables.
Mean: The average of a data set, sensitive to outliers.
Median: The middle value in a numerically ordered list, more robust to outliers.
Comparison: In symmetrical distributions, mean and median are close, while in skewed distributions, one may be significantly different.
Variance: A measure of how data points differ from the mean, calculated as the average of the squared deviations.
Standard Deviation: The square root of variance; gives a measure of spread in the same units as the data.
Interquartile Range (IQR): Difference between the upper and lower quartiles, used to detect outliers.
Correlation Coefficient (r): Measures the strength of the relationship between two variables (-1 to 1 scale).
r > 0 implies positive correlation; r < 0 implies negative correlation; r near 0 implies no correlation.
Cauchy–Bunyakovsky–Schwarz Inequality: Fundamental property relating means and variances, underpinning many statistical methods.
Practical Example of Correlation: In measuring the relationship between pull strength and wire length, correlation coefficients quantify the strength of linear relationships.
Conclusion: Understanding these statistical principles allows for better data interpretation and improved decision making in engineering processes.