GES 500: Engineering Statistics - Key Concepts

Chapter 1: Descriptive Statistics

1.1 Deterministic vs. Stochastic Phenomena
  • Deterministic Phenomena: Outcomes are consistent under identical conditions; the process is predictable.

    • Example: A perfectly controlled physical experiment where measurements yield identical results repeatedly.

  • Stochastic Phenomena: Outcomes vary under identical conditions due to inherent randomness and unpredictability.

    • Example: Measurements affected by various uncontrollable noise factors that produce scatter in results.

  • Key Questions in Stochastic Systems:

    • How can we identify the true value amidst noise?

    • How to model trends or relationships revealed by varying data?

1.2 Descriptive and Inferential Statistics
  • Statistics: The science of data—collecting, analyzing, interpreting, and presenting empirical data.

  • Inferential Statistics: Drawing conclusions about populations from sample data:

    • Requires understanding the uncertainty in sampling and predicting population parameters based on collected samples.

  • Types of Data Sets:

    • Univariate: Observations on a single variable (e.g., battery lifetimes).

    • Bivariate: Observations on two variables simultaneously (e.g., height and weight).

    • Multivariate: Observations on multiple variables, allowing the study of relationships between them.

1.3 Collection of Data
  • **Methods:

    1. Retrospective Studies:** Analyzing historical data.

    2. Observational Studies:** Observing behavior without interference.

    3. Designed Experiments: Controlled experiments where variables are deliberately manipulated.

  • Example of Observational Study:

    • Examining stress levels among different smoking habit groups.

1.4 Presentation of Data
  • Dotplots: Each data point represented by a dot; good for small datasets.

  • Histograms: Bars represent frequency of data values; visualizing distribution shapes (e.g., unimodal, bimodal).

  • Time Series Plots: Visualize how data points vary over time, revealing trends and patterns.

  • Scatter Plots: Used for bivariate data to show potential relationships between two variables.

1.5 Data Analysis: Location of Data
  • Mean: The average of a data set, sensitive to outliers.

  • Median: The middle value in a numerically ordered list, more robust to outliers.

  • Comparison: In symmetrical distributions, mean and median are close, while in skewed distributions, one may be significantly different.

1.6 Data Analysis: Variability of Data
  • Variance: A measure of how data points differ from the mean, calculated as the average of the squared deviations.

  • Standard Deviation: The square root of variance; gives a measure of spread in the same units as the data.

  • Interquartile Range (IQR): Difference between the upper and lower quartiles, used to detect outliers.

1.7 Data Analysis: Relationships in Data
  • Correlation Coefficient (r): Measures the strength of the relationship between two variables (-1 to 1 scale).

    • r > 0 implies positive correlation; r < 0 implies negative correlation; r near 0 implies no correlation.

  • Cauchy–Bunyakovsky–Schwarz Inequality: Fundamental property relating means and variances, underpinning many statistical methods.

  • Practical Example of Correlation: In measuring the relationship between pull strength and wire length, correlation coefficients quantify the strength of linear relationships.

  • Conclusion: Understanding these statistical principles allows for better data interpretation and improved decision making in engineering processes.