Summary of Statistics Concepts
What is Statistics?
- Statistics: The science of collecting, analyzing, presenting, and interpreting data.
- Types of data:
- Quantitative: Measures how much or how many.
- Qualitative: Labels for categories of like items.
Types of Statistics
Descriptive Statistics: Organizes and summarizes data.
- Uses tables/graphs (e.g., averages) for summaries.
- Descriptive values:
- Parameter: Descriptive value for a population.
- Statistic: Descriptive value for a sample.
Inferential Statistics: Uses sample data to make inferences about populations.
- Limited information due to sampling.
- Examples: Hypothesis testing, Regression Analysis.
Population and Sample
- Population: Entire group of individuals (e.g., class size and academic performance of all freshmen).
- Sample: A selected representation of the population for study.
Variables in Statistics
- Variable: Any characteristic that can be measured or counted (e.g., age, income).
- Types of Quantitative Variables:
- Discrete: Indivisible categories (e.g., class size).
- Continuous: Infinitely divisible (e.g., time, weight).
Levels of Measurement
- Nominal: Categorizes and labels variables without order.
- Ordinal: Ranks categories in order (e.g., satisfaction levels).
- Interval: Equal intervals between measurements but no true zero (e.g., temperature).
- Ratio: All properties of interval plus true zero (e.g., weight, height).
Sample Size
- Sample size is the proportion of the population studied.
- Use Slovin's formula:
n = \frac{N}{1 + Ne^2} - Where:
- n = Sample Size
- N = Total Population
- e = Margin of Error
Probability Sampling Techniques
- Simple Random Sampling: Each member has a chance to be included.
- Systematic Sampling: Members selected at regular intervals (k = N/n).
- Stratified Sampling: Population divided into strata, samples taken from each stratum.
Non-Probability Sampling Techniques
- Includes quota, purposive, and convenience sampling.
Frequency Distribution
- Describes occurrences of distinct values.
- Grouped and Ungrouped types exist.
Graphical Presentations
- Present data in bar graphs, pie charts, etc.
Descriptive Statistics Measures
- Mean: Average value.
- Median: Middle score when ordered.
- Mode: Most frequent score.
- Range: Difference between largest and smallest values.
- Standard Deviation: Indicates variability around the mean.
Normal Distribution
- Bell-shaped curve where most data cluster around the mean.
- Characteristics: Symmetric, unimodal, mean=median=mode.
Z-Scores
- Measures how many standard deviations a score is from the mean.
- Formula: z = \frac{x - \bar{x}}{s}
Correlation
- Measures statistical relationship between two variables.
- Pearson's r: Measures linear correlation.
r = \frac{n(\Sigma xy) - (\Sigma x)(\Sigma y)}{\sqrt{[n(\Sigma x^2) - (\Sigma x)^2][n(\Sigma y^2) - (\Sigma y)^2]}} - Spearman's rank: Non-parametric measure of correlation.
\rho = 1 - \frac{6 \Sigma d^2}{n(n^2 - 1)}
- Where d is the difference in ranks.