Definition: Statistics is the study of data, including how to collect, analyze, and interpret information to make informed decisions.
Datasets: Composed of cases (subjects or units) which can be people, animals, or objects.
Variables: Characteristics of cases that can take on various values. Examples include:
Age
Gender
IQ Scores
Test Scores
Labels: Special types of variables used to uniquely identify cases (e.g., participant numbers), which hold no substantive meaning.
Categorical/Qualitative Variables:
Nominal: No order or measurable unit (e.g., ethnicity, gender).
Ordinal: Have an order but no measurable unit (e.g., family position: youngest, middle, oldest).
Quantitative Variables:
Interval: Ordered values with no fixed zero point (e.g., IQ).
Ratio: Ordered values with a meaningful zero point (e.g., age).
Precision of Measurement Levels: Nominal is the least precise; ratio is the most precise.
Identifying Variables:
Who?: Define the subjects or cases.
What?: Determine measurable characteristics (variables).
Why?: Understand the reasoning behind data collection.
Data Distribution: After identifying variables, examine their distribution.
Methods:
Pie Chart: Shows categories and their relative frequencies (should equal 100%).
Bar Graph: Displays frequency on the y-axis vs. variable values on the x-axis.
Methods:
Histogram: Illustrates data distribution quickly using bars for intervals.
Stem Plot: Displays original data values divided into stems and leaves for clarity.
Measures of Center:
Mean: Sum of values divided by count (only for interval and ratio variables).
Median: Middle value in ordered data; if no middle exists, average the two middle values.
Mode: Most frequently occurring value.
Quartiles:
Q1: First quartile (25% of data).
Q2: Median (50% of data).
Q3: Third quartile (75% of data).
Measures of Spread:
Variance: Average of the squared differences from the mean.
Standard Deviation: Square root of variance, indicating average deviation from the mean.
Range: Difference between maximum and minimum scores.
Components:
Minimum
First Quartile (Q1)
Median (Q2)
Third Quartile (Q3)
Maximum
Properties of Density Curves:
Describes the pattern of a quantitative variable.
Area under the curve equals 1 (representing total probability).
Normal Distribution Characteristics:
Symmetrical shape
Single peak (mean)
Bell-shaped curve
Notation: Denoted as N(µ, σ) where µ is mean and σ is standard deviation.
Relevance: Often applicable to real-world data, providing insights on probabilities and statistical conclusions.
stats 2
Categorical/Qualitative Variables:
Quantitative Variables:
Precision of Measurement Levels: Nominal is the least precise; ratio is the most precise.
Measures of Center:
Measures of Spread:
Properties of Density Curves:
Normal Distribution Characteristics:
Notation: Denoted as N(µ, σ) where µ is mean and σ is standard deviation.
Relevance: Often applicable to real-world data, providing insights on probabilities and statistical conclusions.