Math 1020 Final Exam Study Guide

Overview of Key Concepts in Statistics

The study guide outlines foundational concepts applicable to statistics and probability, specifically tailored for Math 1020 and Stat 2010 courses.

Chapter 1: Concepts and Applications

Major themes of statistical analysis, including:
- Parameters vs. Statistics: Distinction between population parameters (true values) and sample statistics (estimates from samples).
- Populations vs. Samples: Definitions and importance of samples in relation to populations.

1.1 Random Sampling and Types of Variables

Random Sampling: Importance and techniques for obtaining representative samples.
Types of Variables:
- Quantitative: Numerical values, can be measured.
- Qualitative: Categorical values, can be classified.

1.2 The Scientific Method and Experimental Design

Scientific Method: Steps include hypothesis formulation, experimentation, data collection, and analysis.
Good Experimental Design: Key aspects include control groups, randomization, and replication.
Alternatives to Experiments: Observational studies and case-control studies.

1.3 Data Visualization

Histograms:
- Construction: Steps to create histograms, including bin width selection.
- Types and Shapes: Characteristics of normal, skewed, and uniform distributions.
Graphs:
- Types: Bar graphs, line graphs, circle (pie) graphs, and Pareto bar graphs.
- When to use each type for effective data presentation.
Stem-and-Leaf Plots: Method for displaying quantitative data while retaining original values.

1.4 Descriptive Statistics

Methods for calculating:
- Central Tendency:
  - Mean: Average of a data set.
  - Median: Midpoint value when data is ordered.
  - Mode: Most frequent value in the data set.
  - Weighted Average: Average that accounts for varying weights.
- Dispersion:
  - Range: Difference between maximum and minimum values.
  - Variance: Measure of how much values differ from the mean, defined as:
    $Variance = rac{ ext{Sum of } (x - \bar{x})^2}{N-1}$
  - Standard Deviation (SD): Square root of variance, indicating variability.
  - Coefficient of Variation (CV): Ratio of the standard deviation to the mean, expressed as a percentage.
  - Interquartile Range (IQR): Difference between the first (Q1) and third quartiles (Q3).

1.5 Quartiles and Percentiles

Quartiles: Division of a data set into four equal parts, with Q1, Q2 (median), Q3 as key points.
Percentiles: Values that divide a dataset into 100 equal parts. Important for ranking data distributions.
Boxplots: Constructing and interpreting boxplots to visualize data dispersion.

1.6 Outliers and Chebychev's Inequality

Outliers: Definitions and impact on statistical analyses.
Chebychev's Inequality: States that for any distribution, the proportion of values that lie within k standard deviations of the mean is at least $1 - rac{1}{k^2}$ for k > 1.

1.7 Correlation and Coefficient of Determination

Understanding Correlation: Measures the strength and direction of a linear relationship between two variables.
Coefficient of Determination (R²): Proportion of variance in the dependent variable predictable from the independent variable.

1.8 Scatterplots and Regression

Scatterplots: Graphical representation of the relationship between two quantitative variables.
Regression Lines: Fitting a line through data points; the least squares method is commonly used.
- Interpolation: Estimating values within the range of data.
- Extrapolation: Estimating values outside the range of the data, with greater risk of error.

1.9 Probability Foundations

Sample Spaces and Events: Definitions and examples of outcomes, events, and simple events.
Long Run Relative Frequency: Concept that probability can be understood as the limit of the relative frequency of an event as the number of trials approaches infinity.
- Defined terms:
  - Impossible Event: Probability of 0.
  - Certain Event: Probability of 1.

1.10 Probability Rules

Equally Likely Rule: Probability of an event is the ratio of favorable outcomes to total outcomes.
Not Rule / And Rule / Or Rule:
- Instances of complementary, joint, and disjunctive events, respectively.
Law of Large Numbers: Convergence of relative frequencies toward probabilities as trials increase.

1.11 Conditional Probability and Independence

Contingency Tables: Organization of joint frequencies of two qualitative variables.
Conditional Probability: Probability of an event given another event has occurred. Mathematically represented as:
$P(A|B) = rac{P(A ext{ and } B)}{P(B)}$
Independence: Events A and B are independent if $P(A|B) = P(A)$ .
Mutually Exclusive: Events that cannot occur at the same time (e.g., in a single trial).

1.12 Counting Principles

Combinations: Selection of items without regard to the order, mathematically expressed as:
$C(n, k) = rac{n!}{k!(n-k)!}$
Permutations: Arrangement of items with regard to order, expressed as:
$P(n, k) = rac{n!}{(n-k)!}$
Factorials: Definitions and usage of n!, where n! is the product of all positive integers up to n.

1.13 Probability Distributions

Discrete vs. Continuous Random Variables: Types of random variables and their corresponding probability distributions.
Probability Mass Function (PMF): Function that gives the probability that a discrete random variable is equal to a specific value.
Expected Value: Calculated as: $E(X) = ext{Sum of } [x<em>i imes P(x</em>i)]$ for all possible values $x_i$ .
- Standard Deviation of Discrete Random Variables: Calculated as before.

1.14 Binomial Distribution

Bernoulli Trials: Trials with two possible outcomes (success or failure).
Binomial Distribution: A discrete probability distribution that describes the number of successes in n Bernoulli trials.
- Parameters include:
  - Number of trials (n)
  - Probability of success (p)
- Calculation of mean and standard deviation for a binomial random variable.

1.15 The Normal Distribution

Understanding Normal Distribution: The bell-shaped curve, defined by its mean ($bc$) and standard deviation ($c3$).
- Empirical Rule: States that approximately 68% of data falls within one standard deviation of the mean, 95% falls within two, and 99.7% falls within three.
- Z Scores: Standardization of scores relative to the mean, defined as:
  $Z = rac{X - \bar{X}}{ ext{SD}}$

1.16 Finding Probabilities

For a Normal random variable, calculating probabilities and addressing Inverse Normal problems where given probabilities are used to backtrack to find values.

1.17 Sampling Distributions and the Central Limit Theorem

Sampling Distributions: Distribution of sample means for a given population, regardless of the original distribution as the sample size increases.
Central Limit Theorem (CLT): States that the sampling distribution of the sample mean approaches a normal distribution as the sample size becomes large (usually n > 30).
Solving problems using the Central Limit Theorem and the normal approximation to the Binomial distribution, including continuity corrections.

1.18 Confidence Intervals

Definition: Interval around a sample estimate that is likely to include the population parameter.
Factors Affecting CI Length: Sample size and confidence level impact the width of confidence intervals.
- Constructing Confidence Intervals: Process for various types of data, including sample proportions.
- Sample Size Calculations: Required to achieve a desired level of confidence and margin of error.
- Assumptions for Confidence Intervals: Normality assumption for sample means, independence of observations.
- Key Terms: Point estimate, margin of error, confidence level, degrees of freedom, proportions, etc.

1.19 Drawing Conclusions from Confidence Intervals

Understanding how to make informed decisions based on the intervals constructed from data and their implications for the population parameter.

1.20 Hypothesis Testing

Null and Alternative Hypothesis: Formulations of hypotheses to be tested.
- Symbols: a (alpha), ẞ (beta), 1-a, 1-ẞ, significance levels, and test statistics.
- P-values: Measure of the strength of the evidence against the null hypothesis.
Hypothesis Test Types: 1-sided vs. 2-sided tests explained, including their applications.
Bonferroni Correction: Adjusting significance levels when conducting multiple comparisons to control for Type I error.
Type I Error: Incorrectly rejecting a true null hypothesis.
Type II Error: Failing to reject a false null hypothesis.
Controlling Errors: Strategies for minimizing the occurrence of errors in hypothesis testing.

1.21 Conducting Hypothesis Tests

Step-by-step approach for conducting hypothesis tests, including setting significance levels, collecting data, calculating test statistics, and interpreting results.

Math 1020 Final Exam Practice Test

Instructions: Answer all questions, showing your work where applicable. Refer to the study guide for relevant formulas and concepts.

Part 1: Multiple Choice (Select the best answer)

Which of the following describes a numerical value that summarizes a population?
a. Statistic
b. Parameter
c. Sample
d. Variable
A researcher wants to study the effect of a new fertilizer on plant growth. They divide plants into two groups: one receiving the fertilizer and another receiving no fertilizer. Which experimental design principle is being used by the group receiving no fertilizer?
a. Randomization
b. Replication
c. Control Group
d. Blinding
Which type of graph is best suited for displaying the proportion of a whole, such as market share by product category?
a. Histogram
b. Line Graph
c. Circle (Pie) Graph
d. Scatterplot
If a dataset is skewed to the right, which relationship between the measures of central tendency is most likely true?
a. Mean < Median < Mode
b. Mode < Median < Mean
c. Mean = Median = Mode
d. Median < Mode < Mean
According to the Empirical Rule, approximately what percentage of data falls within two standard deviations of the mean in a normal distribution?
a. 68%
b. 95%
c. 99.7%
d. 50%
Which of the following is used to measure the strength and direction of a linear relationship between two quantitative variables?
a. Coefficient of Variation
b. Correlation Coefficient
c. Interquartile Range
d. Z-score
An event with a probability of 0 is called a(n):
a. Certain Event
b. Impossible Event
c. Simple Event
d. Compound Event
If $P(A) = 0.5$ , $P(B) = 0.4$ , and $P(A \text{ and } B) = 0.2$ . Are events A and B independent?
a. Yes, because $P(A \text{ and } B) = P(A) \times P(B)$ .
b. No, because $P(A \text{ and } B) \ne P(A) \times P(B)$ .
c. Yes, because $P(A|B) = P(A)$ .
d. More information is needed.
What is the probability of selecting 3 defective items from a batch of 10 items, where 5 are defective and 5 are not, if the order of selection does not matter?
a. Uses permutations
b. Uses combinations
c. Uses factorials
d. Cannot be determined
The Central Limit Theorem states that as the sample size becomes large (n > 30), the sampling distribution of the sample mean approaches a:
a. Binomial distribution
b. Exponential distribution
c. Normal distribution
d. Uniform distribution

Part 2: Short Answer & Calculation

Variables: Identify whether the following variables are Quantitative or Qualitative:
a. Number of children in a household
b. Favorite color
c. Temperature in Celsius
d. Country of origin
Descriptive Statistics: Consider the dataset: $[12, 15, 18, 20, 25]$
a. Calculate the Mean.
b. Calculate the Median.
c. Calculate the Range.
d. Calculate the Variance. ( $Variance = \frac{\text{Sum of } (x - \bar{x})^2}{N-1}$ )
Chebychev's Inequality: For a distribution with a mean of $50$ and a standard deviation of $5$ , what is the minimum proportion of values that lie within $2$ standard deviations of the mean? ( $1 - \frac{1}{k^2}$ )
Conditional Probability: In a survey, $40\%$ of students like Math ( $M$ ), $30\%$ like English ( $E$ ), and $10\%$ like both. What is the probability that a student likes Math given they like English? ( $P(A|B) = \frac{P(A \text{ and } B)}{P(B)}$ )
Counting Principles:
a. How many different ways can the letters A, B, C be arranged in order?
b. In how many ways can you choose 2 students from a group of 5 to form a committee (order doesn't matter)?
Binomial Distribution: A biased coin lands on heads with a probability of $0.6$ . If the coin is flipped $4$ times, what is the probability of getting exactly $3$ heads? (You can set up the formula without calculating the final value).
Z-Scores: A student scores $85$ on an exam with a mean of $70$ and a standard deviation of $10$ . Calculate the Z-score for this student. ( $Z = \frac{X - \bar{X}}{\text{SD}}$ )
Confidence Intervals: Explain how increasing the sample size affects the length of a confidence interval, assuming the confidence level remains the same.
Hypothesis Testing:
a. Define Type I Error.
b. Define Type II Error.