AP Stats
Key concepts
- Frequency Table
- Stem and Leaf Plot
- Two-way frequency table
- Mosaic Plots
- Marginal and conditional distribution
- Graphs
- Pie chart
- Bargraph
- HIstogram
- Segmented Bar Graph
- Dot plot
- Linear Regression
- Residuals
- Line
- Correlation(r)
- r^2
- Y = a + bx
- High Leverage point
- Normal Distribution
- Empirical Rule
- 5 number summary
- Box plot
- Z score
- Standard Deviation
- Setting Up experiment
- Experimental unit
- Treatment
- Sampling
- Simple Random Sampling
- Stratified Random Sampling
- Cluster Random Sampling
- Convenient Sampling
- Voluntary Response Sampling
- Bias
- Under coverage Bias
- Voluntary Response Bias
-
- Blocking
- Statistical Inference
-
A data set contains information from number of individuals, that may be people, animals or things, and they are measured for variables.
There are two types of variables:
- Categorical (people from different countries)
- Quantitative (people of different ages)
Different Bar Graphs and Pie charts are used to create a visual representation of data.
All individuals must be included in graphs (even if they have no value)
2 ways that make graphs deceptive is if they do not start and zero and if they have different scale for their bars. It is also not recommended to use pictographs as bars.
Marginal Distribution:
The marginal distribution of one of the categorical variables in a two-way table of counts in the distribution of values of the variable among all individuals described by the table.
Conditional Distribution:
A conditional distribution of a variable describes the values of that variable among individuals who have a specific value of another variable. There is a separate conditional distribution for each value of the other variable.
Sometimes even a strong association between two atlgorical variables can be influenced by other variables.
Count = Frequency
Percent = Relative Frequency
Examining a distributive Graph:
- Shape (symmetrical, skewed left, skewed right)
- When you describe a distribution shape, concentrate in the main features. Look for major peaks, not fo minor ups and downs in the graph. Look for clusters of values and obvious gaps. Look for potential outliers, not just for the smallest and largest value observations. Look for rough symmetry or clear skewness.
- \
- Center (midpoint)
- Spread (range)
An important kind of departure is an outlier, an individual variable that falls outside the overall pattern.
Correlation ≠ Causation
Identify the type of study:
- A population consists of all items or subjects of interest.
- A sample is a subset of the population that is selected for study
Observational study, treatments are not imposed. Investigators examine current or past data for a sample or individuals or follow a sample of individuals into the future collecting data in order to investigate a topic of interest about the population
Experimental study, Different treatments are assigned to experimental units.
Simple Random Sample (SRS) - a sample in which every group of a given size has an equal chance of being chosen. This method is the basis for many types of sampling mechanisms.
- Random number generator
- Table of random digits
- Drawing a card from a deck without replacement
Stratified Random Sample - involves the division of a population into separate groups, called strata, based on shared attributes or characteristic. Within each strata a simple random sample is selected, and the selected units are combined to form the sample
Cluster Sample - involves the division of a population into smaller groups, called clusters. Ideally they are heterogenous clusters.
Systematic Random Sample - method in which sample members from a population are according to a random starting point and fixed interval
Non Randomly Selected Method:
Voluntary Response Sample: consists of people choose themselves by responding to general appeal
Convenience Sample: Choosing individuals who are easier to reach
There are advantages and disadvantages for each sampling method depending upon the situation that is to be answered and the population from which the sample will be drawn.
Potential Problems with Sampling
Bias: Occurs when certain responses are systematically favored over others. Biased often results in the sample not representing the population.
- Voluntary Response Bias
- Under-coverage bias
- Nonresponse Bias
- Question wording Bias
- Self reported Response bias
*The Goal is to have unbiased results with low variation.
Components of Experiment
- Experimental Units
- Explanatory Variable
- Response Variable
- Confounding Variable
- Treatment
Elements of well designed Experiment
Statistical Significance:
Num of data located to the left or right of the “mean”
If the % ≤ 5%, that is pretty unlikely to happen on its own so its probably due to the caffeine.
If the % ≥ 5%, no its is unlikely to happen
- In order to answer a answer a bias FRQ:
- Identify the population and sample
- Explain how sample individual might differ from the general population
- Explain how this might lead to overestimation or underestimation of the data.
Probability:
Random Process generates results that are determined by chance
The outcome is the result of the random process
An event is a collection of outcomes.
Simulation - a way to model random events, such that the simulated outcomes closely match the real world outcomes
Conduction a simulation:
- Use a random number generator
- Conduct the trial many many times
- Relative frequency represent the probable of the outcome.
or
- State: Ask a question of interest about some chance process.
- Plan: Describe how to use a chance device to imitate one repetition of the process. Tell what you will record at the end of each repetition.
- Do: Perform many repetitions of the simulation.
- Conclude: Use the results of your simulation to answer the question of interest.
Law of large Numbers: Simulated probabilities seem to get closer the true possibility as the number of trials increases.
Mutually Exclusive:
Event that cannot occur at the same time; it is also called disjoint.
Example:
- A number being odd and even.
- A coin flip being head and tail.
Joint probably - intersection of two events.
Conditional Probability:
P (B|A) -> the probability that event B will occur given that event A has occurred.
General Multiplication of rule:
Probability of A and B is P(A) * P(B|A) = P(AnB)
If a simulation includes replacement, the total number of variables stays the same even after the simulation is over.
Independent Events:
Events A and B are independent if, knowing whether or not event A has occurred or will occur does not change the probability that event B will occur.
P(A) * P(B) = (A and B)
And
P(A|B) = P(A)
Geometric Probability
what is the probability the first correct one is on the third try
Binomial Probability
what is the probability that 8 out of 10 are correct
Sample Distribution . . .
P - p = p - p
Easy stuff all formula in the formula sheet


Sample mean and standard deviation
Refers to the actual mean of content itself.
4 lemon, 0.5 lemons, etc
Sample proportion
Refers to proportion 0.01 - 0.99
sample distribution of sample proportion and sample distribution of sample mean
For all random sample of size n = x from this population, the sample mean weights of lemons will have a mean of 4 ounes.
Central something theory
A sample proportion distribution is normal if np and n(1-p) is greater than 10
A sample mean is normal if the sample size is greater than 35 or it is stated that the population is normal.
To find margin of error of proportion use z* to find margin of error for mean use t*


Conditions:
- Proportions
- Random assignment or sample
- If random sample and no replacement
- Sample < 10% of the population
- Np and n(1-p) >= 10
- Means
- Random assignment or sample
- If random sample and no replacement
- Sample < 10% of the population
- NOrmal
- Central limit theory
- Stated in the question
- Sample shows no skewness
- Chi-squared
- Random assignment or sample
- If random sample and no replacement
- Sample < 10% of the population
- Expected value is > 5
- \
- Slope
- The population is linear
- Individual observations are independent of each other or 10% rule
- Normal for any any given x, the distribution for y is normal
- Equal variance
- Random