Stats exam 1 study guide
Chapter 1 Objectives
Triola
Hey ( with rizz)
Understand the definitions in the Definition Block on page 4 of Triola
Data- collections observations, such as measurements, genders, or survey responses.
Statistics- the science of planning studies and experiments; obtaining data; and organizing; summarizing; presenting; analyzing; and interpreting those data and then drawing conclusions based on them.
Population- the complete collection of all measurements or data that are being considered. Typically, a population is the complete collection of data that we would like to make inferences about.
Census- the collection of data from every member of a population
Sample- a subcollection of members selected from a population
Understand the “Origin of Statistics”
Understand the difference between Practical and Statistical Significance
Statistical: Achieved when the result is very unlikely to occur by chance
Practical: Related to whether common sense suggests that the treatment makes enough of a difference to justify its use
Know the difference between a parameter and a statistic
Parameter: A numerical measurement describing some characteristic of a population
Statistic: A numerical measurement describing some characteristic of a sample
Know the difference between qualitative (categorical) and quantitative data
Qualitative data: Consists of names or labels (NOT numbers representing counts/measurements)
Quantitative data: Two types
Discrete-
Continuous
(see next question)
Know the difference between discrete and continuous data
Discrete data: Result when the data values are quantitative and number of values is finite / “countable” (ex coin tosses before getting tails)
Continuous data: Result from infinitely many possible quantitative values, collection is not countable / Uncountable (ex. Lengths of distances from 0cm to 12cm)
Be able to differentiate between nominal, ordinal, interval, and ratio data
Nominal: characterized by data that consist of names, labels, or categories only, and the data cannot be arranged in some order (such as low to high). Ie. Yes, No, Undecided (categories only)
Ordinal: involves data that can be arranged in some order, but differences (obtained by subtraction) between data values either cannot be determined or are meaningless. Ie. Grades- A,B,C,D,F (Categories with some order)
Interval: involves data that can be arranged in order, and the differences between data values can be found and are meaningful. However, there is no natural zero starting point at which none of the quantities is present. Ie. Years 1000, 2000, 1776, and 1492 (differences but no natural zero points)
Ratios: data can be arranged in order, differences can be found and are meaningful, and there is a natural zero starting point (where zero indicates that none of the quantity is present). Differences and ratios are both meaningful. Ie. Time- 100 minutes, 50 min. Difference and a natural zero point
Know what is meant by “Big Data” and “Data Science”
Big data: Ahem A hefty boy quantity of data
More formally: Sets of data that are so large/complex their analysis is beyond the capabilities of traditional software tools and may take many different computers running simultaneously
Data science: Involves smart boi stuff such as
Stats
Computer science
Software engineering
Other stuff (sociology or finance)
Be able to explain the basic design of experiments
Replication: repetition of an experiment on more than one individual
Blinding: Subject doesnt know whether he or she is receiving a treatment or a placebo (gets around the placebo affect)
Double-Blind: Two levels (Pills experiment)
The subject doesn't know if he or she is receiving treatment or a placebo
The experimenter does not know whether he or she is administering the treatment or placebo
Randomization: Selects subjects randomly, create experiment based on chance
Be able to differentiate between the sampling methods on p 27
Simple Random Sample: a sample of n subjects is selected so that every sample of the same size n has the same chance of being selected.
Systematic Sample: Selecting every kth subject. Could be 2nd, 3rd, and number
Convenience sample: Data that is easiest to get
Stratified Sample: Dividing population into strata (groups) with the same characteristics, then randomly sampling within stratas.
Cluster Sample: partition the population in clusters, then randomly selecting some clusters, then select all members of the selected cluster.
Know the difference between a cross-sectional study, a retrospective study or a longitudinal study
Know that there are sampling errors
Unavoidable error that occurs despite perfect plan and execution
Take notes on the in class examples of using Excel for statistics
REVIEW THIS UUUGUGHGHGUGHGHGHG
Chapter 2 & 3 Objectives
Chapter 2
Be able to read and construct a frequency table
Be able to read, construct and use a relative frequency distribution
Understand the following when it comes to histogram: bell-shape, uniform, skew to the right and left. Be able to construct a histogram by hand an with excel
Know what skew right and skew left mean
Be able to interpret
Historgrams
Dotplots
Stemplots
Time-series graph
Bar Graphs
Pareto Charts
Pie Charts
Understand how nonzero vertical axis and pictographs can deceive
Chapter 3
Know how to calculate the three measures of center: mean, median, mode, midrange. Plus know what “resistant” means and when to use each.
Know the round off rules on
Be able to find the mean, median and mode using Excel
Know the measures of variation and what variation means: range, standard deviation
Be able to calculate the range:
Be able to calculate the standard deviation and variance using Excel.
Know the notation for standard deviation and variance on
Be able to calculate a z score from memory
Know what a z score allows you to compare
Be able to interpret a percentile and a quartile
Be able to build a 5-Number Summary
Be able to identify skewness and an outlier in a box plot
Be able to build a box plot using Excel
Chapter 4 Objectives
Be able to interpret probability values
Understand that statisticians reject explanations (such as chance) based on very low probabilities
Given the three common approaches to finding the probability of an event, be able to use each approach.
Given the three common approaches to finding the probability of an event, be able to determine which of the three approaches is appropriate to use.
Note: Simulations, mentioned on page 157, are covered in the Operations Analysis class
Note: plan to round probabilities to three significant digits
Understand the Law of Large Numbers
Given the Cautions on p 157, understand how they apply to your life
Understand the Caution at the bottom of page 159
Know what the complement of an event is
Be able to state statistical when we declare there to be a significantly high or low number of successes (i.e. The Rare Event Rule).
Know what racetracks use odds while statisticians use probabilities
Realize that the odds are stacked in favor of the house in gambling. This has to do with actual odds versus payoff odds
Know what it means for events to be independent or dependent as well as what the term replacement means
Know what rule the words, and / or, are associated with
Given the Multiplication Counting Rule and the Factorial Rule be able to use both to solve a problem. Both of these rules help you determine the total number of possibilities from some sequence of events. Understand how the Multiplication Rule can be used instead of the Factorial Rule.
Chapter 1 Objectives
Triola
Hey ( with rizz)
Understand the definitions in the Definition Block on page 4 of Triola
Data- collections observations, such as measurements, genders, or survey responses.
Statistics- the science of planning studies and experiments; obtaining data; and organizing; summarizing; presenting; analyzing; and interpreting those data and then drawing conclusions based on them.
Population- the complete collection of all measurements or data that are being considered. Typically, a population is the complete collection of data that we would like to make inferences about.
Census- the collection of data from every member of a population
Sample- a subcollection of members selected from a population
Understand the “Origin of Statistics”
Understand the difference between Practical and Statistical Significance
Statistical: Achieved when the result is very unlikely to occur by chance
Practical: Related to whether common sense suggests that the treatment makes enough of a difference to justify its use
Know the difference between a parameter and a statistic
Parameter: A numerical measurement describing some characteristic of a population
Statistic: A numerical measurement describing some characteristic of a sample
Know the difference between qualitative (categorical) and quantitative data
Qualitative data: Consists of names or labels (NOT numbers representing counts/measurements)
Quantitative data: Two types
Discrete-
Continuous
(see next question)
Know the difference between discrete and continuous data
Discrete data: Result when the data values are quantitative and number of values is finite / “countable” (ex coin tosses before getting tails)
Continuous data: Result from infinitely many possible quantitative values, collection is not countable / Uncountable (ex. Lengths of distances from 0cm to 12cm)
Be able to differentiate between nominal, ordinal, interval, and ratio data
Nominal: characterized by data that consist of names, labels, or categories only, and the data cannot be arranged in some order (such as low to high). Ie. Yes, No, Undecided (categories only)
Ordinal: involves data that can be arranged in some order, but differences (obtained by subtraction) between data values either cannot be determined or are meaningless. Ie. Grades- A,B,C,D,F (Categories with some order)
Interval: involves data that can be arranged in order, and the differences between data values can be found and are meaningful. However, there is no natural zero starting point at which none of the quantities is present. Ie. Years 1000, 2000, 1776, and 1492 (differences but no natural zero points)
Ratios: data can be arranged in order, differences can be found and are meaningful, and there is a natural zero starting point (where zero indicates that none of the quantity is present). Differences and ratios are both meaningful. Ie. Time- 100 minutes, 50 min. Difference and a natural zero point
Know what is meant by “Big Data” and “Data Science”
Big data: Ahem A hefty boy quantity of data
More formally: Sets of data that are so large/complex their analysis is beyond the capabilities of traditional software tools and may take many different computers running simultaneously
Data science: Involves smart boi stuff such as
Stats
Computer science
Software engineering
Other stuff (sociology or finance)
Be able to explain the basic design of experiments
Replication: repetition of an experiment on more than one individual
Blinding: Subject doesnt know whether he or she is receiving a treatment or a placebo (gets around the placebo affect)
Double-Blind: Two levels (Pills experiment)
The subject doesn't know if he or she is receiving treatment or a placebo
The experimenter does not know whether he or she is administering the treatment or placebo
Randomization: Selects subjects randomly, create experiment based on chance
Be able to differentiate between the sampling methods on p 27
Simple Random Sample: a sample of n subjects is selected so that every sample of the same size n has the same chance of being selected.
Systematic Sample: Selecting every kth subject. Could be 2nd, 3rd, and number
Convenience sample: Data that is easiest to get
Stratified Sample: Dividing population into strata (groups) with the same characteristics, then randomly sampling within stratas.
Cluster Sample: partition the population in clusters, then randomly selecting some clusters, then select all members of the selected cluster.
Know the difference between a cross-sectional study, a retrospective study or a longitudinal study
Know that there are sampling errors
Unavoidable error that occurs despite perfect plan and execution
Take notes on the in class examples of using Excel for statistics
REVIEW THIS UUUGUGHGHGUGHGHGHG
Chapter 2 & 3 Objectives
Chapter 2
Be able to read and construct a frequency table
Be able to read, construct and use a relative frequency distribution
Understand the following when it comes to histogram: bell-shape, uniform, skew to the right and left. Be able to construct a histogram by hand an with excel
Know what skew right and skew left mean
Be able to interpret
Historgrams
Dotplots
Stemplots
Time-series graph
Bar Graphs
Pareto Charts
Pie Charts
Understand how nonzero vertical axis and pictographs can deceive
Chapter 3
Know how to calculate the three measures of center: mean, median, mode, midrange. Plus know what “resistant” means and when to use each.
Know the round off rules on
Be able to find the mean, median and mode using Excel
Know the measures of variation and what variation means: range, standard deviation
Be able to calculate the range:
Be able to calculate the standard deviation and variance using Excel.
Know the notation for standard deviation and variance on
Be able to calculate a z score from memory
Know what a z score allows you to compare
Be able to interpret a percentile and a quartile
Be able to build a 5-Number Summary
Be able to identify skewness and an outlier in a box plot
Be able to build a box plot using Excel
Chapter 4 Objectives
Be able to interpret probability values
Understand that statisticians reject explanations (such as chance) based on very low probabilities
Given the three common approaches to finding the probability of an event, be able to use each approach.
Given the three common approaches to finding the probability of an event, be able to determine which of the three approaches is appropriate to use.
Note: Simulations, mentioned on page 157, are covered in the Operations Analysis class
Note: plan to round probabilities to three significant digits
Understand the Law of Large Numbers
Given the Cautions on p 157, understand how they apply to your life
Understand the Caution at the bottom of page 159
Know what the complement of an event is
Be able to state statistical when we declare there to be a significantly high or low number of successes (i.e. The Rare Event Rule).
Know what racetracks use odds while statisticians use probabilities
Realize that the odds are stacked in favor of the house in gambling. This has to do with actual odds versus payoff odds
Know what it means for events to be independent or dependent as well as what the term replacement means
Know what rule the words, and / or, are associated with
Given the Multiplication Counting Rule and the Factorial Rule be able to use both to solve a problem. Both of these rules help you determine the total number of possibilities from some sequence of events. Understand how the Multiplication Rule can be used instead of the Factorial Rule.