Sampling variability
the observed value of a statistic depending on the particular sample selected from the population and it will vary from sample to sample
Standard error of the mean
indicates how different the population means is likely to be from a sample mean.
The central limit theorem
states that the distribution of sample means approximates a normal distribution as the sample size gets larger.
Standard deviation
a number that is equal to the square root of the variance and measures how far data values are from their mean
Variance
average of the squares of the deviation
Symmetrical
a figure or shape that can be divided into two equal parts by a line
Skewed to the left
the graph is pulled out to the left.
Skewed to the right
the graph is pulled out to the right.
Mean
a number that measures the central tendency of the data; a common name it is 'average.'
Median
a number that separates ordered data into halves
The sample mean
The average number of the sample
Population mean
The average number of the population.
Mode
The value that appears most often in a set of data.
The Law of Large Numbers
if you take samples of larger and larger size from any population, then the mean x of the sample is very likely to get closer and closer to µ.
Sampling distribution
a probability distribution of a statistic that comes from choosing random samples of a given population
Relative frequency distribution
the ratio of the number of times a value of the data occurs in the set of all outcomes to the number of all outcomes
Relative frequency table
a data representation in which grouped data is displayed along with the corresponding frequencies
Box plots
a graph that gives a quick picture of the middle 50% of the data
x
the number of data values counting from the bottom of the data list up to but not including the data value for which you want to find the percentile.
y
the number of data values equal to the data value for which you want to find the percentile.
n
the total number of data.
Formula for percentile
x + (0.5y/n) (100)
k
the kth percentile. It may or may not be part of the data.
i
the index (ranking or position of a data value)
n
the total number of data
kth percentile
i = (k/100)(n + 1)
Quartiles
the numbers that separate the data into quarters; may or may not be part of the data
Percentiles
a number that divides ordered data into hundredths
First quartile
the value that is the median of the lower half of the ordered data set
Interquartile range
is the range of the middle 50 percent of the data values; found by subtracting the first quartile from the third quartile.
IQR
Q3 - Q1
Smaller outlier
Q1 - IQR(1.5)
Larger outlier
Q2 - IQR(1.5)
Histogram
a graphical representation in the x-y form of the distribution of data in a data set; x represents the data and y represents the frequency or relative frequency. The graph consists of contiguous rectangles.
Frequency Polygons
looks like a line graph but uses intervals to display ranges of large amounts of data
Convenient starting point
a lower value carried out to one more decimal place than the value with the most decimal places
Discrete data
type of data that includes whole, concrete numbers with specific and fixed data values determined by counting
Paired data set
two data sets that have a one-to-one relationship so that
Stem-and-leaf graph or stemplot
easy to compute the median and other quantiles. Each data point is converted into stem and leaf, e.g., 438 (stem
Outlier
an observation that does not fit the rest of the data
Line graph
A graph used to show changes over time
X-axis
this is the explanatory variable
Data values
the content that fills a space in a record
Y-axis
the response variable
Frequency
the number of times a value of the data occurs
Bar graphs
used to display grouped data or categorical data; the identity of the sample points within the respective groups is lost