1/80
(ch 1-4)
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
The science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions.
Statistics
2 types of Statistics
descriptive and inferential
…can be used to organize data into a meaningful form
Descriptive statistics
Methods of organizing, summarizing, and presenting data in an informative way.
descriptive statistics
The methods used to estimate a property of a population on the basis of a sample.
inferential statistics
The entire set of individuals or objects of interest or the measurements obtained from all individuals or objects of interest.
population
A portion or part of the population of interest.
sample
There are two basic types of variables
qualitative variable and quantitative variable
An object or individual is observed and recorded as a non-numeric characteristic or attribute. Examples: gender, state of birth, eye color
qualitative variable
A variable that is reported numerically.
Examples: balance in your checking account, the life of a car battery, the number of people employed by a company
quantitative variable
…variables can be discrete or continuous
Quantitative variables
….variables are typically the result of counting
-Examples: the number of bedrooms in a house (1, 2, 3, 4, etc.), the number of students in a statistics course (326, 421, etc.)
Discrete variables
…variables are usually the result of measuring something. Can assume any value within a specific range
- Examples: Duration of flights from Orlando to San Diego (5.25 hours), grade point average (3.258)
continuous
There are four levels of measurement
Nominal, ordinal, interval, and ratio
The level of measurement determines the type of …
statistical analysis that can be performed
… is the lowest level of measurement. Because it provides the least amount of information compared to the other levels.
nominal
Data recorded at the … level of measurement is represented as labels or names. They have no order. They can only be classified and counted.
-Examples: classifying M&M candies by color, identifying students at a football game by gender
nominal
…adds ranking (e.g., small, medium, large)
-The rankings are known, but not the magnitude of differences between groups
ordinal
Data recorded at the … level of measurement is based on a relative ranking or rating of items based on a defined attribute or qualitative variable. Variables based on this level of measurement are only ranked and counted.
-Examples: the list of top ten states for best business climate, student ratings of professors
ordinal
-This data has all the characteristics of ordinal level data, plus the differences between the values are meaningful
-There is no natural 0 point; a zero does not represent the absence of the condition
interval level
For data recorded at the … level of measurement, the … or the distance between values is meaningful. The … level of measurement is based on a scale with a known unit of measurement.
-Examples: the Fahrenheit temperature scale, credit scores (300- 850), SAT scores (400-1600)
interval
The data has all the characteristics of the interval scale, and … between numbers are meaningful -The 0 point represents the absence of the characteristic
ratio
Data recorded at the … level of measurement are based on a scale with a known unit of measurement and a meaningful interpretation of zero on the scale.
-Additionally, these variables have zero measurements representing a lack of the attribute. For example, zero kilograms indicates a lack of weight. • Examples: wages, changes in stock prices, and height
ratio
Practice … with integrity and honesty when collecting, organizing, summarizing, analyzing, and interpreting numerical information
statistics
… is used to process and analyze data and information to support a story or narrative of a company
Business Analytics
chapter 2 is next
chapter 2 is next
… A grouping of qualitative and quantitative data into mutually exclusive and collectively exhaustive classes showing the number of observations in each class.
frequency table
Each observation is in only one class.
mutually exclusive
There is a class for each value
Collectively exhaustive
Constructing Frequency Tables
1. Sort the data into classes
2. Count the number in each class
3. Report as the class frequency
-Example: Car sales by location
…is just a tally of how many times something happened.
a frequency
…tells you how significant that number is compared to the entire group.
relative frequency
to find …take the class frequency and divide by the total number of observations.
relative frequencies
Pro-tip: In a Relative Frequency table, the sum of the relative frequency column should always equal … If it doesn't, someone missed a car!
1.00 (or 100%)
…A graph that shows the qualitative classes on the horizontal axis and the class frequencies on the vertical axis.
bar chart
the bar chart is the most common graphic to present a…
qualitative variable
…is class with the highest frequency
the mode
A chart that shows the proportion or percentage that each class represents of the total number of frequencies
pie chart
A grouping of quantitative and qualitative data into mutually exclusive and collectively exhaustive classes showing the number of observations in each class
-shows the pattern and the peaks and the gaps
-describes the pattern
frequency distribution
A graph in which the quantitative classes are marked on the horizontal axis and the class frequencies on the vertical axis.
histogram
Consists of line segments connecting the points formed by connecting the class midpoints.
Gives a quick picture of the main characteristics of the data.
Good to use when comparing two or more distributions.
frequency polygon
Add each frequency to the frequencies before it
Cumulative frequency distribution
Divide the cumulative frequencies by the total number of observations
Cumulative relative frequency distribution
chapter 3 is next
chapter 3 is next
A measure of …is a value used to describe the central tendency of a set of data.
location
Common measures of location:
Mean
Median
Mode
The …is the most widely reported measure of location
arithmetic mean
The mean is both a …
population parameter and sample statistic
-An interval or ratio scale of measurement is required
-All the data values are used in the calculation
-The .. is unique
-The sum of the deviations from the … equals zero
-A weakness of the … is that it is affected by extreme values
(large or small).
sample mean
For data containing extreme values, the .. may not
fairly represent the central location
mean
The midpoint of the values after they have
been ordered from the minimum to the maximum
values
median
-The … is the value in the middle of a set of ordered
data.
-At least the ordinal scale of measurement is required.
-Because you need a meaningful ranking of the data values
to define what the “middle” is
-Extreme values do not influence it.
-Fifty percent of the observations are larger than the
...
-Fifty percent of the observations are smaller than the
….
-It is unique to a set of data.
median
The value of the observation that occurs most
frequently
mode
The … can be found for nominal level data. The mode is
meaningful because you can still say which category occurs
most often
-A set of data can have more than one ....
- set of data could have no ...
mode
-The .. is found by multiplying each observation
by its corresponding weight.
-A convenient way to compute the mean when there are
several observations with the same value.
weighted mean
measures of location only describe the center, so we need… to see how scattered the data is sat at the center
dispersion
Measures of dispersion include:
-Range.
-Variance.
-Standard Deviation.
The simplest measure of dispersion is…
range
=max value - min value
range
range is influenced by…
extreme values
The variance measures how much the values vary from their ….
mean
The … is used to compare the spread of two or more sets of observations
-finds the square root of variance
Small: The values are close to the mean
Large: The values are widely scattered about the mean
standard deviation
is for any set of values regardless of the shape of the distribution
-one size fits all
-use if the problem says unknown distribution
Chebyshev’s theorem
The Empirical Rule or Normal Rule provides an approximation.
-symmetric,bell-shaped curve
-use if problem says normally distributed
• 1 standard deviation of the mean: about …of values.
• 2 standard deviations of the mean: about … of values.
• 3 standard deviations of the mean: about … of values.
68
95
99
chapter 4 is next
chapter 4 is next
Summarizes the distribution of one variable by stacking dots as points on a number line that shows all the values.
-If there are identical values, the dots are “piled” on top of each other.
dot plot
-The standard deviation is the most widely used measure of
dispersion.
-We can also determine the location of values that divide a
set of observations into parts.
quartiles
Q1=
Q2=
Q3=
25
50 (median)
75
Divide a set of observations into 10 equal parts
deciles
Divide a set of observations into 100 equal parts
percentiles
Excludes the minimum and maximum when calculating quartiles
-10-20, 20-30, 30-40
exclusive method
Includes the minimum (0th percentile) and maximum
-10-19, 20-29, 30-39
(100th percentile) when calculating quartiles.
Formula for the position of the pth percentile:
inclusive method
…A graphic display that shows the general shape
of a variable’s distribution.
-It is based on five descriptive statistics: the maximum and
minimum values, the first and third quartiles, and the
median
box plot
A data point that is unusually far from the other
outlier
four common skewness shapes
symmetric
positively skewed
negatively skewed
bimodal
-The median is roughly centered in the box
-the whiskers (lines extending from the box) are about the
same length
-Q1 → Median → Q3 are evenly spaced.
Symmetric Distribution
-The median is closer to Q1
-The right whisker (upper tail) is longer
-Q3 – Median > Median – Q1
Right-Skewed Distribution (Positively
Skewed)
-The median is closer to Q3
-The left whisker (lower tail) is longer
-Q1 – Median > Median – Q3
Left-Skewed Distribution (Negatively
Skewed)
Graphical technique used to show the relationship between two variables measured with interval or ratio scales
-we use to see if two different things are related (correlation)
scatter diagram
…Measures the direction and strength of the relationship
-Ranges from −1.0 to +1.0
-The closer the coefficient is to −1.0 or +1.0, the stronger the
relationship
-If r is close to 0.0, we can say that there is no relationship between the variables
-Positive indicates a positive relationship
-Negative indicates a negative relationship
Correlation coefficient
…A table used to classify observations according to two identifiable
characteristics
-It is a cross-tabulation that simultaneously summarizes two variables of interest
-Both variables need only be nominal or ordinal
contingency table