1/18
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Statistics (according to Webster Dictionary)
2 parallel definitions
The science of data (collecting, organizing and analyzing data)
Plural of the word “statistics”
Parameter
Characteristic of a population
Usually a numerical characteristic
Ex. population mean, population variance, population standard deviation, etc.
Statistic
Characteristic of a sample
Ex. sample mean, standard deviation
Data
Collection of information
2 types:
Categorical (qualitative)
Nominal- according to name
Ex. Data with names, genders, race, etc.
Numerical (quantitative)
Numerical (Quantitative) Data
According to a ratio scale
Possible value of zero is an inherent zero → the number, zero, signifies nothing is there
Ex. Data with heights, weights, time durations, grades, etc.
According to the interval scale
Zero is not inherently zero → the number zero signifies a value
Ex. Data containing temperatures
Two types:
Discrete dataset
Continuous dataset
Discrete Dataset
One where measurements take a countable set of isolated values
Ex. The number of___
Continuous Dataset
Measurements take any real value within a certain range
Ex. amount of rainfall in Charlotte in Jan. of last 30 years; length of customer waiting times at a local bank
Graphical Descriptive Statistics
Describes set of data via graphs
Ex. bar graphs, pie charts, histograms, scatter plots, etc
Numerical Descriptive Statistics
Measures describing center of data- mean (arithmetic average), median, mode and weighted mean
Measures describing spread/dispersion of data- range, variance and standard deviation of data
Measures of location- tells where a particular measurement stands compares to the rest of the data (ex. z-score and percentile ranking)
Measures describing shape of data- (ex. skewness and kurtasis)
Inferential Statistics
Process of utilizing one or more random samples to gain insight about population where the samples were selected from
Sample
Part/portion of a population
Population
Set of all individuals, objects or measurements
Variable
characteristic of population of interest
How does a data analyst use inferential statistics?
Data analyst selects one or more samples from the population of interest → performs statistical analysis
When sample characteristics are verified/revealed, the data analyst uses this to transform sample information into population information
3 Main Methods of Inferential Statistics
Constructing confidence intervals- estimates population parameter to be within 2 limits (a lower limit and an upper limit)
Perform hypothesis testing- verifies or rejects hypotheses or claims
Modeling or Testing relationships between data sets
Sample
Subset of elements from set of individuals with one or more common features (a population) that have been selected for a study
n= number of elements in a sample
N= number of elements in the population
Use samples to learn about populations because many real-world examples make it impossible to measure a characteristic from every member of a population
Random Sampling
Subset of a population that’s chosen in a manner where every member of the population has an equal chance of being selected for the subset/sample → allows for the sample size to represent the population they were selected from
n= number of members of a random sample
Statistical Estimation
Type of inferential statistical analysis where statistics calculated from data of other samples is used to estimate population parameters
Also used to quantify uncertainty in estimates
Ex. Predict % of voters who will vote for a specific candidate
Conduct a poll where a random sample is selected
Ask each member who they plan to vote for and how likely they are to vote in the election
Use the info to calculate percentage of voters who will vote for a specific candidate (called a sample statistic)
Use the sample statistic to estimate percentage of entire population who will vote for a candidate
Results are called a statistical estimation of population parameter
Statistical Notation
X (bar)- sample mean
u- population mean
s²- variance of a sample
O²- sigma squared→ variance of a population
s- standard deviation of a sample
o- standard deviation of a population