1/45
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
statistics is the science of
variability
saying: what was
compared
saying: who’s not
here
saying: incorporate
ish-ness
data is
a representation of someone or something
tidy data is a way of
mapping the real world to a dataset
observations are the
things we are interested in
attributes
the pieces of information we are interested in
measures are the way
we collect information about observations
quantitative data
values of an attribute for an observation are numbers representing a quantity of something
categorical data
values for an attribute for an observation are selected from a set of different category labels
rating scale data
special type of categorical data, values of an attribute for an observation are selected from a predetermined scale
time series data
the values of an attribute for an observation indicate a moment in time
reliability refers to
the extent to which th data you collect from a measure truly represents and reflects the real world
data validation
the act ensuring the values collected from each observation for each attribute are reliable
variation is
universal
statistical modeling/statistical testing
building a model based on what you know, considering the predictions the model makes and comparing those predictions to what actually is happening
statistical investigative cycle
problem, plan, data, analysis, conclusion
problem -
identify a statistical question
plan -
choose a sample design, study design, and measures
data -
collect and process data
analysis -
look for patterns with summary tables, graphs, and statistical models
conclusion
interpret the results and generate new questions about the real world context
data is a representation
of someone or something
tidy data is a way of mapping the real world to a
dataset
observations are the
things we are interested in
attributes are the
pieces of information we are interested in
in tidy datasets, there is one row for
each observation
one observation
per row
one attribute
per column
rows are the
things we are interested in
types of data
quantitative, categorical, rating scale, test, time series
if someone is writing in a value then it is
text data
data cleaning
removing invalid values from a dataset
prospective data cleaning
build checks into the survey - preferred method
retrospective data cleaning
clean the data after, remove things
quantitative examples
temperature, blood pressure
categorical examples
hair color, blood type, country of origin
rating scale example
how much pain are you in between 1 and 5
text examples
customer reviews, social media posts
time series examples
date of birth, year, financial quarter
type of measure: quantitative
key giveaway: “how many”, values with decimals, scales
type of measure: categorical
key giveaway: selected from a drop down list, no order to the values
type of measure: rating scale
key giveaway: clear order to the values
type of measure: text
key giveaway: write-in data
type of measure: time series
key giveaway: “when?”