1/38
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Data
Facts and statistics collected together for reference or analysis
What do we do with data?
collect, store, measure, analyze, visualize
What skill sets do Data analysts need?
R or SAS, SQL, statistical analysis, database management & reporting, and data analysis
Goal of data analysis
to organize and summarize information in order to make evidence-based inferences about priority population. Take raw data and use it solve problems.
data analysis cycle: Ask
Ask a question (stats, case studies)
data analysis cycle: Obtain
Gather data from relevant sources (database, files, web api)
data analysis cycle: Scrub
Clean data to appropriate formats (Split, merge or extract columns)
data analysis cycle: Explore
Find patterns and trends via statistcs (Type of data, descriptive stats, correlation)
data analysis cycle: Model
Construct models for prediction (Regression for predication)
data analysis cycle: Interpret
Use the results for decision making (visualize results with domain knowledge)
Statistics
the collection and classification of data that are in the form of numbers. It uses data collection, analysis, interpretation, presentation.
Data Collection
capturing and gathering all data necessary to complete the processing of transactions
Data Analysis
The process of compiling, analyzing, and interpreting the results of primary and secondary data collection.
data interpretation
Making sense of and analyzing data to find patterns and trends.
Data presentation
an organized display of data.
Population
the entire collection of people, objects, or events that needs to be analyzed
Sample
a subset of the population
descriptive statistics
Uses data to describe the population parameters through numbers and graphics
inferential statistics
use data collected from a sample of a population to make inference and predications about the entire population
Probability
likelihood that a particular event will occur
random experiment
An experiment or process for which the outcome can not be predicted with certainty
sample space
the set of all possible outcomes
event
One or more outcomes of an experiment
confidence interval
a range of values so defined that there is a specified probability that the value of a parameter lies within it.
Types of Investigations
Descriptive, Relational, experimental
descriptive investigation (field studies or interviews)
constructing an accurate description of what is happening.
Relational investigations (Observations or surveys)
used identify relations between multiple factors. Rarely determines the causal relationship between multiple factors.
experimental research (controlled experiements)
the establishment of causal relationship
hypotheses
A precise problem statement that can be directly tested through an empirical investigation.
null hypothesis
there is no difference between experimental treatments.
alternative hypothesis
lstatement that is mutually exclusive with the null hypothesis
Goal of an experiment
to find statistical evidence to refute the null hypothesis in order to support the alternative hypothesis.
independent variable
the variable that is manipulated in an experiment, modify subjects conditions.
Type 1 error (worse than type 2)
Rejecting null hypothesis when it is true. Ex. False alarm
type 2 error
failing to reject a false null hypothesis. Ex. Missed dectection
Low threshold
many type 1 errors
high threshold
many type 2 errors
Dependent variable
variable affected by change
factorial design (between/within, split plot)
a study in which there are two or more independent variables, or factors. Researchers identify interactions between variables.