1/43
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Statistics
The branch of mathematics that deals with the collection, organization, analysis and interpretation of numerical data. specifically used in drawing general conclusions about a set of data from a sample of the data.
Population
Entire group of "entities" that we want information about. "The who"
Parameter
An observation that is applicable to a population "The what"
Sample
A part of the population that we actually examine in order to gather information.
Statistic
An observation that is applicable to a sample "what is calculated"
Voluntary response sample
A sample where people voluntarily offer to provided data. Biased because people with strong opinions are most likely to respond.
Random selected sample
A sample where people are randomly chosen from the population for sampling.
Bias
A systemic inaccuracy in data due to the characteristics of the process employed in the creation, collection, manipulation, and presentation of data, or due to faulty sample design of the estimating technique. Systematically favors certain outcomes.
Reduce this by randomizing the sample and avoiding different types.
Undercoverage Bias
Occurs when a part of the population is not represented in the sample due to methodology
Nonresponse Bias
Occurs when a selected unit cannot be contacted or refuses to participate/cooperate.
Response Bias
Occurs when the behavior of the respondent or of the interviewer cause untruthful responses. Factors such as question wording, race, gender, and time-recall issues affect this.
Simple Random Sample (SRS)
Consists of N individuals from a given population chosen in such a way that every unit has an equal chance of being selected.
Stratified Random Sample
The population is divided into strata based on similar features, then a separate SRS is picked from each of the strata. Afterwards all of these are combined to make the full sample.
Multistage Sampling Design
Selection for Sampling is done in several stages.
Capture-Recapture sampling design
Only used for biological studies, Has 3 steps: 1) tag creatures for first SRS 2) Wait a period of time (washout period) 3) Take another SRS, count how many creatures from SRS 1. Formula: n1/N=Tagged n2/n2
Anecdotal evidence
Provided by open ended questions
Observational Study
A survey for example
Experiment
Deliberately imposes a treatment on individuals in order to observe their responses.
Experimental Units
The individuals on which the experiment is done. If they are human, they are known as subjects.
Confounding
(x and z => y) Two variables are ______ when their effects on a response variable cannot be distinguished from each other.
Factors
The explanatory variable(s) in an experiment
Levels
The different values that factors can have in experiments
Treatment
Experimental condition applied to the units
Response Variable
What is measured for each unit
Lurking Variables
A variable that is not amongst the explanatory or response variables in a study and yet may influence the interpretation of relationships among these variables.
Placebo effect
The effect of simply receiving attention and responding favorably, which may or may not be helpful.
Lack of Realism
When studies do not realistically duplicate the conditions we really want to study. Examples include not having enough units in the sample and using unrelated subject matter in experimentation to draw conclusions on another subject.
Principles of Experimental Design
Control Group, Randomization, Replication
Completely randomized design
all experimental units allocated at random amongst different treatments. (Design)
Randomized block design
Units separated into blocks based on similarities before random assignment to all treatments.
Matched Pairs design
Alternates two treatments between two groups to help eliminate bias.
Causation
Only a carefully controlled experiment that follows the principles of experimental design can show this. Otherwise the study must show 1) a strong association between the variables, 2) a consistent association, 3) higher values of one variable imply higher values of the other, 4) The alleged cause precedes the effect, and 5) The alleged cause is plausible. This rarely is a complete explanation of the association between the two variables.
Ethics of experiments involving humans
1) A review board is required to review the study 2) informed consent needs to be given by the participants 3) the results must remain confidential and cannot be linked back to individual subjects.
Ethics of experiments involving animals
1) Replace animal subjects if possible with computer models, cell cultures, microorganisms, or lower phyla species 2) reduce the number of animals necessary by implementing careful experimental design 3) eliminate or reduce any unnecessary pain or distress.
Inference
the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation.The value of the measured statistic should be as close as possible to the value of the parameter. Avoid variability and bias.
Variability
Described by the spread of its sampling distribution "spread"
Reduce by using a larger sample size
Control Group
The group not receiving the treatment in an experiment.
Centerline
Represents the mean value in a control graph.
Upper Control Limit
= µ + 3 (σ/√n)
Lower Control Limit
= µ - 3 (σ/√n)
Out of control signals
9 points above/below the centerline OR one point outside the UCL or LCL.
Empirical Rule
states that nearly all values lie within three standard deviations of the mean in a normal distribution.
68.27% of the values lie within one standard deviation of the mean. Similarly, 95.45% of the values lie within two standard deviations of the mean. Nearly all (99.73%) of the values lie within three standard deviations of the mean.
Also known as the 68-95-99.7 rule.
Central Limit Theorem
states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed.
Common Response
(z=>(x<->y))
Both x and y change in response to z.