1/63
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Bias
A deviation from the truth in data collection
tendency to favor the selection of certain members of the population
ALWAYS check for bias when collecting data→ no recovery from a biased sampling method AFTER the sample is collected.
Selection bias
A type of bias when the sample is not representative of the population.
technique for selecting
Types
size bias
voluntary response bias
undercoverage bias(inadequate representation of certain groups)
done when using convenience samples or judgment samples(professional opinion only)
Convenience Bias
Choosing individuals whoo are easy to reach
Disadvantages:
Data collected tens to be highly unrepresentative of the entire population
human mistake
a careless human mistake
a real concern of researchers
size bias
tendency to believe outcomes are more likely to occur if they are part of a large category than part of a small category.
Sample Size
The most important is not the fraction of the population, but rather the actual sample size
“bad sample”
if have enough bias→ sample could be worthless
Proportional Sampling
EACH stratum(homogenous groupings) sample size is DIRECTLY proportional to the population size of the ENTIRE population
a “ratio”
Convinience Bias
Based on choosing easy-to-reach individuals
Disadvantages:
tend to produce data unrepresentative of the ENTIRE population
voluntary response bias
Individuals CHOOSE whether they want too respond
Example
survey
Result:
Typically give too much emphases to those who feel STRONG opinions
undercoverage bias
inadequate representation of certain groups
Response bias
the tendency of participants to answer questions inaccurately
Based on participant response
Types:
non-response bias
questionnaire bias(misleading influence due to non neutral wording)
incorrect response bias(very obvious what the right answer is)
Impact: distorts research findings, reduces validity of results
Minimize: careful wording, field test survey questions, use randomized response technique, ensure the anonymity of responses
non-response bias
bias that occurswhen survey participants are unwilling or unable to respond to a survey question or an entire survey.
questionnaire bias
misleading influence due to non neutral wording
incorrect response bias
very obvious what the right answer is
Sample Surveys
poll, questions to small groups in hopes of learning something about the population
Advantages:
best when a random sample(not chosen by the creator nor volunteered) of ppl is surveyed.
Cheaper and quicker than experiments
Disadvantages:
tend to produce data unrepresentative of the ENTIRE population
usually subject to bias → very hard to conclude cause & effect, only can suggest relationships
Sampling frame/Parameter
list of the population units in which the sample is drawn form
could be the whole population of interest
Random Sampling
use of choice in selecting a sample from a population
necessary in order to be able to generalise findings to the population
Census
collecting data from the whole population
no inference procedures are required later
a poorly run census→ can provide less info and be less accurate than a well-designed survey
Population
entire group of individuals does not mean everyone in the world
Numerical summary= parameter
“population parameter”
Sample
part/subset of the population
Numerical summary= statistic
“sample statistic”
Sampling variation/error
*variability
natural variation BETWEEN samples
When different samples give different sample statistics for the estimate for the same population parameter
can never be eliminated
smaller for smaller sample size(n)
never an error
not higher with bias
Randomize
unpredictable chance
Variability
the mean that VARIES from 1 sample to the next
Strata
homogeneous subgroup
can be considered unacceptable if there are overlaps
Good sample design
No personal bias/reference
avoids undercoverage, nonresponse, and response bias
the wording of the question matters
does’t use old data(date+time) for a later date
Larger sample → more detailed and accurate result
Simple Random Sample(SRS)
randomly choosing a number from units
the SAME probability for everyone in the populations
Requirements: numbering everything
example:
pick out of a hat
rolling a dice
random number generator
Stratification( Stratified random sample)
Steps
Units divided into strata(homogeneous groups)
Do an SRS on EACH grouping(from 1)
COMBINE to form a full sample(the stratified random sample)
Advantages:
Convenient, coverage, precision
helps with diversity
easier and more cost-effective than SRS
Notes
nonequal chance of selection
not considered an SRS, but includes it
Cluster Sampling
Units divided into clusters(heterogeneous groups)
Do an SRS to choose grouping(s) (from 1)
test ALL the units that were chosen (from 2)
2 stage cluster sampling
Units divided into clusters(heterogeneous groups)
Do an SRS to choose A grouping(s) (from 1)
A SECOND SRS for chosen groupings(from2) → chose an individual unit(or a few) in EACH grouping
Systematic Sample with random Start (1 in k)
random start, from ___to ___ every kth term
reasonable as long as the original order of the list is not related to the variable under condition
Not a “random” sample
Cluster
heterogeneous groups
Should be representative of the population → EACH cluster should look similar
Depends on the amount of time and money
Observational study
Study where individuals are observed and SPECIFIC variables of interest are measured
No cause-and-effect relationship can be determined
No treatment imposed
Usually MORE cost and time-effective than an experiment
Retrospective
Looking backward, examining old data
Advantages:
tend to be smaller scale, quicker to complete, less expensive
Disadvantages:
Less control due to past record keeping(usually done by others)
subjects inaccuracy
possible bias
Prospective
Watch for outcomes, tracking individuals into the future
Advantages:
less susceptible to recall errors from subjects
researchers do OWN record keeping→ can monitor SPECIFIC variables of interest
Disadvantages:
Expensive, time-consuming
follow a large number of subjects for a long time
Experiment
A research method in which one or more variables are manipulated to observe the effect on another variable.
Individuals placed in particular treatment-measured response
experiments ≠ observation
Examples
clinical trial, randomized comparative experiment
placebo, fake treatment
Statistically significant Experimental results
An observed effect/results that is unlikely to occur by chance
Treatment Group
CHANGING independent variable may affect the results
at least 2
*random assignment MUST be used to determine which experimental units go into which treatment group
Comparison Group
treated “SAME” as treatment group
Control Group
*placebo
No change in the independent variable
may not be necessary(use context)
Good Experimental Design
Control/comparison group → Best to establish a cause-and-effect relationship
Randomization →reduced by confounding
Replication→ SAME treatment for DIFFERENT units
Notes
Should have an EQUAL chance of being assigned to a treatment and assigned in a random way
Completely Randomised Design(CRD)
Number units t to n
n/(number of treatments wanted)
Randomly put treatment numbers into treatment groups
Randomised Pair Comparison Design(matched pairs)
Place 2 homogeneous units in a pair
randomly decide who gets what treatment
A special case of blocking, where EACH pair is a “block”
reduces variability because pairs are already in homogeneous groups
Randomised Block Design
block units based on homosimilarity
Do a CRD with EACH block
randomization occurs ONLY within groups of blocks(homogenous experimental units)
Randomized comparitive Design
Compare ONLY 1 or 2 treatments at a time → Controls lurking variables
Replicate experiment → reduce variation and ensure efficiency
Double Blind Study
"subject & designer don’t know what the experiment treatment is.
Treatments
a combination of factors x levels that an experimental unit receives
block what you can and Randomize what you cannot.
Factors
NO. of groups(“title”)
Levels
NO. of groups within groups
eg. height, length
Experimental Unit
unit that treatment is being applied to
Confounding
cause confusion in response
uncertainty with regard too which variable causes an effect→vairables are confounded→not propper conclusions
Impossible to SEPARATE the effect on the response
can reduce through random assignment, control groups
*variables considered in the study
Lurking variable
Doesn’t cause confusion in response
effects outcome, as it drives 2 other variables into a mistaken cause-and-effect relationship
not included in the analysis
*variables not considered in the study
Random assignment(randomisation)
Best solution to reduce confounding because there is only 2 main causes for a difference in response
Chance
Treatment itself
source of variation cancels out
not SAME as listing out
subjects are randomly assigned to treatments to even out effects over which have no control
*random assignment!
Blocking
Block variables with a STRONG association to results and impact them
Grouping together homogenous units
Decreases the change of variation and lurking variables
make conclusions MORE specific by controlling certain variables and bringing them into the picture
allows us to clearly see a difference caused by treatments
especially when we can’t control certain variables
Random Digit Table
Simulation start: Random place on the graph
Assign digits to correspond to things
record results and make a frequency distribution of the number of trials needed until success
Self selection
kinda like voluntary response, but in this case you just chose what to do, not hv to be inn regards to a survey
Examining and goal of Sampling vs Experiments
Sample: Examining the population and its response→ Describe the characteristics of the population
Experiments: Examining the treatment response → Different treatments lead to different responses
Randomisation in Sampling vs Experiments
Sample: Take an SRS from the population
Experiments: Reduce the likelihood of a confounding variable by randomly assigning treatments to available units
Controlling variation in Sampling vs Experiments
Sample: Variation can be controlled through stratification( combining SRS of strata( homogenous groups))
Experiments: Variation can be controlled by blocking based on homogenous
Threats to inference in Sampling vs Experiments
Sample: 2 bias’s: section bias, response bias
Experiments: Confounding variables
Placebo Effect
Many people respond to ANY kind of perceived treatment
best way to minimize the placebo effect
binding and Control groups
Blinding
Subjects don’t know which treatment they’ve received
sometimes it can be impossible in an experiment
eg. ppl are told to not do something that those in the other treatment is doing
Double Blinding
BOTH subjects and those evaluating responses don’t know which treatment was received.