Power, Type 1/2 errors and understanding biases and limitations of studies

Power

Parametric tests are said to have more power

Power: the likelihood of the test detecting a significant difference when the null hypothesis is false

Several things affect the power of tests:

Type of test - parametrics are more sensitive
Making more accurate measurements - tight procedure and clearly defined and measured dependent variable
having a one-tailed hypothesis - lowers the critical value required for equivalent levels of significance

Probability and Significance

probability of events occurring is measured on a scale of 0 to 1

logical probability: ratio of the number of ways our predicted outcome can happen divided by the number of possible outcomes

empirical probability: ratio of the number of relevants which have happened divided by the total number of relevant events

differences/correlations needed to be submitted to a test of significance in order for a decision to be made concerning whether the differences are to be counted as showing a genuine effect or dismissed as likely to represent chance fluctuation

reject the null when the probability of being true drops below 0.05: 5% significance level

Type 1 error = false positive: null hypothesis is true but has been rejected because p<0.05

Type 2 error = false negative: null hypothesis has been retained because p>0.05 but there is a real underlying effect

Directional hypothesis = one-tailed test of probability

Non-directional hypothesis = two-tailed test of probability

results tested with a one-tailed test are more likely to reach significance but if the direction is opposite to that predicted, even past critical value, the null hypothesis must be retained

probability distribution: histogram with columns measuring the likelihood of occurrence of the event they represent.

Understanding biases and limitations of Studies

reliability: a measures consistency in producing similar results on different but comparable occasions
validity: whether a measure is really measuring what it was intended to measure
internal validity: whether an effect was genuine or the result of incorrectly applied statistics, sampling biases or extraneous variables unconnected with the IV
external validity: whether an effect generalises from the specific people, place and measures of variables tested to the population, other populations, other places and other measures of variables tested.
population validity: can it be generalised to all other people in that population/other populations
ecological validity: can it be generalised to other settings
construct validity: does your measure of a concept really reflect the breadth of that concept?
standardised procedures reduce variance in people’s performances, exclude bias from different treatment of groups and make replication possible
meta-analysis: statistical review of many tests of the same hypothesis in order to establish the extent of valid replication and to produce objective reviews of results in topic areas

Threats to internal validity

using a low power statistical test: different tests have varying sensitvity to detect difference
violating assumpttions of statistical test used: tests should not be used if the data dont fit the assumptions
capitalising on chance: multiple testinng of same data gives a higher chance of gettign a fluke significant result
reliabilitty of measures
reliability of procedures
random errors in the research setting
participant variance
history: events which happen to participants during the research which affect results but arent linked to the IV
maturation: participants mature durinng the study (eg child development studies)
testing: practise or recallign mistakes
selection bias
drop out
imitation of treatment: control participants may knnow what teh treatment groups are doing
rivalry of control group: control participants may resent the treatment or want to do as well as the treatment group

Threats to external validity

construct validity
inadequate variable definition: to what extent are the measures used adequately defined
mono-method bias: construct validity is improved by taking a variety of measures of the same concept
hypothesis guessing: treatment participants guess what is required of them during the study
evaluation apprehension: hypothesis guessinng may lead to tryinnng to please the experimenter
experimenter expectancy
level of independent variable: may not be far enough apart (30+40s vs 30s+1m)
ecological validity

Relative Values of Quantitative vs Qualitative Studies

Quantitative	Qualitative
Information is objective and narrow	Information is subjective and rich
high internal validity	low internal validity
artificial setting	realistic/naturalistic setting
structured design	unstructured design
low realism	high realism
low construct validity	high construct validity
high reliability	low reliability

Sampling

Types of Groups

control group: group which is used as a baseline measure against which the performance of the intervention group is assessed

experiment/treatment group: group who recieves values of the IV in ann experiment or quasi-experiment

placebo group: group who dont recieve treatment but everything else the experimental group recieve and who are sometimes lef to believe their treatment will have an effect

Types of Sampling

cluster sampling: sample selected from a specific area as beinng representative of a population

opportunity sampling: sample selected because they are easily available for testing

systematic sampling: sample selected by taking every nth case

quota sampling: sample selected so that specified group[s will appear in numbers proportional to their size in the target population

random sampling: sample selected in which every member of the target population has an equal chance of being selected

self-selecting sampling: sample selected for study on the basis of their own action in arriving at the sampling point

snowball sampling: sample selected for study by asking key figures for people they think will be important or useful to include

stratified sampling: samples are selected so that specified groups will appear in numbers proportional to their size in the target population, within each subgroup cases are selected on a random basis