Power, Type 1/2 errors and understanding biases and limitations of studies

Power

Parametric tests are said to have more power

Power: the likelihood of the test detecting a significant difference when the null hypothesis is false

Several things affect the power of tests:

  • Type of test - parametrics are more sensitive
  • Making more accurate measurements - tight procedure and clearly defined and measured dependent variable
  • having a one-tailed hypothesis - lowers the critical value required for equivalent levels of significance

Probability and Significance

probability of events occurring is measured on a scale of 0 to 1

logical probability: ratio of the number of ways our predicted outcome can happen divided by the number of possible outcomes

empirical probability: ratio of the number of relevants which have happened divided by the total number of relevant events

differences/correlations needed to be submitted to a test of significance in order for a decision to be made concerning whether the differences are to be counted as showing a genuine effect or dismissed as likely to represent chance fluctuation

reject the null when the probability of being true drops below 0.05: 5% significance level

Type 1 error = false positive: null hypothesis is true but has been rejected because p<0.05

Type 2 error = false negative: null hypothesis has been retained because p>0.05 but there is a real underlying effect

Directional hypothesis = one-tailed test of probability

Non-directional hypothesis = two-tailed test of probability

results tested with a one-tailed test are more likely to reach significance but if the direction is opposite to that predicted, even past critical value, the null hypothesis must be retained

probability distribution: histogram with columns measuring the likelihood of occurrence of the event they represent.

 

Understanding biases and limitations of Studies

  • reliability: a measures consistency in producing similar results on different but comparable occasions
  • validity: whether a measure is really measuring what it was intended to measure
  • internal validity: whether an effect was genuine or the result of incorrectly applied statistics, sampling biases or extraneous variables unconnected with the IV
  • external validity: whether an effect generalises from the specific people, place and measures of variables tested to the population, other populations, other places and other measures of variables tested.
  • population validity: can it be generalised to all other people in that population/other populations
  • ecological validity: can it be generalised to other settings
  • construct validity: does your measure of a concept really reflect the breadth of that concept?
  • standardised procedures reduce variance in people’s performances, exclude bias from different treatment of groups and make replication possible
  • meta-analysis: statistical review of many tests of the same hypothesis in order to establish the extent of valid replication and to produce objective reviews of results in topic areas

Threats to internal validity

  • using a low power statistical test: different tests have varying sensitvity to detect difference
  • violating assumpttions of statistical test used: tests should not be used if the data dont fit the assumptions
  • capitalising on chance: multiple testinng of same data gives a higher chance of gettign a fluke significant result
  • reliabilitty of measures
  • reliability of procedures
  • random errors in the research setting
  • participant variance
  • history: events which happen to participants during the research which affect results but arent linked to the IV
  • maturation: participants mature durinng the study (eg child development studies)
  • testing: practise or recallign mistakes
  • selection bias
  • drop out
  • imitation of treatment: control participants may knnow what teh treatment groups are doing
  • rivalry of control group: control participants may resent the treatment or want to do as well as the treatment group

Threats to external validity

  • construct validity
  • inadequate variable definition: to what extent are the measures used adequately defined
  • mono-method bias: construct validity is improved by taking a variety of measures of the same concept
  • hypothesis guessing: treatment participants guess what is required of them during the study
  • evaluation apprehension: hypothesis guessinng may lead to tryinnng to please the experimenter
  • experimenter expectancy
  • level of independent variable: may not be far enough apart (30+40s vs 30s+1m)
  • ecological validity

Relative Values of Quantitative vs Qualitative Studies

QuantitativeQualitative
Information is objective and narrowInformation is subjective and rich
high internal validitylow internal validity
artificial settingrealistic/naturalistic setting
structured designunstructured design
low realismhigh realism
low construct validityhigh construct validity
high reliabilitylow reliability

Sampling

Types of Groups

control group: group which is used as a baseline measure against which the performance of the intervention group is assessed

experiment/treatment group: group who recieves values of the IV in ann experiment or quasi-experiment

placebo group: group who dont recieve treatment but everything else the experimental group recieve and who are sometimes lef to believe their treatment will have an effect

Types of Sampling

cluster sampling: sample selected from a specific area as beinng representative of a population

opportunity sampling: sample selected because they are easily available for testing

systematic sampling: sample selected by taking every nth case

quota sampling: sample selected so that specified group[s will appear in numbers proportional to their size in the target population

random sampling: sample selected in which every member of the target population has an equal chance of being selected

self-selecting sampling: sample selected for study on the basis of their own action in arriving at the sampling point

snowball sampling: sample selected for study by asking key figures for people they think will be important or useful to include

stratified sampling: samples are selected so that specified groups will appear in numbers proportional to their size in the target population, within each subgroup cases are selected on a random basis