1/32
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Reliability vs validity
Reliable = measure produces same pattern of results through multiple testing vs Valid = how well the measure correlates with another measure. Reliability is necessary but not sufficient for validity
Different reliabilities
Test-retest reliability = consistent scores every time
Inter-rater reliability = consistent scores regardless of who measures the variables
Internal reliability = consistent scores no matter who you ask found using cronbach’s alpha (average of all possible item-total correlations)
Operationalisation
Ways to define/measure variables including a definition of concept, what scope of concept will be measure and how (this can involve using past research)
Covariance
Manipulating/creating a difference (e.g. big vs small bowl) AND observing difference in DV/outcome (e.g. amount of pasta eaten)
Need to establish groups/levels of IV e.g. control/placebo or comparison group, so you can observe the difference caused by the IV on the DV
Systematic vs. Unsystematic Variation
Systematic = a confound e.g. all hungry people end up in large bowl pasta group. Random assignment removes this however this could happen in small sample size through bad luck
Unsystematic variation = not a confound; natural variability within groups
Independent-Groups Design (between-subjects)
Different people at each level of IV
Pro: no order effects Con: least effective for control participant related variables (selection effects) + need more participants
Pre-test and post-test is not within groups because participants only see one level of IV
Within-Groups Design
Same people go through each level of IV
Concurrent-measures: exposed to all levels at same time then give attituded/behaviour preference
Repeated-measures: measured on DV after exposure to each level of IV
Threats to Internal Validity
Design confounds (systematic variation alongside IV), selection effects (one level/group of IV vary systematically to other groups which can only occur in independent-groups so fix through random assignment/matching), and order effects (responses are systematically affected by earlier ones which only occurs in within-groups so fix through counterbalancing)
Design Confounds
Another variable varies systematically along with IV (correct by controlling variables) e.g. people who take notes of laptops answer harder questions than written
Selection Effects
Systematically different types of participants are in two groups which occurs in independent-groups designs only e.g. when participants are involved in choosing groups or when people are actively seeking out treatment (correct through random assignment or matching) E.g. in study on autism some parents insisted their children should be in intensive treatment group
Order effect
Later responses are systematically affected by earlier ones which only occurs within-groups designs only e.g. fatigue and practice (correct by counterbalancing) E.g. taste of chocolate was more delicious the first time compared to second time
Maturation
Spontaneous change in behaviour over time (correct by adding control group) e.g. disruptive boys settle down as they get used to camp
History
External event happens to everyone in treatment group during treatment (correct by adding control group) e.g. dorm residents used less air conditionings because weather got cooler
Regression to the mean
If results are extreme at time 1 it is less likely to be extreme at time 2 as scores move closer to true mean (correct by adding a control group) e.g. pre test depression average is extreme because when they volunteered for study they were feeling more depressed than usual
Demand characteristics
Participants guess what the study is about (correct by blind/double blind study) e.g. campers guess that low sugar diet is supposed to make them calmer
Factorial designs
Experimental designs with more than one IV to test the effect of a factor on the effect of another factor which lets you test limits e.g. how much did A depend on the level of B and theories. The main effect is the overall of each IV ignoring the interaction which is NOT the most important effect while the interaction is the pattern of results at one level of IV compared to the different pattern of one or more levels of the IV which is the MOST important
Deception and debriefing
Deception and not fully informed consent isn't always unethical as long as benefits outweigh risks: if we don't it can change how participants act and skew the results (results that occur NATURALLY); however you have to use the LEAST amount of deception needed
If people are upset with deception they can decide to withdraw data (which can skew results but it is a rule) HOWEVER you can't do that in some studies because debriefing can affect how they act in other similar studies or could if participant told friends about study it could affect how they act
Debriefing is required for any deception study and almost all UoN studies
Quasi-experimental designs
Lacks random assignment/experimental control and often used when random assignment is unethical or would negatively affect the experiment. Groups are pre-defined e.g. age which provides lower internal validity meaning more data checking but possible high external validity.
Validity in small-N designs
Small-N designs are designs with very few participants which often involves sacrificing external validity to improve internal validity. Includes stable baseline, multiple-baseline, and reversal designs.
Avoiding bias in behavioural observers
Blind observers: don't know condition they're observing; (even with bias it will randomly affect control/treatment = unsystematic variation and not bad)
Explicit coding: clear instructions/criteria and high detail definitions to reduce grey areas e.g. smile must go above this line and go for 3 seconds
Positive general, positive specific, negative, neutral, corrective/instructional
Allows us to accurately count stuff
Distraction in car study: combine eye tracking, self-report and behavioural observations to code eye gaze/focus; you then further code
Use multiple observers: must be independent and used to check inter-rater reliability
Representative sampling (probability sampling)
Simple random sampling: everyone in population has equal chance of being selected (often impractical)
Cluster sampling: population is divided into clusters/groups e.g. schools (you randomly sample the schools instead of each student which produces same result for cheaper)
Multistage sampling: Two random samples are collected = stage 1- random sample of clusters selected; stage 2- from selected clusters a random sample of people are chosen e.g. select 10 schools and randomly select 15 children from each school (cluster + simple random sampling)
Stratified random sampling: multistage technique where researcher selects specific demographic categories and randomly samples individuals from each category e.g. obtain information on age demographic in population and ensure sample reflects that
helps you ensure your sample has equal demographics especially in smaller sample sizes
Oversampling: stratified sampling but you over-represent a group/s to avoid problems associated with very small numbers of demographic groups (you then re-adjust in data analysis) e.g. you want rare religion in your group but it's rare so you oversample to get more precise measurements)
Biased sample (non-probability sampling)
Cheaper however doesn't always mean the study is bad (e.g. studies using psych students) and is context dependent (how much does external validity apply/how much can you generalise). Self-selection bias: people who opt into study e.g. course survey or product review. Most people start with an unrepresentative sample then evolve with further replications/developments use more representative sampling. Types:
Convenience sampling: samples easy to access e.g. psych students
Purposive sampling: studying certain kinds of people so you only recruit those types
Snowball sampling: a kind of purpose sampling where participants are asked to recommend other participants for the study; used for hard to reach groups e.g. experts or homeless ppl
Quota sampling: identify subsets of population -> set target no. (quota) -> use non-random/convenience sampling to fill quota
Bivariate claim
Association between two measured variables with each participant having a value for both variables
Bivariate claims are mostly association however they can also be causal + variables can be manipulated
E.g. height and weight are associated and one participant has both
Statistical validity
How well does data support conclusion = what is the effect size, is correlation statistically significant, are outliers affecting the association, is there restriction of range,
Effect size = describes strength of relationship (Cohen's Size of Correlations)
Statistical significance = p is less than significance level and base factor is strong = if you run experiment again you would get the same result
However, statistical significance =/= importance or big effect size because small effect sizes and un-important results can be statistically significant
Importance = how big is the difference and what is the impact/consequences?
Importance and effect size: men are statistically significantly better at mental rotation compared to women but there is a small effect size and knowing that only a few men will be better than women is not important HOWEVER small effect size for heart attack medicine IS IMPORTANT
Temporal precedence: Longitudinal designs
Longitudinal designs = measuring same variables in same ppl at several points in time
Interpreting results from longitudinal designs = cross-sectional correlations, autocorrelations and cross-lag correlations
Cross sectional correlation = same time different variables (multiple variables measured)
Autocorrelations = same variable measured at different time (one variable measured)
Cross-lag correlation = does the variable measured in one time period correlate with other variable in other time period; this addresses directionality problem and establishes TEMPORAL PRECEDENCE
Multiple regression
One output variable (DV) e.g. behaviour problems and MORE THAN ONE input variable (IV ) e.g. minutes for recess and wealth of school; each IV is a predictor
You get a standardised regression coefficient (beta) for each predictor which controls for other predictors i.e. if beta no. is significant then minutes of recess still has a relationship with behaviour problems even when SES is controlled for
Beta is similar to r as -beta no. = negative relationship and vice versa
Most studies use many other input variables and measure them all then run multiple regression for each variable
Mediators
ask WHY these two variables are correlated; testing a theory
Mediators are good for causal claims as they help understand the mechanism that causes two things to be correlated
Example: deep talk and wellbeing: we have a how it happens hypothesis = what if amount of deep talk was related to you social networks meaning amount of deep talk causes well-being but it does it through quality of social ties
Specify time sequence for three variables (temporal precedence) i.e. how are the variables related and which comes 1st
Difference between mediator and third variable: mediator is internal (pathway) to causal variable and interests research vs third variable which is external (something else causes both variables) and is a nuisance
Moderators
ask what the LIMITS/CONDITIONS of the correlation (when, for whom, or under what conditions are variables related)
Relationship between two variables changes depending on the level of another variable = moderator
Moderators are a switch that can change the strength of associations and are very good for external validity i.e. who does this association apply to/who can we generalise
External validity: What if violent tv and aggression was moderated by gender (different for genders)
Types of Replication Studies
Direct replication = researchers try to reproduce original experiment as closely as possible to see if same effect occurs (adds certainty to original finding but fails to expand knowledge and risks repeating same limitations of previous study)
Conceptual replication = study uses different operational definition of the relevant variables, different sample, different design, etc; NOT a strict reproduction but can help avoid existing threats to validity
Different operational definitions have different cons/pros so this can help address that aspect
Replication-plus-extension = effect of interest is reproduced (similar to direct replication) BUT further expansions/development e.g. extra variable or group are added to expand knowledge
Combination of direct replication AND conceptual replication (best of both worlds)
Independent replication = completely independent set of researchers replicate effect of interest which has a much more powerful affect HOWEVER it's rare bc most replications are done by same lab/researcher
Adversarial replication = two independent researchers have opposite theories about same phenomenon then they get together and run one study together which tests both theories
Often when one researcher studies something about there theory it will be true (and vice versa) which isn't because they are cheating but just bc they only study what they think is important which are different to what other researcher -> leads to weird situation where both researchers keep saying they're right when they can't be
P-hacking
When you keep analysing results until you find a significant effect which will lead to detectable p-value bias (e.g. clustered near p = 0.05) causing a type I error i.e. when p-value is exactly 0.05. Each study has a 5% chance that even when there's nothing going on it will say something is significant so p-hacking just involves continuous testing until you get that 0.05