1/197
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
what types of questions can be asked within epidemiology?
descriptive: how many people in the population have a NCD, interested in describing the health of the population, tracking the health of the population for comparison over a time period
analytical/aetiological: what causes and thus can prevent NCDs, looking at association between risk factor and diseases, trying to understand the casuality between the relationship and then improve the health outcomes
predictive: given a set of predictors, can we predict who will get a disease.
what is health?
Health is multidimensional (physical, mental, social) with the positive end of health covered (not just diseased/non-diseased)
WHO’s definition: a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity
defined as your ability to adapt and self-manage in the face of social, physical and emotional challenges
precise universal meaning challenging and strict definition for all conditions not possible
what are the differences between binary and continous measures of health?
a condition is considered as a continuum with arbitrary cut-offs implemented to ensure it is binary
psycopathy makes it diffcult to place a practical threshold as original axis exists on a continuum.
binary = yes/no
continuous = symptoms might exist on a continuum
benefits & disadvantages of self-reported measures of health
cheap and easy to obtain
able to obtain info on multiple different aspects of health
difficult to interpret when understanding aetiology
potential for bias (either random/differential)
what are some examples of measures that are normally used as outcomes in health research?
mental health disorders/symptoms
self-rated health (e.g very good - very bad scale)
health records
vital status
benefits and disadvantages of objective measures of health?
e.g blood pressure, BMI
can be more accurate than self-reported equivalents
Hawthorn effect: when people change their behaviour to create more positive results when being measured for some sort of objective measure
what are some examples of derived/calculated measures?
life expectancy (year of birth, sex, age-specific death rates)
healthy life expectancy (data on health conditions/disability)
what’s the difference between accuracy and precision?
accuracy: how close to the true value the measurement is
precision: how reproducible/consistent the measurement is
features of using health records as a form of measurement
May only be possible to match a proportion of the population
May only capture those visible to the health service
Typically, only measure binary disease states
how has the model surrounding disability change over the years?
1800s - medical model (management aimed at curing), directly thought that it is caused by disease
1970s - social model (depends on social environment of person), frames it as a socially created problem, complex collection of conditions
Now - acknowledged combination/interaction, evolving definition
definitions of disability
UK Equality Act 2010: disabled if you have a physical or mental impairment that has a ‘substantial’ and ‘long-term’ negative effect on your ability to do normal daily activities
WHO: difficulties in any 3 areas of functioning (impairments, activity limitations, participation restrictions) due to physical or mental health conditions.
definition of sensitivity (binary trait)
how good a test is at finding something if it is there - probability that the diseased person is correctly identified as diseased
- e.g the % of the results that will be positive when HIV is present
definition of specificity (binary trait)
accuracy against false positives - probability that non-diseased correctly identified as non-diseased
e.g the % of the results that will be negative when HIV is not present
what is medicalisation?
the process by which problems traditionally considered nonmedical come to be defined/treated as medical issues
expansion of medical professional’s influence and authority into the domains of everyday existence
identifying a personal or social condition (e.g certain medical issue) subject it to medical intervention
what are the underlying factors driving the increase in medicalisation?
appropriate: greater awareness of mental health problems, improved detection by health practitioners
inappropriate: influence of pharma industry to increase profit, medicalisation if previously dealt with by non-medical means
what is screening?
identification of unrecognised disease of defect by rapid tests
what are the key considerations when deciding whether it should be implemented?
sensitivity and specificity example
potential benefits include: earlier detection of disease, reduced ill health and consequent burden
potential risks: false positives
definition of surveillance
systematic and continuous collection, analysis and interpretation of data, closely integrated with the timely and coherent dissemination of results to those who take action.
what is surveillance used for?
use learning to improve public health actions
assess distribution, identify determinants and application of knowledge
estimating magnitude of a problem, determining geographic distribution, detecting epidemics, stimulating research and evaluating control measures
what are some things to consider when looking at surveillance data?
population size has large impacts on raw data and their presentation
some people are not at risk for developing a new onset due to pre-existing infection
denominator important to consider (size of total population at risk needs to be known)
prevalence definition
existing state and number of cases (both new and old cases)
often expressed as % (n.o of diseased/total n.o)
features of prevalence
different types exist: lifetime, period (e.g in 1 year), point (e.g new cases)
how many people are cured/treated or died can affect prevalence (often no longer accounted for within the prevalence pool)
more useful for chronic disease planning (e.g resource allocation)
incidence definition
number of new cases in a specific period of time, usually expressed per 100,1,000 or 100,000 persons
features of incidence
more useful than prevalence when accounting for acute conditions as onset and resolution of disease is often very short
more useful for understanding risk and aetiology
cumulative incidence calculation
n.o of new cases/total at risk during period of time
how to calculate incidence rate + what’s person-time
n.o of new cases/ total at risk person-time
person-time: when diseased or lost to follow-up, no longer at risk so not counted
total at risk person time (population size x duration of study)
how are incidence and prevalence related?
high incidence, low prevalence (e.g highly contagious infectious disease with very short duration - colds)
low incidence, high prevalence (e.g diseases with long duration & survival - arthritis, diabetes)
what is standardisation?
a technique to remove effect of confounding variables when making comparisons
uses a reference population to standardize the sample to
helps to reduce/correct age-related differences amongst the groups
applies to both incidence and prevalence (can relate to all demographic characteristics)
why does standardisation matter?
without it, comparing crude rates between 2 populations with different age structures is misleading
e.g an older population will always appear to have higher mortality simply because age is a risk factor
what is sampling?
the selection of participants from a population
target population: the group of interest of your study
aim of sampling is to generalise our results back to the whole population
difference between probability and non-probability sampling
randomly selected particpants vs convenience selection of people
features of probability sampling
can be simple through equal probability or complex through strata weighting
strata conducted as over-sampling can occur in certain sub-groups to ensure the inclusion of more individuals that would otherwise not be researched
tend to be more reliably representative of the population
Can “weight” analysis to recover representativeness - as often the case, we can infer provided we make assumptions that can be generalised to the whole population
features of non-probability sampling
good for getting quick results, might not be able to confidently generalise back to the entire population
often done within a smaller sample
why would we choose a bigger sample size?
Can be more confident about the conclusions we make when we notice differences
greater statistical power, precision and easier to detect true findings
can do “power analysis” - determined by N, effect size and statistical certainty
why do we not choose a smaller sample size?
If there is any effect within a small sample size, there isn’t enough variation to characterize confident differences - is this robust/trustworthy when there is an effect, was this due to lower power if there is no effect?
what is generalisability?
“external validity” - can the findings of this study then be generalised to the wider population?
what are some factors that need to be considered within generalising findings to the general population?
Representativeness of general population - WEIRD populations
detailed knowledge of causal processes & factors which could modify risk
characteristics of source and population (e.g socioeconomic determinants)
most research populations not representative of global populations, findings may not generalise across cultures, socioeconomic groups or geographies
what are the differences between observational and experimental study designs?
observational (cross-sectional surveys, cohort, case-control studies) and experimental (RCT, quasi-experiments)
Changes are made within experimental studies to determine the effect on the participants, whereas in observational studies, there is no change done towards the population.
Quasi-experimental - looking for natural opportunities where the researcher can implement changes
what’s the definition of a cross-sectional study?
Study of health and potential determinant as measured at one point in time, exposure and health measured at same time
benefits and disadvantages of using cross-sectional studies?
Cheap and easy to carry out, if only one cross-section is being taken - good to see if you want to demonstrate basic associations
can’t rule out reverse causation (can’t prove if the factor preceded the disease), nor can you demonstrate temporality between two factors = causal inference challenging
Usually provides prevalence but not incidence - can’t distinguish risk factors for occurrence of disease (incidence) from risk factors for survival with the disease (e.g can’t tell you if a specific factor caused the disease or simply helped the person stay alive once they had it)
how would you carry out a cohort study?
Take everyone at a baseline timepoint, and then again at another time for everyone to measure the health outcome
how would you define a cohort?
group of individuals with shared characteristic - birth cohort/occupational cohort that are followed up over time creating a longitudinal approach and can be prospective/retrospective.
how would you conduct a cohort study?
1st time - exposure and confounders measured - variables need to be isolated to see impact on actual variable on health outcome
2nd time - health outcome measured
benefits and limitations of cohort study
important to demonstrate temporality and to show that the variable came before the effect
loss to follow-up: if those who drop out differ systematically from those who stay = biased results
how would you conduct a case-control study?
Retrospective study that identifies individuals with a specific outcome (cases) and similar individuals without it (controls) to compare their prior exposure to risk factors
Study moves backward in time, from effect (disease) to cause (exposure).
Researchers compare the frequency of exposure in the cases to the frequency of exposure in the controls.
benefits and limitations of case-control study
comparatively quick & easy to conduct, useful in particular when disease is rare
choosing comparison (control) group is difficult - hard to match (so findings could be confounded, even after matching or adjustment), concerns about accuracy of recalling past events (exposures)
what’s a kappa statistic?
evaluating the agreement between the 2 raters when they are classifying items into categories, ranging from -1 to 1, a score of 1 indicates perfect agreement whilst 0 indicates agreement no better than chance and negative values indicate disagreement.
what is an ecological study?
Unit of observation in a group of people, not individuals (e.g country/city)
e.g., instead of looking at whether Person A smokes and has lung cancer, looks at whether Country A has high smoking rates and lung cancer rates compared to Country B.
benefits and disadvantages of ecological study designs?
Data is often readily/cheaply available
Causal inference difficult given confounding (ecological fallacy) and it is based off drawing individual inferences from grouped data.
cannot link exposure to outcome at the individual level
what is an experimental study design?
Compare treatment with placebo or other treatment which can be randomised (unblinded, single or double-blind)
benefits and disadvantages of experimental study designs?
Potentially robust in terms of causal influence since confounding can be minimised (randomisation)
Concerns about: ethics, practicality, wider generalisability (who takes part voluntarily in trials?), usually limited to short-term follow-up
benefits and disadvantages of study reviews?
Narrative - broad but possibly biased with the main concern being that the author might be biased in picking (cherry pick) studies for inclusion, thus conclusion
Systematic: usually narrower, less bias and with optional meta-analysis
Combines results, where they can be combined/compared leading to precise estimates (tight confidence interval) but this could still be non-causal = misleading
Useful to test heterogeneity
definition of risk
probability that something will occur, and probabilities can range from 0 to 1, or converted to %
The closer to 1 = greater risk
how would you compare risks?
Subtract to get risk difference (risk in exposed - risk in unexposed)
Divide to get risk ratio (risk in exposed/ risk in unexposed)
how do you interpret the results in risks?
Ratios > 1.0 indicate rate is higher among exposed then unexposed,
=1.0 indicate no association,
<1.0 = rate is lower among exposed than unexposed
what are odds ratio?
a way to compare the odds of an outcome happening in one group vs another
number of “events” divided by the number of “non-events’
E.g if 1 person is sick and 4 are healthy = odds are 1:4
what do the different values mean in the odds ratio?
OR= 1.0, exposure does not affect the odds of the outcome,
OR> 1.0, exposure is associated with higher odds of the outcome (a risk factor)
OR<1.0, exposure is associated with lower odds of the outcome (protective factor)
how to calculate odds ratio
example: Calculate odds of exposure among those with ADHD (300/500) divided by 1 - (300/500) = 1.5
Calculate odds of exposure among those without ADHD (503/1000) - exposed/total n.o of individuals - (503/1000) = 1.012
Odds ratio in the case control study: 1.5/1.012 = 1.48
when is the risk or odds ratio used?
prospective studies - risk/rate ratios used
case control studies - odds ratio (since total population in each group is not known)
what is the difference in utilising difference and ratio measures within studies?
Difference measures quantify the potential direct public health benefit of an intervention.
Ratio measures provide an intuitive summary of the magnitude of differences in 2 exposures (tells us the strength and direction of a relationship - helps to provide a sense of proportion)
what is a confidence interval?
ndicates where we are 95% certain that the true population measure (e.g risk ratio, or other measure of effect, or prevalence estimate) is likely to be - not completely certain since data used from sample which is not the population
how are difference/ratio measures used in relation to confidence intervals?
Difference measure: does the confidence interval contain 0? if so, its not statistically significant (groups are identical and there is no difference)
Ratio measure: does the confidence interval contain 1?
what is the population attributable risk proportion?
measure of the proportion of the total disease burden associated with exposure
PARP = (a/a+b) - (c/c+d) divided by a/a+b
(risk in exposed - risk in unexposed)/ risk in exposed
what is the attributable proporiton?
risk for exposed group - risk for unexposed group/ risk for exposed group x 100
how can you interpret linear regression?
outcome - continuous (e.g BMI), exposure - continuous or categorical
difference between continuous and categorical exposure?
Continuous exposure: mean difference in outcome per 1 unit increase in exposure
Categorical exposure: mean difference in outcome in group 1 compared with group 0 (e.g men vs women)
what is the difference between deterministic and probabilistic?
Deterministic: occurrences are causally determined by preceding events or natural laws
Probabilistic: of, relating to, or based on probability - considering multiple component causes, but often have not identified all of the possible causal components
explain the concept of counterfactuals in relation to disease causation
Can only conjecture factors that are counterfactuals
can never observe the same person in both exposed and unexposed state simultaneously, reliance on comparison groups as proxies
by only observing 1 version of reality, we have to then make an inference from the data that is available with this imperfection in mind.
how can we understand disease causation?
Causes can be shared and for each individual struggling with a condition, no single exposure is sufficient by itself.
People can accumulate “causes” across life (at one point and/or slow cumulation)
E.g tobacco smoke in utero, chronic poverty, cigarette smoking starts in adolescence
Disease manifestation takes time, and understanding etiology across life can inform when to intervene (more cost-effective to intervene earlier in life)
what are the impacts of understanding component causes?
identifying shared component causes (e.g unhealthy food environment, community violence, unhealthy social norms around substance use) can drive targeted health policies amongst the population
Different component causes can cause different health outcomes = impact of random chance events.
Different people may develop the same disease through entirely different sufficient cause combinations, which explains individual variation in who gets diseased.
what is the difference between sufficient and necessary component causes?
Sufficient: set of different factors which result in disease: multiple sets
Some component causes may not be sufficient by themselves, and it needs to act along with other causes
Necessary: if all cases of disease require the cause (e.g alcoholism - alcohol consumption necessary)
what is bias?
a mistaken estimation of the true effect of the exposure and outcome
bias can push estimates towards or away from the null value, making real effects look smaller or larger than they actually are.
what is reverse causality?
Outcome causes the exposure
a pertinent issue in cross-sectional studies, less likely in longitudinal studies
what are some considerations of bias that need to be considered?
No straightforward way to identify/account for, need to be able to consider if it is likely and interpret accordingly
(causality can be in both directions - bi directionality of association)
what is the difference between a confounder and mediation factor when thinking about causal inference?
To understand if 1 exposure causes disease or ill health, need to try to rule out other potential causes (confounding)
Interested in mechanism or pathway (mediation) - helps scientific understanding, can lead to identification of new targets for intervention
what is a confounder?
bias of estimated effect due to common cause of exposure and outcome
can influence both the exposure and outcome at the same time
Common to present unadjusted and confounder-adjusted results
difference between unadjusted and confounder-adjusted results
unadjusted: looking at raw data
adjusted: using math to “cancel out” the effect of the confounder (3rd party variable)
what does it mean when results are similar or different in confounders?
similar: suggests little bias due to confounding - unadjusted & adjusted numbers are almost the same
3rd party factor didn’t really matter with original finding remaining quite solid.
different: : suggests confounding bias - may attenuate toward null value
if link between adjusted and unadjusted values gets weaker, 3rd party was doing all the work and original “link” seen as illusion caused by confounding.
what are some limitations of confounders that need to be considered?
often confounders measured poorly, bias due to confounding can remain when adjusted (residual) or remain as the confounders are unobserved
Hard to measure all the confounders that may be important in an epidemiological bias - can be confounded by unobserved confounders (factors that you don’t think of, or factors that are impossible to measure - e.g genetic predispositions or exact stress levels)
Negative confounding can also happen, masking a true effect which makes a harmful exposure look harmless or even protective.
how can confounding be used within bias studies?
Confounding bias can lead to incorrect conclusions and costly consequences
Comparisons can help us understand importance of confounding bias
Comparing unadjusted and confounder-adjusted estimates of association
Comparing results from different study designs (e.g observational vs experimental studies)
Comparing results from contexts with different confounding structures
what is a mediator?
variable on part of causal pathway from exposure to outcome
explains why the exposure influences the outcome through the mediator
Common in epidemiology to see unadjusted and mediator-adjusted results
how can results be explained by a mediator?
Results similar - suggests not explained by mediator
Results differ - suggested explained by mediator
Can’t distinguish mediation and confounding statistically - need knowledge of topic
how can measurement impact bias?
can be imperfect, random/non-differential which can bias results
Often the association is weakened (attenuated) (e.g risk ratio - exposure measured perfectly (5), exposure measured with more random error (3)
Non-random (differential) measurement error, can either result in over or underestimate of true result
Missing data on the exposure, or the outcome can lead to bias also reduces statistical power
definition of AI
computational systems that can perform cognitive tasks (e.g writing and coding)
what does a LLM do?
processes + generates language
Trained on internet content (series of words through a neural network - input, series of words - output)
Model predicts next word/ ‘token’, can split up words into different parts - surprisingly effective (e.g writing/summarising text, giving code)
Input token -> some algorithm -> output token
definition of agents
LLMs which execute tasks
need to undergo prompt engineering
how can we improve prompt engineering?
be clear/direct, using examples, give a persona, adding imperative has effects on the performance of the LLM
how is AI performing against cognitive benchmarks?
benchmark saturation is a concern and that’s why we need to create new benchmarks that are not available on the internet so AI can’t be trained on them.
Need to make more challenging benchmarks for AI - not super useful if AI can get high scores on benchmarks.
what are some of the benefits of using AI?
AI can now do cognitive tasks and tasks that take longer than hours in comparison to just minutes
99% reduction in costs for using LLM’s over time - cost of intelligence is trending over 0 making big implications in fields where cognitive tasks are prevalent (most white collar jobs)
Powerful but intelligence described as “jagged” - great at some questions/domains but terrible at others (e.g good at writing but terrible at memory storage - need to start a new conversation thread etc.)
what is agi?
artificial general intelligence
AI that can do all/most human cognitive work as well as/better than humans
Levels include: No AI - AI as a tool - AI as a Consultant - AI as a Collaborator - AI as an Expert - AI as an Agent
How can AI be used within health research?
Natural language processing in health is significant application (e.g extracting structured information form clinical notes, electronic health records or patient-reported outcomes at scale which would require enormous human resources)
Other examples in health research: AI in medical imaging diagnostics (e.g detecting cancers from scans), drug discovery acceleration (e.g AlphaFold for protein structure prediction) or AI-assisted clinical trial design
what is gpt?
General Purpose Technology - can be used for anything and does not need to take away human autonomy when utilising these tools for research
Could be used very much for admin tasks
Emerging evidence shows that AI can be quite good in demonstrating and summarising abstracts of papers which could help utilise efficiency within systematic reviews
what are the pitfalls of utilising ai?
hallucinations - could be improved through training/optimisation, can modify temperature setting (lower value = more deterministic)
sycophancy (tend to agree with you and runs with errors in prompts)
inconsistency (asks the same question twice, might get different answers)
bias (may recapitulate biases in training data)
reproductibility issues/reliance on closed source tech (generally can’t send sensitive data via the cloud - e.g regulation of patient data by GDPR)
what are the ethical concerns surrounding the use of ai?
ethical concerns surrounding training data - are they exploiting human activity?
Did we/others consent? Will revenue be shared with authors/artists?
AI slop: increase in low quality scientific publications - overwhelming an already stretched publication system
what are the downfalls of using AI within academia?
- less need to think = mental atrophy (AI tutoring? - effects on the higher education sector, unclear)
over-reliance for statistical analysis or literature synthesis could erode researchers’ ability to critically evaluate methods or spot errors = can we ensure methodological rigour?
how is ai currently being used within academia ?
Increases accesses to millions of papers and reduces hallucinations
Bandwidth freed for higher level tasks for humans
Can help to create more ambitious reviews (e.g across disciplines/designs), continual, more informed papers
AI’s creativity is empirically testable but if flawed/limited - still useful (instant & cheap)
AI can be done through unlocking historic (new) data, and with collected data - manual variable work (AI assisted) and then can use one research question to address all studies.
definition of epidemiology
scientific study of distribution, pattern and causes of health and disease
definition of social epidemiology
branch of epidemiology interested in how social structures and institutions impact health and disease risk in a population
definition of socioeconomic position
umbrella term, that captures lots of different measures of social standing, and how advantaged or disadvantaged individuals are based on their social or economic circumstances
E.g income, education, wealth, housing tenure, occupational class
what are some arguments around SEP?
Class traditionally been defined by occupation, wealth and education. But research argues that this is too simplistic, suggesting that class has 3 dimensions (economic, social and cultural).
SEP is multidimensional, which manifest across life and measures of SEP typically overlap despite potential for independent effects on health