wellbeing, society and data

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/197

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 11:41 PM on 5/24/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

198 Terms

1
New cards

what types of questions can be asked within epidemiology?

descriptive: how many people in the population have a NCD, interested in describing the health of the population, tracking the health of the population for comparison over a time period

analytical/aetiological: what causes and thus can prevent NCDs, looking at association between risk factor and diseases, trying to understand the casuality between the relationship and then improve the health outcomes

predictive: given a set of predictors, can we predict who will get a disease.

2
New cards

what is health?

Health is multidimensional (physical, mental, social) with the positive end of health covered (not just diseased/non-diseased)

  • WHO’s definition: a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity

  • defined as your ability to adapt and self-manage in the face of social, physical and emotional challenges

  • precise universal meaning challenging and strict definition for all conditions not possible

3
New cards

what are the differences between binary and continous measures of health?

a condition is considered as a continuum with arbitrary cut-offs implemented to ensure it is binary

  • psycopathy makes it diffcult to place a practical threshold as original axis exists on a continuum.

  • binary = yes/no

  • continuous = symptoms might exist on a continuum

4
New cards

benefits & disadvantages of self-reported measures of health

cheap and easy to obtain

  • able to obtain info on multiple different aspects of health

  • difficult to interpret when understanding aetiology

  • potential for bias (either random/differential)

5
New cards

what are some examples of measures that are normally used as outcomes in health research?

mental health disorders/symptoms

  • self-rated health (e.g very good - very bad scale)

  • health records

  • vital status

6
New cards

benefits and disadvantages of objective measures of health?

e.g blood pressure, BMI

  • can be more accurate than self-reported equivalents

  • Hawthorn effect: when people change their behaviour to create more positive results when being measured for some sort of objective measure

7
New cards

what are some examples of derived/calculated measures?

life expectancy (year of birth, sex, age-specific death rates)

  • healthy life expectancy (data on health conditions/disability)

8
New cards

what’s the difference between accuracy and precision?

accuracy: how close to the true value the measurement is

precision: how reproducible/consistent the measurement is

9
New cards

features of using health records as a form of measurement

May only be possible to match a proportion of the population

  • May only capture those visible to the health service

  • Typically, only measure binary disease states

10
New cards

how has the model surrounding disability change over the years?

1800s - medical model (management aimed at curing), directly thought that it is caused by disease

1970s - social model (depends on social environment of person), frames it as a socially created problem, complex collection of conditions

Now - acknowledged combination/interaction, evolving definition

11
New cards

definitions of disability

UK Equality Act 2010: disabled if you have a physical or mental impairment that has a ‘substantial’ and ‘long-term’ negative effect on your ability to do normal daily activities

  • WHO: difficulties in any 3 areas of functioning (impairments, activity limitations, participation restrictions) due to physical or mental health conditions.

12
New cards

definition of sensitivity (binary trait)

how good a test is at finding something if it is there - probability that the diseased person is correctly identified as diseased
- e.g the % of the results that will be positive when HIV is present

13
New cards

definition of specificity (binary trait)

accuracy against false positives - probability that non-diseased correctly identified as non-diseased

  • e.g the % of the results that will be negative when HIV is not present

14
New cards

what is medicalisation?

the process by which problems traditionally considered nonmedical come to be defined/treated as medical issues

  • expansion of medical professional’s influence and authority into the domains of everyday existence

  • identifying a personal or social condition (e.g certain medical issue) subject it to medical intervention

15
New cards

what are the underlying factors driving the increase in medicalisation?

appropriate: greater awareness of mental health problems, improved detection by health practitioners

inappropriate: influence of pharma industry to increase profit, medicalisation if previously dealt with by non-medical means

16
New cards

what is screening?

identification of unrecognised disease of defect by rapid tests

17
New cards

what are the key considerations when deciding whether it should be implemented?

sensitivity and specificity example

  • potential benefits include: earlier detection of disease, reduced ill health and consequent burden

  • potential risks: false positives

18
New cards

definition of surveillance

systematic and continuous collection, analysis and interpretation of data, closely integrated with the timely and coherent dissemination of results to those who take action.

19
New cards

what is surveillance used for?

use learning to improve public health actions

  • assess distribution, identify determinants and application of knowledge

  • estimating magnitude of a problem, determining geographic distribution, detecting epidemics, stimulating research and evaluating control measures

20
New cards

what are some things to consider when looking at surveillance data?

population size has large impacts on raw data and their presentation

  • some people are not at risk for developing a new onset due to pre-existing infection

  • denominator important to consider (size of total population at risk needs to be known)

21
New cards

prevalence definition

existing state and number of cases (both new and old cases)

  • often expressed as % (n.o of diseased/total n.o)

22
New cards

features of prevalence

different types exist: lifetime, period (e.g in 1 year), point (e.g new cases)

  • how many people are cured/treated or died can affect prevalence (often no longer accounted for within the prevalence pool)

  • more useful for chronic disease planning (e.g resource allocation)

23
New cards

incidence definition

number of new cases in a specific period of time, usually expressed per 100,1,000 or 100,000 persons

24
New cards

features of incidence

more useful than prevalence when accounting for acute conditions as onset and resolution of disease is often very short

  • more useful for understanding risk and aetiology

25
New cards

cumulative incidence calculation

n.o of new cases/total at risk during period of time

26
New cards

how to calculate incidence rate + what’s person-time

n.o of new cases/ total at risk person-time

  • person-time: when diseased or lost to follow-up, no longer at risk so not counted

  • total at risk person time (population size x duration of study)

27
New cards

how are incidence and prevalence related?

high incidence, low prevalence (e.g highly contagious infectious disease with very short duration - colds)

  • low incidence, high prevalence (e.g diseases with long duration & survival - arthritis, diabetes)

28
New cards

what is standardisation?

a technique to remove effect of confounding variables when making comparisons

  • uses a reference population to standardize the sample to

  • helps to reduce/correct age-related differences amongst the groups

  • applies to both incidence and prevalence (can relate to all demographic characteristics)

29
New cards

why does standardisation matter?

without it, comparing crude rates between 2 populations with different age structures is misleading

  • e.g an older population will always appear to have higher mortality simply because age is a risk factor

30
New cards

what is sampling?

the selection of participants from a population

  • target population: the group of interest of your study

  • aim of sampling is to generalise our results back to the whole population

31
New cards

difference between probability and non-probability sampling

randomly selected particpants vs convenience selection of people

32
New cards

features of probability sampling

can be simple through equal probability or complex through strata weighting

  • strata conducted as over-sampling can occur in certain sub-groups to ensure the inclusion of more individuals that would otherwise not be researched

  • tend to be more reliably representative of the population

  • Can “weight” analysis to recover representativeness - as often the case, we can infer provided we make assumptions that can be generalised to the whole population

33
New cards

features of non-probability sampling

good for getting quick results, might not be able to confidently generalise back to the entire population

  • often done within a smaller sample

34
New cards

why would we choose a bigger sample size?

Can be more confident about the conclusions we make when we notice differences

  • greater statistical power, precision and easier to detect true findings

  • can do “power analysis” - determined by N, effect size and statistical certainty

35
New cards

why do we not choose a smaller sample size?

If there is any effect within a small sample size, there isn’t enough variation to characterize confident differences - is this robust/trustworthy when there is an effect, was this due to lower power if there is no effect? 

36
New cards

what is generalisability?

“external validity” - can the findings of this study then be generalised to the wider population?

37
New cards

what are some factors that need to be considered within generalising findings to the general population?

Representativeness of general population - WEIRD populations

  • detailed knowledge of causal processes & factors which could modify risk

  • characteristics of source and population (e.g socioeconomic determinants)

  • most research populations not representative of global populations, findings may not generalise across cultures, socioeconomic groups or geographies

38
New cards

what are the differences between observational and experimental study designs?

observational (cross-sectional surveys, cohort, case-control studies) and experimental (RCT, quasi-experiments)

  • Changes are made within experimental studies to determine the effect on the participants, whereas in observational studies, there is no change done towards the population. 

  • Quasi-experimental - looking for natural opportunities where the researcher can implement changes

39
New cards

what’s the definition of a cross-sectional study?

Study of health and potential determinant as measured at one point in time, exposure and health measured at same time 

40
New cards

benefits and disadvantages of using cross-sectional studies?

  • Cheap and easy to carry out, if only one cross-section is being taken - good to see if you want to demonstrate basic associations

  • can’t rule out reverse causation (can’t prove if the factor preceded the disease), nor can you demonstrate temporality between two factors = causal inference challenging

  • Usually provides prevalence but not incidence - can’t distinguish risk factors for occurrence of disease (incidence) from risk factors for survival with the disease (e.g can’t tell you if a specific factor caused the disease or simply helped the person stay alive once they had it)

41
New cards

how would you carry out a cohort study?

Take everyone at a baseline timepoint, and then again at another time for everyone to measure the health outcome 

42
New cards

how would you define a cohort?

  •  group of individuals with shared characteristic - birth cohort/occupational cohort that are followed up over time creating a longitudinal approach and can be prospective/retrospective. 

43
New cards

how would you conduct a cohort study?

1st time - exposure and confounders measured - variables need to be isolated to see impact on actual variable on health outcome

2nd time - health outcome measured

44
New cards

benefits and limitations of cohort study

important to demonstrate temporality and to show that the variable came before the effect

  • loss to follow-up: if those who drop out differ systematically from those who stay = biased results

45
New cards

how would you conduct a case-control study?

Retrospective study that identifies individuals with a specific outcome (cases) and similar individuals without it (controls) to compare their prior exposure to risk factors

  • Study moves backward in time, from effect (disease) to cause (exposure).

  • Researchers compare the frequency of exposure in the cases to the frequency of exposure in the controls.

46
New cards

benefits and limitations of case-control study

  • comparatively quick & easy to conduct, useful in particular when disease is rare

  • choosing comparison (control) group is difficult - hard to match (so findings could be confounded, even after matching or adjustment), concerns about accuracy of recalling past events (exposures)

47
New cards

what’s a kappa statistic?

evaluating the agreement between the 2 raters when they are classifying items into categories, ranging from -1 to 1, a score of 1 indicates perfect agreement whilst 0 indicates agreement no better than chance and negative values indicate disagreement.

48
New cards

what is an ecological study?

Unit of observation in a group of people, not individuals (e.g country/city)

  • e.g., instead of looking at whether Person A smokes and has lung cancer, looks at whether Country A has high smoking rates and lung cancer rates compared to Country B.

49
New cards

benefits and disadvantages of ecological study designs?

  • Data is often readily/cheaply available

  • Causal inference difficult given confounding (ecological fallacy) and it is based off drawing individual inferences from grouped data. 

  • cannot link exposure to outcome at the individual level

50
New cards

what is an experimental study design?

Compare treatment with placebo or other treatment which can be randomised (unblinded, single or double-blind) 

51
New cards

benefits and disadvantages of experimental study designs?

  • Potentially robust in terms of causal influence since confounding can be minimised (randomisation) 

  • Concerns about: ethics, practicality, wider generalisability (who takes part voluntarily in trials?), usually limited to short-term follow-up

52
New cards

benefits and disadvantages of study reviews?

Narrative - broad but possibly biased with the main concern being that the author might be biased in picking (cherry pick) studies for inclusion, thus conclusion 

  • Systematic: usually narrower, less bias and with optional meta-analysis

  •  Combines results, where they can be combined/compared leading to precise estimates (tight confidence interval) but this could still be non-causal = misleading

  • Useful to test heterogeneity

53
New cards

definition of risk

probability that something will occur, and probabilities can range from 0 to 1, or converted to %

  • The closer to 1 = greater risk 

54
New cards

how would you compare risks?

  • Subtract to get risk difference  (risk in exposed - risk in unexposed)

  • Divide to get risk ratio (risk in exposed/ risk in unexposed)

55
New cards

how do you interpret the results in risks?

Ratios > 1.0 indicate rate is higher among exposed then unexposed,

  • =1.0 indicate no association,

  • <1.0 = rate is lower among exposed than unexposed 

56
New cards

what are odds ratio?

a way to compare the odds of an outcome happening in one group vs another

  • number of “events” divided by the number of “non-events’

  • E.g if 1 person is sick and 4 are healthy = odds are 1:4

57
New cards

what do the different values mean in the odds ratio?

OR= 1.0, exposure does not affect the odds of the outcome,

  • OR> 1.0, exposure is associated with higher odds of the outcome (a risk factor)

  • OR<1.0, exposure is associated with lower odds of the outcome (protective factor)

58
New cards

how to calculate odds ratio

example: Calculate odds of exposure among those with ADHD (300/500) divided by 1 - (300/500) = 1.5

  1. Calculate odds of exposure among those without ADHD (503/1000) - exposed/total n.o of individuals - (503/1000) = 1.012

  2. Odds ratio in the case control study: 1.5/1.012 = 1.48

59
New cards

when is the risk or odds ratio used?

prospective studies - risk/rate ratios used

case control studies - odds ratio (since total population in each group is not known)

60
New cards

what is the difference in utilising difference and ratio measures within studies?

Difference measures quantify the potential direct public health benefit of an intervention. 

  • Ratio measures provide an intuitive summary of the magnitude of differences in 2 exposures (tells us the strength and direction of a relationship - helps to provide a sense of proportion)

61
New cards

what is a confidence interval?

ndicates where we are 95% certain that the true population measure (e.g risk ratio, or other measure of effect, or prevalence estimate) is likely to be - not completely certain since data used from sample which is not the population

62
New cards

how are difference/ratio measures used in relation to confidence intervals?

  • Difference measure: does the confidence interval contain 0? if so, its not statistically significant (groups are identical and there is no difference)

  • Ratio measure: does the confidence interval contain 1?

63
New cards

what is the population attributable risk proportion?

measure of the proportion of the total disease burden associated with exposure

  • PARP = (a/a+b) - (c/c+d)   divided by a/a+b

  • (risk in exposed - risk in unexposed)/ risk in exposed

64
New cards

what is the attributable proporiton?

risk for exposed group - risk for unexposed group/ risk for exposed group x 100 

65
New cards

how can you interpret linear regression?

outcome - continuous (e.g BMI), exposure - continuous or categorical

66
New cards

difference between continuous and categorical exposure?

  • Continuous exposure: mean difference in outcome per 1 unit increase in exposure

  • Categorical exposure: mean difference in outcome in group 1 compared with group 0 (e.g men vs women)

67
New cards

what is the difference between deterministic and probabilistic?

  • Deterministic:   occurrences are causally determined by preceding events or natural laws

  • Probabilistic: of, relating to, or based on probability - considering multiple component causes, but often have not identified all of the possible causal components

68
New cards

explain the concept of counterfactuals in relation to disease causation

Can only conjecture factors that are counterfactuals

  • can never observe the same person in both exposed and unexposed state simultaneously, reliance on comparison groups as proxies

  • by only observing 1 version of reality, we have to then make an inference from the data that is available with this imperfection in mind.

69
New cards

how can we understand disease causation?

Causes can be shared and for each individual struggling with a condition, no single exposure is sufficient by itself. 

  • People can accumulate “causes” across life (at one point and/or slow cumulation) 

    • E.g tobacco smoke in utero, chronic poverty, cigarette smoking starts in adolescence

    • Disease manifestation takes time, and understanding etiology across life can inform when to intervene (more cost-effective to intervene earlier in life) 

70
New cards

what are the impacts of understanding component causes?

identifying shared component causes (e.g unhealthy food environment, community violence, unhealthy social norms around substance use) can drive targeted health policies amongst the population

  • Different component causes can cause different health outcomes = impact of random chance events. 

  • Different people may develop the same disease through entirely different sufficient cause combinations, which explains individual variation in who gets diseased.

71
New cards

what is the difference between sufficient and necessary component causes?

  • Sufficient:  set of different factors which result in disease: multiple sets

  • Some component causes may not be sufficient by themselves, and it needs to act along with other causes

  • Necessary:  if all cases of disease require the cause (e.g alcoholism - alcohol consumption necessary) 

72
New cards

what is bias?

a mistaken estimation of the true effect of the exposure and outcome

  • bias can push estimates towards or away from the null value, making real effects look smaller or larger than they actually are.

73
New cards

what is reverse causality?

Outcome causes the exposure

  • a pertinent issue in cross-sectional studies, less likely in longitudinal studies

74
New cards

what are some considerations of bias that need to be considered?

No straightforward way to identify/account for, need to be able to consider if it is likely and interpret accordingly

  • (causality can be in both directions - bi directionality of association)

75
New cards

what is the difference between a confounder and mediation factor when thinking about causal inference?

  • To understand if 1 exposure causes disease or ill health, need to try to rule out other potential causes (confounding)

  • Interested in mechanism or pathway (mediation) - helps scientific understanding, can lead to identification of new targets for intervention

76
New cards

what is a confounder?

bias of estimated effect due to common cause of exposure and outcome

  • can influence both the exposure and outcome at the same time

  • Common to present unadjusted and confounder-adjusted results

77
New cards

difference between unadjusted and confounder-adjusted results

unadjusted: looking at raw data

adjusted: using math to “cancel out” the effect of the confounder (3rd party variable)

78
New cards

what does it mean when results are similar or different in confounders?

similar: suggests little bias due to confounding - unadjusted & adjusted numbers are almost the same

  • 3rd party factor didn’t really matter with original finding remaining quite solid. 

different: : suggests confounding bias - may attenuate toward null value

  • if link between adjusted and unadjusted values gets weaker, 3rd party was doing all the work and original “link” seen as illusion caused by confounding. 

79
New cards

what are some limitations of confounders that need to be considered?

  • often confounders measured poorly, bias due to confounding can remain when adjusted (residual) or remain as the confounders are unobserved

  • Hard to measure all the confounders that may be important in an epidemiological bias  - can be confounded by unobserved confounders (factors that you don’t think of, or factors that are impossible to measure - e.g genetic predispositions or exact stress levels) 

  • Negative confounding can also happen, masking a true effect which makes a harmful exposure look harmless or even protective.

80
New cards

how can confounding be used within bias studies?

Confounding bias can lead to incorrect conclusions and costly consequences

  • Comparisons can help us understand importance of confounding bias

  • Comparing unadjusted and confounder-adjusted estimates of association

  • Comparing results from different study designs (e.g observational vs experimental studies)

  • Comparing results from contexts with different confounding structures

81
New cards

what is a mediator?

 variable on part of causal pathway from exposure to outcome

  • explains why the exposure influences the outcome through the mediator

  • Common in epidemiology to see unadjusted and mediator-adjusted results 

82
New cards

how can results be explained by a mediator?

  • Results similar - suggests not explained by mediator

  • Results differ - suggested explained by mediator

  • Can’t distinguish mediation and confounding statistically - need knowledge of topic 

83
New cards

how can measurement impact bias?

can be imperfect, random/non-differential which can bias results

  • Often the association is weakened (attenuated) (e.g risk ratio - exposure measured perfectly (5), exposure measured with more random error (3) 

  • Non-random (differential) measurement error, can either result in over or underestimate of true result 

  • Missing data on the exposure, or the outcome can lead to bias also reduces statistical power

84
New cards

definition of AI

computational systems that can perform cognitive tasks (e.g writing and coding)

85
New cards

what does a LLM do?

processes + generates language 

  • Trained on internet content (series of words through a neural network - input, series of words - output) 

  • Model predicts next word/ ‘token’, can split up words into different parts - surprisingly effective (e.g writing/summarising text, giving code)

  • Input token -> some algorithm -> output token 

86
New cards

definition of agents

LLMs which execute tasks

  • need to undergo prompt engineering

87
New cards

how can we improve prompt engineering?

  • be clear/direct, using examples, give a persona, adding imperative has effects on the performance of the LLM

88
New cards

how is AI performing against cognitive benchmarks?

  • benchmark saturation is a concern and that’s why we need to create new benchmarks that are not available on the internet so AI can’t be trained on them. 

  • Need to make more challenging benchmarks for AI - not super useful if AI can get high scores on benchmarks. 

89
New cards

what are some of the benefits of using AI?

  • AI can now do cognitive tasks and tasks that take longer than hours in comparison to just minutes

  • 99% reduction in costs for using LLM’s over time - cost of intelligence is trending over 0 making big implications in fields where cognitive tasks are prevalent (most white collar jobs) 

  • Powerful but intelligence described as “jagged” - great at some questions/domains  but terrible at others (e.g good at writing but terrible at memory storage - need to start a new conversation thread etc.) 

90
New cards

what is agi?

artificial general intelligence

  • AI that can do all/most human cognitive work as well as/better than humans

  • Levels include: No AI - AI as a tool - AI as a Consultant - AI as a Collaborator - AI as an Expert - AI as an Agent

91
New cards

How can AI be used within health research?

  • Natural language processing in health is significant application (e.g extracting structured information form clinical notes, electronic health records or patient-reported outcomes at scale which would require enormous human resources) 

  • Other examples in health research: AI in medical imaging diagnostics (e.g detecting cancers from scans), drug discovery acceleration (e.g AlphaFold for protein structure prediction) or AI-assisted clinical trial design

92
New cards

what is gpt?

General Purpose Technology  - can be used for anything and does not need to take away human autonomy when utilising these tools for research

  • Could be used very much for admin tasks

  • Emerging evidence shows that AI can be quite good in demonstrating and summarising abstracts of papers which could help utilise efficiency within systematic reviews

93
New cards

what are the pitfalls of utilising ai?

hallucinations - could be improved through training/optimisation, can modify temperature setting (lower value = more deterministic)

  • sycophancy (tend to agree with you and runs with errors in prompts)

  • inconsistency (asks the same question twice, might get different answers)

  • bias (may recapitulate biases in training data)

  • reproductibility issues/reliance on closed source tech (generally can’t send sensitive data via the cloud - e.g regulation of patient data by GDPR)

94
New cards

what are the ethical concerns surrounding the use of ai?

ethical concerns surrounding training data - are they exploiting human activity?

  • Did we/others consent? Will revenue be shared with authors/artists?

  • AI slop: increase in low quality scientific publications - overwhelming an already stretched publication system

95
New cards

what are the downfalls of using AI within academia?

  • - less need to think = mental atrophy (AI tutoring? - effects on the higher education sector, unclear)

  • over-reliance for statistical analysis or literature synthesis could erode researchers’ ability to critically evaluate methods or spot errors = can we ensure methodological rigour? 

96
New cards

how is ai currently being used within academia ?

  • Increases accesses to millions of papers and reduces hallucinations 

  • Bandwidth freed for higher level tasks for humans

  • Can help to create more ambitious reviews (e.g across disciplines/designs), continual, more informed papers 

  • AI’s creativity is empirically testable but if flawed/limited - still useful (instant & cheap)

  • AI can be done through unlocking historic (new) data, and with collected data - manual variable work (AI assisted) and then can use one research question to address all studies.

97
New cards

definition of epidemiology

  •  scientific study of distribution, pattern and causes of health and disease

98
New cards

definition of social epidemiology

  • branch of epidemiology interested in how social structures and institutions impact health and disease risk in a population

99
New cards

definition of socioeconomic position

umbrella term, that captures lots of different measures of social standing, and how advantaged or disadvantaged individuals are based on their social or economic circumstances

  • E.g income, education, wealth, housing tenure, occupational class 

100
New cards

what are some arguments around SEP?

Class traditionally been defined by occupation, wealth and education. But research argues that this is too simplistic, suggesting that class has 3 dimensions (economic, social and cultural). 

  • SEP is multidimensional, which manifest across life and measures of SEP typically overlap despite potential for independent effects on health