Risk Assessment Notes

Risk Assessment

What is Risk Assessment?

Risk assessment is prognostic, not diagnostic.
It assesses the likelihood of certain behaviors (e.g., criminal behavior, violence, sexual offending).
It's based on the presence or absence of relevant risk factors, and sometimes protective factors.
It's a combination of multiple risk factors; no single risk factor is determinative.
It may involve:
- Categorizing an individual (e.g., high-risk).
- Estimating the probability of an individual engaging in a particular behavior (e.g., 50% likelihood of reoffending).
A key feature of evidence-based risk assessment is UNCERTAINTY

Purpose of Risk Assessment

Risk assessment has many possible uses in the criminal justice system:
- To guide decisions about sentencing and release.
- To assist with monitoring and risk/sentence management.
- To determine required need and intensity of treatment.
- To identify potential treatment targets and approaches to treatment.
- To inform allocation of resources.
Legislation can require risk assessment:
- Parole Act 2002 – Section 28: Direction for release on parole
  - “The Board may give a direction under subsection (1) only if it is satisfied on reasonable grounds that the offender, if released on parole, will not pose an undue risk to the safety of the community or any person or class of persons within the term of the sentence”
- Public Safety (Public Protection Orders) Act 2014 – Section 13: Court may make public protection order
  - Applies when there is a "very high risk of imminent serious sexual or violent offending".
  - Highest stakes in New Zealand.
Ethical questions exist about the use of risk assessment in some areas (e.g., sentencing).
- Should individuals be punished for potential future crimes or only for past offenses?
- Risk assessment may create or exacerbate biases within the system.
- Most defensible when used to inform decisions about treatment/rehabilitation, but ethical concerns still exist.

How Do We Assess Risk?

Option 1: Rely on (Professional) Judgement

Involves interview and file review.
Probability of offending/violence is “inferred from psychological structures and dynamics” (Hanson, 2009).
Typically grounded in the professional standing of the practitioner making the assessment.
TWO BIG PROBLEMS:
- Neither the risk factors nor the method of combining the risk factors are specified in advance.
  - Inherently unreliable.
- Weak predictive validity.
  - Statistical approaches are “clearly superior” in the prediction of violence (Ægisdóttir et al., 2006; Grove et al., 2000).

Option 2: Use “Actuarial” Risk Measures

Lists of empirically-validated risk factors compiled into numerical assessments of risk.
Mostly comprised of static/historical factors (e.g., age, criminal history).
Convenient and reliable to score.
Can often be scored solely through a file review (i.e., without meeting the individual being assessed).
Still requires some use of professional judgement.
STRENGTH: Significantly more accurate than professional judgement.
*Static risk factors can't change.
TWO BIG PROBLEMS:
- “Atheoretical”
  - Without theory to them, doesn’t account for why someone would reoffend or why they did.
- Incapable of measuring change.

The RoC*RoI (Bakker et al., 1999)

Actuarial risk measure commonly used in New Zealand.
Computer-generated estimate of likelihood of being reconvicted AND sentenced to imprisonment within 5 years (e.g., 0.5 = 50%).
Algorithm includes age and range of criminal history variables (e.g., age at first conviction, number of prior convictions).
Calculated for all individuals who have been convicted and received a sentence administered by Ara Poutama|Department of Corrections.
Used to inform wide range of decisions within the correctional/criminal justice system.
Predicts recidivism with high level of accuracy.
Higher scores = higher reimprisonment rate, so it does good job at predicting

Option 3: Use Actuarial Dynamic Risk Measures

Lists of empirically-validated risk factors compiled into numerical assessments of risk.
Measures are comprised of dynamic (i.e., changeable) risk factors.
Risk score can change over time.
Likely to require thorough file review and interviews.

Types of dynamic risk factor

Stable
- Change slowly/gradually.
- Related to longer-term risk.
- Target for long-term intervention.
- e.g., pro-violence attitudes, antisocial peers, alcoholism
Acute
- Change rapidly.
- Signal imminent risk.
- Target for short-term intervention.
- e.g., anger, intoxication, employment.

DRAOR

Dynamic Risk Assessment for Offender Re-entry (Serin et al., 2007).
Risk measure probation officers use in New Zealand when meeting offender – scored 0, 1, or 2.
STRENGTHS:
- Incremental predictive validity over static measures (Olver & Wong, 2019).
- Can guide treatment and supervision.
LIMITATIONS:
- Less reliable and more time-consuming to score than static measures.

The Violence Risk Scale (Wong & Gordon, 2023)

Includes both static and dynamic factors.
Also dynamic factors of measure detailed take hours to score it.

Option 4: Use Protective Factors

Relatively new concept – developing area of research.
Came from the maltreatment and resilience literature, also “promotive factors” or “strengths”.
Some measures that include risk and protective factors (e.g., DRAOR).
Some tools of their own, to be used alongside risk assessment measures (e.g., SAPROF).
*Many people went through horrible things but didn’t reoffend
STRENGTHS:
- Incremental predictive validity over risk measures (e.g., Burghart et al., 2023).
- Can guide treatment and supervision.
- Could enhance client motivation.
LIMITATIONS:
- Relatively untested empirically.
- Conceptual questions (opposite of risk or something else?).

Option 5: Use Structured Professional Judgement

Non-numeric (non-algorithmic) decision making process of risk estimation.
Lists of dynamic risk factors but not combined into an overall score.
Instead, form a “judgement” on risk level (e.g., low, moderate, high) based on number/relevance of factors.
Commonly used in practice (e.g., HCR-20).
STRENGTHS:
- More individualised approach.
- Reasonable evidence of predictive validity (Douglas & Shaffer, 2021).
LIMITATIONS:
- Judgement remains unreliable.
- “A regressive step” (Bonta & Wormith, 2013).

The Ten Commandments of Risk Assessment (Bonta & Andrews, 2023)

Use actuarial measures of risk
Risk assessments should demonstrate predictive validity
The assessment instruments should be directly relevant to the business of corrections
Use instruments derived from relevant theory
Assess dynamic risk factors
Use general personality and cognitive tests for the assessment of responsivity
Use multi-method assessment
Use multi-domain sampling
Exercise professional and ethical responsibility
Adhere to the least restrictive alternative

How Accurate is Risk Assessment?

The importance of accuracy:
- Overestimation can lead to infringement on human rights and liberties.
- Underestimation can lead to opportunities to continue offending and further victims/harm.
  *Reading – risk assessment is criterion reference vs norm reference
  *Norm is comparing people across population, how no items in those measures relate to each other but criterion measure a specific outcome so it better if items/ questions are unrelated so each add value

Assessing Accuracy

True positive: Predicted would reoffend and did.
True negative: Predicted wouldn’t reoffend and didn’t.
False negative: Predicted wouldn’t reoffend but did (concerning one).
False positive: Predicted would reoffend but didn’t.
These are diagnostic accuracy criteria – saying something is or is not present and can test against true accuracy, never clear cut off point for risk assessment score- predictive
Example:
- Men: 109 reconvicted, 345 not reconvicted.
- Women: 3 reconvicted, 59 not reconvicted.
- Making a prediction that men will be reconvicted and women will not doesn't work, high false positive.
Recidivism rate for men = 24% ( $(109/454)$ ) vs. rate for women = 4.8% ( $(3/62)$ ).
Correctly predicted 97% ( $(109/112)$ ) of those reconvicted BUT…
Only 32.6% ( $(168/516)$ ) correct overall.
False positive rate = 76%!

Area Under the Curve (AUC)

One useful measure for assessing accuracy.
Between 0 and 1.
Over 0.5 = better than chance.
Interpretation = the probability that a random recidivist will have a higher score than a random non-recidivist (Helmus & Babchishin, 2017).
0.5 50% chance levels anything above better than chance
Tells us how the risk measure is ordering people into high vs low risk
Structured actuarial risk measures show AUCs ~ .65 - .75 in general (Campbell et al., 2009; Hanson & Morton-Bourgon, 2009; Olver et al., 2014; Singh et al., 2011; Yang et al., 2010).
Accuracy differs depending on measure, sample, outcome, follow-up period, etc.

Accuracy Across Groups

Major issue in risk assessment.
Recent large meta-analysis (Olver et al., 2024) focused on accuracy across ethnicity.
“Most measures demonstrated moderate predictive validity but often had significant ethno-racial differences, particularly for static measures”.
*Look at reading difference between discrimination vs calibration
*Issue w accuracy with how those would reoffend w indigenous vs non indigenous group – large meta analysis done another
*Discrimination measure for indigenous was moderate but slighter worse in other ones
*Calibration – discrimination measure how discriminate among higher risk individuals for recidivism among indigenous and nonindigenous, moderate same but lower risk individuals had higher recidivism potentially due to bias different treatment during lower level offenses
Key question: what is the alternative?
- “Although there are disparities in prediction magnitudes between Indigenous and non-Indigenous samples, depending on the tool and outcome, generally speaking, we concur with Gutierrez et al. (2016) that “abandoning their use is not defensible, unless they are replaced with a method empirically demonstrated to have superior accuracy”” (Olver et al., 2024).
The alternative is unstructured risk measured which arguably will have more space for bias

How Do We Communicate Risk?

Different ways to communicate risk:
- “Mr Smith is high risk of engaging in further violent offending”.
- “Mr Smith has a 50% chance of violent reoffending after release”.
- “Mr Smith is three times as likely as the average offender to engage in further offending”.
  *1st classified into categories (high, moderate or low risk) complication arises how everyone deciphers it different
  *2nd way providing number is called absolute risk estimate give probability likehood w number
  *3rd relative risk estimate how someone might compare to average
Consistent with research, people prefer the second option (concrete answer).
Different options have different strengths (Lehmann et al., 2016):
- Risk categories:
  - Useful for resource allocation.
  - Easy to calculate.
- Absolute recidivism estimates:
  - Very useful for threshold decisions.
  - Easy to interpret.
- Other relative risk metrics (e.g., ratios, percentiles):
  - Reliable across samples (Helmus et al., 2012).
  - Some (e.g., percentiles) are relatively easy to interpret.
Problems with all available options (Lehmann et al., 2016):
- Risk categories:
  - Different measures use different numbers of categories.
  - Same categories (e.g., high-risk) mean different things (Singh et al. 2014).
- Absolute recidivism estimates:
  - Accuracy is misleading (estimates are at group level, not individual level).
  - Estimates not consistent across samples (Helmus et al., 2012).
- Other relative risk metrics (e.g., ratios, percentiles):
  - Can be misinterpreted because most people are bad at fractions (Varela et al., 2014).
  - Limited without information about base rates.
Hanson et al.’s (2017) Common Risk Language: A way forward?
*Came up with common risk language any type of offending

Suggested Viewing

Dr Andrew Brankley
- Risk Assessment Tool Types (6:47)
  - https://www.youtube.com/watch?v=kp4kxG6n87w
- What is Risk: A Simple Answer (8:39)
  - https://www.youtube.com/watch?v=EKSN51tb4Lk
- Common Risk Language, Part I: Why do we need to standardize risk communication? (10:06)
  - https://www.youtube.com/watch?v=5DT6Juw0Epk&t=79s

Key References

Ægisdóttir, S., White, M. J., Spengler, P. M., Maugherman, A. S., Anderson, L. A., Cook, R. S., … & Rush, J. D. (2006). The meta-analysis of clinical judgment project: Fifty-six years of accumulated research on clinical versus statistical prediction. The Counseling Psychologist, 34(3), 341-382. https://doi.org/10.1177/0011000005285875
Bonta, J. & Andrews, D. (2023). The Psychology of Criminal Conduct (7th Edition). Taylor & Francis Group.
Grove, W. M., Zald, D. H., Lebow, B. S., Snitz, B. E., & Nelson, C. (2000). Clinical versus mechanical prediction: A meta-analysis. PsychologicalAssessment, 12(1), 19-30. https://doi.org/10.1037/1040-3590.12.1.19
Hanson, R. K. (2009). The psychological assessment of risk for crime and violence. Canadian Psychology/Psychologie canadienne, 50(3), 172. https://doi.org/10.1037/a0015726
Hanson, R.K., Bourgon, G., McGrath, R.J., Kroner, D., D’Amora, D.A., Thomas, S.S., & Tavarez, L.P. (2017). A five- level risk and needs system: Maximizing assessment results in corrections through the development of a common language. Council of State Governments Justice Center.
Helmus, L. M., & Babchishin, K. M. (2017). Primer on risk assessment and the statistics used to evaluate its accuracy. Criminal Justice and Behavior, 44(1), 8-25. https://doi.org/10.1177/0093854816678898
Monahan, J., & Skeem, J. L. (2016). Risk assessment in criminal sentencing. Annual Review of Clinical Psychology, 12, 489-513. https://doi.org/10.1146/annurev-clinpsy-021815-092945
Olver, M. E., Stockdale, K. C., Helmus, L. M., Woods, P., Termeer, J., & Prince, J. (2024). Too risky to use, or too risky not to? Lessons learned from over 30 years of research on forensic risk assessment with Indigenous persons. Psychological Bulletin. Advance online publication. https://doi.org/10.1037/bul0000414