AI in the dock

Many of the controversial uses of AI have been connected with policing and the criminal justice system. The use of facial recognition for video surveillance has been a major issue, both for its absolute impact on privacy and personal freedom, and also the potential for racial bias given differing accuracy across ethnic groups.

Another use has been various forms of predictive policing. For the police, this is often non-specific, using data analysis to identify high risk areas for increased police presence rather than Minority Report style identification of particular individuals and crimes. However, similar techniques have been used extensively in the US to help judges in parole decisions, where AI systems use individual demographic details and criminal records to predict the likelihood of recidivism, that is whether a convicted felon is likely to reoffend if released early.

COMPAS is one of the systems that has been used. It was developed explicitly to address weaknesses identified by criminologists in earlier systems and to incorporate the best empirical methods [BD18].

However, despite these worthy aims, COMPAS has been the subject of particular scrutiny due to a ProPublica article that highlighted significant racial disparity in its recidivism risk scores [AL16,LM16]. The analysis was focused on African-American vs white subjects and found that while the absolute accuracy was roughly similar (around 60%), the form of errors differed markedly between racial groups. Crucially the article included the following table, which emphasised the disparity.

	WHITE	AFRICAN AMERICAN
Labeled Higher Risk, But Didn’t Re-Offend	23.5%	44.9%
Labeled Lower Risk, Yet Did Re-Offend	47.7%	28.0%

It should be noted that the COMPAS training and use did not include explicit markers for race, but several of the features used by the algorithm [BW09] such as poverty or living on a ‘high crime neighborhood’ are likely to correlate with race, so act as proxy variables. Brennan, one of the developers of COMPAS, is reported as saying that “If those [correlated features] are omitted from your risk assessment, accuracy goes down” [AL16].

The article has been very influential in highlighting the issues of bias in these kinds of algorithms. However, it has also been subject to critique, leading to academic study [KM16,Ch17] and a follow up ProPublica article [AL16b], as well as a piece in the Washington Post on the mathematics of fairness [CE16]. The key issues are that:

there are multiple definitions of fairness, which are often fundamentally irreconcilable [KM16]; and
there are different base rates for recidivism, that is the proportion of people who reoffend varies between different ethnic groups and across gender.

A recent Medium article by Prathamesh Patalay [Pa23] has a very accessible explanation of the issues.

The discussions were not helped by the fact that the rows in the original table would have been better described “Didn’t Re-Offend But Labeled Higher Risk” and “Did Re-Offend Yet Labeled Lower Risk”. The table corresponding to the actual terms shows a very slight reverse bias! This conflict between fairness definitions is evident in the gender analysis ProPublica’s detailed analysis, where the Female:Male statistics have a similar shape to the White:African–American ones, in large part due to the higher recidivism rate for men [LM16].

However, this focus on the different mathematical fairness definitions can obscure deeper questions.

The COMPAS algorithm training and ProPublica analysis are based on historic data showing actual recidivism. However, the value of this as a ‘gold standard’ depends on the fairness of past policing practices and court processes, which are themselves often suspected of bias.
Why do the base rates for recidivism differ so much between gender and ethnicity? Correlated features such as poverty point to deeper structural racism outwith the criminal justice system. AI is a mirror reflecting underlying problems in society.
COMPAS provides a series of detailed scores against different risk factors. Instead of using it to determine whether a prisoner is released early, it could instead be used to target help once they are released. Indeed, this appears to have been one of the original aims for the tool.

References

[AL16] Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner (2016). Machine bias there’s software used across the country to predict future criminals: And it’s biased against blacks. ProPublica (23 May 2016). https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

[AL16b] Julia Angwin and Jeff Larson (2016). Bias in Criminal Risk Scores Is Mathematically Inevitable, Researchers Say. ProPublica (30 Dec. 2016). https://www.propublica.org/article/bias-in-criminal-risk-scores-is-mathematically-inevitable-researchers-say

[BW09] Brennan, Tim, William Dieterich, and Beate Ehret (2009). “Evaluating the predictive validity of the COMPAS risk and needs assessment system.” Criminal Justice and Behavior 36(1):21-40. https://doi.org/10.1177/0093854808326545

[BD18] Brennan, Tim, and William Dieterich (2018). Correctional offender management profiles for alternative sanctions (COMPAS). Chapter 3 in Handbook of recidivism risk/needs assessment tool. pp. 49–75. https://doi.org/10.1002/9781119184256.ch3

[Ch17] Chouldechova, Alexandra (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5(2):153-163. https://doi.org/10.48550/arXiv.1610.07524

[CE16] Corbett-Davies, Sam, Emma Pierson, Avi Feller, and Sharad Goel. A computer program used for bail and sentencing decisions was labeled biased against blacks. It’s actually not that clear. Washington Post 17 Dec (2016). https://www.washingtonpost.com/news/monkey-cage/wp/2016/10/17/can-an-algorithm-be-racist-our-analysis-is-more-cautious-than-propublicas/

[KM16] Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan (2016). Inherent trade-offs in the fair determination of risk scores. arXiv preprint. https://doi.org/10.48550/arXiv.1609.05807

[LM16] Larson, J., Mattu, S., Kirchner, L. and Angwin, J. (2016). How we analyzed the COMPAS recidivism algorithm. ProPublica, 23 May 2016. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

[Pa23] Prathamesh Patalay (2023). COMPAS : Unfair Algorithm? Medium. Nov 21, 2023. https://medium.com/@lamdaa/compas-unfair-algorithm-812702ed6a6a