My contribution to this paper on predicting life outcomes was a model that predicted the “material hardship” variable and performed better than the baseline model even though, and this was the point, the selected predictor variables made absolutely no theoretical sense, i.e., they had no plausible causal connection to the outcome, they just happened to correlate. (See also the idea of “crud factor“: in behavioral research everything correlates with everything else). Filiz Garip has a nice commentary on the paper here.
In light of this, I offered some thoughts in response to an inquiry from AlgorithmWatch for a story on predicitive policing in Switzerland (some quotes are included in the final article):
Q: Do the findings of your paper apply to the predictability of individual outcomes in the field of recidivism, or violence against women?
These variables are social and psychological, so yes, their predictability is extremely limited as demonstrated. There is a difference between prediction and explanation, and if you can explain something, you can predict it with the right data, but if you can predict something, it doesn’t mean you can explain it.
The outcomes in the Fragile Families data analyzed in the paper included three binary variables (like the examples in the question probably refer to: “will this person commit another crime, yes or no?”), eviction, job training, and layoff, and there is no reason that predicting recidivism or violence would function differently.
Q: Several pieces of software are in use in the police/justice sector in Switzerland (Precobs for burglaries, ROS for recidivism, Dyrias-Intimpartner for violence against women and others). Is it fair to say that the findings of your paper cast doubt on the assumptions that underlie such software (i.e. that individual outcomes are predictable)?
That individual outcomes or behavior are accurately predictable is highly doubtful. Some aggregate measures may be predictable. So in general, something like Precobs, that highlights a neighborhood to look at, will tend to work better than an algorithms that wants to tell you if a specific person will commit a crime (here, false positives are very detrimental). But even for the first kind, it doesn’t mean it can be carelessly employed .
There may be a sound mechanism derived from empirical (and thus retrospective) or simulation research, but when it is implemented into an algorithmic system, the output is wrapped into a “cloak of objectivity,” the context and qualifications of the initial model are lost in the process of making it “actionable” and subsequently go unquestioned.
Systems like Precobs can have a desirable deterrent effect in that they lead to fewer burglary attempts, not more arrests (I am not sure if this is empirically the case). However, I see an underappreciated risk here: these systems based on algorithmic selection can also lead to broader chilling effects as these systems deter people from exercising their rights and they modify their behaviors. This is a form of anticipatory obedience; being aware of the possibility of getting (unjustly) “caught” by these algorithms, people may tend to increase conformity with perceived societal norms and self-expression and alternative lifestyles are suppressed.
Overall, even if these systems were perfect, I see the broader problem in searching for technological solutions to structural social problems. For example, people with real alternatives will not resort to theft when in financial trouble, so rather than preventing the theft on the spot, investments to reduce social inequities are warranted.