This repository contains the replication materials for the paper "To Predict and Serve?," published in volume 13 issue 5 of Significance, and authored by Kristian Lum and William Isaac of the Human Rights Data Analysis Group (HRDAG).
Abstract: Police departments are increasingly relying on predictive policing tools to predict the locations of crime hotspots in the aim of more efficiently allocating scarce policing resources. But, virtually all police forecasting models are trained using police-recorded instances of crime which are neither complete nor representative. Not all crimes are contained within the police records and these records systematically over-represent certain demographic groups. As a result, predictions made on the basis of this data are vulnerable to these same biases, and police action directed by these predictions will disparately affect historically over-policed communities.
We demonstrate this effect by applying a recent predictive policing model to publicly available data on drug crimes in the city of Oakland from 2009 to 2011. We find that despite very small substantive difference in drug use within the city of Oakland, the occurrence of drug related arrests are highly concentrated in areas with higher proportions of low income and non-white residents. When the predictive policing model is applied to this data, additional targeted policing is directed primarily to the the non-white and low-income neighborhoods, despite the fact that drug crimes in these areas are no more common than in more more affluent, white neighborhoods.
This README file provides an overview of the files in the repository. The overview is divided into two sections. The Data section describes the datasets required for the analysis, as well as those generated as outputs from the analysis. Please note that in order to access the data files you will need to install Git Large File Storage or git-lfs. The Code section summarizes the purpose of each R or python script.
Do we need to explain our file system??
- drug_crimes_with_bins.csv -- Data collected by OpenOakland.
- oakland_outline.rds --
- oakland_grid_data.rds --