Predictive policing is a broadly defined term, and for many people, one that invokes a distinct feeling of unease. Movies like “Minority Report” and recent stories about how Palantir is secretly operating in New Orleans do little to ease those fears.
In an effort to improve discourse on this topic, the team here at PredPol has decided to tell you a little about how we do predictive policing and our philosophy around it. We hope you come away with a better understanding of PredPol’s goal to use data to make cities safer, while at the same time preserving the civil rights and privacy of the residents of those cities.
PredPol has a pretty narrow definition of predictive policing. For us and our customers, it is the practice of identifying the locations where specific crimes are most likely to occur, then patrolling those areas to prevent those crimes from occurring. Put simply, it’s about reducing victimization.
Reducing victimization is a hollow victory, however, if it means people have to give up their civil rights or privacy protections along the way. That’s why we have consciously worked to ensure that the data and algorithms we use are as objective and transparent as possible.
Let’s start with the source of our data. In the US, we use only records management systems (RMS) incident data from our law enforcement partners. RMS incident data is collected and managed according to guidelines prepared by the US Department of Justice. The records in the RMS are based on incident reports filed by a patrol officer, which are then reviewed and vetted by a supervisor before being entered in the system. It is the most accurate and objective data to which we have access.
The kind of data we use is also important. We make our predictions based on victimization information, i.e. crimes that have been reported to police. This information is anonymized; no personally identifiable information is ever collected or used. The only data items we collect and use are:
- Incident type (residential burglary, robbery, assault, etc.)
- Incident location (address or latitude/longitude)
- Incident date and time (or a time range if the exact time is not known)
- Case ID, docket number or other unique identifier.
Notice, that we’re not collecting anything else about the victim and nothing at all about an alleged perpetrator. We do not use demographic, economic, or any kind of potential “profiling” information about the location or neighborhood where the incident took place. The what-where-when information from the RMS is all our model needs and nothing more. We have found these three data points to be the most effective predictors of future crime activity in time and space. We not only believe that using any sort of personal identifying information isn’t appropriate, we stand behind our research that shows that it is *more* effective to exclude it. Companies that use personal information are not only perpetuating concerns in communities, they are making less effective models for reducing victimization.
Given the concerns some people have expressed around the idea of using algorithms to direct public policy, we’ve also decided to publish the algorithm we use. In fact, we did this years ago in a peer-reviewed article in the Journal of the American Statistical Association. You can read it here:
It’s worthwhile talking about algorithms themselves for a moment. Algorithms are simply the instructions used (generally by a computer) to process information and arrive at an answer. Essentially, any information that is processed by a computer is following the steps described by an algorithm.
So what’s the value of using an algorithm to recommend patrol locations for police officers? Well, whether we recognize it or not, we are all subject to underlying biases and habits in how we see the world. These biases can influence how we see, perceive and understand things in ways we don’t even recognize. This concept of “cognitive bias” was introduced by Amos Tversky and Daniel Kahneman, who later won the Nobel Prize in Economics for the work he did on this topic.
Cognitive bias works against good policing in a couple of ways. First of all, the perception that officers are unfairly targeting certain neighborhoods can justifiably undermine trust between the community and police. It’s also an inefficient use of patrol resources. You want your officers to be patrolling the areas where crimes are most likely to occur – using an objective set of transparent criteria – rather than where they think crimes are going to occur. You want officers developing personal and community relationships with traditionally underserved and disadvantaged populations - not patrolling an area out of perception. We have found that our model does not simply guide patrol operations into areas that officers (and even community leaders) may perceive have the most crime, but instead objectively helps guide patrol operations to the areas where victimization is most likely to occur at that time.
We know that no model is perfect and we know that there is no magic wand that can eliminate victimization. But we’re proud of the work we do here at PredPol. Our goal from the beginning has been to help law enforcement agencies reduce rates of victimization by identifying and patrolling those areas where victimization is most likely to occur. At the same time, we believe that protecting the privacy and civil rights of the residents of our communities is as important as protecting them from crime. The data we use, the technologies we apply, and the transparency we embrace all support this mission. We have found that our work has reduced victimization in the communities that have deployed it - and we are proud of the fact that we seek to be a community partner. It’s important to us that we continue to be as transparent as possible, and we appreciate your taking the time to learn a little bit more about what we do.