Predictive policing

Hotspots

Crime tends to concentrate at places, so we find the places and direct policing. A very straightforward intervention-oriented approach.

Crime concentration

Andresen, M. A., Linning, S. J., & Malleson, N. (2017). Crime at Places and Spatial Concentrations: Exploring the Spatial Stability of Property Crime in Vancouver BC, 2003-2013. Journal of Quantitative Criminology, 33(2), 255–275. doi:10.1007/s10940-016-9295-8

The core hypothesis, that crime is concentrated at small places, tends to come from statistics like these: “Property crime in Vancouver is highly concentrated in a small percentage of street segments and intersections, as few as 5% of street segments and intersections in 2013 depending on the crime type”.

However, 5% is less impressive when you realize there were 18,445 street segments and intersections on which crime could occur, and only 1700 or so burglaries in a given year, so a completely uniform spread of crimes could still only hit 9% or so of the map.
Weisburd, D., Bushway, S., Lum, C., & Yang, S.-M. (2004). Trajectories of crime at places: A longitudinal study of street segments in the city of Seattle. Criminology, 42(2), 283–322. doi:10.1111/j.1745-9125.2004.tb00521.x

The classic source. This shows an interesting trajectory analysis over several years, and the fundamental crime concentration claim comes from 29,849 street segments and around 100,000 crimes per year, 50% of which is contained in maybe 5% of the segments. This is interesting, but doesn’t determine if crime is more concentrated than we’d expect from simple population density and mapping reasons (e.g. some street segments never experience crime because they’re interstate on-ramps or small access roads).
Hipp, J. R., & Kim, Y.-A. (2016). Measuring Crime Concentration Across Cities of Varying Sizes: Complications Based on the Spatial and Temporal Scale Employed. Journal of Quantitative Criminology, 1–38. doi:10.1007/s10940-016-9328-3

Discusses the issue of random variation causing concentration and the possibility of measuring concentration relative to the concentration we’d expect just from a uniform spread of crime across the map. Instead of proposing a measure which does so, however, they propose metrics which try to avoid upward-biased estimates of concentration—pick the top cells from last year and see what fraction of crime is contained in them this year, for example, which tries to smooth out random variation in concentration. They claim, by linear regression against the number of crimes and the number of possible locations for crime, that this measure accounts for most of the concentration we expect from having few crimes in a large city, but I don’t find this terribly convincing.
Mohler, G. O., Brantingham, P. J., Carter, J., & Short, M. B. (2019). Reducing bias in estimates for the law of crime concentration. Journal of Quantitative Criminology. doi:10.1007/s10940-019-09404-1

Points out the problem of having fewer crimes than locations, and suggests a method that assumes each location’s crimes occur from a Poisson distribution whose mean is drawn from a Gamma distribution (making the crime counts negative binomial overall). Crime concentration can be read from the shape of the Gamma, which can be estimated from the data.

Finding hotspots

Usually clustering methods or kernel densities: pick the areas with clusters or the highest crime density. There are conflicting results on what works best, but I don’t like the metrics anyway; the PAI and RRI don’t seem to measure useful quantities, particularly when you arbitrarily choose your threshold for defining “hotspot” and don’t compare across a range of thresholds, ROC-style.

Chainey, S., Tompson, L., & Uhlig, S. (2008). The Utility of Hotspot Mapping for Predicting Spatial Patterns of Crime. Security Journal, 21(1-2), 4–28. doi:10.1057/palgrave.sj.8350066
Levine, N. (2008). The "Hottest" Part of a Hotspot: Comments on "The Utility of Hotspot Mapping for Predicting Spatial Patterns of Crime". Security Journal, 21(4), 295–302. doi:10.1057/sj.2008.5
Drawve, G. (2016). A Metric Comparison of Predictive Hot Spot Techniques and RTM. Justice Quarterly, 33(3), 369–397. doi:10.1080/07418825.2014.904393
Drawve, G., Moak, S. C., & Berthelot, E. R. (2016). Predictability of gun crimes: a comparison of hot spot and risk terrain modelling techniques. Policing and Society, 26(3), 312–331. doi:10.1080/10439463.2014.942851

For evaluation metrics:

Adepeju, M., Rosser, G., & Cheng, T. (2016). Novel evaluation metrics for sparse spatio-temporal point process hotspot predictions - a crime case study. International Journal of Geographical Information Science, 30(11), 2133–2154. doi:10.1080/13658816.2016.1159684

On top of PAI, adds measures of compactness of hotspots, their consistency from one time period to the next, and the difference in prediction between different methods.

Explaining hotspots

He, L., Páez, A., & Liu, D. (2016). Persistence of Crime Hot Spots: An Ordered Probit Analysis. Geographical Analysis, 1–20. doi:10.1111/gean.12107

Counts how frequently each block is a hotspot in a spatial scan statistic method, then tries to use covariates to predict this.

Experimental trials

Various experiments have tested whether directing patrols to hotspots reduces crime, to generally positive results.

Sherman, L. W., & Weisburd, D. (1995). General deterrent effects of police patrol in crime "hot spots": A randomized, controlled trial. Justice Quarterly, 12(4), 625–648. doi:10.1080/07418829500096221
Ratcliffe, J. H., Taniguchi, T., Groff, E. R., & Wood, J. D. (2011). The Philadelphia foot patrol experiment: A randomized controlled trial of police patrol effectiveness in violent crime hotspots. Criminology, 49(3), 795–831. doi:10.1111/j.1745-9125.2011.00240.x
Braga, A. A., Papachristos, A. V., & Hureau, D. M. (2014). The Effects of Hot Spots Policing on Crime: An Updated Systematic Review and Meta-Analysis. Justice Quarterly, 31(4), 633–663. doi:10.1080/07418825.2012.673632

But trying to solve community problems may be better than just saturation patrol:

Taylor, B., Koper, C. S., & Woods, D. J. (2011). A randomized controlled trial of different policing strategies at hot spots of violent crime. Journal of Experimental Criminology, 7(2), 149–181. doi:10.1007/s11292-010-9120-6
Groff, E. R., Ratcliffe, J. H., Haberman, C. P., Sorg, E. T., Joyce, N. M., & Taylor, R. B. (2014). Does what police do at hot spots matter? The Philadelphia policing tactics experiment. Criminology, 53(1), 23–53. doi:10.1111/1745-9125.12055

One curious trial finds increases in crime when hotspot patrols are predictable: [To read] Ariel, B., & Partridge, H. (2016). Predictable Policing: Measuring the Crime Control Benefits of Hotspots Policing at Bus Stops. Journal of Quantitative Criminology, 1–25. doi:10.1007/s10940-016-9312-y

Risk Terrain Modeling

A spatial technique to identify spatial features which lead to crime. Works by identifying risk factors (bars, foreclosures, schools, etc.), mapping these, and then seeing how well they predict crime.

The initial iteration just added up the number of risk factors, then used a logistic regression to predict presence or absence of crime: Kennedy, L. W., Caplan, J. M., & Piza, E. L. (2010). Risk Clusters, Hotspots, and Spatial Intelligence: Risk Terrain Modeling as an Algorithm for Police Resource Allocation Strategies. Journal of Quantitative Criminology, 27(3), 339–362. doi:10.1007/s10940-010-9126-2

Model selection was just “which logistic regression has the biggest slope”, which naturally biases it to the models with fewer risk factors, since their risk values have a smaller range (as just a count of present factors) and hence must have a larger slope. Variable selection used a bunch of univariate chi-squareds, and I’m dubious about using p values to decide which variable predicts best.

Then came an update which uses elastic net penalized regression to fit a Poisson model, picking the best penalty via cross-validation, then further reducing the model with stepwise regression and BIC. (Why not just adjust the penalty parameter for more sparsity?) Features were included as three binary variables for proximity (within 426, 852, or 1278 feet) and three different kernel densities (with those three bandwidths), for reasons I do not understand: Kennedy, L. W., Caplan, J. M., Piza, E. L., & Buccine-Schraeder, H. (2016). Vulnerability and Exposure to Crime: Applying Risk Terrain Modeling to the Study of Assault in Chicago. Applied Spatial Analysis and Policy, 9(4), 529–548. doi:10.1007/s12061-015-9165-z

Other spatial methods

Andrew Palmer Wheeler, “Quantifying the Local and Spatial Effects of Alcohol Outlets on Crime”, https://ssrn.com/abstract=2869198.

Uses negative binomial regression with a bunch of covariates to estimate the effect of alcohol outlets on various crimes, including burglary, which has a similar relationship to the other variables, somewhat surprisingly. Finds that different kinds of alcohol outlets have statistically indistinguishable effects on crime, suggesting it’s not just drunk people from bars causing problems but the increased traffic to liquor stores and shops as well.
Xu, J., & Griffiths, E. (2016). Shooting on the street: Measuring the spatial influence of physical features on gun violence in a bounded street network. Journal of Quantitative Criminology, 33(2), 237–253. doi:10.1007/s10940-016-9292-y

Evaluates the connection between crimes and spatial point features, like bus stops, by a cross K function, sort of a continuous spatial generalization of a Knox test. Advocates measuring distance in terms of road network shortest path distance, rather than Euclidean distance. Uses the K functions to estimate the distance of influence of each feature.

Doesn’t use the K functions to test for near repeats (see section below), as time isn’t included in the analysis.

I suspect these results are biased from not accounting for self-excitation at all, and the null hypothesis used in the K function plots is complete spatial randomness of points, which is never seriously believed to be false, even if the spatial point features are completely unrelated to the crime pattern. I’d need to see simulations showing what happens to the K function when spatial distributions of events are clustered but independent.

Near repeats

Crimes tend to be followed by nearby crimes, e.g. from a burglar returning to an area to try a new target.

Counting repeats

A bunch of papers use the Knox test, a permutation test that compares the number of crimes nearby in space and time with the permutation null. Requires discrete choice of cutoffs for “nearby”, so claims of distances of effects are really claims about the power of the test. (If significance is only found within 200m, would it be found at 300m if we had more data?) Implemented in the Near Repeat Calculator, widely used.

Townsley, M., Homel, R., & Chaseling, J. (2003). Infectious burglaries: A test of the near repeat hypothesis. British Journal of Criminology, 43(3), 615–633. doi:10.1093/bjc/43.3.615
Comparing distances with significant near-repeat effects across crimes (though perhaps not meaningful for power reasons, as counts varied from 3,000 to 12,000 across crime types): Youstin, T. J., Nobles, M. R., Ward, J. T., & Cook, C. L. (2011). Assessing the Generalizability of the Near Repeat Phenomenon. Criminal Justice and Behavior, 38(10), 1042–1063. doi:10.1177/0093854811417551
Combining near-repeats into chains of crimes: Haberman, C. P., & Ratcliffe, J. H. (2012). The Predictive Policing Challenges of Near Repeat Armed Street Robberies. Policing, 6(2), 151–166. doi:10.1093/police/pas012
Chainey, S. P., & Silva, B. F. A. (2016). Examining the extent of repeat and near repeat victimisation of domestic burglaries in Belo Horizonte, Brazil. Crime Science, 5(1), 1–10. doi:10.1186/s40163-016-0049-6

Another approach models choice of houses to burgle with a multinomial logit, where the outcome is the choice of house: Ratcliffe, J. H., & Rengert, G. F. (2008). Near-Repeat Patterns in Philadelphia Shootings. Security Journal, 21(1-2), 58–76. doi:10.1057/palgrave.sj.8350068

K functions

Ripley’s K function provides a continuous analog of the Knox test statistic. It’s a normalized count of the average number of points within a given distance of an arbitrary event, so it’s function of distance instead of having an arbitrary cutoff; a natural space-time generalization counts the average number within a given distance and a given time. Plotting these gives a sense of the scale and decay of near-repeat effects.

Used to compare before and after stop-and-frisk events: Wooditch, A., & Weisburd, D. (2016). Using Space-Time Analysis to Evaluate Criminal Justice Programs: An Application to Stop-Question-Frisk Practices. Journal of Quantitative Criminology, 32(2), 191–213. doi:10.1007/s10940-015-9259-4

Heterogeneity vs. state dependence

Burglaries are the most common crime studied, presumably because the theory is clear: burglars like returning to areas they’re familiar with. But this is easily confounded with spatial heterogeneity: some places are better to burgle than others, regardless of whether they were recently burgled. This seems connected to the state dependence vs. heterogeneity problem, Heckman, J. J. (1991). Identifying the hand of past: Distinguishing state dependence from heterogeneity. The American Economic Review, 81(2), 75–79. http://www.jstor.org/stable/2006829

Johnson, S. D. (2008). Repeat burglary victimisation: a tale of two theories. Journal of Experimental Criminology, 4(3), 215–240. doi:10.1007/s11292-008-9055-3

Via simulation, tries to show that heterogeneity can’t account for the entire observed effects, since it wouldn’t cause as many very rapid repeats. This doesn’t actually settle the issue: all it demonstrates is that these particular simulation models are distinguishable, not that all possible state-dependent or heterogeneous processes are distinguishable from each other.
Short, M. B., D’Orsogna, M. R., Brantingham, P. J., & Tita, G. E. (2009). Measuring and Modeling Repeat and Near-Repeat Burglary Effects. Journal of Quantitative Criminology, 25(3), 325–339. doi:10.1007/s10940-009-9068-8

Another attempt to disentangle the two effects; also claims they can be distinguished, since they lead to different distributions of inter-event times.
Ornstein, J. T., & Hammond, R. A. (2017). The burglary boost: A note on detecting contagion using the knox test. Journal of Quantitative Criminology, 33(1), 65–75. doi:10.1007/s10940-016-9281-1

Shows the problem also affects Knox tests, which confound contagion and spatial heterogeneity.

Interventions

Johnson, S. D., Davies, T., Murray, A., Ditta, P., Belur, J., & Bowers, K. (2017). Evaluation of operation swordfish: A near-repeat target-hardening strategy. Journal of Experimental Criminology. doi:10.1007/s11292-017-9301-7

Experimental intervention to reduce near-repeat burglaries by providing prevention information and tools (like light timers and neighborhood watch information) to victims of burglaries and their neighbors. Found only a small, marginally significant effect on crime rates, and a small increase in satisfaction with police.

Self-exciting point process models

Other prediction methods

Bao Wang, Duo Zhang, Duanhao Zhang, P.Jeffery Brantingham, Andrea L. Bertozzi (2017). “Deep Learning for Real Time Crime Forecasting.” https://arxiv.org/abs/1707.03340

Because interpretability is for wimps.

Not compared to other methods for accuracy.

Weather

Crime is, naturally, affected by the weather.

LeBeau, J. L., & Corcoran, W. T. (1990). Changes in calls for police service with changes in routine activities and the arrival and passage of weather fronts. Journal of Quantitative Criminology, 6(3), 269–291. doi:10.1007/BF01065411
Field, S. (1992). The effect of temperature on crime. British Journal of Criminology, 32(3), 340–351.
Brunsdon, C., Corcoran, J., Higgs, G., & Ware, A. (2009). The influence of weather on local geographical patterns of police calls for service. Environment and Planning B: Planning and Design, 36(5), 906–926. doi:10.1068/b32133
Mares, D. (2013). Climate change and crime: monthly temperature and precipitation anomalies and crime rates in St. Louis, MO 1990-2009. Crime, Law and Social Change, 59(2), 185–208. doi:10.1007/s10611-013-9411-8

Predictive policing and the law

A series of papers on how predictive policing interacts with the Fourth Amendment:

First, it’s surprising to see that courts already have recognized an implied Fourth Amendment exception for “high-crime areas”, which contribute to finding reasonable suspicion for a stop and search: Ferguson, A. G. (2011). Crime Mapping and the Fourth Amendment: Redrawing "High-Crime Areas". Hastings Law Journal, 63(1), 179–232. http://www.hastingslawjournal.org/2014/04/03/crime-mapping-and-the-fourth-amendment-redrawing-high-crime-areas/

Next, more on the concerns caused by data and predictive policing being used to justify searches:

Ferguson, A. G. (2012). Predictive Policing and Reasonable Suspicion. Emory Law Review, 62(2), 259–325. http://law.emory.edu/elj/content/volume-62/issue-2/articles/predicting-policing-and-reasonable-suspicion.html
Ferguson, A. G. (2015). Big Data and Predictive Reasonable Suspicion. University of Pennsylvania Law Review, 163(2), 327–410. https://ssrn.com/abstract=2394683
Ferguson, A. G. (2017). Policing Predictive Policing. Washington University Law Journal, 94. https://ssrn.com/abstract=2765525

A skeptical review of predictive policing claims, and the gradual evolution from property crime to violent crime to person-based prediction. Discusses the various dangers, such as low-quality or biased data, and criticizes claims (like PredPol’s) of effectiveness made with minimal experimental evidence. I think the skepticism is well-deserved: hotspot policing based on secret algorithms and low-quality data gives me the heebie-jeebies.
Kelly K. Koss, “Leveraging predictive policing algorithms to restore fourth amendment protections in high-crime areas in a Post-Wardlow world”, 90 Chicago-Kent Law Review 301 (2015). http://scholarship.kentlaw.iit.edu/cklawreview/vol90/iss1/12/

Explores the results of the Supreme Court’s Illinois v. Wardlow decision that being in a “high-crime area” can be a factor in establishing reasonable suspicion for a Terry stop. Proposes putting the FBI in charge of national standards for crime analysis; I’m not sure the FBI has the statistical expertise to deal with rapidly evolving methods, or expertise dealing with ordinary street crime that most police agencies would target. Given the FBI’s role in distributing Stingray surveillance devices, I’m not sure they’re interested in training police on constitutional concerns of predictive policing either.

I’d like to see an approach like NIST’s Forensic Science Center of Excellence or previous National Academies reports on forensics: convene some outside experts, not FBI agents without quantitative backgrounds.
Brayne, S. (2017). Big data surveillance: The case of policing. American Sociological Review, 82(5), 977–1008. doi:10.1177/0003122417725865

Brayne embedded herself in the LAPD for several years, conducting interviews with officers and analysts to see how they put surveillance into practice. She finds some cases where data analysis simply launders what officers already do; for example, LAPD developed a points-based system for identifying high-risk people, and being stopped by an officer for a field interview counts for points, so bias in who is stopped quickly turns into official risk scores. Teaming with Palantir, LAPD swept in massive amounts of data from varied sources (license plate readers, foreclosures, vehicle registration, warrants, social media…), and automated search tools make it easy to find everyone matching certain criteria. This also encourages “system avoidance”, where those in trouble with the law systematically avoid using medical, financial, or other institutions that keep formal records that might be used to track them.