Decision making with limited feedback: Error bounds for predictive policing and recidivism prediction

Danielle Ensign, Sorelle A. Frielder, Scott Neville, Carlos Scheidegger, Suresh Venkatasubramanian

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations


When models are trained for deployment in decision-making in various real-world settings, they are typically trained in batch mode. Historical data is used to train and validate the models prior to deployment. However, in many settings, feedback changes the nature of the training process. Either the learner does not get full feedback on its actions, or the decisions made by the trained model influence what future training data it will see. In this paper, we focus on the problems of recidivism prediction and predictive policing. We present the first algorithms with provable regret for these problems, by showing that both problems (and others like these) can be abstracted into a general reinforcement learning framework called partial monitoring. We also discuss the policy implications of these solutions.

Original languageEnglish (US)
Pages (from-to)359-367
Number of pages9
JournalProceedings of Machine Learning Research
StatePublished - 2018
Event29th International Conference on Algorithmic Learning Theory, ALT 2018 - Lanzarote, Spain
Duration: Apr 7 2018Apr 9 2018


  • online learning
  • Partial monitoring
  • predictive policing
  • recidivism prediction

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability


Dive into the research topics of 'Decision making with limited feedback: Error bounds for predictive policing and recidivism prediction'. Together they form a unique fingerprint.

Cite this