Berk Ustun (Massachusetts Institute of Technology);Cynthia Rudin (Duke University)
Risk scores are simple classification models that let users quickly assess risk by adding, subtracting, and multiplying a few small numbers. Such models are widely used in healthcare and criminology, but are still built ad hoc. In this paper, we present a new approach to learn risk scores that are fully optimized for feature selection, integer coefficients, and operational constraints. We formulate the risk score problem as a mixed integer nonlinear program, and present a new cutting plane algorithm to efficiently recover its optimal solution. Our approach can learn optimized risk scores in a way that scales linearly in the sample size of a dataset, provides a proof of optimality, and accommodates complex constraints without parameter tuning. We illustrate these benefits by building a customized risk score for ICU seizure prediction, as well as an extensive set of numerical experiments.