KDD Papers

Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data

Zhaobin Kuang (University of Wisconsin, Madison);Peggy Peissig (Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI);Vitor Santos Costa (Universidade do Porto);Richard Maclin (University of Minnesota, Duluth);David Page (Department of Computer Sciences and Department of Biostatistics, University of Wisconsin, Madison, WI)


Several prominent public health hazards that occurred at the beginning of this century due to adverse drug events (ADEs) have raised international awareness of governments and industries about pharmacovigilance (PhV), the science and activities to monitor and prevent adverse events caused by pharmaceutical products after they are introduced to the market. A major data source for PhV is large-scale longitudinal observational databases (LODs) such as electronic health records (EHRs) and medical insurance claim databases. Inspired by the Self-Controlled Case Series (SCCS) model, arguably the leading method for ADE discovery from LODs, we propose baseline regularization, a regularized generalized linear model that leverages the diverse health profiles available in LODs across different individuals at different times. We apply the proposed method as well as SCCS to the Marshfield Clinic EHR. Experimental results suggest that the proposed method outperforms SCCS under various settings in identifying benchmark ADEs from the Observational Medical Outcomes Partnership ground truth.