Online Program

Estimation of Re-Identification Risk in De-identified Health Care Data

*Seth Eisen, Department of Veterans Affairs 
*Aleksandra Slavkovic, Penn State University 
*Stephen Fienberg, Carnegie Mellon University 
*Lawrence Cox, National Institute of Statistical Sciences 
*Xiao-Hua Andrew Zhou, University of Washington 
*Yaniv Erlich, Whitehead Institute for Biomedical Research 


The US Department of Veterans Affairs Health Administration (VHA) and Office of Research and Development (ORD) are committed to promoting transparency by making de-identified health data available. One of the greatest concerns about releasing de-identified health data is the threat of re-identification of individual Veterans. Previously, it has been thought that HIPAA guideline de-identified patient data is not re-identifiable. It is now recognized that the dramatic increase in scope of electronic health data, the ability to merge de-identified health data with identified data obtained from various sources, and the availability of substantial computing power, poses a significant re-identification risk. Numerous statistical methods to assess re-identification risk have been proposed in the literature. The goal of this workshop is for invited panelists to provide guidance regarding the methods that have been used or could apply to protect health data that is already de-identified.