Committee on Privacy and Confidentiality Committee on Privacy and Confidentiality Committee on Privacy and Confidentiality

ASA Committee on Privacy and Confidentiality

Key Terms/Definitions in Privacy and Confidentiality

Training Modules on Privacy and Confidentiality

Methods for Reducing Disclosure Risks

Protecting Biological and Health Data: Special Issues and Applications

Protecting Business and Tax Data: Special Issues and Applications

Protecting Demographic/Other Data: Special Issues and Applications

Guidelines for Government Statistical Agencies

Laws and Regulations about Privacy and Confidentiality

Human Subjects Protection, Ethical Research, and IRBs

 


Protecting Biological and Health Data: Special Issues and Applications

This page talks about protection methods for biological and health data, which often are protected under the Health Insurance Portability and Accountability Act. These data typically contain demographic and other potentially identifying information, and health variables that are sensitive. Most of the typical alteration strategies can be applied on demographic/other data; see the web page on data protection methods for explanation of the methods. Below are links to illustrative applications of confidentiality protections on biological and health data. This list is by no means exhaustive, but it does illustrate the techniques typically used to protect these data.

Aggregation and top-coding in the Health and Retirement Study (HRS)
The HRS uses aggregation of categories (e.g., geographies, occupations), rounding and top-coding (monetary data), and suppression of variables related to the survey design. These actions result in a restricted access data file, which researchers can access after applying and signing promises to maintain data confidentiality.

Noise addition and synthetic data in the National Health Interview Survey Linked Mortality Files
For each person deemed at risk of identification, the Center for Disease Control staff either add noise to the date of death or generate a synthetic value of the underlying cause of death (after aggregated death codes). They also The results from the perturbed and original data are compared in a 2008 paper in the American Journal of Epidemiology (volume 168, pages 336-344).

Data swapping and microaggregation in the Substance Abuse and Mental Health Data Archive (SAMHDA)
The Inter-university Consortium for Political and Social Research (ICPSR) archives and safeguards many datasets, including the SAMHDA. The ICPSR uses data swapping and microaggregation to protect records in these data.

The Personal Genome Project (PGP)
Genetic data are extremely difficult to protect without substantial sacrifice in data usefulness. Researchers at the PGP at Harvard University have taken a different approach: ask individuals to consent to make their genetic data available to the public without modification. Although the data are not protected, we include this link as an alternative approach to data access.

Copyright ©2003, 2009 American Statistical Association