Methods for Reducing Disclosure Risks When Sharing Data
Statistical
agencies, survey organizations, academic researchers, and business
establishments often collect data that they seek to share with
others. Dissemination of data can facilitate advances in
research, inform public policy, furthers citizens' knowledge, and
improve students' learning. Typically, groups that share
data are ethically or legally obligated to protect the confidentiality
of data subjects' identities and sensitive attributes. Failure to
do so can break promises or violate laws, can cause data subjects to
give lower-quality answers, and can reduce participation rates in
future data collection efforts. Data disseminators thus are
pulled in two directions: the benefits of data access encourage them to
release data, but the need to protect confidentiality encourages them
not to release data. The links below describe this dilemma and
approaches to dealing with it.
A.
Defining the
problem: Disclosure risk and data utility
Data providers strive to balance the quality of data for statistical analysis with
the risk of confidentiality breaches. This page describes this trade off.
B.
Overviews of statistical disclosure protection methods
To protect confidentiality, data providers can alter the original data before sharing them.
This page describes data alteration strategies typically used in practice.
C.
Technological solutions: secure data enclaves, remote access, data
licensing
Another approach is to protect confidentiality by restricting or controlling who has access to
the sensitive data. This page describes various mechanisms for providing restricted access.
D.
Computer science approaches and privacy preserving data mining
Computer scientists develop algorithms and protocols for safe sharing of data distributed across multiple parties, for safe query systems, and
for general disclosure limitation setting. This page
describes these approaches.
E.
Further investigation (books, journals, web sites, technical reports)
This page suggests some resources for further investigations of the topics in parts A through D.