Committee on Privacy and Confidentiality Committee on Privacy and Confidentiality Committee on Privacy and Confidentiality

ASA Committee on Privacy and Confidentiality

Key Terms/Definitions in Privacy and Confidentiality

Training Modules on Privacy and Confidentiality

Methods for Reducing Disclosure Risks

Protecting Biological and Health Data: Special Issues and Applications

Protecting Business and Tax Data: Special Issues and Applications

Protecting Demographic/Other Data: Special Issues and Applications

Guidelines for Government Statistical Agencies

Laws and Regulations about Privacy and Confidentiality

Human Subjects Protection, Ethical Research, and IRBs

 


Protecting Business and Tax Data: Special Issues and Applications

Business and tax data are among the most sensitive data collected by government agencies and researchers. These data often contain highly skewed variables that can be at risk for disclosures. For example, if given actual total payroll on manufacturers in the airline industry, it may be relatively easy to identify Boeing; it is the record with the largest payroll. Furthermore, businesses and individuals understandably want to guard the privacy of this information. For example, private companies do not want their competition to know the amounts they spend on marketing, research and development, payroll, etc., as this might compromise their business practice. And, individuals may be reluctant for others to learn their salaries or total incomes.

If data collectors disseminated business and tax data in ways that resulted in harm to businesses and individuals, data subjects might not be willing to provide their data. This would damage government's ability to make economic policy and reduce researchers' opportunities to analyze economic data. Thus, most business and tax data, if released at all (in fact, there are no public use business micrdata available in the U.S.), are altered before release.

Nearly all the typical alteration strategies are applied on business and tax data; see the web page on data protection methods for explanation of the methods. Below are links to illustrative applications of confidentiality protections on business and tax data. This list is by no means exhaustive, but it does illustrate the techniques typically used to protect these data.

Aggregation in the County Business Patterns (CBP)
Business and tax microdata are frequently aggregated for public use. This link to the CBP, released by the Census Bureau, illustrates how establishments' payroll and employee size are aggregated to create public use tables.

Noise addition in the Commodity Flow Survey (CFS)
This paper illustrates how noise can be added to underlying economic microdata when the released data are tabular. The CFS is released by the Census Bureau.

Noise addition in the Longitudinal Employer-Household Dynamics (LEHD) Program
This presentation provides an example of adding noise to establishment-level data. The LEHD program is run by the Census Bureau.

Microaggregation in the Individual Tax Model Public Use File (ITMPUF)
This link is to a paper in the 2002 proceedings of the Joint Statistical Meetings that describes the microaggregation strategy used for the ITMPUF, which is released by the Statistics of Income division of the Internal Revenue Service.

Synthetic data in the Survey of Consumer Finances
The Federal Reserve Board protects sensitive monetary values by replacing them with multiple imputations. This is the first published instance of what is now known as partially synthetic data.

Synthetic data in the Longitudinal Business Database (LBD)
The U.S. Bureau of the Census is developing a partially synthetic public use data set for the LBD. This JSM proceedings paper summarizes some of the initial development.

Copyright ©2003, 2009 American Statistical Association