Potential Uses of Administrative Records for Triple System Modeling for Estimation of Census Coverage Error in 2020
*Richard Arthur Griffin, U.S. Census Bureau 

Keywords: Triple system estimation; correlation bias; administrative records

Residual heterogeneity is known to produce bias in the dual system estimates which have been used to estimate census coverage in U.S. Censuses since 1980. Triple system estimation using an administrative records list as a third source along with the census and coverage measurement survey has the potential to produce estimates with less bias. This is particularly important for hard to reach populations. The paper presents potential statistical methods for estimation of net census undercount using three systems for obtaining population information: (1) a decennial census; (2) an independent enumeration of the population in a sample of block clusters; and (3) administrative records. The 2010 Census Match Study will create census-like files for the entire nation using federal and commercial sources of administrative records. The 2010 Census Coverage Measurement Survey is an enumeration in a sample of block clusters that is independent of the 2010 Census. Empirical studies planned, using data files that will be available after the 2010 Census, 2010 CCM, and the 2010 Census Match Study are completed, are discussed. The Census Bureau has used dual system estimation for census net error estimation starting with the 1980 Census. The incomplete 2 by 3 table of counts for triple system estimation can be divided into one complete 2 by 2 sub-table and one incomplete sub-table. Adding the additional source from administrative records provides data with which to evaluate the previously un-testable assumption of independence between the census and the coverage measurement survey. Direct evidence is available in the triple-system tables for odds ratios in 2 by 2 sub-tables formed by restricting consideration to cases observed in the administrative records source. In this case, complete information is available for all four cells defined by capture or not in the census and coverage measurement survey. The analysis indicates that the ability to estimate the dependence between the census and coverage measurement survey for persons not on the administrative list (using persons on the administrative list) may reduce bias in the estimation of census coverage error. With dual system estimation we do not get this reduction in correlation bias in the presence of dependence and heterogeneity because we have no data available from the two sources to estimate the dependence. Potential triple system estimators are suggested in papers by Fienberg (1972), Zaslavsky and Wolfgang (1990, 1993) and Darroch (1993). Some of these estimates have a closed form expression and some are based on models requiring an iterative procedure for fitting. There is empirical evidence available motivating some of the estimators. This paper provides an overview of these estimators and a discussion of the conditions for which each might be accurate for a decennial census. Concerns about correlation bias and potential data errors such as missing data, erroneous inclusion, and matching error are presented. A simulation study using hypothetical populations with relatively low census capture probabilities is presented. Potential triple system estimators are contrasted based on bias, variance, and model fit. Data that will be available from the 2010 CCM and 2010 Census Match Study will be applied to the most promising triple system estimators coming out of the simulation study to demonstrate how alternative estimators compare for populations found to be hard to reach in the 2010 Census.

