High Dimension Multiple Imputation: Missing Blood Glucose Values in the Epidemiology of Diabetes Interventions and Complications Study
Keywords: chained equations, missing data, model checking, neuropathy, retinopathy, variable selection
The Epidemiology of Diabetes Interventions and Complications (EDIC) study is following subjects from the Diabetes Control and Complications Trial, which compared conventional (n=729) and intensive (n=711) treatment for type 1 or insulin-dependent diabetes. Reported findings have had fundamental impacts on treatment in practice. Recent work has found genetic associations to health measurements and diabetes-related conditions. Missing data in blood glucose profile measurements (7 measurements in a day, one day per quarter over nine years) compromises the ability to evaluate the association of characteristics of glycemia (e.g., variation) with long term outcomes. Multiple imputation (MI) using chained equations is a feasible and realistic approach to filling in the missing values and enabling certain analyses. To implement the MI strategy, one must use appropriate models for imputation. The number of glucose values (more than 250) and the high fraction missing (16% nonresponse and another 23% truncated) presents a variety of challenges. Selection of models and predictor variables for imputation along with procedures for assessing the quality of imputations are presented.