TL17: Evaluation of diagnostic tests/devices with imperfect reference standard
*Bipasa Biswas, FDA 

Keywords: Diagnostic tests, imperfect reference standard, bias

A diagnostic test/device used to categorize the presence or absence of a condition of interest is evaluated against a reference standard to assess its performance measures. Sometimes, it is not possible to establish a definite diagnosis i.e. a reference standard may not exist or may be impractical to administer. For example, the diagnosis of Alzheimer’s disease cannot be definitive until a patient has died. Even a definitive diagnosis say for detecting polyps by optical colonoscopy in colorectal screening of subjects is subject to errors (optical colonoscopy can miss polyps). Thus in many diagnostic accuracy studies, an imperfect reference standard is used to evaluate the test. The accuracy of the test being evaluated against an imperfect reference standard will often be either under- or overestimated. This type of bias is called imperfect reference standard bias. I would like to discuss possibilities of correcting for such biases in clinical studies evaluating the performance of a new diagnostic test.

Key Questions: (1) How to report performance measures when reference standard is not perfect? (2) Optical colonoscopy is currently used for detecting and removing polyps. How best to evaluate devices for detecting polyps against optical colonoscopy where OC is not a perfect test? (3) Agreement measures like positive percent agreement (PPA) and negative percent agreement (NPA) against the imperfect reference standard? Is there any better way to report performance other than the PPA and NPA? (4) Can we assume conditional independence? When is it appropriate to assume conditional independence?