Online Program
An Adaptive Approach to HidiroglouBerthelot Outlier Detection*Matthew C Nelson, US Census BureauKeywords: outlier detection, HidiroglouBerthelot, periodic surveys, maximum likelihood logistic regression, ratio edit The HidiroglouBerthelot (HB) ratio edit is a common method of bivariate outlier detection within establishment surveys. Among its strengths is the ability to vary acceptance thresholds depending on the size of a given case. Yet parameterization requires adjustment of up to three variables, and the determination of optimal parameters is not a straightforward process. Moreover, outliers are identified in a strictly binary sense, not allowing for uncertainty that a case may or may not be an outlier. In this paper I propose an adaptive variation of HB, in which outlier probabilities are modeled against transformed terms of the HB formula using maximum likelihood logistic regression. These model parameters are then trained using the results of supervised learning (user classification). In the paper I investigate several logistic models, considering their efficacy in replicating HB decision boundaries and their training efficiency. The goal of this research is to produce a flexible, continuouslylearning variation of HB that still employs the terms of the original formula. Proposed methodology is demonstrated using example scenarios.
