Prediction of In Vivo Toxicity Endpoints Using In Vitro Bioassay and Numerical Descriptors
*Stanley Young, National Institute of Statistical Sciences 

Keywords: prediction algorithms

In the EPA's Phase-I ToxCast dataset, a total of 289 distinct small molecules are characterized with 76 in vivo toxic endpoint measurements, 524 in vitro biological assay predictors, and 68 physical/chemical property predictors. In our analysis, the compounds were further described with 1544 augmented atom descriptors. These multiple blocks of predictors are fed into a recursive partitioning algorithm to predict the in vivo toxic endpoints. Our results show that augmented atom descriptors are better predictors than the in vitro bioassay data or physical/chemical property data. These results suggest it would be cost effective to use the numerical chemical descriptors for the ranking of untested compounds for further study. Another expected benefit of this research is the identification of which biological assays or structural predictors are most useful for a given in vivo toxic endpoint.