Extract

Thomas et al. (2012) recently published an evaluation of statistical models for classifying in vivo toxicity endpoints from ToxRefDB (Knudsen et al., 2009; Martin et al., 2009a,, b) using ToxCast in vitro bioactivity data (Judson et al., 2010) and chemical structure descriptors. We commend the authors for a thorough assessment of statistical tools for uncovering patterns of associations among thousands of covariate features derived from in vitro measurements, chemical structure, and toxicity endpoints from animal studies. They were largely unsuccessful in accurately classifying toxicities based on in vitro bioactivity or chemical structure. However, their conclusion that the current ToxCast phase I assays and chemicals have limited applicability for predicting in vivo chemical hazards using statistical classification methods is misleading and warrants clarification.

The approach of Thomas et al. (2012) is primarily a statistical path to producing classifiers that does not incorporate knowledge of biological or adverse outcome pathways to group assays or endpoints from the in vitro or in vivo data sets. Classification accuracy has two key ingredients: relevant assays or descriptors and a sufficient number of representative chemicals, positives and negatives, for the different types of toxicity endpoints being predicted. This is a classical statistical issue, and emphasizes the difficulty in finding robust statistical associations with relatively small number of samples for the many ToxRefDB apical toxicity endpoints. Hence, Thomas et al. (2012) conclude that the current ToxCast phase I assays and chemical library have limited applicability for predicting in vivo chemical hazards. The results described in Thomas et al. (2012) are not altogether surprising and are consistent with findings from other research groups, including our own. In a modeling study of ToxCast/ToxRefDB-like data (Judson et al., 2008), we demonstrated that although most statistical or machine-learning methods perform well in the presence of a small number of causal features, most show significant degradation in performance as irrelevant features are added—a well-known characteristic of such models (Almuallim and Dietterich, 1991). We also demonstrated in Judson et al. (2008) that all statistical and machine-learning methods performed better with feature selection. This earlier study with simulated data led us to a hypothesis-driven approach for feature selection and data aggregation incorporating biological, chemical, and toxicological knowledge into modeling from the ToxCast and ToxRefDB data. Using this hypothesis-driven approach, we have developed useful predictive models from the ToxCast phase I data.

You do not currently have access to this article.

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.