Motivation: Polymorphisms in human genes are being described in remarkable numbers. Determining which polymorphisms and which environmental factors are associated with common, complex diseases has become a daunting task. This is partly because the effect of any single genetic variation will likely be dependent on other genetic variations (gene–gene interaction or epistasis) and environmental factors (gene–environment interaction). Detecting and characterizing interactions among multiple factors is both a statistical and a computational challenge. To address this problem, we have developed a multifactor dimensionality reduction (MDR) method for collapsing high-dimensional genetic data into a single dimension thus permitting interactions to be detected in relatively small sample sizes. In this paper, we describe the MDR approach and an MDR software package.

Results: We developed a program that integrates MDR with a cross-validation strategy for estimating the classification and prediction error of multifactor models. The software can be used to analyze interactions among 2–15 genetic and/or environmental factors. The dataset may contain up to 500 total variables and a maximum of 4000 study subjects.

Availability: Information on obtaining the executable code, example data, example analysis, and documentation is available upon request.

Contact: moore@phg.mc.vanderbilt.edu

Supplementary information: All supplementary information can be found at http://phg.mc.vanderbilt.edu/Software/MDR.


To whom correspondence should be addressed.