Simulating Populations with Complex Diseases
Diabetes, breast cancer, multiple sclerosis, Alzheimer’s disease. All are associated with several genes’ alleles interacting in complex ways with one another and the environment. Now, using a computationally intensive method known as forward-time simution of human populations, researchers are hoping to gain a better understanding of how such complex diseases become established.
“In a real population you just see people with the disease,” says Marek Kimmel, PhD, professor of statistics at Rice University and co-author of the work. “You don’t see who in the population has the disease genes because people carrying these genes do not necessarily become diseased.” But in the model population, he says, “you see both.” And the researchers’ approach allows them to simulate a very complicated scenario—including changes in types of selection pressure.
“This lets us evaluate how well statistical genetics tests determine what genes are responsible for the symptoms of a disease and how frequently those genes appear in the population.” That’s a non-trivial exercise, he says, because it has been impossible, until now, to compare the many existing gene-mapping methods head-to-head. The work was published in PLoS Genetics in March 2007.
Before now, the most commonly used approach to simulating diseases in human populations—called the “coalescent” method—worked by coalescing backward in time to a most-recent common ancestor. But it’s extremely difficult to take selection into account using the coalescent method, says co-author Bo Peng, PhD, a postdoctoral fellow at the University of Texas MD Anderson Cancer Center. Moreover, that approach gets too complicated if more than one disease gene is involved. So Peng and his colleagues turned to forward-time simulation, an approach that’s been around for about one hundred years.
But that technique is not without its problems. When a population evolves forward in time, there are simply too many possible outcomes. Most notably, when you introduce a disease allele, it can rapidly be eliminated and replaced with new alleles. So Peng came up with a trick: He pre-sets desired disease allele frequencies inthe current generation, extrapolates them backward, and starts the simulation from there. As Kimmel puts it, “We are restricting potential variability in one aspect of the present in order to produce a simulation that resembles something close to the actual variability that exists now.”
The simulation uses a scripting language called simuPOP, a general-purpose forward-time simulation environmentbased on Python. The software is freely available at http://simupop.sourceforge.net, under a GPL license.
When Peng and his colleagues used their method to compare several gene mapping techniques they found that certain methods worked better for loci that were located distantly from one another; and other methods were more effective when loci were close together. Overall, though, says Kimmel, “We’re mildly pessimistic” about current gene mapping approaches. “When the number of loci involved in complex disease is greater than two, the methods rapidly lose their power.” Until recently, gene mapping for complex diseases has been disappointing, he says. Loci identified in such efforts have later turned out to be statistical artifacts. “Our modeling could figure out if this is inevitable,” he says—and help guide people toward more effective approaches.
David Balding, PhD, a professor of statistical genetics at Imperial College in London, does similar work using forwardtime simulations of large genomic regions. He has become pessimistic about the method’s usefulness for understanding complex diseases because no one really knows what kind of selection is going on. Nevertheless, he says, this work can be useful for studying selection itself. “People tend to look at selection one allele at a time,” he says, “But forward-time simulation lets us do it with complex interactions.”