Despite their identical genomes, cells in the body develop distinct personalities—become neurons or liver cells, for instance—due to differences in gene expression. The mechanism that regulates this process has remained obscure, but a new study explains it using a simple thermodynamic model.
“Much of this phenomenon can be explained by a simple model of protein-protein and protein-DNA interactions,” says principal author Barak Cohen, PhD, of the Washington University Medical School in St. Louis. “In our system there is no need to account for complicated chemical processes.”
According to large-scale studies of eukaryotic genomes, gene expression is turned up and down when transcription factors interact with a zone of noncoding DNA located upstream from the gene—the gene’s promoter. This interaction is complex, and can involve a variety of transcription factors operating in concert. Indeed, a typical promoter may include 20 or more sites that can each bind any one of about 250 known transcription factors. The number of possible promoters and their interactions is thus enormous, but data about their behavior is limited to a few thousand known promoters. “This makes it real hard to tease out the rules of gene regulation,” says Cohen.
To make the problem tractable, Cohen and his collaborators built 2800 synthetic promoters each combining three to five transcription binding sites from about 20 known sites. Experiments on yeast cells showed that varying these mini-promoters for a gene yielded nearly three orders of magnitude variation in its expression. To analyze the promoter-expression relationship, the researchers invoked a thermodynamic model developed in earlier studies. In this model the interactions between proteins and their binding sites either help or hinder the recruitment of RNA polymerase—the molecule needed to build RNA from the DNA—to the promoters. The researchers “trained” the model using measured gene expression levels for a set of about 400 promoters, and tested it on an independent set of another 83 promoters.
The trained thermodynamic model explained nearly 50 percent of the variation in gene expression for the training set, and about 44 percent of the variation for the independent set. In contrast, empirical models relying on genomic data explain less than 25 percent of the variation in gene expression, says Cohen. The system also showed how weak binding sites cooperate to regulate gene expression, an effect that prior models failed to address. When applied to actual yeast genome data, the system found that Mig1, a transcription factor associated with glucose metabolism, regulated several additional genes not previously known to be regulated by this protein. “This is remarkable because Mig1 is one of the most widely studied transcription factors,” says Cohen.
In addition to shedding light on gene regulation, the findings could also facilitate in silico engineering of promoters with completely novel expression patterns, says Cohen. Such custom-designed promoters could be a boon for stem cell development, tissue engineering, regeneration, and similar areas. As a step towards this goal, the researchers plan to extend their work to mammalian cells, Cohen says.
“This paper is an important advance developing quantitative models for transcriptional regulation,” says Eran Segal, PhD, of the Weizmann Institute of Science in Israel. “It shows on a large scale what has been demonstrated previously on smaller sets of genes in fly and bacteria.” Paturu Kondaiah, PhD, of the Indian Institute of Science in Bangalore agrees with this assessment, but points out that transcription factors behave differently depending on their conformation, and can also recruit co-activators or co-repressors. “The next step is to take these effects into account,” he says.