RNA Takes Shape
Predicting RNA structure remains an open challenge, but progress is being made
RNA is not just a single-stranded template. Like proteins, many RNA molecules can fold into three-dimensional structures that catalyze reactions and regulate gene expression. Predicting this structure, though, remains an open challenge. Scientists at the University of Montreal have devised a novel way to attack the problem, which they describe in the March 6 issue of Nature.
“Our approach is to generate a more complete RNA secondary structure and from there go to three dimensions directly. Whereas before going to 3-D from secondary structure was impossible,” says François Major, PhD, professor of computer science and operations research, who developed the method with graduate student Marc Parisien.
RNA nucleotides bind with each other to form secondary structures such as hairpins (a stem with a loop) and helices. Though most nucleotides pair according to Watson-Crick or wobble rules (C-G, A-U, and G-U), a small number (about 15 percent of nucleotides in hairpins, for example) form alternate pairings—such as A-C or a G-U-A base triple (where the bases meet in different orientations). Previous programs have fallen short of predicting these “non-canonical” pairings that are the key to 3-D structure and indeed often drive the most interesting geometries such as loops, bulges, and twists.
To better predict non-canonical pairings, Major and Parisien identified 19 regular, repeated small motifs (mostly 3 to 5 nucleotides) in solved RNA structures. They call these the RNA structural alphabet or “nucleotide cyclic motifs” (NCMs). The most common “letter” (or NCM) consists of two Watson-Crick base pairs stacked on top of each other; a bunch of these together form a basic helix. But many of the other NCMs are defined by non-Watson-Crick base pairs. One example is a four-nucleotide loop with a G-A pair at the bottom.
To determine the 3-D structure of a given RNA primary sequence, Major and Parisien feed it through two programs: MC-Fold and MC-Sym. MC-Fold enumerates all possible base pairings (including non-canonicals) and all possible arrangements of NCMs. It then picks the most probable arrangement based on statistical data from solved RNA structures. Next, MC-Sym translates the NCMs directly into 3-D structures. The pipeline is available as a web service (http://www.major.iric.ca/MC-Pipeline/). Currently, accuracy is limited to sequences of fewer than 75 base pairs—unless experimental or multiple-sequence data are incorporated into the program, Major says.
As a test case, Major and Parisien fold- ed several precursor microRNAs (with previously unknown structures). Such molecules would be expected to share a common structural element for binding to the enzyme Dicer, which processes them into functional microRNAs. The result: despite different primary sequences as well as non-canonical base pairs and bulges, the pre-microRNAs all folded into double helices.
“That’s a pretty powerful result,” comments Philip Bevilacqua, PhD, professor of chemistry at Penn State University. “I think this method is going to be of practical benefit to the RNA community,” he says. “This has the potential for enormous impact, and hopefully it will get fulfilled.”