Teaching an Old Model New Tricks
Hidden Markov models estimate DNA loop kinetics
The hidden Markov model—a statistical model used for decades in fields as diverse as speech recognition and climatology—has received an update and a new application. Researchers at the University of Pennsylvania and Emory University adapted the model for tethered molecule experiments, and used it to obtain the most accurate estimates to date of the kinetics of DNA looping. Their results appeared online in Biophysical Journal on February 2, 2007.
“Before now, no one has ever been able to measure the kinetics of DNA loop formation and breakdown in a realistically sized system,” comments Philip Nelson, PhD, professor of physics at the University of Pennsylvania. DNA forms loops to turn certain genes off; accurate measurement of the rates of looping and unlooping are needed to build realistic models of this switching mechanism.
Researchers cannot directly see a strand of DNA in action, so they use a trick pioneered by Laura Finzi, PhD, (now at Emory University) and Jeff Gelles, PhD, (Brandeis University): they tether one end to a microscope slide and attach a visible bead to the other end. Single-particle tracking of the bead’s motion is used to infer the DNA’s state—when DNA is looped, the bead is pulled closer to the microscope slide and its radius of movement is more limited. Previously, researchers analyzed the data by averaging the motion of the bead within certain windows of time—called “binning the data.” But this method is imperfect because the results are heavily influenced by the choice of bin size. So, Nelson’s team turned to hidden Markov models.
“Hidden Markov models have a long and illustrious history in the study of single ion channels, but recently they have also increasingly been the method of choice when analyzing single-molecule biophysics experiments,” Nelson says. Hidden Markov models help scientists make inferences about some unobservable data (e.g., DNA states) based on a set of observable and noisy data (e.g., bead movements). The algorithm estimates the unknown rates by finding the values that make the observed pattern of data the most likely.
“For a physicist, it’s really beautiful to see the same ideas getting recycled in very different contexts,” Nelson says. “But we had a technical challenge, we couldn’t just take it off the shelf and use it because the classic set up wasn’t quite applicable.” Hidden Markov modeling assumes that the noise in the observable data is purely random. However, in tethered particle analysis, this assumption is violated: the position of the bead in one moment depends on the position of the bead the instant before. So, Nelson’s team made a new model—called a diffusive hidden Markov model—that accounts for this dependency.
The resulting estimates of the rates of looping formation and breakdown were robust; their rate estimates did not change when they re-analyzed the data after removing every other datapoint.
“I think their approach seems very novel and sound, and it’s clear that by doing this they can obtain more accurate information about DNA looping kinetics,” says Taekjip Ha, PhD, associate professor of physics at the University of Illinois at Urbana-Champaign. Ha has done work using hidden Markov modeling for single-molecule fluorescence studies not involving tethered molecules.