Dynamic Simulations of BiomoleculesComputational techniques have emerged rapidly in recent years as an important complement to experiment. Molecular dynamics simulations, in particular, add an exciting element of motion in time, an aspect not easy to capture by experiment for large, floppy macromolecules in solution.
Dr. Schlick's theoretical laboratory is focusing on developing new algorithmic approaches for dynamic simulations as well as modeling tools for studying biological molecules on larger spatial and temporal scales. New numerical approaches are needed to increase our ability to reliably simulate molecular motion, thereby enhancing our understanding of biological function through the structure/function connection. Specifically, algorithms are being developed to permit longer dynamic simulations than possible to date. To realize computational speedup for macromolecular systems, these programs are adapted to advanced computer architectures. New modeling approaches are also under development involving simplified (macroscopic) representations of DNA to permit the study of large spatial scales, very difficult to achieve in all-atom models. Applications to supercoiled DNA, polypeptides, and protein folding are pursued in collaboration with biophysicists.
New Algorithms for Molecular DynamicsThe group's integration scheme based on normal modes and implicit integration, "LIN", passed the important break-even milestone last year. By solving the linearized equations of motion numerically, rather than analytically, and by optimizing other components of the method (via adaptive timestep selection and accelerated minimization), the additional work required to implement the implicit solver (which makes the large timestep possible) was more than compensated for by the decreased number of steps required to cover the same simulation time. Tests of LIN on a model dipeptide and the small protein bovine pancreatic trypsin inhibitor (BPTI) showed excellent agreement with explicit trajectories generated at timesteps of 0.5 fs. Significantly, the LIN timestep was 60 times larger for the dipeptide and 30 times larger for BPTI. This advance was achieved within the framework of the widely-used CHARMM program developed at Harvard.
A delightful surprise also emerged from these studies which addressed the computational competitiveness of LIN. Examination of the range of validity of the harmonic approximation led to development of a related method termed LN (for Langevin/Normal Modes), which includes LIN's linearization, but not correction, step. LN at a timestep of 5 fs also shows very good agreement with traditional MD and, already for BPTI (904 atoms), gives a speedup factor of four on a serial machine. Speedup of LN increases with system size.
This unexpected windfall illustrates the value of developing novel approaches (e.g., based on normal-mode techniques) that might initially appear not practical for macromolecules.
Important work lies ahead, involving: extension of the LIN and LN methods to large biomolecules in aqueous environments, incorporation of LN into fast electrostatic treatments, and establishment of practical speedups with increasing system size. Since the speedup factor in LN depends on calculations of a sparse Hessian (matrix of second derivatives), sparse data structures and sparse matrix routines will be necessary to realize optimal speedups. Parallel implementations might also offer additional speedup opportunities. Further algorithmic improvements can also be devised to allow still larger timesteps to be used and/or larger computational gains.
Error Analysis for Langevin DynamicsThe notion of error in practical molecular and Langevin dynamics simulations of large biomolecules is far from understood because of the relatively large value of the timestep used, the short simulation length, and the low-order methods employed. Recent reports on inconsistency and stability problems in large-scale molecular dynamics simulations of protein and nucleic acids stimulated our interest in this problem.
We began a rigorous analysis of model problems by analyzing the behavior of representative implicit and explicit integration algorithms for the Langevin equation. Specifically, we derived: local stability criteria for these integrators; averages of the potential, kinetic, and total energy; and various properties at special limiting cases of physical relevance (e.g., the timestep and friction constant approaching zero). These results were then compared to the corresponding exact solutions for the continuous problem. Two new functions of practical and theoretical importance emerged: scheme-dependent perturbative damping and scheme-dependent perturbative frequency. Both are numerical artifacts of the integrator used. Interesting differences in the asymptotic behavior among the algorithms became apparent through this analysis, explaining some observations from molecular dynamics simulations, and suggesting algorithmic guidelines for practitioners. Two particular algorithms, "LIM2" (implicit, developed in our group) and "BBK" (explicit, commonly used), appeared the most promising on theoretical grounds. The former allows larger timesteps to be used.
Although this linear analysis serves only as a reference for the highly nonlinear dynamics relevant to macromolecules, it already offering insights of practical value and guidance into selecting various integration schemes. Extensions to the nonlinear case are difficult, but form our very long-term goal.
Applications to Protein DynamicsThe development of methods for enhanced configurational sampling is an important goal in biomolecular simulations. Intrigued by recent successes of hybrid Monte Carlo / molecular dynamics schemes for structure optimization, we developed a dynamics driver approach ("DA") for enhanced sampling based on our implicit integrator for Langevin dynamics. Algorithmic refinements and testing were performed through applications to several dipeptides, a tetrapeptide, and a 13-residue oligoalanine system. We found the DA approach capable of generating pseudo-continuous trajectories that sample more of the relevant conformational space. DA was also successful at reproducing reasonably Boltzmann statistics for the smaller system. The timescale for transition is accelerated in DA but can be estimated by scaling approximations. Indeed, we found the transitional information obtained for the alanine dipeptide and the tetrapeptide consistent with that obtained by several other theoretical approaches. For the oligoalanine helix of 13 residues, the DA trajectory also revealed continuous unfolding and refolding. The range of conformations with an intermediate helix content is consistent with spectroscopic experiments on small peptides, as well as the cooperative two-state model for helix-coil transition.
The good, near-Boltzmann statistics reported for the smaller systems above, in combination with the interesting unfolding results, encouraged us to further explore folding/unfolding processes in proteins. In collaboration with Philippe Derreumaux (CNRS, Paris) and Martin Karplus (Harvard and University of Strasborug), we are studying the dynamics of the protein triosephosphate isomerase (TIM). Our goal is to explore the forces that trigger the large-scale motion of TIM's flexible loop at the active site. This very difficult problem was previously studied with partial success by various approaches.
Our DA simulations of TIM in solvent reveal that intermolecular hydrogen bonds play a crucial role in lowering the energy along the transition pathway of the enzyme. The estimated activation energy barrier to the motion ties well with recent solid-state NMR data. Our simulations further suggest that the opening and closing motion of the TIM loop is a natural dynamic property of the enzyme at room temperature and not necessarily substrate induced. The DA results highlight the crucial role of three key residues in triggering loop motion and suggest that the loop opening/closing process involves a semi-rigid lid with one hinge, rather than a rigid-body movement, as previously thought. This work is ongoing.
Applications to DNA SupercoilingOur studies on long DNA continue with the macroscopic elastic model for long closed-circular DNA systems developed in our group in collaboration with Wilma Olson (Rutgers University). The macroscopic B-spline model for DNA relies on the convenient curve-fitting technique of piecewise interpolation, an indispensable tool in aircraft and automotive industries. (B-spline mathematics was recently used in the film Terminator 2 to display the morphing of Arnold Schwarzenegger's enemy into various forms!)
Our recent studies focused on buckling transitions in superhelical DNA, sudden changes in shape that accompany a smooth variation in a key parameter, such as the superhelical density. These changes were found to depend profoundly on the elastic constants for bending and twisting, important characteristics of DNA's bending and twisting persistence lengths, and on DNA size. An examination of the configurational distributions obtained from thermal ensembles could also be interpreted by the buckling profiles. These views suggested a slightly different experimental procedure for estimating the torsional stiffness of supercoiled DNA than considered to date.
To examine this possibility, an experimental collaboration was established with a chemical engineer and cryoelectron-microscopy expert, Oren Regev (Ben Gurion University of the Negev, Israel). We have begun to produce cryoelectron images of DNA plasmids (kindly provided by Avi Minsky of the Weizmann Institute) and experiment with the laboratory conditions (salt concentration, relative humidity, etc.) to produce optimal photos. Techniques for scanning the microscopic images, as well as for digitizing and analyzing the flat analogs of three-dimensional images, are currently under development.
We are also working on an extension of our B-spline DNA model to allow long-time Brownian dynamics simulations within the classic hydrodynamic polymer framework. We found that application of a common mathematical tool in cryptanalysis (code deciphering) called the singular value decomposition (SVD) can be rigorously applied to generate a bead model from our spline formulation. The mathematical advantage is clear: a more compact and efficient representation, since all the large computations can be done in terms of the lower-dimensional, spline system, which is then converted cheaply to the bead space via SVD.
The algorithmic details have been developed and carefully tested. An important demonstration of the reliability of the new approach is the excellent agreement found between our configurational profiles, generated by Brownian simulations, and the experimentally-reported small-angle X-ray scattering profiles reported as a function of salt concentration. Planned applications of the new Brownian bead-model dynamics involve studies of DNA slithering, a snake-like polymer motion likened to a conveyer-belt-like movement. This slow (microsecond and longer) phenomena is important for understanding the rate of spatial approach among distant segments along the DNA contour. Interesting applications to site-specific recombination reactions are on the horizon.