Statistical clustering techniques for the analysis of long molecular dynamics trajectories: analysis of 2.2-ns trajectories of YPGDV

Biochemistry. 1993 Jan 19;32(2):412-20. doi: 10.1021/bi00053a005.

Abstract

The microscopic interactions and mechanisms leading to nascent protein folding events are generally unknown. While such short time-scale events are difficult to study experimentally, molecular dynamics simulations of peptides can provide a useful model for studying events related to protein folding initiation. Recently, two extremely long molecular dynamics simulations (2.2 ns each) were carried out on the pentapeptide Tyr-Pro-Gly-Asp-Val [Tobias, D. J., Mertz, J. E., & Brooks, C. L., III (1991) Biochemistry 30, 6054-6058] that forms stable reverse turns in solution. Tobias et al. examined folding events in this large system (approximately 30,000 conformations) using traditional methods of trajectory analysis. The shear magnitude of this problem prompted us to develop an automated approach, based on self-organizing neural nets, to extract the key features of the molecular dynamics trajectory. The neural net is used to perform conformational clustering, which reduces the complexity of a system while minimizing the loss of information. The conformations were grouped together using distances in dihedral angle space as a measure of conformational similarity. The resulting clusters represent "conformational states", and transitions between these states were examined to identify mechanisms of conformational change. Many conformational changes involved the rotation of only a single dihedral angle, but concerted angle changes were also found. Most of the conformational information in the 30,000 samples from the full trajectories was retained in the relatively few resultant clusters, providing a powerful tool for analysis of an expanding base of large molecular simulations.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computer Simulation
  • Mathematical Computing*
  • Molecular Sequence Data
  • Oligopeptides / chemistry
  • Protein Conformation*

Substances

  • Oligopeptides