Form Follows Sequence
- 6 Jan 2001Dubchak "trains" neural networks, built with computer processors, to recognize sequences that produce scop-like folds; at present, about a fourth of new sequences can be matched confidently to folds already in the library. Those that don't match known shapes represent folds that have not yet been discovered (or they signal that the neural network doesn't have enough information or hasn't yet learned to recognize the relationship).
Armed with the knowledge that the fold of a new protein resembles familiar folds, biologists can hypothesize the new protein's evolutionary relationships and biological functions, as well as how it may bind to other proteins and to specific chemicals, including drugs.
However, because entirely different dna sequences may produce structures of similar topology, large uncertainties remain. For example, the resolution of a neural-network fold prediction may be limited to several times the typical distance between atoms-and two structures possessing the same fold may be significantly different in size.
|
Using global optimization programs such as GOSPEL, small protein structures can be predicted. |
Teresa Head-Gordon seeks to reduce these uncertainties by invoking the gospel-that is, "global optimization strategies to probe energy landscapes." Head-Gordon's goal is to find, within the range of possibilities, the protein structure corresponding to a specific sequence that has the lowest energy.
Neural-network predictions such as Dubchak's supply "soft constraints" on shape and specify known secondary structures such as alpha helices and beta sheets. By applying gospel - using force-field models such as amber and charmm, and descriptions of aqueous solvation learned from theory and experiment-vaguely defined "coil" structures, which are more challenging, can also be resolved.
In the course of comparing candidates, the algorithm applies these empirically derived functions to areas of the fold accessible to water; it imposes an extra energy penalty on structures with exposed hydrophobic surfaces. Repeated perturbations of amino-acid positions use gospel to lower the energy further, homing in on the lowest possible total energy.
Global optimization is a voracious consumer of computer power and time. Using the Cray T3E-900 at nersc, Head-Gordon and her colleagues have tested their algorithm against simple "target" proteins. In the case of 1pou, for example, a dna binding protein with 72 amino acids arranged as several alpha helices, the structure predicted by gospel from sequence gave a reasonable estimate of the fold but had some six percent higher binding energy than the known structure derived from nuclear magnetic resonance imaging.
"We have still not reached crystal structure energy yet, so further improvements in structure are still possible!" Head-Gordon exclaims.






Please copy the 5 symbols from this security code image into the box below to submit comment.











