An elaborate knowledge-based energy function is designed for fold
recognition. It is a residue-level single-body potential so that
highly efficient dynamic programming method can be used for alignment
optimization. It contains a backbone torsion term, a buried surface
term, and a contact-energy term. The energy score combined with
sequence profile and secondary structure information leads to an
algorithm called SPARKS (Sequence, secondary structure Profiles And
Residue-level Knowledge-based energy Score) for fold
recognition. Compared with the popular PSI-BLAST, SPARKS is 21% more
accurate in sequence-sequence alignment in ProSup benchmark and 10%,
25%, and 20% more sensitive in detecting the family, superfamily,
fold similarities in the Lindahl benchmark, respectively.
Moreover, it is one of the best methods for sensitivity (the number of
correctly recognized proteins), alignment accuracy (based on the
MaxSub score), and specificity (the average number of correctly
recognized proteins whose scores are higher than the first false
positives) in LiveBench 7 among more than twenty servers of
non-consensus methods. The simple algorithm used in SPARKS has the
potential for further improvement. This highly efficient method can
be used for fold recognition on genomic scales.
[See Software/Service for the
Server]
Helices in membrane spanning regions are more tightly packed than the helices
in soluble proteins. Thus, we introduce a method that uses a simple scale of
burial propensity and a new algorithm to predict transmembrane helical (TMH)
segments and a positive-inside rule to predict N-terminal orientation. The
method (the topology predictor of Transmembrane Helical proteins Using Mean BUrial Propensity, or THUMBUP) correctly predicted the topology of 55
out of 73 proteins (or, 75%) with known three-dimensional structures (the
3D_helix database). This is the best that can be achieved by any current
state-of-art methods. Moreover, we found that the 1D_helix database, because of
its inaccuracy, should be avoided as either a training or testing database.
Topology prediction of transmembrane
proteins using mean burial propensity
Cooperative binding of ligands to proteins is one of the methods
that nature uses to increase binding efficiency and regulate binding activity.
Understanding the mechanism of binding cooperativity is one of the central problems
in molecular biology. Much of the current understanding of binding cooperativity
is built on experimental and theoretical studies of human tetrameric hemoglobin.
However, a detailed dynamic mechanism from one crystallographic endpoint to the
other is still missing due to limited experimental information on partially liganded
intermediates and the difficulty to simulate conformational changes that have
a time-scale longer than tens of nanoseconds. In this paper, we investigate partially
liganded intermediate states of a small but strongly cooperative Scapharca
dimeric hemoglobin using molecular dynamics simulation methods and reveal the
direct role of water molecules in binding cooperativity.
[See publication
#54] [See a movie clip One of the bottlenecks for the solution of the protein
folding problem is the lack of an accurate potential to describe the
interactions among amino acid residues and the interactions between the amino
acids and the aqueous solvent. This is a complex and challenging problem because
it involves the interplay among several different types of interactions. The
interaction potential that would yield a complete understanding of the folding
phenomena should be derived from the laws of physics. However, the use of such
physical-based potentials for ab initio folding studies is limited by available
computing power. An alternative method is to extract the potential of mean force
from known protein structures. This yields what is called knowledge-based
statistical potentials. They are simpler and easier to use than physical-based
potentials. The distance-dependent structure-derived potentials developed so far
all employed a reference state that can be characterized as a residue
(atom)-averaged state. Here, we establish a new reference state based on the
principle of statistical mechanics. Results show that the new method improves
significantly structure-derived potentials of mean force for structure selection
and stability prediction. [See publication
#52] [PDF]
Predicting the folding mechanism of the second It has been well established that the folding transition of
many proteins, in particular, of small globule proteins is a first order-like
transition (i.e., it is a two-state transition with no detectable intermediates).
Proteins can fold cooperatively either from a coil or from a molten globule
state with variable secondary structural contents. The origin of cooperativity,
however, is not fully understood. The proposed origins of protein's two-state
behavior range from helix-coil transitions, heteropolymer collapse, sidechain
packing, to the existence of elementary folding units. Although simplified models
can exhibit first-order-like transitions, their interpretations vary. In sophisticated
lattice models, the cooperativity arises from multibody interactions while different
mechanisms (collective orientational rearrangement versus cooperative native-contact
formation) are suggested for lattice models with and without sidechains. Studies
of C Linear regression analysis found that either contact order
(CO) or long-range order (LRO) parameter has a significant correlation with the
logarithms of folding rates. This suggests that sequence separation per contact
and total number of contacts are both important in determining the rate of
folding. Here, the two factors are incorporated into a new parameter, total
contact distance (TCD). Significant improvement in correlation is observed.[See publication
#44][PDF]
][PDF]
Distance-scaled, finite ideal-gas reference (DFIRE) state improves structure-derived potentials of mean force for structure
selection and stability prediction.
The dual role of a loop with low loop
contact distance in folding and domain swapping
helices,
strands, and loops are the basic
build blocks for the structures of proteins. The folding kinetics of
helices and
strands have been investigated extensively. However, little is known about the
formation of loop. Experimental studies show that for some proteins, the formation
of a single loop is the rate-determining step for folding, while for others,
a loop (or turn) can misfold to serve as the hinge loop region for domain-swapped
species. These two seemingly opposite behaviors appear to be the character of
a single loop of a model three-helix bundle (fragment B of Staphylococcal
protein A) in our all-atom folding simulations. To interpret the modeling result,
we developed a simple structural parameter -- the loop contact
distance (LCD) or the sequence distance of contacting residues between
a loop and the rest of the protein. The parameter is then applied to a number
of other proteins including SH3 domains and prion protein. The results suggest
that a locally interacting loop (low LCD), can either promote folding or serve
as the hinge region for domain-swapping. Thus, there is an intimate connection
between folding and domain swapping, a possible cause of misfolding and aggregation.
[See publication
#50][PDF]
Role of hydrophilic and hydrophobic contacts
in folding of the second
-hairpin
fragment of protein G: Molecular dynamics simulation studies of an all-atom model
-hairpin fragment of the Ig-binding
domain B of streptococcal protein G is unexpectedly challenging for simplified
reduced models because the models developed so far indicated a different folding
mechanism from what was suggested from high-temperature unfolding and equilibrium
free-energy surface analysis based on established all-atom empirical force fields
in explicit or implicit solvent. This happened despite the use of empirical
residue-based interactions, multibody hydrophobic interactions, and inclusions
of hydrogen bonding effects in the simplified models. In this paper, a recently
developed all-atom (except nonpolar hydrogens) model interacting with simple
square-well potentials is employed to fold the peptide fragment by molecular
dynamics simulation methods. Folding of the new all-atom model is found to be
initiated by collapse prior to the formation of main-chain hydrogen bonds. This
verifies the mechanism that is proposed from previous all-atom unfolding and
equilibrium simulations. The new model further predicts that the collapse is
initiated by two nucleation contacts (a hydrophilic contact between D46 and
T49 and a hydrophobic contact between Y45 and F52), in agreement with recent
NMR measurements. The results suggest that atomic packing and native contact
interactions play a dominant role in folding mechanism. [See publication
#48][PDF]
Thermodynamics of an all-atom off-lattice
model of the fragment B of Staphylococcal protein A: Implication for
the origin of the cooperativity of protein folding
based (without sidechains)
off-lattice models, on the other hand, failed to produce a first order coil-to-native
folding transition even for highly optimized sequences. Instead, a strong transition
to a molten globule state, followed by a weak folding transition, is observed.
To better understand the folding thermodynamics as well as the kinetics, there
is a need for a more accurate off-lattice model which is reported in this paper.
[See publication
#47][PDF]
Folding Rate Prediction Using Total
Contact Distance