Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Solvent entropy-driven searching for protein modeling examined and tested in simplified models

Solvent entropy-driven searching for protein modeling examined and tested in simplified models Abstract Solvent entropy is a force to consider in protein folding and protein design but is difficult to model. It is investigated here in the context of the hp model: Two types of residues, hydrophobic and hydrophilic, are modeled on a lattice. Nine chains and two- and three-dimensional simulations are compared. We show that considering solvent entropy alone, efficient folding of lattice chains (identification of the native fold) can be achieved by an entropy-driven simulation on its own. Moreover, in a detailed comparison over a wide range of parameters, entropy-guided searching outperforms an energy-driven search in the model. The combination of energy- and entropy-driven search yields the most efficient searching. It is compared in detail with the above results, indicating also how this solvent shell model may advantageously be implemented in more complex protein modeling simulations. Introduction Solvent entropy has an impact on protein folding and protein interactions as well as protein design. However, the enumeration of the different microstates can become difficult. Implementation of the solvent and especially its entropy is not often found in the literature (Shortle et al., 1992; Warwicker, 1997). Exact calculations on entropy are difficult in real systems. Calculating the number of solvent microstates stresses a different perspective of protein folding and protein stability: The global minimum no longer appears as one rare state among a large number of alternative conformations, but as the protein conformation with the highest number of microstate representations of the solvent. A computationally unintensive way is to approximate the impact of the solvent by mean field calculations (Hsiang-Ai and Karplus, 1988; Cramer and Truhlar, 1992; Fraternali and van-Gunsteren, 1996). The mean field calculations neglect the effect of locally ordered structures and can underestimate solvent–protein interactions (Smith and van-Gunsteren, 1994). Simulations taking each water molecule of the protein surrounding shell into account (Levitt and Sharon, 1988; Braxenthaler et al., 1997; Warwicker, 1997; Scheraga and Hao, 1999) are computationally intensive and usually are only done for simulation times of a few nanoseconds. We present here a promising implementation of solvent entropy in the context of the hp model. As a search method we took an application of the common Monte Carlo (MC) method (Metropolis et al., 1953, Kirkpatrick et al., 1983; Aarts and Korst, 1989), implemented in a standard way for this hp model [see Unger and Moult for further details of this standard implementation (Unger and Moult, 1993)]. The protein chain is modeled on a lattice and only two types of residues are considered, hydrophobic (h) and hydrophilic (p) (Lau and Dill, 1989). We further take the local order of the solvent shell around the protein into account. Two- and three-dimensional models are compared. Long-range interactions in the rest of the solvent (e.g. long-range order brought about by solvating ions or exposed polar groups) are not considered in this simplified model. We show that efficient folding of lattice chains (identification of the native fold) can be achieved by an entropy-driven simulation completely on its own. Furthermore, we show in a detailed comparison in our model that entropy-guided searching can outperform an energy-driven search. A combination of energy- and entropy-driven searching yields the most efficient searching and is compared in detail with the above results. This illustrates, in addition, how our simple solvent shell model may advantageously be implemented in more complex simulations. Materials and methods Two-dimensional hp model All simulations used the hp model (Lau and Dill, 1989) and modeled the protein chain on a lattice. Only two types of residues are considered, hydrophobic (H; filled squares) and hydrophilic (P; open squares). The model meets the basic characteristics of real proteins, e.g. similar distributions of secondary structure (Lau and Dill, 1990). The protein is represented as a chain on a two-dimensional square lattice (Figure 1). At each point the chain can turn 90° left or right or continue ahead. The following chains with 12, 18, 24, 33 and 48 residues, respectively, were tested: P H P P H P P H P P H P H H H P H P H P P H P H P P H P P H P H P P H H P P P P H H P P P P H H P P P P H H P H H P P H H P P P P P H H H H H H H P P H H P P P P H H P P H P P P H P P H H P P H H P P P P P H H H H H H H H H H P P P P P P H H P P H H P P H P P H H H H H These chains yield well-folded compact structures and their global energy minima were known a priori as –4, –9, –8, –14 and –23, respectively. Three-dimensional hp model In the three-dimensional model at each point the chain can turn left, right, straight, up and down; otherwise the construction of the chain is similar to the two-dimensional model. Four chains with 12, 14, 22 and 28 residues and different topologies were tested (Figure 2a): H H P H P H P H P H P H P H H H H P P P P H H H H P H P P H P P H P P H P P H P P H P P H P P H H P P H P P P P H P P H P P P P H P P H P P P P H P P H Their energy minima were also known a priori as –5, –6, –4 and –12, respectively. Energy function The energy function ΔE was kept simplistic. It adds –1 for each direct (non-diagonal) lattice contact between two non-consecutive hydrophobic amino acids. `Clashes' (i.e. two residues in the same place in the lattice) are not allowed. Trials with clashes had to be newly created. Implementation of entropy To study entropy effects, lattice spaces adjacent to the protein chain were modeled to be filled by solvent in the following (small) water ensembles (Figure 2a). Long-range interactions in the rest of the solvent (e.g. long-range order brought about by solvating ions or exposed polar groups) were not considered in this simple model. Small water ensembles with two different properties surrounded the model protein: ordered and less ordered. The ordered water ensembles are adjacent to hydrophobic residues or the solvent (i.e. other water ensembles). The less ordered (with high entropy) exist if no hydrophobic residue is adjacent to them. An O-block (ordered) on the lattice is asigned to the ensemble of ordered water molecules, an L-block to a less ordered ensemble (Figure 2b). The number of unordered water ensembles, Ni, was counted. It has been shown experimentally that hydrophobic molecules reduce the entropy of the surrounding (aqueous) solvent (Schulz and Schirmer, 1996). The solvent entropy difference, S, between one protein chain conformation, N1 and a tested next one, N2, during the simulation was set to be proportional to the difference in the number of unordered adjacent lattice spaces counted (entropy is additive; we assigned 1 unordered block = 1 high-entropy unit = 1/f, with f < 1; whereas 1 ordered block = 1 low-entropy unit = 1). With the order parameter f (see below) the probability for the new configuration to be chosen is p ~ e(TS/T) = f(N1 – N2). This implementation considers the entropy of the solvent according to Boltzmann statistics. The derivation of this can be done by simple algebra (see http://www.embl-heidelberg.de/~dandekar/entropy.html). In contrast, the simple energy function from above does not differentiate between ordered and less ordered ensembles (Figure 2b). Instead, it is derived (–1 for any hydrophobic contact), if the sum (`hydrogen bonds' in our model) of the connections of (water ensembles–water ensembles) and (water ensembles–hydrophilic residues) is counted and compared between two conformations. The order parameter f allowed the testing of different entropy weights during simulations, either alone or multiplying the entropy term, f(N1–N2)), with the energy term, e(–ΔE/T), yielding p ~ e–F/T = e(ΔE – TS)/T to consider also the energy difference ΔE between two chain conformations. The model allowed the examination of three implementations: A, energy effects (ΔE; simplified in the context of the model; only hydrophobic interactions are considered or, alternatively, the effect of `hydrogen bonds'); B, entropy (ΔS, difference of ordered small water ensembles); and C, the combination of them (more realistic implementation, F = E – TS). F denotes the Helmholtz free energy (corresponding also to the Gibbs free energy if changes in volume and pressure are neglected). The new chain conformation was accepted if a random number between zero and 1/constant was smaller than e–F/T. Simulation steps A Monte Carlo algorithm was used with the following steps [implemented as in Unger and Moult (Unger and Moult, 1993)]: Start from a random conformation. From a conformation C1 with energy E1 and entropy S1 a single random change yields conformation C2. If C2 is not clash-free a new conformation is created. Otherwise, the algorithms A, B and C decide by different criteria on the acceptance of the new conformation C2. The variable `energy evaluations' is increased by one. This variable is taken as the time-counting variable since the energy evaluation is often the most time-demanding function in more complex protein folding simulations. The MC implementation was, as in common classical statistical mechanics, the probability of the system being in state i was pi = exp[(E – TS)/T], where T is the artificial temperature, an optimized parameter (details are given below). The Boltzmann constant k was set to 1 for simplicity. T was decreased for algorithms A and C as in the common simulated annealing method (Kirkpatrick et al., 1983). Check if the stop criterion is met. Go back to step 2 if the counter for energy evaluations is <100 000. The value 100 000 was taken as the limit to keep the runs fast (a few seconds for the smallest chain, 30 s for the longest) and results were collected from 100 runs for each different parameter set. Moreover, this was sufficient to allow identification of the global minimum for most of the chains. Two implementations were tested and compared for each algorithm. Implementation 2 models and examines a more detailed partition function as a decision rule to change from conformation C1 to C2 (see Appendix for details), but this may prevent the algorithm from exploring new areas in search space when passing along flat surfaces. A decision constant (see Results) was introduced potentially to balance this effect and compare both implementations for a range of different values of the constant (including constant = 1, no constant introduced; the probability to accept C2 could maximally become 1, corresponding to accepting with certainty a change to conformation C2). Implementation of algorithms A, B and C Algorithm A (energy) Accept C2 if rnd < constant×e–ΔE/T (standard, implementation 1) rnd < constant×1/(1 + e–ΔE/T) (implementation 2) where rnd is a random number between 0 and 1 and T is an artificial temperature that gradually decreased (cooled) during the run to achieve convergence. T started with T0 = 2.00 and was decreased after every 100 energy evaluations with the cooling rate c: T = Ti = Ti – 1c (c < 1) c was optimized as a parameter (see Results). Algorithm B (entropy) Accept C2 if rnd < constant×f(N1–N2) (standard, implementation 1) rnd < constant×1/[1 + f(N2–N1)] (implementation 2) f < 1 was the order parameter and optimized (see Results). Note that the entropy of the protein conformations need not be considered as the algorithm itself samples over a representative ensemble of equally weighted protein microstates. Long-range interactions in the rest of the solvent (e.g. long-range order brought about by solvating ions or exposed polar groups) were not considered in this simple model. Furthermore, calculation of the term is easily effected. Comparison with the algorithm A above shows that the calculation is also not more computationally expensive. Algorithm C (energy and entropy) Algorithm C calculated F = E – TS (the free enthalpy F, equal to the energy E minus the product of temperature T and entropy S) before the decision: rnd < constant×f(N1–N2)e–ΔE/T (standard, implementation 1) rnd < constant×1/[1 + f(N2 – N1)e–ΔE/T] (implementation 2) T and f were taken as in algorithms A and B, respectively. Note that in the comparisons, the optimal folded states (maximally packed, minimum energy but also highest solvent shell entropy) are always the same and known beforehand (cf. Figure 1). Only their energy values are given in the tables for comparison. However, in searching for these states in algorithms B and C their entropy is also considered, comparing alternative conformations. The objective function of the three different algorithms is different and favors different suboptimal folds during searching; the potential surface separating and leading to the optimal folded states is different for the three algorithms. Results In our model the protein is surrounded by water composed of clusters with two different properties, ordered and less ordered, to calculate solvent shell entropy. The ordered water clusters are adjacent to hydrophobic residues and thus engage in hydrogen bonds with hydrophilic residues or the solvent. The less ordered clusters exist if no hydrophobic residue is adjacent to them. For our model no further knowledge about side chain properties is needed. This approach may be compared to an algorithm that favors a reduced hydrophobic surface of the protein. However, the latter would be a mean field approximation of the whole surface, whereas our model takes local structures, the solvent microstates, into account. The search efficiency of this entropy implementation in protein folding was compared with that of energy-driven searching. Different chains on both a two- and a three-dimensional lattice were tested to avoid effects depending on the lattice topography chosen. Three search strategies were compared in Monte Carlo (Metropolis et al., 1953; Kirkpatrick et al., 1983; Unger and Moult, 1993) simulations: Searching for the energy minimum. The number of hydrogen bonds gave the energy for each conformation (Table I). Searching for the conformation allowing the highest amount of microstates for the solvent shell (entropy search; Table II): at each step the new and the old conformation were compared. The conformation with the higher entropy (more microstate representations) of the solvent is taken for the next step. A combination of energy and entropy (Table III). Both the number of hydrogen bonds and the number of conformations were taken into account for search selection. All algorithms were run 100 times with each parameter set (with different random seeds). The comparison was done with five chains in the two-dimensional model and four chains in the three-dimensional model. Except for the two longest chains in the two-dimensional model (33-mer and 48-mer), the global minimum was found quite reliable by all algorithms. If the global minimum was not found by each run, our comparisons indicate how often the minimum was found per 100 runs and the standard deviation of this figure (numbers in parentheses). If the global minimum was found reliably, the number of conformation trials to find it was computed. A difference of >20% in the number of trials required was considered a significant difference between the search strategies examined. Two different implementations (see Materials and methods) to evaluate the probability, including different values of the decision constant to accept a new conformation in the next step of the simulation, were compared in each strategy. Additionally, we compared different parameter values for the cooling rate c in A and C and for the entropy parameter f in B and C. We tested a broad range, for f between 0.9 and 0.001 (0.9, 0.7, 0.5, 0.3, 0.1, 0.05, 0.01, 0.001), for c between 0.999999 and 0.9 (0.999999, 0.99999, 0.9999, 0.999, 0.99, 0.9) and for the decision constant the values 1, 2, 4, 8, 16 and 32. f, c and decision constant were all optimized for the results given. The optimal range for f was 0.1 to 0.7; f = 0.3 produced the best results over all conditions. For c the optimal range was 0.99–0.99999 with c = 0.99 yielding the best results comparing all conditions. A decision constant >2 was advantageous and constant = 4 produced the best results. The search strategy on solvent shell entropy alone is effective in identifying the global minimum. It is more effective in searching than the standard Monte Carlo energy minimization procedure with the simplified energy function in more than 60% of the parameter conditions investigated (33 of 54 cases; including the longer chains). Sixteen cases showed no significant difference, five cases were less efficient, including middle sized chain 3 in two dimensions, implementation 1, constant = 1 and chain 4 in three dimensions, implementation 2, constant = 2. The combination of both search strategies is easily achieved (algorithm C). This yields a further significant improvement for all tested chains including the longer ones in both lattices. It outperformed the energy-driven search (algorithm A) in 45 of 54 cases. In two of the remaining nine cases it showed a clear positive trend. It was also clearly and significantly superior (but here only in 40 of 54 cases) than entropy alone (algorithm B); in 10 cases there was no significant difference. Implementation 2 (see Materials and methods; Appendix for details) was not superior, but slightly less efficient than the standard implementation. Exactly 100 000 steps were compared for each strategy and each chain to allow a fair comparison of all search strategies, comparing batches of 100 runs for each condition; 100 000 steps were taken to collect a large amount of data for the algorithm comparison in a convenient time so that hundreds of runs could be compared and average statistics obtained. On average, this value was not sufficient for the 48-mer on the two-dimensional lattice to find the global minimum. For the 33-mer it was found about once (out of 100 trials) in the better strategies. Therefore, the energy of the best local minimum found was compared if the global minimum was not identified. It was not our aim to identify the global minimum under each condition but rather to compare the different algorithms under simple (global minimum always found) and demanding conditions (only the best strategies succeed sometimes). Note that a gain of one step lower in the energy of a local minimum is a significant search success as the number of local minima increases exponentially with higher energy (Lau and Dill, 1990). Solvent shell-guided searching was also here an improvement compared with the simple energy-guided search strategy and algorithm C also performed best in the comparison for longer chains. It found a local optimum with energy –22 for the 48-mer. It also found the global optimum for the 33-mer four times (Tables I–III). Discussion Entropy is an important concept in protein folding (Scheraga, 1998) including advances from recent work (e.g. Warwicker, 1997; Scheraga and Hao, 1999). In general, the implementation of entropy is complex. It includes the correct enumeration of microstates in microcanonical ensembles and considering that protein entropy decreases while solvent shell entropy increases during protein folding. We have shown in a well established, simple model for protein folding (hp model) and for different types of lattices that consideration of the entropy of the adjacent water shell is sufficient to locate the correct fold for these lattice chains. The target function (high entropy, low number of hydrophobic residues on the surface) differs from the energy function (again focusing on hydrophobicity) with better access to the native fold in the overall comparison. As a result, the search speed of the entropy implementation outperformed an energy-driven search. Several factors are potentially involved in this: Considering the entropic forces of the solvent shell helps to enhance globular packing (cf. Schulz and Schirmer, 1996). Some of the local minima and traps are more smoothed out as conformations with identical energy are distinguished by their solvent shell entropy differences (cf. Figure 2b). Otherwise no further complications are introduced in this simplified model, in particular regarding the energy term. We did not observe trapping in new entropy-driven minima with HP contacts, probably as they can be more easily left than pure energy minima as both entropy forces and energy function may open up potentially favorable changes for such conformations. Considering the microstates focuses on a different perspective of the folding problem. The preferred state of the protein chain is not one rare and single conformation but instead in the context of this model (and in the entropy aspect more realistic than in some other simple protein models) the global minimum of energy displays the highest number of microstates. Furthermore, breathing (rearrangement of microstate ensembles) can be seen and investigated in such models (König and Dandekar, 1999). This allows for the investigation of entropy effects as well as partial unfolding of the protein chain after the global minimum has been reached in these simulations. The solvent shell entropy model presented here achieves efficient folding for a number of simplified protein chains. In the context of the model it was not necessary to include long-range forces to achieve this. This will have to be tested further, in particular further refinement of the structures obtained by inclusion of such forces. The starting information required for any of the search strategies A, B and C was the same, the protein sequence was sufficient. An exhaustive enumeration for some of the smaller protein chains modeled is possible, but here we wanted to compare the speed and efficiency of the different search strategies. The hydrophobic effect was compared and considered both by its entropic and energetic consequences. This focus was chosen as hydrophobic interactions are principal forces in protein folding and to compare both strategies with their respective effects on search success. The combined strategy where energy and entropy are both considered is easily achieved and can accommodate also more complex energy functions than the simplistic one chosen for testing. It identifies the native fold best. It yielded a significantly better search effectiveness on both lattices. Additionally, we compared the algorithms with a standard (as in common MC methods) and an alternative implementation (partition function motivated) of the decision constant and showed implementation-independent, similar results for both implementations. Furthermore, both lattices tested (two- and three-dimensional) showed similar success as a control against effects from lattice choices. The use of a decision constant further improved search success. The search strategies compared fulfill ergodicity over sufficient long observation times and to the extent that the search space is non-reducible and contiguous, any conformation can be reached with some probability from any other conformation and it is aperiodic. Detailed balance is fulfilled for the three search strategies at least so far as reversibility for any conformational change effected is possible during the simulation. Entropy maximization was implemented in the model according to the natural behavior of proteins (Schulz and Schirmer, 1996) and, in fact, entropy maximization occurs during time for any molecular interaction (more details on entropy maximization in protein–solvent systems are given on our Web page). Solvent shell optimization can certainly be considered to have an important impact on protein folding. In our simulations this aspect of folding was shown to be sufficient to localize the optimal protein structure of small model proteins. It should be noted that also other well known factors involved in protein folding such as hydrophobic forces, hydrogen bonds and cooperativity (Kolinski et al., 1999) are in part determined by solvent shell entropic forces. Nevertheless, exact experimental quantification of solvent shell effects remains a challenge. Our simplistic model suggests that at present their contribution to protein folding may perhaps be underestimated. Shortle et al. showed that consideration of entropic forces is helpful in predicting and understanding the effects of mutations (Shortle et al., 1992). Also antibody and other protein–protein interaction models are more precise if entropic effects are included. The inclusion of a simplified solvent shell entropy algorithm as given and compared in detail here improves the search success in protein folding simulations and will be taken to improve more complex folding simulations such as more extended, grid free models as a next step. Appendix We compared the standard implementation to an alternative implementation 2. Implementation 2 considered a more detailed (entropy motivated) partition function comparing the new state C2 with the old state C1:  \[\mathit{p\ }_{\mathit{i}}\ =\frac{1}{\mathit{Z}}e^{\frac{{\mbox{--}}\mathit{E}_{\mathit{i}}}{\mathit{T}}}\mathit{i}\ =\ 1,\ 2\] 1 where Z is the normalization constant (partition function). With pi = 1,  \[\mathit{p}_{2}\ =\frac{e^{\frac{{\mbox{--}}\mathit{E}2}{\mathit{T}}}}{e^{\frac{\mathit{E}1}{\mathit{T}}}\ +\ e^{\frac{{\mbox{--}}\mathit{E1}}{\mathit{T}}}}\ =\ \frac{1}{1\ +\ e^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}}}\ e\ ^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}}\ with\ \mathit{E}\ =\ \mathit{E}_{2}\ {\mbox{--}}\mathit{E}_{1}\] 2 If C1 is the old and C2 the new conformation, p2 is the exact probability of accepting the new conformation (implementation 2 throughout the text). If the new conformation has the same or only slightly better energy value, the probability of accepting the new conformation is according to the partition function only around 0.5. Metropolis et al. (1953) made the following modification which more easily accepts a new conformation (Figure 3). If E becomes negative, p2 becomes close to 1. If E is positive, we have  \[\frac{1}{1\ +\ e^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}}}\ {\varepsilon}\ {[}0.5;1{]}\] 3 Furthermore, expression (3) is close enough to 1 if –E is smaller than or close to T. This yields the approximation  \[\mathit{p}_{2}\ =\ \{\ e^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}\ {\forall}\ {\Delta}\mathit{E}{>}0}_{\ 1\ {\forall}\ {\Delta}\mathit{E}{\leq}0}\] 4and is applied in the standard Metropolis Monte Carlo method (called implementation 1 throughout the text). In a similar way, considering again the partition function one can implement the entropy calculation in two ways, replacing e–ΔE/T by f(N1 – N2) (order parameter f; number of microstates N1 and N2). Table I. Identification of the global minimum in 100 trials (`found') is compared for two implementations (imp1 and imp2) and different values for the decision constant c to accept a new conformation: results considering energy only (algorithm A) Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  aNine chains with two (2D) or three (3D) dimensions were tested. The energy of the respective global minimum is given in square brackets [E]. If every time the global minimum was found the average number of energy evaluations (steps) to find it is given.  bValues with standard deviations (dev) are given in parentheses if the global minimum was not always found in 100 trials.  cFor longer protein chains (2D, chains 4 and 5) the 100 000 simulation steps used in this search strategy comparison were on average too few to locate the global minimum; instead the lowest energy minimum found and how often it was found in 100 trials is given in square brackets.  2D, chain 1  [–4]  54b  (5)b  53  (5)  52  (5)  50  (5)  53  (5)  47  (5)  2D, chain 2  [–9]  4  (2)  4  (2)  7  (3)  6  (2)  9  (3)  3  (2)  2D, chain 3  [–8]  11  (3)  5  (2)  5  (2)  5  (2)  10  (3)  8  (3)  2D, chain 4  [–14]  0c  [–13, 6×]c  0  [–13, 2×]  0  [–13, 6×]  0  [–13, 2×]  1  (1)  0  [–13, 2×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–20, 1×]  0  [–19, 3×]  0  [–20, 1×]  0  [–20, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  3  (2)  0  (2)  4  (2)  2  (1)  4  (2)  5  (2)  3D, chain 2  [–6]  95  (2)  91  (3)  93  (3)  97  (2)  92  (3)  94  (3)  3D, chain 3  [–4]  100a  28907a  100  23536  98  (1)  99  (1)  98  (1)  99  (1)  3D, chain 4  [–12]  2  (1)  1  (1)  3  (2)  2  (1)  2  (1)  3  (2)  Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  aNine chains with two (2D) or three (3D) dimensions were tested. The energy of the respective global minimum is given in square brackets [E]. If every time the global minimum was found the average number of energy evaluations (steps) to find it is given.  bValues with standard deviations (dev) are given in parentheses if the global minimum was not always found in 100 trials.  cFor longer protein chains (2D, chains 4 and 5) the 100 000 simulation steps used in this search strategy comparison were on average too few to locate the global minimum; instead the lowest energy minimum found and how often it was found in 100 trials is given in square brackets.  2D, chain 1  [–4]  54b  (5)b  53  (5)  52  (5)  50  (5)  53  (5)  47  (5)  2D, chain 2  [–9]  4  (2)  4  (2)  7  (3)  6  (2)  9  (3)  3  (2)  2D, chain 3  [–8]  11  (3)  5  (2)  5  (2)  5  (2)  10  (3)  8  (3)  2D, chain 4  [–14]  0c  [–13, 6×]c  0  [–13, 2×]  0  [–13, 6×]  0  [–13, 2×]  1  (1)  0  [–13, 2×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–20, 1×]  0  [–19, 3×]  0  [–20, 1×]  0  [–20, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  3  (2)  0  (2)  4  (2)  2  (1)  4  (2)  5  (2)  3D, chain 2  [–6]  95  (2)  91  (3)  93  (3)  97  (2)  92  (3)  94  (3)  3D, chain 3  [–4]  100a  28907a  100  23536  98  (1)  99  (1)  98  (1)  99  (1)  3D, chain 4  [–12]  2  (1)  1  (1)  3  (2)  2  (1)  2  (1)  3  (2)  View Large Table II. Identification of the global minimum in 100 trials (`found') is compared for two implementations (imp1 and imp2) and different values for the decision constant c to accept a new conformation: results considering entropy only (algorithm B) Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  13978a  100  16790  100  8058  100  8852  100  3748  100  4228  2D, chain 2  [–9]  14b  (4)b  12  (3)  13  (3)  13  (3)  33  (5)  32  (5)  2D, chain 3  [–8]  4  (2)  10  (3)  11  (3)  10  (3)  25  (4)  26  (4)  2D, chain 4  [–14]  0c  [–12, 6×]c  0  [–13, 1×]  0  [–12, 16×]  0  [–13, 1×]  1  (1)  0  [–13, 3×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–19, 1×]  0  [–20, 1×]  0  [–20, 1×]  0  [–21, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  2  (1)  1  (1)  1  (1)  2  (1)  2  (1)  3  (2)  3D, chain 2  [–6]  100  8502  100  10161  100  13806  100  19192  100  9255  100  9879  3D, chain 3  [–4]  100  5383  100  6582  100  5898  100  5872  100  5961  100  5209  3D, chain 4  [–12]  2  (1)  1  (1)  1  (1)  0  (1)  4  (2)  3  (2)  Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  13978a  100  16790  100  8058  100  8852  100  3748  100  4228  2D, chain 2  [–9]  14b  (4)b  12  (3)  13  (3)  13  (3)  33  (5)  32  (5)  2D, chain 3  [–8]  4  (2)  10  (3)  11  (3)  10  (3)  25  (4)  26  (4)  2D, chain 4  [–14]  0c  [–12, 6×]c  0  [–13, 1×]  0  [–12, 16×]  0  [–13, 1×]  1  (1)  0  [–13, 3×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–19, 1×]  0  [–20, 1×]  0  [–20, 1×]  0  [–21, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  2  (1)  1  (1)  1  (1)  2  (1)  2  (1)  3  (2)  3D, chain 2  [–6]  100  8502  100  10161  100  13806  100  19192  100  9255  100  9879  3D, chain 3  [–4]  100  5383  100  6582  100  5898  100  5872  100  5961  100  5209  3D, chain 4  [–12]  2  (1)  1  (1)  1  (1)  0  (1)  4  (2)  3  (2)  View Large Table III. Identification of the global minimum in 100 trials (`found') is compared for two implementations (imp1 and imp2) and different values for the decision constant c to accept a new conformation: results combining entropy and energy (algorithm C) Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  10686a  100  11861  100  9553  100  8894  100  3659  100  3686  2D, chain 2  [–9]  33b  (5)b  27  (4)  44  (5)  40  (5)  51  (5)  47  (5)  2D, chain 3  [–8]  22  (4)  16  (4)  34  (5)  24  (4)  50  (5)  45  (5)  2D, chain 4  [–14]  0c  [–13, 4×]c  0  [–13, 7×]  1  (1)  1  (1)  1  (1)  1  (1)  2D, chain 5  [–23]  0  [–20, 3×]  0  [–20, 1×]  0  [–22, 1×]  0  [–21, 1×]  0  [–22, 1×]  0  [–21, 2×]  3D, chain 1  [–5]  9  (3)  9  (3)  6  (2)  7  (3)  7  (3)  4  (2)  3D, chain 2  [–6]  100  6428  100  7074  100  6660  100  5221  100  7400  100  7642  3D, chain 3  [–4]  100  3223  100  3837  100  4066  100  3070  100  3638  100  3997  3D, chain 4  [–12]  6  (2)  5  (2)  13  (3)  5  (2)  4  (2)  4  (2)  Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  10686a  100  11861  100  9553  100  8894  100  3659  100  3686  2D, chain 2  [–9]  33b  (5)b  27  (4)  44  (5)  40  (5)  51  (5)  47  (5)  2D, chain 3  [–8]  22  (4)  16  (4)  34  (5)  24  (4)  50  (5)  45  (5)  2D, chain 4  [–14]  0c  [–13, 4×]c  0  [–13, 7×]  1  (1)  1  (1)  1  (1)  1  (1)  2D, chain 5  [–23]  0  [–20, 3×]  0  [–20, 1×]  0  [–22, 1×]  0  [–21, 1×]  0  [–22, 1×]  0  [–21, 2×]  3D, chain 1  [–5]  9  (3)  9  (3)  6  (2)  7  (3)  7  (3)  4  (2)  3D, chain 2  [–6]  100  6428  100  7074  100  6660  100  5221  100  7400  100  7642  3D, chain 3  [–4]  100  3223  100  3837  100  4066  100  3070  100  3638  100  3997  3D, chain 4  [–12]  6  (2)  5  (2)  13  (3)  5  (2)  4  (2)  4  (2)  View Large Fig. 1. View largeDownload slide Three chains (18-mer, 24-mer, 33-mer) for the two-dimensional model, shown in their optimal folded lowest energy state. Their energies (direct hydrophobic contacts) are –9, –8 and –14. Fig. 1. View largeDownload slide Three chains (18-mer, 24-mer, 33-mer) for the two-dimensional model, shown in their optimal folded lowest energy state. Their energies (direct hydrophobic contacts) are –9, –8 and –14. Fig. 2. View large Download slide View large Download slide (a) Chain 3 of the three-dimensional model, surrounded by ordered (gray spheres) and less ordered (black spheres) water. (b) Two structures of a 12-mer on the two-dimensional lattice. L stands for an ensemble of less ordered water, O for a (higher) ordered ensemble. The energy implementation alone (algorithm A) would consider both stuctures the same (Eleft = Eright). The objective function of B and C considers the amount of ordered water ensembles (O) and favors the left, globule like, native structure (Nleft = 3 < Nright = 4, so Fleft > Fright (for further details, see Materials and methods and Results). Fig. 2. View large Download slide View large Download slide (a) Chain 3 of the three-dimensional model, surrounded by ordered (gray spheres) and less ordered (black spheres) water. (b) Two structures of a 12-mer on the two-dimensional lattice. L stands for an ensemble of less ordered water, O for a (higher) ordered ensemble. The energy implementation alone (algorithm A) would consider both stuctures the same (Eleft = Eright). The objective function of B and C considers the amount of ordered water ensembles (O) and favors the left, globule like, native structure (Nleft = 3 < Nright = 4, so Fleft > Fright (for further details, see Materials and methods and Results). Fig. 3. View largeDownload slide Probability of accepting a new conformation C2. The (standard) implementation 1 and implementation 2 are compared. (A) The probability curve of accepting C2 for implementation 2. (B) Probability curve for the (standard) implementation 1. It can be seen that in (A) even for F < 0 there is still the possibility of not accepting the step, whereas in (B) every new step with F < 0 is accepted. Fig. 3. View largeDownload slide Probability of accepting a new conformation C2. The (standard) implementation 1 and implementation 2 are compared. (A) The probability curve of accepting C2 for implementation 2. (B) Probability curve for the (standard) implementation 1. It can be seen that in (A) even for F < 0 there is still the possibility of not accepting the step, whereas in (B) every new step with F < 0 is accepted. 1 To whom correspondence should be addressed. E-mail: dandekar@embl-heidelberg.de We thank Warren C.Lathe,III for stylistic corrections. Part of the research was funded by Graduiertenkolleg (University of Heidelberg) and SFB544/B2. References Aarts,E. and Korst,J. (1989) Simulated Annealing and Boltzmann Machines. Wiley, New York. Google Scholar Braxenthaler,M., Unger,R., Auerbach,D., Given,J.A. and Moult,J. ( 1997) Proteins , 29, 417–425. Google Scholar Cramer,C.J. and Truhlar,D.G. ( 1992) Science , 256, 213–217. Google Scholar Fraternali,F. and van-Gunsteren,W.F. ( 1996) J. Mol. Biol. , 256, 939–948. Google Scholar Hsiang-Ai,Y. and Karplus,M. ( 1988) J. Chem. Phys. , 89, 2366–2379. Google Scholar Kirkpatrick,S., Gelatt,C.D. and Vecchi,M.P. ( 1983) Science , 220, 671–680. Google Scholar Kolinski,A, Ilkowsko,B. and Skolnick,J. ( 1999) Biophys. J. , 77, 2942–2952. Google Scholar König,R. and Dandekar,T. ( 1999) J. Mol. Model. , 5, 317–324. Google Scholar Lau,K.F. and Dill,K.A. ( 1989) Macromolecules , 22, 3986–3997. Google Scholar Lau,K.F. and Dill,K.A. ( 1990) Proc. Natl Acad. Sci. USA , 87, 638–642. Google Scholar Levitt,M. and Sharon,R. ( 1988) Proc. Natl Acad. Sci. USA , 85, 7557–7561. Google Scholar Metropolis,N., Rosenbluth,A.W., Rosenbluth,M.N., Teller,A.H. and Teller,E. ( 1953) J. Chem. Phys. , 21, 1087–1092. Google Scholar Scheraga,H.A. ( 1998) J. Biomol. Struct. Dyn. , 16, 447–460. Google Scholar Scheraga,H.A. and Hao,M.-H. ( 1999) Adv. Chem. Phys. , 105, 243–272. Google Scholar Schulz,G. and Schirmer,R.H. (1996) Principles of Protein Structure. Springer, Heidelberg. Google Scholar Shortle,D., Chan,H.-S., Dill,K.A. ( 1992) Protein Sci. , 1, 201–215. Google Scholar Smith,P.E. and van-Gunsteren,W.F. ( 1994) J. Mol. Biol. , 236, 629–636. Google Scholar Unger,R. and Moult,J. ( 1993) J. Mol. Biol. , 231, 75–81. Google Scholar Warwicker,J. ( 1997) Protein Eng. , 10, 809–814. Google Scholar © Oxford University Press http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Protein Engineering, Design and Selection Oxford University Press

Solvent entropy-driven searching for protein modeling examined and tested in simplified models

Loading next page...
 
/lp/oxford-university-press/solvent-entropy-driven-searching-for-protein-modeling-examined-and-tqls7HuUPA

References (33)

Publisher
Oxford University Press
Copyright
© Oxford University Press
ISSN
1741-0126
eISSN
1741-0134
DOI
10.1093/protein/14.5.329
Publisher site
See Article on Publisher Site

Abstract

Abstract Solvent entropy is a force to consider in protein folding and protein design but is difficult to model. It is investigated here in the context of the hp model: Two types of residues, hydrophobic and hydrophilic, are modeled on a lattice. Nine chains and two- and three-dimensional simulations are compared. We show that considering solvent entropy alone, efficient folding of lattice chains (identification of the native fold) can be achieved by an entropy-driven simulation on its own. Moreover, in a detailed comparison over a wide range of parameters, entropy-guided searching outperforms an energy-driven search in the model. The combination of energy- and entropy-driven search yields the most efficient searching. It is compared in detail with the above results, indicating also how this solvent shell model may advantageously be implemented in more complex protein modeling simulations. Introduction Solvent entropy has an impact on protein folding and protein interactions as well as protein design. However, the enumeration of the different microstates can become difficult. Implementation of the solvent and especially its entropy is not often found in the literature (Shortle et al., 1992; Warwicker, 1997). Exact calculations on entropy are difficult in real systems. Calculating the number of solvent microstates stresses a different perspective of protein folding and protein stability: The global minimum no longer appears as one rare state among a large number of alternative conformations, but as the protein conformation with the highest number of microstate representations of the solvent. A computationally unintensive way is to approximate the impact of the solvent by mean field calculations (Hsiang-Ai and Karplus, 1988; Cramer and Truhlar, 1992; Fraternali and van-Gunsteren, 1996). The mean field calculations neglect the effect of locally ordered structures and can underestimate solvent–protein interactions (Smith and van-Gunsteren, 1994). Simulations taking each water molecule of the protein surrounding shell into account (Levitt and Sharon, 1988; Braxenthaler et al., 1997; Warwicker, 1997; Scheraga and Hao, 1999) are computationally intensive and usually are only done for simulation times of a few nanoseconds. We present here a promising implementation of solvent entropy in the context of the hp model. As a search method we took an application of the common Monte Carlo (MC) method (Metropolis et al., 1953, Kirkpatrick et al., 1983; Aarts and Korst, 1989), implemented in a standard way for this hp model [see Unger and Moult for further details of this standard implementation (Unger and Moult, 1993)]. The protein chain is modeled on a lattice and only two types of residues are considered, hydrophobic (h) and hydrophilic (p) (Lau and Dill, 1989). We further take the local order of the solvent shell around the protein into account. Two- and three-dimensional models are compared. Long-range interactions in the rest of the solvent (e.g. long-range order brought about by solvating ions or exposed polar groups) are not considered in this simplified model. We show that efficient folding of lattice chains (identification of the native fold) can be achieved by an entropy-driven simulation completely on its own. Furthermore, we show in a detailed comparison in our model that entropy-guided searching can outperform an energy-driven search. A combination of energy- and entropy-driven searching yields the most efficient searching and is compared in detail with the above results. This illustrates, in addition, how our simple solvent shell model may advantageously be implemented in more complex simulations. Materials and methods Two-dimensional hp model All simulations used the hp model (Lau and Dill, 1989) and modeled the protein chain on a lattice. Only two types of residues are considered, hydrophobic (H; filled squares) and hydrophilic (P; open squares). The model meets the basic characteristics of real proteins, e.g. similar distributions of secondary structure (Lau and Dill, 1990). The protein is represented as a chain on a two-dimensional square lattice (Figure 1). At each point the chain can turn 90° left or right or continue ahead. The following chains with 12, 18, 24, 33 and 48 residues, respectively, were tested: P H P P H P P H P P H P H H H P H P H P P H P H P P H P P H P H P P H H P P P P H H P P P P H H P P P P H H P H H P P H H P P P P P H H H H H H H P P H H P P P P H H P P H P P P H P P H H P P H H P P P P P H H H H H H H H H H P P P P P P H H P P H H P P H P P H H H H H These chains yield well-folded compact structures and their global energy minima were known a priori as –4, –9, –8, –14 and –23, respectively. Three-dimensional hp model In the three-dimensional model at each point the chain can turn left, right, straight, up and down; otherwise the construction of the chain is similar to the two-dimensional model. Four chains with 12, 14, 22 and 28 residues and different topologies were tested (Figure 2a): H H P H P H P H P H P H P H H H H P P P P H H H H P H P P H P P H P P H P P H P P H P P H P P H H P P H P P P P H P P H P P P P H P P H P P P P H P P H Their energy minima were also known a priori as –5, –6, –4 and –12, respectively. Energy function The energy function ΔE was kept simplistic. It adds –1 for each direct (non-diagonal) lattice contact between two non-consecutive hydrophobic amino acids. `Clashes' (i.e. two residues in the same place in the lattice) are not allowed. Trials with clashes had to be newly created. Implementation of entropy To study entropy effects, lattice spaces adjacent to the protein chain were modeled to be filled by solvent in the following (small) water ensembles (Figure 2a). Long-range interactions in the rest of the solvent (e.g. long-range order brought about by solvating ions or exposed polar groups) were not considered in this simple model. Small water ensembles with two different properties surrounded the model protein: ordered and less ordered. The ordered water ensembles are adjacent to hydrophobic residues or the solvent (i.e. other water ensembles). The less ordered (with high entropy) exist if no hydrophobic residue is adjacent to them. An O-block (ordered) on the lattice is asigned to the ensemble of ordered water molecules, an L-block to a less ordered ensemble (Figure 2b). The number of unordered water ensembles, Ni, was counted. It has been shown experimentally that hydrophobic molecules reduce the entropy of the surrounding (aqueous) solvent (Schulz and Schirmer, 1996). The solvent entropy difference, S, between one protein chain conformation, N1 and a tested next one, N2, during the simulation was set to be proportional to the difference in the number of unordered adjacent lattice spaces counted (entropy is additive; we assigned 1 unordered block = 1 high-entropy unit = 1/f, with f < 1; whereas 1 ordered block = 1 low-entropy unit = 1). With the order parameter f (see below) the probability for the new configuration to be chosen is p ~ e(TS/T) = f(N1 – N2). This implementation considers the entropy of the solvent according to Boltzmann statistics. The derivation of this can be done by simple algebra (see http://www.embl-heidelberg.de/~dandekar/entropy.html). In contrast, the simple energy function from above does not differentiate between ordered and less ordered ensembles (Figure 2b). Instead, it is derived (–1 for any hydrophobic contact), if the sum (`hydrogen bonds' in our model) of the connections of (water ensembles–water ensembles) and (water ensembles–hydrophilic residues) is counted and compared between two conformations. The order parameter f allowed the testing of different entropy weights during simulations, either alone or multiplying the entropy term, f(N1–N2)), with the energy term, e(–ΔE/T), yielding p ~ e–F/T = e(ΔE – TS)/T to consider also the energy difference ΔE between two chain conformations. The model allowed the examination of three implementations: A, energy effects (ΔE; simplified in the context of the model; only hydrophobic interactions are considered or, alternatively, the effect of `hydrogen bonds'); B, entropy (ΔS, difference of ordered small water ensembles); and C, the combination of them (more realistic implementation, F = E – TS). F denotes the Helmholtz free energy (corresponding also to the Gibbs free energy if changes in volume and pressure are neglected). The new chain conformation was accepted if a random number between zero and 1/constant was smaller than e–F/T. Simulation steps A Monte Carlo algorithm was used with the following steps [implemented as in Unger and Moult (Unger and Moult, 1993)]: Start from a random conformation. From a conformation C1 with energy E1 and entropy S1 a single random change yields conformation C2. If C2 is not clash-free a new conformation is created. Otherwise, the algorithms A, B and C decide by different criteria on the acceptance of the new conformation C2. The variable `energy evaluations' is increased by one. This variable is taken as the time-counting variable since the energy evaluation is often the most time-demanding function in more complex protein folding simulations. The MC implementation was, as in common classical statistical mechanics, the probability of the system being in state i was pi = exp[(E – TS)/T], where T is the artificial temperature, an optimized parameter (details are given below). The Boltzmann constant k was set to 1 for simplicity. T was decreased for algorithms A and C as in the common simulated annealing method (Kirkpatrick et al., 1983). Check if the stop criterion is met. Go back to step 2 if the counter for energy evaluations is <100 000. The value 100 000 was taken as the limit to keep the runs fast (a few seconds for the smallest chain, 30 s for the longest) and results were collected from 100 runs for each different parameter set. Moreover, this was sufficient to allow identification of the global minimum for most of the chains. Two implementations were tested and compared for each algorithm. Implementation 2 models and examines a more detailed partition function as a decision rule to change from conformation C1 to C2 (see Appendix for details), but this may prevent the algorithm from exploring new areas in search space when passing along flat surfaces. A decision constant (see Results) was introduced potentially to balance this effect and compare both implementations for a range of different values of the constant (including constant = 1, no constant introduced; the probability to accept C2 could maximally become 1, corresponding to accepting with certainty a change to conformation C2). Implementation of algorithms A, B and C Algorithm A (energy) Accept C2 if rnd < constant×e–ΔE/T (standard, implementation 1) rnd < constant×1/(1 + e–ΔE/T) (implementation 2) where rnd is a random number between 0 and 1 and T is an artificial temperature that gradually decreased (cooled) during the run to achieve convergence. T started with T0 = 2.00 and was decreased after every 100 energy evaluations with the cooling rate c: T = Ti = Ti – 1c (c < 1) c was optimized as a parameter (see Results). Algorithm B (entropy) Accept C2 if rnd < constant×f(N1–N2) (standard, implementation 1) rnd < constant×1/[1 + f(N2–N1)] (implementation 2) f < 1 was the order parameter and optimized (see Results). Note that the entropy of the protein conformations need not be considered as the algorithm itself samples over a representative ensemble of equally weighted protein microstates. Long-range interactions in the rest of the solvent (e.g. long-range order brought about by solvating ions or exposed polar groups) were not considered in this simple model. Furthermore, calculation of the term is easily effected. Comparison with the algorithm A above shows that the calculation is also not more computationally expensive. Algorithm C (energy and entropy) Algorithm C calculated F = E – TS (the free enthalpy F, equal to the energy E minus the product of temperature T and entropy S) before the decision: rnd < constant×f(N1–N2)e–ΔE/T (standard, implementation 1) rnd < constant×1/[1 + f(N2 – N1)e–ΔE/T] (implementation 2) T and f were taken as in algorithms A and B, respectively. Note that in the comparisons, the optimal folded states (maximally packed, minimum energy but also highest solvent shell entropy) are always the same and known beforehand (cf. Figure 1). Only their energy values are given in the tables for comparison. However, in searching for these states in algorithms B and C their entropy is also considered, comparing alternative conformations. The objective function of the three different algorithms is different and favors different suboptimal folds during searching; the potential surface separating and leading to the optimal folded states is different for the three algorithms. Results In our model the protein is surrounded by water composed of clusters with two different properties, ordered and less ordered, to calculate solvent shell entropy. The ordered water clusters are adjacent to hydrophobic residues and thus engage in hydrogen bonds with hydrophilic residues or the solvent. The less ordered clusters exist if no hydrophobic residue is adjacent to them. For our model no further knowledge about side chain properties is needed. This approach may be compared to an algorithm that favors a reduced hydrophobic surface of the protein. However, the latter would be a mean field approximation of the whole surface, whereas our model takes local structures, the solvent microstates, into account. The search efficiency of this entropy implementation in protein folding was compared with that of energy-driven searching. Different chains on both a two- and a three-dimensional lattice were tested to avoid effects depending on the lattice topography chosen. Three search strategies were compared in Monte Carlo (Metropolis et al., 1953; Kirkpatrick et al., 1983; Unger and Moult, 1993) simulations: Searching for the energy minimum. The number of hydrogen bonds gave the energy for each conformation (Table I). Searching for the conformation allowing the highest amount of microstates for the solvent shell (entropy search; Table II): at each step the new and the old conformation were compared. The conformation with the higher entropy (more microstate representations) of the solvent is taken for the next step. A combination of energy and entropy (Table III). Both the number of hydrogen bonds and the number of conformations were taken into account for search selection. All algorithms were run 100 times with each parameter set (with different random seeds). The comparison was done with five chains in the two-dimensional model and four chains in the three-dimensional model. Except for the two longest chains in the two-dimensional model (33-mer and 48-mer), the global minimum was found quite reliable by all algorithms. If the global minimum was not found by each run, our comparisons indicate how often the minimum was found per 100 runs and the standard deviation of this figure (numbers in parentheses). If the global minimum was found reliably, the number of conformation trials to find it was computed. A difference of >20% in the number of trials required was considered a significant difference between the search strategies examined. Two different implementations (see Materials and methods) to evaluate the probability, including different values of the decision constant to accept a new conformation in the next step of the simulation, were compared in each strategy. Additionally, we compared different parameter values for the cooling rate c in A and C and for the entropy parameter f in B and C. We tested a broad range, for f between 0.9 and 0.001 (0.9, 0.7, 0.5, 0.3, 0.1, 0.05, 0.01, 0.001), for c between 0.999999 and 0.9 (0.999999, 0.99999, 0.9999, 0.999, 0.99, 0.9) and for the decision constant the values 1, 2, 4, 8, 16 and 32. f, c and decision constant were all optimized for the results given. The optimal range for f was 0.1 to 0.7; f = 0.3 produced the best results over all conditions. For c the optimal range was 0.99–0.99999 with c = 0.99 yielding the best results comparing all conditions. A decision constant >2 was advantageous and constant = 4 produced the best results. The search strategy on solvent shell entropy alone is effective in identifying the global minimum. It is more effective in searching than the standard Monte Carlo energy minimization procedure with the simplified energy function in more than 60% of the parameter conditions investigated (33 of 54 cases; including the longer chains). Sixteen cases showed no significant difference, five cases were less efficient, including middle sized chain 3 in two dimensions, implementation 1, constant = 1 and chain 4 in three dimensions, implementation 2, constant = 2. The combination of both search strategies is easily achieved (algorithm C). This yields a further significant improvement for all tested chains including the longer ones in both lattices. It outperformed the energy-driven search (algorithm A) in 45 of 54 cases. In two of the remaining nine cases it showed a clear positive trend. It was also clearly and significantly superior (but here only in 40 of 54 cases) than entropy alone (algorithm B); in 10 cases there was no significant difference. Implementation 2 (see Materials and methods; Appendix for details) was not superior, but slightly less efficient than the standard implementation. Exactly 100 000 steps were compared for each strategy and each chain to allow a fair comparison of all search strategies, comparing batches of 100 runs for each condition; 100 000 steps were taken to collect a large amount of data for the algorithm comparison in a convenient time so that hundreds of runs could be compared and average statistics obtained. On average, this value was not sufficient for the 48-mer on the two-dimensional lattice to find the global minimum. For the 33-mer it was found about once (out of 100 trials) in the better strategies. Therefore, the energy of the best local minimum found was compared if the global minimum was not identified. It was not our aim to identify the global minimum under each condition but rather to compare the different algorithms under simple (global minimum always found) and demanding conditions (only the best strategies succeed sometimes). Note that a gain of one step lower in the energy of a local minimum is a significant search success as the number of local minima increases exponentially with higher energy (Lau and Dill, 1990). Solvent shell-guided searching was also here an improvement compared with the simple energy-guided search strategy and algorithm C also performed best in the comparison for longer chains. It found a local optimum with energy –22 for the 48-mer. It also found the global optimum for the 33-mer four times (Tables I–III). Discussion Entropy is an important concept in protein folding (Scheraga, 1998) including advances from recent work (e.g. Warwicker, 1997; Scheraga and Hao, 1999). In general, the implementation of entropy is complex. It includes the correct enumeration of microstates in microcanonical ensembles and considering that protein entropy decreases while solvent shell entropy increases during protein folding. We have shown in a well established, simple model for protein folding (hp model) and for different types of lattices that consideration of the entropy of the adjacent water shell is sufficient to locate the correct fold for these lattice chains. The target function (high entropy, low number of hydrophobic residues on the surface) differs from the energy function (again focusing on hydrophobicity) with better access to the native fold in the overall comparison. As a result, the search speed of the entropy implementation outperformed an energy-driven search. Several factors are potentially involved in this: Considering the entropic forces of the solvent shell helps to enhance globular packing (cf. Schulz and Schirmer, 1996). Some of the local minima and traps are more smoothed out as conformations with identical energy are distinguished by their solvent shell entropy differences (cf. Figure 2b). Otherwise no further complications are introduced in this simplified model, in particular regarding the energy term. We did not observe trapping in new entropy-driven minima with HP contacts, probably as they can be more easily left than pure energy minima as both entropy forces and energy function may open up potentially favorable changes for such conformations. Considering the microstates focuses on a different perspective of the folding problem. The preferred state of the protein chain is not one rare and single conformation but instead in the context of this model (and in the entropy aspect more realistic than in some other simple protein models) the global minimum of energy displays the highest number of microstates. Furthermore, breathing (rearrangement of microstate ensembles) can be seen and investigated in such models (König and Dandekar, 1999). This allows for the investigation of entropy effects as well as partial unfolding of the protein chain after the global minimum has been reached in these simulations. The solvent shell entropy model presented here achieves efficient folding for a number of simplified protein chains. In the context of the model it was not necessary to include long-range forces to achieve this. This will have to be tested further, in particular further refinement of the structures obtained by inclusion of such forces. The starting information required for any of the search strategies A, B and C was the same, the protein sequence was sufficient. An exhaustive enumeration for some of the smaller protein chains modeled is possible, but here we wanted to compare the speed and efficiency of the different search strategies. The hydrophobic effect was compared and considered both by its entropic and energetic consequences. This focus was chosen as hydrophobic interactions are principal forces in protein folding and to compare both strategies with their respective effects on search success. The combined strategy where energy and entropy are both considered is easily achieved and can accommodate also more complex energy functions than the simplistic one chosen for testing. It identifies the native fold best. It yielded a significantly better search effectiveness on both lattices. Additionally, we compared the algorithms with a standard (as in common MC methods) and an alternative implementation (partition function motivated) of the decision constant and showed implementation-independent, similar results for both implementations. Furthermore, both lattices tested (two- and three-dimensional) showed similar success as a control against effects from lattice choices. The use of a decision constant further improved search success. The search strategies compared fulfill ergodicity over sufficient long observation times and to the extent that the search space is non-reducible and contiguous, any conformation can be reached with some probability from any other conformation and it is aperiodic. Detailed balance is fulfilled for the three search strategies at least so far as reversibility for any conformational change effected is possible during the simulation. Entropy maximization was implemented in the model according to the natural behavior of proteins (Schulz and Schirmer, 1996) and, in fact, entropy maximization occurs during time for any molecular interaction (more details on entropy maximization in protein–solvent systems are given on our Web page). Solvent shell optimization can certainly be considered to have an important impact on protein folding. In our simulations this aspect of folding was shown to be sufficient to localize the optimal protein structure of small model proteins. It should be noted that also other well known factors involved in protein folding such as hydrophobic forces, hydrogen bonds and cooperativity (Kolinski et al., 1999) are in part determined by solvent shell entropic forces. Nevertheless, exact experimental quantification of solvent shell effects remains a challenge. Our simplistic model suggests that at present their contribution to protein folding may perhaps be underestimated. Shortle et al. showed that consideration of entropic forces is helpful in predicting and understanding the effects of mutations (Shortle et al., 1992). Also antibody and other protein–protein interaction models are more precise if entropic effects are included. The inclusion of a simplified solvent shell entropy algorithm as given and compared in detail here improves the search success in protein folding simulations and will be taken to improve more complex folding simulations such as more extended, grid free models as a next step. Appendix We compared the standard implementation to an alternative implementation 2. Implementation 2 considered a more detailed (entropy motivated) partition function comparing the new state C2 with the old state C1:  \[\mathit{p\ }_{\mathit{i}}\ =\frac{1}{\mathit{Z}}e^{\frac{{\mbox{--}}\mathit{E}_{\mathit{i}}}{\mathit{T}}}\mathit{i}\ =\ 1,\ 2\] 1 where Z is the normalization constant (partition function). With pi = 1,  \[\mathit{p}_{2}\ =\frac{e^{\frac{{\mbox{--}}\mathit{E}2}{\mathit{T}}}}{e^{\frac{\mathit{E}1}{\mathit{T}}}\ +\ e^{\frac{{\mbox{--}}\mathit{E1}}{\mathit{T}}}}\ =\ \frac{1}{1\ +\ e^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}}}\ e\ ^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}}\ with\ \mathit{E}\ =\ \mathit{E}_{2}\ {\mbox{--}}\mathit{E}_{1}\] 2 If C1 is the old and C2 the new conformation, p2 is the exact probability of accepting the new conformation (implementation 2 throughout the text). If the new conformation has the same or only slightly better energy value, the probability of accepting the new conformation is according to the partition function only around 0.5. Metropolis et al. (1953) made the following modification which more easily accepts a new conformation (Figure 3). If E becomes negative, p2 becomes close to 1. If E is positive, we have  \[\frac{1}{1\ +\ e^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}}}\ {\varepsilon}\ {[}0.5;1{]}\] 3 Furthermore, expression (3) is close enough to 1 if –E is smaller than or close to T. This yields the approximation  \[\mathit{p}_{2}\ =\ \{\ e^{\frac{{\mbox{--}}{\Delta}\mathit{E}}{\mathit{T}}\ {\forall}\ {\Delta}\mathit{E}{>}0}_{\ 1\ {\forall}\ {\Delta}\mathit{E}{\leq}0}\] 4and is applied in the standard Metropolis Monte Carlo method (called implementation 1 throughout the text). In a similar way, considering again the partition function one can implement the entropy calculation in two ways, replacing e–ΔE/T by f(N1 – N2) (order parameter f; number of microstates N1 and N2). Table I. Identification of the global minimum in 100 trials (`found') is compared for two implementations (imp1 and imp2) and different values for the decision constant c to accept a new conformation: results considering energy only (algorithm A) Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  aNine chains with two (2D) or three (3D) dimensions were tested. The energy of the respective global minimum is given in square brackets [E]. If every time the global minimum was found the average number of energy evaluations (steps) to find it is given.  bValues with standard deviations (dev) are given in parentheses if the global minimum was not always found in 100 trials.  cFor longer protein chains (2D, chains 4 and 5) the 100 000 simulation steps used in this search strategy comparison were on average too few to locate the global minimum; instead the lowest energy minimum found and how often it was found in 100 trials is given in square brackets.  2D, chain 1  [–4]  54b  (5)b  53  (5)  52  (5)  50  (5)  53  (5)  47  (5)  2D, chain 2  [–9]  4  (2)  4  (2)  7  (3)  6  (2)  9  (3)  3  (2)  2D, chain 3  [–8]  11  (3)  5  (2)  5  (2)  5  (2)  10  (3)  8  (3)  2D, chain 4  [–14]  0c  [–13, 6×]c  0  [–13, 2×]  0  [–13, 6×]  0  [–13, 2×]  1  (1)  0  [–13, 2×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–20, 1×]  0  [–19, 3×]  0  [–20, 1×]  0  [–20, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  3  (2)  0  (2)  4  (2)  2  (1)  4  (2)  5  (2)  3D, chain 2  [–6]  95  (2)  91  (3)  93  (3)  97  (2)  92  (3)  94  (3)  3D, chain 3  [–4]  100a  28907a  100  23536  98  (1)  99  (1)  98  (1)  99  (1)  3D, chain 4  [–12]  2  (1)  1  (1)  3  (2)  2  (1)  2  (1)  3  (2)  Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  aNine chains with two (2D) or three (3D) dimensions were tested. The energy of the respective global minimum is given in square brackets [E]. If every time the global minimum was found the average number of energy evaluations (steps) to find it is given.  bValues with standard deviations (dev) are given in parentheses if the global minimum was not always found in 100 trials.  cFor longer protein chains (2D, chains 4 and 5) the 100 000 simulation steps used in this search strategy comparison were on average too few to locate the global minimum; instead the lowest energy minimum found and how often it was found in 100 trials is given in square brackets.  2D, chain 1  [–4]  54b  (5)b  53  (5)  52  (5)  50  (5)  53  (5)  47  (5)  2D, chain 2  [–9]  4  (2)  4  (2)  7  (3)  6  (2)  9  (3)  3  (2)  2D, chain 3  [–8]  11  (3)  5  (2)  5  (2)  5  (2)  10  (3)  8  (3)  2D, chain 4  [–14]  0c  [–13, 6×]c  0  [–13, 2×]  0  [–13, 6×]  0  [–13, 2×]  1  (1)  0  [–13, 2×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–20, 1×]  0  [–19, 3×]  0  [–20, 1×]  0  [–20, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  3  (2)  0  (2)  4  (2)  2  (1)  4  (2)  5  (2)  3D, chain 2  [–6]  95  (2)  91  (3)  93  (3)  97  (2)  92  (3)  94  (3)  3D, chain 3  [–4]  100a  28907a  100  23536  98  (1)  99  (1)  98  (1)  99  (1)  3D, chain 4  [–12]  2  (1)  1  (1)  3  (2)  2  (1)  2  (1)  3  (2)  View Large Table II. Identification of the global minimum in 100 trials (`found') is compared for two implementations (imp1 and imp2) and different values for the decision constant c to accept a new conformation: results considering entropy only (algorithm B) Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  13978a  100  16790  100  8058  100  8852  100  3748  100  4228  2D, chain 2  [–9]  14b  (4)b  12  (3)  13  (3)  13  (3)  33  (5)  32  (5)  2D, chain 3  [–8]  4  (2)  10  (3)  11  (3)  10  (3)  25  (4)  26  (4)  2D, chain 4  [–14]  0c  [–12, 6×]c  0  [–13, 1×]  0  [–12, 16×]  0  [–13, 1×]  1  (1)  0  [–13, 3×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–19, 1×]  0  [–20, 1×]  0  [–20, 1×]  0  [–21, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  2  (1)  1  (1)  1  (1)  2  (1)  2  (1)  3  (2)  3D, chain 2  [–6]  100  8502  100  10161  100  13806  100  19192  100  9255  100  9879  3D, chain 3  [–4]  100  5383  100  6582  100  5898  100  5872  100  5961  100  5209  3D, chain 4  [–12]  2  (1)  1  (1)  1  (1)  0  (1)  4  (2)  3  (2)  Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  13978a  100  16790  100  8058  100  8852  100  3748  100  4228  2D, chain 2  [–9]  14b  (4)b  12  (3)  13  (3)  13  (3)  33  (5)  32  (5)  2D, chain 3  [–8]  4  (2)  10  (3)  11  (3)  10  (3)  25  (4)  26  (4)  2D, chain 4  [–14]  0c  [–12, 6×]c  0  [–13, 1×]  0  [–12, 16×]  0  [–13, 1×]  1  (1)  0  [–13, 3×]  2D, chain 5  [–23]  0  [–20, 1×]  0  [–19, 1×]  0  [–20, 1×]  0  [–20, 1×]  0  [–21, 1×]  0  [–20, 1×]  3D, chain 1  [–5]  2  (1)  1  (1)  1  (1)  2  (1)  2  (1)  3  (2)  3D, chain 2  [–6]  100  8502  100  10161  100  13806  100  19192  100  9255  100  9879  3D, chain 3  [–4]  100  5383  100  6582  100  5898  100  5872  100  5961  100  5209  3D, chain 4  [–12]  2  (1)  1  (1)  1  (1)  0  (1)  4  (2)  3  (2)  View Large Table III. Identification of the global minimum in 100 trials (`found') is compared for two implementations (imp1 and imp2) and different values for the decision constant c to accept a new conformation: results combining entropy and energy (algorithm C) Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  10686a  100  11861  100  9553  100  8894  100  3659  100  3686  2D, chain 2  [–9]  33b  (5)b  27  (4)  44  (5)  40  (5)  51  (5)  47  (5)  2D, chain 3  [–8]  22  (4)  16  (4)  34  (5)  24  (4)  50  (5)  45  (5)  2D, chain 4  [–14]  0c  [–13, 4×]c  0  [–13, 7×]  1  (1)  1  (1)  1  (1)  1  (1)  2D, chain 5  [–23]  0  [–20, 3×]  0  [–20, 1×]  0  [–22, 1×]  0  [–21, 1×]  0  [–22, 1×]  0  [–21, 2×]  3D, chain 1  [–5]  9  (3)  9  (3)  6  (2)  7  (3)  7  (3)  4  (2)  3D, chain 2  [–6]  100  6428  100  7074  100  6660  100  5221  100  7400  100  7642  3D, chain 3  [–4]  100  3223  100  3837  100  4066  100  3070  100  3638  100  3997  3D, chain 4  [–12]  6  (2)  5  (2)  13  (3)  5  (2)  4  (2)  4  (2)  Chaina  [E]  c = 1  c = 2  c > 2      imp1  imp2  imp1  imp2  imp1  imp2      Found  Stepsa (dev)b [E]c  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  Found  Steps (dev) [E]  a–cSee Table I.  2D, chain 1  [–4]  100a  10686a  100  11861  100  9553  100  8894  100  3659  100  3686  2D, chain 2  [–9]  33b  (5)b  27  (4)  44  (5)  40  (5)  51  (5)  47  (5)  2D, chain 3  [–8]  22  (4)  16  (4)  34  (5)  24  (4)  50  (5)  45  (5)  2D, chain 4  [–14]  0c  [–13, 4×]c  0  [–13, 7×]  1  (1)  1  (1)  1  (1)  1  (1)  2D, chain 5  [–23]  0  [–20, 3×]  0  [–20, 1×]  0  [–22, 1×]  0  [–21, 1×]  0  [–22, 1×]  0  [–21, 2×]  3D, chain 1  [–5]  9  (3)  9  (3)  6  (2)  7  (3)  7  (3)  4  (2)  3D, chain 2  [–6]  100  6428  100  7074  100  6660  100  5221  100  7400  100  7642  3D, chain 3  [–4]  100  3223  100  3837  100  4066  100  3070  100  3638  100  3997  3D, chain 4  [–12]  6  (2)  5  (2)  13  (3)  5  (2)  4  (2)  4  (2)  View Large Fig. 1. View largeDownload slide Three chains (18-mer, 24-mer, 33-mer) for the two-dimensional model, shown in their optimal folded lowest energy state. Their energies (direct hydrophobic contacts) are –9, –8 and –14. Fig. 1. View largeDownload slide Three chains (18-mer, 24-mer, 33-mer) for the two-dimensional model, shown in their optimal folded lowest energy state. Their energies (direct hydrophobic contacts) are –9, –8 and –14. Fig. 2. View large Download slide View large Download slide (a) Chain 3 of the three-dimensional model, surrounded by ordered (gray spheres) and less ordered (black spheres) water. (b) Two structures of a 12-mer on the two-dimensional lattice. L stands for an ensemble of less ordered water, O for a (higher) ordered ensemble. The energy implementation alone (algorithm A) would consider both stuctures the same (Eleft = Eright). The objective function of B and C considers the amount of ordered water ensembles (O) and favors the left, globule like, native structure (Nleft = 3 < Nright = 4, so Fleft > Fright (for further details, see Materials and methods and Results). Fig. 2. View large Download slide View large Download slide (a) Chain 3 of the three-dimensional model, surrounded by ordered (gray spheres) and less ordered (black spheres) water. (b) Two structures of a 12-mer on the two-dimensional lattice. L stands for an ensemble of less ordered water, O for a (higher) ordered ensemble. The energy implementation alone (algorithm A) would consider both stuctures the same (Eleft = Eright). The objective function of B and C considers the amount of ordered water ensembles (O) and favors the left, globule like, native structure (Nleft = 3 < Nright = 4, so Fleft > Fright (for further details, see Materials and methods and Results). Fig. 3. View largeDownload slide Probability of accepting a new conformation C2. The (standard) implementation 1 and implementation 2 are compared. (A) The probability curve of accepting C2 for implementation 2. (B) Probability curve for the (standard) implementation 1. It can be seen that in (A) even for F < 0 there is still the possibility of not accepting the step, whereas in (B) every new step with F < 0 is accepted. Fig. 3. View largeDownload slide Probability of accepting a new conformation C2. The (standard) implementation 1 and implementation 2 are compared. (A) The probability curve of accepting C2 for implementation 2. (B) Probability curve for the (standard) implementation 1. It can be seen that in (A) even for F < 0 there is still the possibility of not accepting the step, whereas in (B) every new step with F < 0 is accepted. 1 To whom correspondence should be addressed. E-mail: dandekar@embl-heidelberg.de We thank Warren C.Lathe,III for stylistic corrections. Part of the research was funded by Graduiertenkolleg (University of Heidelberg) and SFB544/B2. References Aarts,E. and Korst,J. (1989) Simulated Annealing and Boltzmann Machines. Wiley, New York. Google Scholar Braxenthaler,M., Unger,R., Auerbach,D., Given,J.A. and Moult,J. ( 1997) Proteins , 29, 417–425. Google Scholar Cramer,C.J. and Truhlar,D.G. ( 1992) Science , 256, 213–217. Google Scholar Fraternali,F. and van-Gunsteren,W.F. ( 1996) J. Mol. Biol. , 256, 939–948. Google Scholar Hsiang-Ai,Y. and Karplus,M. ( 1988) J. Chem. Phys. , 89, 2366–2379. Google Scholar Kirkpatrick,S., Gelatt,C.D. and Vecchi,M.P. ( 1983) Science , 220, 671–680. Google Scholar Kolinski,A, Ilkowsko,B. and Skolnick,J. ( 1999) Biophys. J. , 77, 2942–2952. Google Scholar König,R. and Dandekar,T. ( 1999) J. Mol. Model. , 5, 317–324. Google Scholar Lau,K.F. and Dill,K.A. ( 1989) Macromolecules , 22, 3986–3997. Google Scholar Lau,K.F. and Dill,K.A. ( 1990) Proc. Natl Acad. Sci. USA , 87, 638–642. Google Scholar Levitt,M. and Sharon,R. ( 1988) Proc. Natl Acad. Sci. USA , 85, 7557–7561. Google Scholar Metropolis,N., Rosenbluth,A.W., Rosenbluth,M.N., Teller,A.H. and Teller,E. ( 1953) J. Chem. Phys. , 21, 1087–1092. Google Scholar Scheraga,H.A. ( 1998) J. Biomol. Struct. Dyn. , 16, 447–460. Google Scholar Scheraga,H.A. and Hao,M.-H. ( 1999) Adv. Chem. Phys. , 105, 243–272. Google Scholar Schulz,G. and Schirmer,R.H. (1996) Principles of Protein Structure. Springer, Heidelberg. Google Scholar Shortle,D., Chan,H.-S., Dill,K.A. ( 1992) Protein Sci. , 1, 201–215. Google Scholar Smith,P.E. and van-Gunsteren,W.F. ( 1994) J. Mol. Biol. , 236, 629–636. Google Scholar Unger,R. and Moult,J. ( 1993) J. Mol. Biol. , 231, 75–81. Google Scholar Warwicker,J. ( 1997) Protein Eng. , 10, 809–814. Google Scholar © Oxford University Press

Journal

Protein Engineering, Design and SelectionOxford University Press

Published: May 1, 2001

There are no references for this article.