Role of Water in the Selection of Stable Proteins at Ambient and Extreme Thermodynamic Conditions

Role of Water in the Selection of Stable Proteins at Ambient and Extreme Thermodynamic Conditions PHYSICAL REVIEW X 7, 021047 (2017) Role of Water in the Selection of Stable Proteins at Ambient and Extreme Thermodynamic Conditions 1 2 1 1 Valentino Bianco, Giancarlo Franzese, Christoph Dellago, and Ivan Coluzza Faculty of Physics, University of Vienna, Vienna 1090, Austria Secció de Física Estadística i Interdisciplinària–Departament de Física de la Matèria Condensada, Facultat de Física & Institute of Nanoscience and Nanotechnology (IN2UB), Universitat de Barcelona, Barcelona 08028, Spain (Received 21 October 2016; revised manuscript received 1 February 2017; published 26 June 2017) Proteins that are functional at ambient conditions do not necessarily work at extreme conditions of temperature T and pressure P. Furthermore, there are limits of T and P above which no protein has a stable functional state. Here, we show that these limits and the selection mechanisms for working proteins depend on how the properties of the surrounding water change with T and P. We find that proteins selected at high T are superstable and are characterized by a nonextreme segregation of a hydrophilic surface and a hydrophobic core. Surprisingly, a larger segregation reduces the stability range in T and P. Our computer simulations, based on a new protein design protocol, explain the hydropathy profile of proteins as a consequence of a selection process influenced by water. Our results, potentially useful for engineering proteins and drugs working far from ambient conditions, offer an alternative rationale to the evolutionary action exerted by the environment in extreme conditions. DOI: 10.1103/PhysRevX.7.021047 Subject Areas: Biological Physics, Soft Matter, Statistical Physics I. INTRODUCTION high-T adaptation alone selects for superstable sequences that are highly resistant to both cold and pressure denatu- Proteins are molecules made of a sequence of amino acid ration. Here, we define the proteins as superstable if their residues that fold to target structures (native states) for a average stability region encompasses the average stability range of temperatures T and pressures P. Different pro- region of proteins designed at ambient conditions. Our teins, with different sequences, have different T–P ranges results confirm, for the first time, the hypothesis that a of stability for their native state. As a matter of fact, many strong stability of the folded state at high T corresponds to life forms survive under extreme T–P conditions [1,2], strong stability also at low T and high P [17–20]. implying that their proteins fit that environment. Recent We develop a protein design strategy that focuses on the works [3–10] have evidenced that changes in T and P relation between sequence and folded structure, allowing us significantly alter the water-mediated hydrophilic and to calculate the stability of many proteins in a wide range of hydrophobic interactions of the residues along the chain, T and P in explicit water, a formidable task that is currently with an effect that depends on the sequence. However, a infeasible for atomistic models. Following the standard direct observation of how the protein selection responds to approach introduced by Shakhnovich and Gutin [21,22], extreme changes in the aqueous conditions is still lacking. we adopt a coarse-grained lattice representation of proteins One of the reasons for this is that, for the few notable that is computationally effective and yields results that can studies accounting for explicit water [11–16], an exhaustive also be extended to off-lattice proteins, as recently dem- sampling of the T–P stability region of the selected proteins onstrated for the implicit solvent case [23–25]. The main is beyond the reach of current computers. Here, we fill this difference between our strategy and the standard approach gap. We present a computational study based on a novel is that here, instead of considering the phenomenology of strategy that mimics the adaptation process of solvated water implicitly, we explicitly include water in our coarse- proteins in a wide range of thermodynamic conditions. Our graining. As an explicit solvent, we adopt a many-body results reveal the border beyond which artificial protein water model that has recently been shown to provide design and natural evolution fail. Moreover, we show that relevant information on the role of water in the cold- and pressure-denaturation mechanisms of proteins [9]. II. DESIGN APPROACH Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Protein and solvent interactions are described by the Further distribution of this work must maintain attribution to Hamiltonian (see Appendix A for details) H ≡ H þ R;R the author(s) and the published article’s title, journal citation, ðbÞ ðhÞ and DOI. H þ H þ H , where the first term accounts for the w;w w;w R;w 2160-3308=17=7(2)=021047(15) 021047-1 Published by the American Physical Society BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) residue-residue interactions and the second term for the the calculations for the 30 amino acid proteins. The same residue-hydrophobic or hydrophilic interaction with the results hold for longer proteins. In each protocol, we design ðbÞ the protein at a given temperature T and pressure P and d d solvent, respectively. The third term, H , accounts for w;w test its stability in a wide range of T and P [37]. We choose the water-water interaction in the bulk, including the T and P uniformly over values ranging from the liquid- d d isotropic van der Waals interaction, as well as the direc- gas transition to the glass-transition temperature, and from tional and cooperative components of the hydrogen bonds negative pressure to ≃1 GPa. For each (T , P ) and each ðhÞ d d (HBs) [8,9,32–36]. The last term, H , describes the w;w protocol, we identify 5–15 optimized sequences, for a total water-water interaction in the hydration layer—the first of more than 1.5 × 10 optimizations, a number far beyond layer of water molecules in contact with the protein—and the capability of any fully atomistic protein model. For accounts for the effect of the protein’s hydrophobic and each designed sequence, we test the stability in T–P by hydrophilic residues on the interaction and structure of the checking, with Monte Carlo simulations, if the average hydration water [9]. equilibrium conformation coincides with the protein native- By assuming that the protein interface affects the water target structure. Then, we average the stability regions of all properties in the hydration shell, we use a design strategy sequences optimized for a given target-native state. For all based on the enthalpy associated with the hydrated protein, ðhÞ designed proteins, we find stability regions that resemble regardless of the bulk contribution, H ≡ H þ H þ R R;R R;w those predicted by theory [38], previous numerical works ðhÞ ðbÞ H þ PðV − V Þ, with a bias that enhances the w;w [5,7,9,39], and those observed in experiments [18,40–52] sequence heterogeneity to avoid homopolymer design (Fig. 1). Throughout the paper, we use IU for T and P solutions [23,24,29–31]. In the above expression for defined in Methods. We find that the boundaries of the ðhÞ H , V is the total volume of the system and V is the stability regions extend from low T, where proteins volume associate with the bulk water, i.e., the water not in undergo cold denaturation, to high T, where thermal direct contact with the protein. Since our optimization denaturation occurs. The T range of stability depends on accounts for changes in the aqueous conditions, the the protein sequence and size. Small proteins undergo enthalpies of each tested sequence must be averaged over cold denaturation for T< 0.1 IU [53], consistent with the water configurations. We then envisage two possible previous results [9]. Longer proteins, with higher content of design protocols based on measuring the average sequence hydrophobic residues, cold-denaturate at higher T. All the ðhÞ sequences we consider are unstable above the threshold enthalpies hH i, where the angular brackets h…i refer to pressure P ≃ 0.7 IU (Figs. 5 and 6 in Appendix D), the thermodynamic average calculated over equilibrium qualitatively consistent with what is observed in water configurations, including the bulk. The first strategy ðhÞ Refs. [40,54,55]. We find that denaturation occurs also consists in minimizing the average enthalpy hH i of the R;f at low or negative P, consistent with recent findings hydrated protein in its folded (f) conformation. The second [9,39,56]. ðhÞ consists in maximizing the enthalpic gap ΔH between Experimental data for different proteins [Fig. 1(b)] show the f state and the unfolded (u) protein conformation, that the higher the thermal stability of the proteins, the both hydrated. As a u state, we use a completely stretched higher the pressure-stability region [17,18]. Our simula- structure that is a representative of all equivalent tions have the same trend. In particular, we find that the unbounded configurations. For both protocols, the enthalpy stability region is more extended for sequences designed of the hydration water is averaged over equilibrium at higher T and intermediate P , corroborating the d d configurations, accounting for the energy, density, and hypothesis that stability ranges in T and P are positively entropy of the solvent at the given thermodynamic con- correlated: Proteins with a pronounced thermal stability are ditions and protein interface. We refer to the two protocols also more stable with respect to pressure. Furthermore, we as MIN ENTHALPYand MAX GAP, respectively. In order observe that sequences designed at high T are stable both to quantify the solvent contribution, we compare the results at high and low T, while those designed at lower T can obtained with these two protocols with those of a simpler fold only at low T. For example, sequences designed at method (IMPLICIT SOLVENT) based on the minimization P ¼ 0 and T ¼ 0.1 IU are stable up to T ≃ 0.3 IU, while d 1 of the residue interactions without any water contribution. to get a sequence that also folds at higher T ¼ 0.4 IU, we need to design it at T ¼ T [Fig. 1(a)]. These results d 2 are consistent with experiments comparing the stability III. RESULTS AND DISCUSSION range of mesophilic and thermophilic proteins [57] and A. Thermal and pressure stability of proteins with data revealing the higher resistance of thermophilic We focus our attention on ten different target proteins proteins to cold denaturation [19,20]. More generally, we with lengths between 30 and 90 amino acids that are find that proteins that are designed at high T are super- compact enough that the exposed surface to the solvent is stable; i.e., they also show a remarkable stability at low T and at high P. the smallest possible. For sake of clarity, here we show only 021047-2 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) (a) (b) FIG. 1. T–P stability regions of designed and experimental proteins. (a) Average stability regions (closed curves) within which the protein sequences designed with the MIN ENTHALPY protocol at different T and P fold into the native state. The regions for proteins d d designed at P ¼ 0 (red, green, and blue curves), close to ambient pressure, are enclosed one into another with an extension that is proportional to T . All of these regions enclose the stability region of proteins designed at higher P (pink curve). The dotted line shows d d the stability region for a sequence calculated with the IMPLICIT SOLVENT protocol. The “glass-transition” (solid black) line defines the temperatures below which the system does not equilibrate. The long-dashed line represents the limit of stability (spinodal) of the liquid with respect to the gas phase. Pressure and temperature are expressed in internal units (IU). (b) Experimental stability region for different proteins indicated in the legend (adapted from Refs. [40–49]). The long-dashed line is the liquid-gas transition line. The data show a clear positive correlation between the T range and P range of stability. Pressure denaturation is observed in a range of −0.1 ≲ P=GPa ≲ 0.6, while stability at higher P is reached by introducing artificial covalent bridges between the amino acids, as in the case of Zn cytochrome c [43]. In both panels, the shaded region corresponds to ambient conditions. B. Water effect on the protein surface surface and a hydrophobic core (Fig. 3). These proteins and core hydropathy profiles with less segregation in their sequence are less stable at high T [Fig. 1(a)]. We rationalize this fundamental property of protein This finding is consistent with previous studies on evolution in terms of the relative fraction of hydrophobic thermophilic proteins, i.e., proteins stable at high T, and hydrophilic residues in the sequence and how it showing that the higher thermostability is correlated with changes depending on the thermodynamic state of the PHI a stronger segregation between the polar surface [60,61] solvent. We calculate the fraction P ðT ;P Þ of hydro- surf d d and the hydrophobic core [62,63]. In particular, a system- philic (PHI) amino acids on the protein surface—number atic analysis of the hydropathy profile of thermophilic [58], of hydrophilic amino acids on the surface divided by the mesophilic [58] (i.e., stable at intermediate T), and ice- total number of surface amino acids—and the fraction PHO binding proteins (i.e., proteins that interfere with ice Q ðT ;P Þ of hydrophobic (PHO) amino acids into the core d d growth, Fig. 7) [59] reveals that these categories of proteins protein core—number of hydrophobic amino acids divided have, on average, compositions that are close to those by the total number of core amino acids (Figs. 2 and 3). The PHI PHO that we calculate with our design strategies at high comparison between P ðT ;P Þ and Q ðT ;P Þ char- d d core d d surf T (T ≳ 0.45 IU and P ≃ 0), ambient T (T ¼ 0.35 IU acterizes the level of segregation of the sequence in the and P ≃ 0), and low T (T ≲ 0.15 IU and P ≃ 0). folded state. By segregation, we mean the tendency of Indeed, the data [58] (Fig. 3) show that ≃78.8% of the the optimized sequence to have a surface exposed to the surface of thermophilic proteins is hydrophilic and ≃54.5% solvent that is rich in hydrophilic amino acids and a core of their core is hydrophobic, while for mesophilic proteins, that is abundant in hydrophobic amino acids. For all the percentages are ≃78.4% and ≃54.7%, respectively. designed proteins and for both design protocols with Moreover, our direct analysis of ice-binding proteins solvent, we find that the optimized sequences at ambient −4 [59] (Fig. 7) shows that ≃67.0% of their surface is conditions (P ≃ 0.7 × 10 IU, T ≃ 0.35 IU) are mostly d d hydrophilic and ≃48.5% of their core is hydrophobic. hydrophilic on the surface and hydrophobic into the core, as expected. However, upon lowering T , both protocols Hypothesizing that the protein’s composition is optimized with solvent generate protein sequences that, in their folded at the environmental conditions at which the protein works, state, have a lower segregation between a hydrophilic we observe that the real proteins have a segregation 021047-3 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) 0.7 (a) (b) 0.8 0.6 Thermophilic Mesophilic 0.6 Proteins Proteins 0.5 0.4 0.2 Ice Binding 0.4 Proteins 0.3 (c) (d) 0.5 0.8 0.4 0.6 0.4 0.3 0.2 0.2 0.1 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.5 FIG. 3. Hydrophilic profile of a protein surface designed at PHI Temperature T [IU] Temperature T [IU] ambient pressure. The fraction P ðT ;P Þ of surface hydro- d d d d surf philic amino acid decreases with the design temperature T , for FIG. 2. Hydropathy profile of the designed protein surface and both the design protocols MIN ENTHALPY (black circles) and PHI core. The color-coded contour plots of the fraction P ðT ;P Þ MAX GAP (diamonds). Data are from Figs. 2(a) and 2(b) at surf d d of surface hydrophilic amino acid [panels (a) and (b)] and the P ¼ 0. Our results qualitatively follow the trend observed for the PHO fraction Q ðT ;P Þ of core hydrophobic amino acid [panels average hydropathy of real thermophilic [58] (large red circle), core d d (c) and (d)] as a function of the design temperature T and mesophilic [58] (large green circle), and ice-binding proteins [59] pressure P , calculated with both design protocols, MIN EN- (large blue circle), plotted at high T (T ¼ 0.45 IU), intermediate THALPY [panels (a) and (c)] and MAX GAP [panels (b) and T (T ¼ 0.3 IU), and low T (T ¼ 0.15 IU), respectively, to ease (d)], show a stronger segregation between the hydrophilic surface the comparison. Such values of T resemble the average working and the hydrophobic core for proteins designed at high T and temperature of real thermophilic, mesophilic, and ice-binding low or intermediate P . The straight (green) line in the lower- proteins. The hydropathy profile (independent of T ) of sequen- d d right corner of each panel marks the liquid-gas spinodal line of ces designed with implicit solvent (dashed line) is unable to the solvent. T and P are expressed in internal units. reproduce the observed trend. d d between hydrophilic surface and hydrophobic core that MAX GAP protocol leads to proteins with a large follows the same qualitative trend as our prediction number of hydrophilic residues in the core without major (Figs. 3 and 8). Hence, both our theory and the real data changes on the surface with respect to the high-T case show that the higher the temperature of stability of the [Figs. 2(b) and 2(d)]. protein, the stronger the segregation of the amino acid These differences in the selection are a consequence of sequence into the hydrophilic surface and hydrophobic the fact that at low T the hydrophobic amino acids are more core, although an extreme segregation does not imply the soluble [3,9,64,65] because the energetic gain of forming stability of the folded state at higher T, as shown by the water-water HBs at the hydrophobic interface compensates sequences generated with the implicit solvent design for the loss of water-hydrophilic residue HBs [9]. Hence, [Figs. 1(a) and 3]. on the one hand, the first protocol minimizes the enthalpy To get insight into the role of water in the evolutionary of the f state by increasing the number of exposed process of sequence segregation, we perform a detailed hydrophobic residues. comparison of the results of the two design protocols MIN On the other hand, the increased solubility at low T of a ENTHALPY and MAX GAP. We find that they select largely hydrophobic core would decrease the enthalpy of similar sequences at high T but not at low T . the u state and, as a consequence, the relative enthalpic gain d d In particular, at high T and intermediate P , e.g., T ¼ of the f state. Hence, to maximize this gain, the second d d d 0.45 IU and P ¼ 0.2 IU, both protocols select sequences protocol selects sequences with a less hydrophobic core with strong segregation between the hydrophilic surface with respect to the high-T case and without changes of the and the hydrophobic core (Fig. 2). Instead, at low T , surface composition, i.e., of the enthalpy of the f state. e.g., T ¼ 0.1 IU and P ¼ 0.2 IU, the MIN ENTHALPY Remarkably, both categories of sequences—those with d d protocol selects sequences with a large number of surface less hydrophilic surface and those with less hydrophobic hydrophobic residues with minor changes in the core with core—have similar stability at low T [Figs. 1(a) and 9(a)]. respect to the high-T case [Figs. 2(a) and 2(c)], while the Hence, at low T, there is a variety of folding—less 021047-4 Pressure P [IU] Pressure P [IU] d d ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) segregated—sequences that are larger than at high T. Furthermore, we observe that proteins designed at low T —e.g., at (T ¼ 0.2 IU, P ¼ 0)—have a stability range d d d that includes ambient conditions [Fig. 1(a)]. Hence, the lack of sequence segregation found in naturally evolved proteins [66,67] is a way to maximize the number of folding sequences at ambient conditions, taking advantage of the T–P-dependent water properties. This observation allows us to reconsider the usual understanding of the high number of hydrophobic residues on the surface of natural proteins as a necessary compro- mise between solubility and functionality [66,67].To clarify this point, we apply the IMPLICIT SOLVENT protocol, which uses implicit water without accounting for the water contribution to the enthalpy; we find that it FIG. 4. Isothermal free energy, enthalpy, and entropy generates only sequences that are highly segregated—with variation. Inset: Color-coded (in IU) maximum value of ≃90% hydrophilic residues on the surface and ≃55% ðhÞ hΔH i found for the sequences designed with the MAX GAP hydrophobic residues in the core—and that are unstable at protocol. Main panel: The variations of the denaturation free- extreme conditions, as well as those designed at ambient T energy gain hΔ ðPÞ − Δ ð0Þi, of the denaturation enthalpy G G and P [Fig. 1(a)] [68]. Therefore, our results suggest that at gain hΔ ðPÞ − Δ ð0Þi, and of the folding entropy surplus H H extreme T and P, proteins are more stable if not completely hΔ ðPÞ − Δ ð0Þi increase with P along the isotherm TS TS segregated in their sequence, and this is the result of an T ¼ 0.15 IU for a protein designed at P ¼ 0.8 IU and T ¼ d d adaptation process in water since water is a solvent that 0.2 IU. We calculate hΔ i as thermodynamic average over P f changes properties with temperature and pressure. equilibrium configurations, hΔ ðPÞ − Δ ð0Þi ¼ hV ðPÞ − G G V ð0ÞidP and hΔ ðPÞ−Δ ð0Þi¼hΔ ðPÞ−Δ ð0Þi−hΔ ðPÞ− TS TS H H G Δ ð0Þi. C. High-pressure design and stability We now focus on the role of water in selecting sequences that fold at high P. While proteins designed at high T are d the larger bulk-water contribution. Hence, the balance stable in their f state in a range of T> T , we find that d Δ − Δ regulates the protein stability, favoring the u H TS those generated by the protocols at high P fold only at d state when Δ > 0 and the f state when Δ < 0. G G P<P [Fig. 1(a)]. Furthermore, for P ≳ 0.7 IU, no d We find that for proteins designed at high P and low T , d d protein folds. e.g., P ¼ 0.8 IU and T ¼ 0.2 IU (Fig. 4), both Δ and d d H These findings are surprising because, although the Δ become larger for increasing P at constant T, with the TS design protocols at high P select for proteins that are d denaturation enthalpy gain increasing faster than the folding more homogeneous than those generated at lower P —with d entropy surplus inducing the unfolding of the protein. The surfaces less hydrophilic and cores less hydrophobic increase of Δ is expected because, for increasing P,the TS (Fig. 2)—the generated sequences are such that the number of HBs decreases, making the entropy difference hydrated protein enthalpy of the f state is much less than between the f and the u state larger when the water-protein that of the u state (Fig. 4, inset). Therefore, the larger interface decreases upon folding. On the other hand, our instability of the folded protein is due to all the terms in the results (Fig. 4, inset) suggest that the protein and hydration- Gibbs free energy that are not explicitly included in our water contributions to Δ are negative for the considered calculation of the hydrated protein enthalpy, i.e., to the range of P and T and are slowly changing with P at T ¼ bulk-water contributions. 0.15 IU. Therefore, the increase of Δ implies that the To clearly show this result, we calculate, as a function of contribution coming from bulk water is increasingly larger f u P and T, the denaturation free-energy gain Δ ≡ G − G , G for increasing P. We conclude that the proteins designed at f u where G and G are the free energy for the entire system— high P are unstable in their native state under pressurization protein, hydration water, and bulk water—calculated for because of the bulk-water contribution to the total free the f state and for a completely stretched conformation energy G of the system. This contribution completely representative of the u state, respectively. This quantity dominates G for those pressures for which the number of is given by the difference Δ ¼ Δ − Δ between the HBs is vanishing—in our case, at P> 0.6 IU [69,70]. G H TS f u denaturation enthalpy gain Δ ≡ H − H and the folding Hence, according to our simulations, both the cold- and f u α α entropy surplus Δ ≡ TðS − S Þ, where H (S ) are the pressure-denaturation boundaries are due to the free- TS the enthalpy (entropy) of the α state with α ¼ f, u.A energy contribution of the hydration and bulk water [71]. This limit is a consequence of the specific features of the positive value of Δ favors the denaturation, while upon protein folding, there is an entropy surplus, Δ > 0, due to HB, representing a natural barrier above which folding is TS 021047-5 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) not expected. More importantly, such boundaries are also when the selection process explicitly takes into account the barriers for evolution and can be crossed only by intro- thermodynamic properties of the solvent. ducing artificial interactions between the amino acids to stabilize the protein native state (Fig. 1) [72]. ACKNOWLEDGMENTS We are thankful to L. Rovigatti, E. Locatelli, and M. IV. CONCLUSIONS Goethe for helpful discussions and suggestions. V. B. We present a computationally efficient model capable of and I. C. acknowledge support from the Austrian Science describing the artificial evolution of protein stability Fund (FWF), Grant No. P 26253-N27. V. B. acknowledges regions as a function of the thermodynamic properties of support from the FWF, Grant No. M 2150-N36. G. F. water. With our method, we study a large number of acknowledges support from Spanish MINECO Grants scenarios and demonstrate that the resulting stability No. FIS2012-31025 and No. FIS2015-66879-C2-2-P. regions are qualitatively similar to those of natural proteins. The computational results presented here have been Our results elucidate the role that water has in the achieved using the Vienna Scientific Cluster (VSC). selection process of protein sequences. In particular, we show that the maximum denaturation pressure ≃1 GPa APPENDIX A: PROTEIN-WATER MODEL above which all proteins denature in experiments is a At any P and T, we partition the total volume VðP; TÞ of consequence of the specific features of the hydration and our system into N regular, nonoverlapping cells of volume bulk-water hydrogen bonds. v ≡ V=N ≥ v , where v is the water-excluded volume, 0 0 We adopt design protocols that mimic the natural selection each occupied by one of the N protein’s residues or of proteins and find that the proteins selected at high T are by water molecules (Fig. 10). We adopt a coarse-grain superstable (Fig. 1), i.e., are stable in a wide range of T and representation of the protein that follows previous works P, while those selected at lower T arestableonlyina [29–31] and has been extensively used in the literature to reduced range of T and P. We observe that proteins designed get a qualitative understanding of protein properties (e.g., to be stable at different conditions of T and P are Refs. [4,5,9,73,74]) and protein design [21,22,75–78].For characterized by sequences with different degrees of segre- the solvent, we consider a coarse-grain model capable of gation between the hydrophilic surface and the hydrophobic reproducing in a qualitative way, at least, the properties of core (Figs. 2 and 3). The optimal degree of segregation is water [8,32–34,36,79] and its contribution to the protein selected spontaneously as a consequence of the free-energy folding [9,35]. balance of the protein aqueous solution without the necessity The Hamiltonian for the entire system is given by of an external active process of selection. In particular, we find that the segregation decreases by decreasing T ,and at H ≡ H þ H þ H ; ðA1Þ R;R R;w w;w ambient conditions, it is moderate (≃70% of the surface is hydrophilic and ≃50% of the core is hydrophobic). This where each term is defined in the following. As in result is consistent with a trend observed in the composition Refs. [29–31], we assume that the protein’s residues have of thermophilic, mesophilic, and ice-binding proteins only nearest-neighbor interactions, (Fig. 3). The broader stability of high-T designed proteins implies that these segregated sequences are a subset of those N N N C C W X X X designed at lower T . H ≡ H þ H ≡ C S þ C S ; p R;R R;w ij ij ij Furthermore, the general observation that many proteins i j≠i expose the solvent to a high percentage of hydrophobic ðA2Þ residues [66,67], as predicted by our model, suggests that such an exposure is not a compromise between stability and where the indices i and j run over the residues and j runs biological functionality of the proteins but rather a natural over the N ≤ N − N water molecules in the hydration W C consequence of the water properties. As a matter of fact, shell [80]; C is a contact matrix, with C ¼ 1,if l and k l;k selecting artificial sequences with an extreme segregation are first neighbors, and 0 otherwise; S are elements of does not increase the stability of proteins but rather reduces it. ij the Miyazawa-Jernigan residue-residue interaction matrix We believe that our findings could potentially improve the S [29,81,82] (solvent independent—see Table 1 in engineering of artificial biopolymers since the aggregation Appendix C) accounting for the correlations between can be prevented or enhanced in different thermodynamic real amino acids, and for the protein-water interaction conditions, according to the hydrophilic-hydrophobic ratio of W PHO W PHI S ¼ ε if the residue i is hydrophobic and S ¼ ε if the protein surface, although an experimental validation of the i i it is hydrophilic [9]. proposed design strategies remains crucial. Each cell accommodates, at most, one molecule, with the Putting the hydrophobic effect in an evolutionary per- average O─O distance between next-neighbor water mol- spective, our results substantiate an intriguing hypothesis: 1=3 Many features observed in natural proteins generally arise ecules given by r ¼ v . We associate a variable n ¼ 1 021047-6 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) ðbÞ ðhÞ H ≡ H þ H ; ðA4Þ with each cell, if the cell i is occupied by a water molecule w;w w;w w;w and has v =v > 0.5, and n ¼ 0 otherwise. Hence, n is a 0 i i where discretized density field replacing the water translational degrees of freedom. ðhÞ PHO PHO PHI PHI PHO PHO PHI PHI H ≡−½J N þJ N −½J N þJ N : w;w σ coop σ coop HB HB The Hamiltonian of bulk water is ðA5Þ ðbÞ ðbÞ ðbÞ H ≡ Uðr Þ − JN − J N : ðA3Þ w;w coop ij σ HB ij Within the square brackets of Eq. (A5), we have 1=3 explicitly indicated the contribution from the water mol- The first term, with UðrÞ ≡ ∞ for r< r ≡ v ¼ 2.9 Å 0 0 ecules at the protein-interface (h) neighboring hydrophobic (water van der Waals diameter), UðrÞ ≡ 4ϵ½ðr =rÞ − or hydrophilic residues. ðr =rÞ  for r ≥ r , with ϵ ≡ 2.9 kJ=mol, and UðrÞ ≡ 0 0 0 The hydrophobic interface strengthens the water-water for r>r ≡ 3r (cutoff), accounts for the O─O van der c 0 hydrogen bonding in the first hydration shell [7,86–90] Waals interaction between molecules i and j. The sum runs and increases the local water density upon pressurization over all possible water-molecule couples. PHO [90,95–97]. Therefore, we assume J >J for HBs The second term represents the directional component ðbÞ between water molecules at the hydrophobic interface of the HB interaction, where N ≡ n n δ is HB hiji i j σ ;σ ij ji PHO and J >J for the cooperative component. This con- σ σ the total number of bulk HBs. The sum is over each dition guarantees that the solvation free energy of a nearest-neighbor pair, and the argument is nonzero if hydrophobic amino acid decreases at low T [65]. the following conditions are satisfied: (i) n n ¼ 1, i.e., i j Following Ref. [9], we express the average volume change 1=3 r ≤ 2 r ¼ 3.6 Å, the maximum distance that is con- ij 0 per water-water HB at the hydrophobic interface as a series ventionally associated with a HB (while n n ¼ 0 other- i j PHO PHO expansion in P up to the linear term v =v ≡ 1 − k P HB HB;0 wise), and (ii) δ ¼ 1, with δ ¼ 1 if a ¼ b, and 0 σ ;σ ab PHO ij ji [98], where v is the volume change associated with the HB;0 otherwise. Here, σ ¼ 1; …;q is a bonding index. ij HB formation in the hydrophobic hydration shell at P ¼ 0 PHO Conventionally, a HB is broken if OOH > 30°, implying and k > 0. Therefore, the volume contribution V to that only 1=6 of the entire range of values [0,360°] for total volume V due to the HBs in the hydrophobic shell is d PHO PHO PHO PHO PHO the OOH angle is associated with a bonded state. Thus, V ≡ N v , where V and N are the hydro- HB HB HB choosing q ¼ 6, we correctly account for the entropy phobic hydration shell volume and number of HBs, variation due to HB formation and breaking. With this respectively. We assume that the water-water hydrogen definition, each molecule can form up to four HBs with its bonding and the water density at the hydrophilic interface neighbors. Bifurcated HBs are excluded. are not affected by the protein since they are equal to the PHI PHI The third term accounts for the HB cooperativity due to bulk-water values. Therefore, J ¼ J, J ¼ J , and σ σ the quantum many-body interaction [83,84] and leads to ðbÞ PHI v ¼ v . Moreover, the polarization effect of hydro- HB HB the low-P tetrahedral structure [85]. The cooperativity is philic residues on the HB network in the hydrophilic shell P P ðbÞ defined as the sum N ≡ n δ , over all the coop i σ ;σ i lk is not included here because, in our coarse-grain descrip- ik il water molecules i and over all the lk pairs of the bonding tion, it does not show a qualitative change in protein indices σ and σ of the molecule i. We choose J ≪ J to denaturation mechanisms [9], consistent with previous il ik σ guarantee an asymmetry between the two HB terms. observations [99]. A HB between two water molecules, The formation of a HB in the bulk leads to the local one hydrating a PHO amino acid and the other a PHI amino ðbÞ PHO PHI volume increase v =v , with an enthalpic variation 0 acid, is formed with a coupling constant ðJ þ J Þ=2, HB ðbÞ PHO PHI −J þ Pv , which accounts for the P-disrupting effect and it leads to a local increase of volume ðv þ v Þ=2. HB HB HB ðbÞ Hence, the total volume is on the HB network. Here, v =v represents the average HB volume increase between high-density ices VI and VIII and ðbÞ ðhÞ V ≡ V þ V ; ðA6Þ low-density (tetrahedral) ice Ih, and it is chosen as an ðhÞ PHO PHI approximation of the average volume variation per HB where V ≡ V þ V is the volume due to the HBs in when a tetrahedral HB network is formed. Hence, the the hydration shell. ðbÞ ðbÞ ðbÞ volume of bulk molecules is V ¼ Nv þ N v . In order to favor the visualization and the understanding HB HB The presence of the hydrophobic or hydrophilic protein of our results, we adopt a model representation in two interface affects the water-water hydrogen bonding in the dimensions [4,5,7,75–78,100]. We choose the parameters in hydration shell [7,86–94] (water molecules that are first Eq. (A1) in such a way as to get proteins that are stable for neighbors of the protein amino acids). Hence, the 250 ≲ T=K ≲ 350 and P< 1 GPa, consistent with exper- Hamiltonian for water, including hydration molecules imental observations [40–51,54,55,101–103].Weexpress all and the many-body effect on HB formation close to the the quantities in IU: adopting 8ϵ as the energy unit, v as the protein interface, reads volume unit, 8ϵ=k as the temperature unit, and 8ϵ=v as the B 0 021047-7 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) pressure unit. Accordingly, we fix the interaction constants homopolymer solutions. We set T ≪ T, the design tem- (all expressed in IU) J ¼ 0.3 and J ¼ 0.05 for bulk water, perature, because we look for sequences with either PHI PHI J ¼ J and J ¼ J for water at hydrophilic interfaces, minimum or maximum values of the enthalpy. A σ σ PHO PHO Monte Carlo step consists in an attempt to modify the and J ¼ 1.2 and J ¼ 0.2 for water at hydrophobic PHO protein sequence followed by a number of water moves that interfaces. Moreover, we fix k ¼ 4 and v ¼ 2 (in IU). HB;0 is large enough to equilibrate the solvent around the fixed These choices have two effects: (i) They balance the residue- protein. For the MAX GAP design protocol, we also residue, residue-water, and water-water interactions and perform a simulation with a completely stretched protein, make the proteins stable for thermodynamic conditions whose sequence is identical to the folded protein, in order comprised in the (stable and metastable) liquid phase, to calculate the enthalpy difference between the two including ambient conditions; (ii) they account for the lower conformations (native and unfolded). For both the water- surface volume ratio in the two-dimensional system with dependent design protocols, the protein enthalpy that we respect to a three-dimensional one, enhancing the interface ðhÞ ðhÞ ðhÞ interactions. Changes in parameters combine in a nontrivial consider is hH i ≡ hH iþhH iþ PhV i. w;w way, resulting in a shift, broadening, or reduction of the We perform the design for a large number of thermo- stability regions of proteins, but leaving the qualitative dynamic state points, sampling ½3; 5 × 10 independent scenario unaffected [104]. Lastly, since our preliminary sequences for each T, P and protein target-native structure. results on the three-dimensional many-body water model Each water-dependent design is performed on a grid of [105] show a phase diagram qualitatively similar to the ≃90 different T–P points. Implicit water design is per- ðhÞ one in two dimensions [36], we expect that our findings formed once per structure. We sample the averages hH i will remain substantially unaltered in a more realistic three- ðhÞ and hΔH i for each sequence over ≃10 water configu- dimensional version of the model. rations. We sort, in ascending order, the sequences accord- ðhÞ ðhÞ ing to their values of hH i, −hΔH i, and H , and R;R APPENDIX B: DESIGN AND R R consider, for characterization and stability analysis, only FOLDING SIMULATIONS the top 5; 15 sequences from each list. Overall, we perform The concept of protein design refers to an optimization more than 1500 independent designs. scheme for the sequence of amino acids, aiming to We test the validity of the design by folding the protein maximize the probability of folding into a specific target with Monte Carlo simulations at constant values of P, T, N , conformation. The design simulation consists in a broad and N. We start from a stretched protein conformation and, sampling of the space of sequences, on top of a fixed keeping the amino acid sequence, allow the protein to move protein structure that defines the native-target conforma- using local corner flip, pivot, and crankshaft algorithms tions (Fig. 10). We perform Monte Carlo simulations, [106]. Water is equilibrated using cluster moves [36].For keeping P, T, N , and N constant, with point mutation each sequence and each state point, we calculate the free of the sequence and residue swapping moves with the energy as a function of the number of native contacts. A following acceptance probability: protein is defined to be stable if, at the free-energy minimum, 90% of its native contacts are formed. This definition min −Δ=T n o ϵ o p P ≡ minf1;e g min f1; ðN =N Þ g; ðB1Þ acc P P guarantees that a stable protein folds in its native state. Computing the free energy on a grid of values of T and P where Δ is the difference between the new and the old yields the stability region in the T–P plane. By averaging the ðhÞ ðhÞ configurations in hH i or hΔH i or H , depending on R;R stability region over all sequences designed at the same T R R the design protocol; T ¼ 0.05 IU is the optimization and P , we calculate the average stability curve. n o temperature, N and N are the number of permutations P P for the new (n) and old (o) amino acid sequences, APPENDIX C: RESIDUE-RESIDUE respectively; and ϵ ¼ 14 is a weighting parameter. The INTERACTION MATRIX S n o ϵ term ðN =N Þ is added to bias towards highly hetero- P P In Table I we report the interaction matrix between the geneous sequences, which are better folders [24,30]. amino acids. This matrix does not include solvent contri- Therefore, we minimize the enthalpy of the folded structure butions, since these are explicitly stated in the Hamiltonians via a Monte Carlo scheme with separate acceptance criteria ðhÞ for the water moves and the sequence moves. While the H and H , and has been scaled by a factor 2 to balance w;w R;w water is simulated at T and P , the sequences are sampled with the water-water HB interaction. Amino acids are d d at low optimization temperature T . In Fig. 8, we show that indicated with letters following the FASTA code. The amino tuning ϵ does not significantly affect the hydropathy acids I, V, L, F, C, M, and A are assumed to be hydrophobic, profile of proteins, although a low value of ϵ generates according to the Kyte-Doolittle hydropathy scale [82]. 021047-8 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) TABLE I. Residue-residue interaction energy expressed in internal units. ACD E F G H I K L M N P Q R S T V W Y A −0.26 0 0.24 0.52 0.06 −0.14 0.68 −0.44 0.28 −0.02 0.5 0.56 0.2 0.16 0.86 −0.12 −0.18 −0.2 −0.18 0.18 C −2.12 0.06 1.38 −0.46 −0.16 −0.38 0.32 1.42 −0.16 0.38 0.26 0 0.1 0.48 −0.04 0.38 0.12 0.16 0.08 D 0.08 −0.3 0.78 −0.44 −0.78 1.18 −1.52 1.34 1.3 −0.6 0.08 −0.34 −1.44 −0.62 −0.58 1.16 0.48 0 E −0.06 0.54 0.5 −0.9 0.7 −1.94 0.86 0.88 −0.64 −0.2 −0.34 −1.48 −0.52 0 0.68 0.58 −0.2 F −0.88 −0.76 −0.32 −0.38 0.88 −0.6 −0.84 0.36 0.4 −0.58 0.82 0.58 0.62 −0.44 −0.32 0 G −0.76 0.4 0.5 0.22 0.46 0.38 −0.28 −0.22 −0.12 −0.08 −0.32 −0.52 0.32 0.36 0.28 H −0.58 0.98 0.44 0.32 1.98 −0.48 −0.42 −0.04 −0.24 −0.1 −0.38 0.38 −0.24 −0.68 I −0.44 0.72 −0.82 −0.56 1.06 0.5 0.72 0.84 0.42 0.28 −0.5 0.04 0.22 K 0.5 0.38 0 −0.66 0.22 −0.76 1.5 −0.26 −0.18 0.88 0.44 −0.42 L −0.54 −0.4 0.6 0.84 0.52 0.7 0.5 0.4 −0.58 −0.18 0.48 M 0.08 0.16 −0.68 0.92 0.62 0.28 0.38 −0.28 −1.34 −0.26 N −1.06 −0.36 −0.5 −0.28 −0.28 −0.22 1 0.12 −0.4 P 0.52 −0.84 −0.76 0.02 −0.14 0.18 −0.56 −0.66 Q 0.58 −1.04 −0.28 −0.28 0.48 0.16 −0.4 R 0.22 0.34 −0.7 0.6 −0.32 −0.5 S −0.4 −0.16 0.36 0.68 0.18 T 0.06 0.5 0.44 0.26 V −0.58 −0.14 0.04 W −0.24 −0.08 Y −0.12 APPENDIX D: ADDITIONAL FIGURES In this Appendix we report supplementary figures. Temperature [IU] FIG. 5. Average stability regions for proteins designed at constant pressure according to the MIN ENTHALPY protocol. The proteins are designed along the isobar P ¼ 0.2 IU. We observe how designed sequences at high T are more resistant to thermal fluctuations. d d Percentage of compactness 1 100 0.8 0.6 0.4 0.2 0.1 0.2 0.3 0.4 0.5 Temperature [IU] FIG. 6. Average compactness of the designed proteins. The compactness is calculated as the average number of residue-residue contact points, regardless of if they correspond to the native contacts, as a function of T and P. It is expressed as a (color-coded) percentage over the maximum possible number of contacts and averaged over proteins designed at different values of P and T . The straight (green) line on the d d bottom represents the liquid-gas spinodal line. 021047-9 Pressure [IU] Pressure [IU] BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) 0.6 0.5 0.4 0.3 Antifreeze & Ice-nucleating proteins 0.2 0.5 0.6 0.7 0.8 Hydrophilicity of the protein surface FIG. 7. Hydropathy of the protein surface and the protein core of the ice-binding proteins. (a) 0.7 0.6 ε =18 ε =14 ε =10 ε =7 0.5 0.1 0.2 0.3 0.4 0.5 Design temperature T [IU] (b) 0.5 0.4 0.3 0.1 0.2 0.3 0.4 0.5 Design temperature T [IU] FIG. 8. Hydropathy profile of proteins designed at ambient pressure for different values of ϵ . Here, we report how the hydrophilic profile of the protein surface and the hydrophobic profile of the protein core, respectively shown in panels (a) and (b), changes as a function of T ,at ambient pressure, accordingtothe parameter ϵ , appearingin Eq. (B1). Thedataarecalculated followingtheprotocolMIN ENTHALPY.Inall cases, the segregation between the hydrophilic surface and the hydrophobic core decreases with T. 021047-10 Hydrophobicity of the protein core Hydrophilicity of the protein surface Hydrophobicity of the protein core ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) Temperature [IU] Sequences designed at T =0.2 and 0.8 P =-0.1 P =0.5 0.6 P =0.8 0.4 0.2 -0.2 0.1 0.2 0.3 0.4 0.5 0.6 Temperature [IU] FIG. 9. Average stability regions for proteins designed according to the MAX GAP protocol. Average stability regions for proteins designed along the isobar P ¼ 0 and the isotherm T ¼ 0.2 IU, respectively, shown in panels (a) and (b). Arrows point at the design d d pressure P ¼ 0 IU and to the design temperature T ¼ 0.2 IU. d d FIG. 10. Schematic representation of the water-protein model. The position of the water molecules is coarse-grained, assigning a cell to each water molecule. The cell size coincides with the average distance between first-neighbor molecules and fluctuates according to the Boltzmann weight of isobaric-isothermal ensemble. The conformational state of water molecule i is described via four bonding indices σ , each one accounting for the bonding conformation of molecule i with respect to the neighbor in the direction j. Water ij molecules around the protein form the hydration shell (dark blue cells). Protein is modeled as a self-avoiding lattice chain, composed of 20 different amino acids. Each amino acid can be hydrophobic or hydrophilic, according to the Kyte-Doolittle hydropathy scale [82]. Red and green amino acids in the figure refer to core and surface elements, respectively. 021047-11 Pressure [IU] Pressure [IU] BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) [1] K. E. Zachariassen and E. Kristiansen, Ice Nucleation and [21] E. I. Shakhnovich and A. M. Gutin, A New Approach Antinucleation in Nature, Cryobiology 41, 257 (2000). to the Design of Stable Proteins, Protein Eng. 6,793 [2] L. J. Rothschild and R. L. Mancinelli, Life in Extreme (1993). Environments, Nature (London) 409, 1092 (2001). [22] E. I. Shakhnovich and A. M. Gutin, Engineering of Stable [3] G. Salvi, S. Mölbert, and P. De Los Rios, Design of Lattice and Fast-Folding Sequences of Model Proteins, Proc. Natl. Proteins with Explicit Solvent, Phys. Rev. E 66, 061911 Acad. Sci. U.S.A. 90, 7195 (1993). (2002). [23] I. Coluzza, Transferable Coarse-Grained Potential for De [4] M. I. Marqués, J. M. Borreguero, H. E. Stanley, and N. V. Novo Protein Folding and Design, PLoS One 9, e112852 Dokholyan, Possible Mechanism for Cold Denaturation of (2014). Proteins at High Pressure, Phys. Rev. Lett. 91, 138103 [24] I. Coluzza, A Coarse-Grained Approach to Protein (2003). Design: Learning from Design to Understand Folding, [5] B. A. Patel, P. G. Debenedetti, F. H. Stillinger, and P. J. PLoS One 6, e20853 (2011). Rossky, A Water-Explicit Lattice Model of Heat-, Cold-, [25] In order to optimize a sequence for a given target structure, a common strategy, known as negative design, consists in and Pressure-Induced Protein Unfolding, Biophys. J. 93, 4116 (2007). minimizing the free energy of the target conformation with [6] I. N. Berezovsky, K. B. Zeldovich, and E. I. Shakhnovich, respect to the free energy of all the possible compact Positive and Negative Design in Stability and Thermal conformations [26–28]. Negative design is normally nec- Adaptation of Natural Proteins, PLoS Biol. 3, e52 (2007). essary when, due to the limitations of the model used (e.g., [7] C. L. Dias, T. Ala-Nissila, M. Karttunen, I. Vattulainen, reduced alphabet size or coarse-graining), the standard and M. Grant, Microscopic Mechanism for Cold Denatu- procedure cannot find a sequence that folds to the target ration, Phys. Rev. Lett. 100, 118101 (2008). structure. In all the models we considered so far, we could [8] G. Franzese and V. Bianco, Water at Biological and always directly design on- and off–lattice structures using Inorganic Interfaces, Food Biophys. 8, 153 (2013). an alphabet of 20 letters [29–31]. [9] V. Bianco and G. Franzese, Contribution of Water to [26] C. Micheletti, F. Seno, A. Maritan, and J. R. Banavar, Pressure and Cold Denaturation of Proteins, Phys. Rev. Design of Proteins with Hydrophobic and Polar Amino Lett. 115, 108101 (2015). Acids, Proteins 32, 80 (1998). [10] S. B. Kim, J. C. Palmer, and P. G. Debenedetti, Computa- [27] I. Samish, C. M. Macdermaid, J. M. Perez-Aguilar, and tional Investigation of Cold Denaturation in the Trp-Cage J. G. Saven, Theoretical and Computational Protein Miniprotein, Proc. Natl. Acad. Sci. U.S.A. 113,8991(2016). Design, Annu. Rev. Phys. Chem. 62, 129 (2011). [11] J. R. Desjarlais and T. M. Handel, De Novo Design of the [28] N. Koga, R. Tatsumi-Koga, G. Liu, R. Xiao, T. B. Acton, Hydrophobic Cores of Proteins, Protein Sci. 4, 2006 G. T. Montelione, and D. Baker, Principles for Designing (1995). Ideal Protein Structures, Nature (London) 491, 222 [12] B. I. Dahiyat, De Novo Protein Design: Fully Automated (2012). Sequence Selection, Science 278, 82 (1997). [29] I. Coluzza and D. Frenkel, Designing Specificity of [13] B. I. Dahiyat and S. L. Mayo, Probing the Role of Packing Protein-Substrate Interactions, Phys. Rev. E 70, 051917 Specificity in Protein Design, Proc. Natl. Acad. Sci. U.S.A. (2004). 94, 10172 (1997). [30] I. Coluzza and D. Frenkel, Monte Carlo Study of Sub- [14] B. Kuhlman et al., Design of a Novel Globular Protein strate-Induced Folding and Refolding of Lattice Proteins, Fold with Atomic-Level Accuracy, Science 302, 1364 Biophys. J. 92, 1150 (2007). (2003). [31] I. Coluzza and J.-P. Hansen, Transition from Highly [15] D. Rothlisberger et al., Kemp Elimination Catalysts by to Fully Stretched Polymer Brushes in Good Solvent, Computational Enzyme Design, Nature (London) 453, 190 Phys. Rev. Lett. 100, 016104 (2008). (2008). [32] K. Stokely, M. G. Mazza, H. E. Stanley, and G. Franzese, [16] L. Jiang et al., De Novo Computational Design of Retro- Effect of Hydrogen Bond Cooperativity on the Behavior of Aldol Enzymes, Science 319, 1387 (2008). Water, Proc. Natl. Acad. Sci. U.S.A. 107, 1301 (2010). [17] D. J. Hei and D. S. Clark, Pressure Stabilization of [33] M. G. Mazza, K. Stokely, S. E. Pagnotta, F. Bruni, H. E. Proteins from Extreme Thermophiles, Appl. Environ. Stanley, and G. Franzese, More than One Dynamic Cross- Microbiol. 60, 932 (1994). over in Protein Hydration Water, Proc. Natl. Acad. Sci. [18] J. Torrent, P. Rubens, M. Ribó, K. Heremans, and M. U.S.A. 108, 19873 (2011). Vilanova, Pressure Versus Temperature Unfolding of [34] G. Franzese, V. Bianco, and S. Iskrov, Water at Interface Ribonuclease A: An FTIR Spectroscopic Characterization with Proteins, Food Biophys. 6, 186 (2011). of 10 Variants at the Carboxy-Terminal Site, Protein Sci. [35] V. Bianco, S. Iskrov, and G. Franzese, Understanding the 10, 725 (2001). Role of Hydrogen Bonds in Water Dynamics and Protein [19] S. Kumar, C. J. Tsai, and R. Nussinov, Temperature Stability, J. Biol. Phys. 38, 27 (2012). Range of Thermodynamic Stability for the Native State [36] V. Bianco and G. Franzese, Critical Behavior of a Water of Reversible Two-State Proteins, Biochemistry 42, 4864 Monolayer under Hydrophobic Confinement, Sci. Rep. 4, (2003). 4440 (2014). [20] B. N. Dominy, H. Minoux, and C. L. Brooks, An Electro- [37] Here, T defines the fluctuations of the water’s energy, in static Basis for the Stability of Thermophilic Proteins, units of the Boltzmann constant k , over which the Proteins 57, 128 (2004). enthalpy is averaged. This parameter is different from 021047-12 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) what is called “design temperature” in V. S. Pande, A. Y. [53] With the adopted units, 1 atm correspond to P ≃ 0.7 × −4 Grosberg, and T. Tanaka, Biophys. J. 73, 3192 (1997), 10 IU. For such values of P, the liquid-gas transition which in our case has a constant and very low value. occurs at T ≃ 0.44 IU. Therefore, the ambient temper- LG [38] S. A. Hawley, Reversible Pressure-Temperature ature is T ¼ 0.8 × T ≃ 0.35 IU. LG Denaturation of Chymotrypsinogen, Biochemistry 10, [54] W. B. Floriano, M. A. Nascimento, G. B. Domont, and 2436 (1971). W. A. Goddard, Effects of Pressure on the Structure of [39] H. W. Hatch, F. H. Stillinger, and P. G. Debenedetti, Metmyoglobin: Molecular Dynamics Predictions for Pres- Computational Study of the Stability of the Miniprotein sure Unfolding through a Molten Globule Intermediate, Trp-Cage, the GB 1β-Hairpin, and the AK16 Peptide, Protein Sci. 7, 2301 (1998). under Negative Pressure, J. Phys. Chem. B 118, 7761 [55] M. Gross and R. Jaenicke, Proteins under Pressure. The (2014). Influence of High Hydrostatic Pressure on Structure, [40] L. Smeller, Pressure-Temperature Phase Diagrams of Function and Assembly of Proteins and Protein Biomolecules, Biochim. Biophys. Acta 1595, 11 (2002). Complexes, Eur. J. Biochem. 221, 617 (1994). [41] R. Ravindra and R. Winter, On the Temperature–Pressure [56] E. Larios and M. Gruebele, Protein Stability at Negative Free-Energy Landscape of Proteins, Chem. Phys. Chem. Pressure, Methods 52, 51 (2010). 4, 359 (2003). [57] J. Hollien and S. Marqusee, A Thermodynamic Compari- [42] H. Herberhold and R. Winter, Temperature- and Pressure- son of Mesophilic and Thermophilic Ribonucleases H, Induced Unfolding and Refolding of Ubiquitin: A Static Biochemistry 38, 3831 (1999). and Kinetic Fourier Transform Infrared Spectroscopy [58] S. Fukuchi and K. Nishikawa, Protein Surface Amino Acid Study, Biochemistry 41, 2396 (2002). Compositions Distinctively Differ Between Thermophilic [43] H. Lesch, H. Stadlbauer, J. Friedrich, and J. M. and Mesophilic Bacteria, J. Mol. Biol. 309, 835 (2001). Vanderkooi, Stability Diagram and Unfolding of a Modi- [59] We analysed the PDB files of the following ice-binding fied Cytochrome c: What Happens in the Transformation proteins: 1MSI, 1AME, 1B71, 1B7J, 1B7K, 1C3Z, 1C8A, Regime?, Biophys. J. 82, 1644 (2002). 1EKL, 1EZG, 1HG7, 1JAB, 1KDF, 1OPS, 1WFB, 1XUZ, [44] J. Wiedersich, S. Köhler, A. Skerra, and J. Friedrich, 2AME, 2BRD, 2 MSI, 2PY2, 2ZDR, 2ZIB, 3P4G, 3ULT, Temperature and Pressure Dependence of Protein Stabil- 3UYV, 3WP9, 4DT5, 4KDV, 4NU3, 4UR4, and 4UR6. We ity: The Engineered Fluorescein-Binding Lipocalin FluA assume that an amino acid is exposed to the solvent if the Shows an Elliptic Phase Diagram, Proc. Natl. Acad. Sci. average accessible surface area is ≥ 0.3 of the total U.S.A. 105, 5756 (2008). accessible area of the amino acid (calculations are per- [45] A. Zipp and W. Kauzmann, Pressure Denaturation of formed with the MDTRAJ program at http://mdtraj.org/1.7 Metmyoglobin, Biochemistry 12, 4217 (1973). .2). Protein structures are taken from the PDB database at [46] M. W. Lassalle, H. Yamada, and K. Akasaka, The http://www.rcsb.org/pdb/home/home.do. Pressure-Temperature Free Energy-Landscape of Staphy- [60] G. Vogt, S. Woell, and P. Argos, Protein Thermal Stability, lococcal Nuclease Monitored by (1)H NMR, J. Mol. Biol. Hydrogen Bonds, and Ion Pairs, J. Mol. Biol. 269, 631 298, 293 (2000). (1997). [47] A. Maeno, H. Matsuo, and K. Akasaka, The Pressure- [61] G. Vogt and P. Argos, Protein Thermal Stability: Hydrogen Temperature Phase Diagram of Hen Lysozyme at Low pH, Bonds or Internal Packing?, Folding Des. 2, S40 (1997). Biophysik 5, 1 (2009). [62] V. Z. Spassov, A. D. Karshikof, and R. Ladenstein, The [48] J. Somkuti, Z. Mártonfalvi, M. S. Z. Kellermayer, and Optimization of Protein-Solvent Interactions: Thermosta- L. Smeller, Different Pressure-Temperature Behavior bility and the Role of Hydrophobic and Electrostatic of the Structured and Unstructured Regions of Titin, Interactions, Protein Sci. 4, 1516 (1995). Biochim. Biophys. Acta 1834, 112 (2013). [63] P. Haney, J. Konisky, K. K. Koretke, Z. Luthey-Schulten, [49] J. Somkuti, S. Jain, S. Ramachandran, and L. Smeller, and P. G. Wolynes, Structural Basis for Thermostability Folding-Unfolding Transitions of Rv3221c on the and Identification of Potential Active Site Residues Pressure-Temperature Plane, High Press. Res. 33, 250 for Adenylate Kinases from the Archaeal Genus (2013). Methanococcus, Proteins 28, 117 (1997). [50] F. Meersman, L. Smeller, and K. Heremans, Pressure- [64] E. van Dijk, P. Varilly, T. P. J. Knowles, D. Frenkel, and Assisted Cold Unfolding of Proteins and Its Effects on the S. Abeln, Consistent Treatment of Hydrophobicity in Conformational Stability Compared to Pressure and Heat Protein Lattice Models Accounts for Cold Denaturation, Unfolding, High Press. Res. 19, 263 (2000). Phys. Rev. Lett. 116, 078101 (2016). [51] A. Pastore, S. R. Martin, A. Politou, K. C. Kondapalli, [65] M. S. Moghaddam and H. S. Chan, Pressure and T. Stemmler, and P. A. Temussi, Unbiased Cold Denatu- Temperature Dependence of Hydrophobic Hydration: ration: Low- and High-Temperature Unfolding of Yeast Volumetric, Compressibility, and Thermodynamic Signa- Frataxin under Physiological Conditions, J. Am. Chem. tures, J. Chem. Phys. 126, 114507 (2007). Soc. 129, 5374 (2007). [66] L. Lins, A. Thomas, and R. Brasseur, Analysis of [52] J. Roche, J. A. Caro, D. R. Norberto, P. Barthe, C. Accessible Surface of Residues in Proteins, Protein Sci. Roumestand, J. L. Schlessman, A. E. Garcia, B. Garcia- 12, 1406 (2003). Moreno E., and C. A. Royer, Cavities Determine the [67] S. Moelbert, E. Emberly, and C. Tang, Correlation Pressure Unfolding of Proteins, Proc. Natl. Acad. Sci. Between Sequence Hydrophobicity and Surface-Exposure U.S.A. 109, 6945 (2012). Pattern of Database Proteins, Protein Sci. 13, 752 (2004). 021047-13 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) min [68] The sequences designed with H are independent of T [84] U. Góra, R. Podeszwa, W. Cencek, and K. Szalewicz, R;R and P . Interaction Energies of Large Clusters from Many-Body [69] The condition for a vanishing number of HBs is P ≃ Expansion, J. Chem. Phys. 135, 224102 (2011). J=v (where J and v are parameters of the model [85] A. K. Soper and M. A. Ricci, Structures of High- HB HB described in the Methods section), which in Ref. [70] gives Density and Low-Density Water, Phys. Rev. Lett. 84, P ≥ 1 IU, while here it gives P ≥ 0.6 IU ≃ 1 GPa when 2881 (2000). [86] N. Giovambattista, P. J. Rossky, and P. G. Debenedetti, converted to real units. [70] G. Franzese and H. E. Stanley, The Widom Line of Effect of Pressure on the Phase Behavior and Structure of Supercooled Water, J. Phys. Condens. Matter 19, Water Confined Between Nanoscale Hydrophobic and 205126 (2007). Hydrophilic Plates, Phys. Rev. E 73, 041604 (2006). [71] Details of our results reveal that the protein interface [87] C. Petersen, K.-J. Tielrooij, and H. J. Bakker, Strong affects up to the second hydration shell and that this Temperature Dependence of Water Reorientation in contributes significantly to the enthalpy difference between Hydrophobic Hydration Shells, J. Chem. Phys. 130, the u and the f states. 214511 (2009). [72] Depending on the specific set of residue-residue inter- [88] Y. I. Tarasevich, State and Structure of Water in Vicinity of actions, the exact boundary can move, but the fact that both Hydrophobic Surfaces, Colloid J. 73, 257 (2011). in experiments and in our simulations the high pressure [89] J. G. Davis, K. P. Gierszal, P. Wang, and D. Ben- boundary is ≃1 GPa implies that the values we are using to Amotz, Water Structural Transformation at Molecular represent the interactions between the amino acids have the Hydrophobic Interfaces, Nature (London) 491, 582 correct order of magnitude. (2012). [73] G. Caldarelli and P. De Los Rios, Cold and Warm [90] S. Sarupria and S. Garde, Quantifying Water Density Denaturation of Proteins, J. Biol. Phys. 27, 229 (2001). Fluctuations and Compressibility of Hydration Shells of [74] S. Matysiak, P. G. Debenedetti, and P. J. Rossky Role Hydrophobic Solutes and Proteins, Phys. Rev. Lett. 103, of Hydrophobic Hydration in Protein Stability: A 3D 037803 (2009). Water-Explicit Protein Model Exhibiting Cold and Heat [91] K. A. Dill, T. M. Truskett, V. Vlachy, and B. Hribar-Lee B, Denaturation, J. Phys. Chem. B 116, 8095 (2012). Modeling Water, the Hydrophobic Effect, and the Ion [75] K. F. Lau and K. A. Dill, A Lattice Statistical Mechanics Solvation, Annu. Rev. Biophys. Biomol. Struct. 34, 173 Model of the Conformational and Sequence Spaces of (2005). Proteins, Macromolecules 22, 3986 (1989). [92] A. Badasyan, Y. S. Mamasakhlisov, R. Podgornik, and [76] K. Yue and K. A. Dill, Inverse Protein Folding Problem: V. A. Parsegian, Solvent Effects in the Helix-Coil Tran- Designing Polymer Sequences, Proc. Natl. Acad. Sci. sition Model Can Explain the Unusual Biophysics of U.S.A. 89, 4163 (1992). Intrinsically Disordered Proteins, J. Chem. Phys. 143, [77] M. S. Shell, P. G. Debenedetti, and A. Z. Panagiotopoulos, 014102 (2015). Computational Characterization of the Sequence Land- [93] C. J. Fennell, C. W. Kehoe, and K. A. Dill, Modeling scape in Simple Protein Alphabets, Proteins 62, 232 Aqueous Solvation with Semi-Explicit Assembly, Proc. (2006). Natl. Acad. Sci. U.S.A. 108, 3234 (2011). [78] B. A. Patel, P. G. Debenedetti, F. H. Stillinger, and P. J. [94] C. Camilloni, D. Bonetti, A. Morrone, R. Giri, C. M. Rossky, The Effect of Sequence on the Conformational Dobson, M. Brunori, S. Gianni, and M. Vendruscolo, Stability of a Model Heteropolymer in Explicit Water, Towards a Structural Biology of the Hydrophobic Effect in J. Chem. Phys. 128, 175102 (2008). Protein Folding, Sci. Rep. 6, 28285 (2016). [79] F. de los Santos and G. Franzese, Understanding Diffusion [95] P. Das and S. Matysiak, Direct Characterization of and Density Anomaly in a Coarse-Grained Model for Hydrophobic Hydration During Cold and Pressure Water Confined Between Hydrophobic Walls, J. Phys. Denaturation, J. Phys. Chem. B 116, 5342 (2012). Chem. B 115, 14311 (2011). [96] T. Ghosh, A. E. García, and S. Garde, Molecular Dynamics [80] Here, for the sake of simplicity, following Ref. [9],we Simulations of Pressure Effects on Hydrophobic Inter- assume that, in each volume v, there is at most one water actions, J. Am. Chem. Soc. 123, 10997 (2001). molecule. This approximation could be removed by gen- [97] C. L. Dias and H. S. Chan, Pressure-Dependent Properties eralizing the model in such a way that the water variables of Elementary Hydrophobic Interactions: Ramifications associated with each cell could account for the state of for Activation Properties of Protein Folding, J. Phys. more than one water molecule at a time. Chem. B 118, 7488 (2014). [81] S. Miyazawac and R. L. Jernigan, Estimation of Effective [98] This approximation implies that our calculations are valid Interresidue Contact Energies from Protein Crystal Struc- only for P< 1=k . [99] M. Kurnik, L. Hedberg, J. Danielsson, and M. Oliveberg, tures: Quasi-chemical Approximation, Macromolecules 18, 534 (1985). Folding without Charges, Proc. Natl. Acad. Sci. U.S.A. [82] J. Kyte and R. F. Doolittle, A Simple Method for Display- 109, 5705 (2012). ing the Hydropathic Character of a Protein, J. Mol. Biol. [100] P. De Los Rios and G. Caldarelli, Putting Proteins Back 157, 105 (1982). into Water, Phys. Rev. E 62, 8449 (2000). [83] L. H. de la Peña and P. G. Kusalik, Temperature Depend- [101] G. Hummer, S. Garde, A. E. García, M. E. Paulaitis, and ence of Quantum Effects in Liquid Water, J. Am. L. R. Pratt, The Pressure Dependence of Hydrophobic Chem. Soc. 127, 5246 (2005). Interactions Is Consistent with the Observed Pressure 021047-14 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) Denaturation of Proteins, Proc. Natl. Acad. Sci. U.S.A. 95, [104] V. Bianco, N. P. Gelabert, I. Coluzza, and G. Franzese, 1552 (1998). How the Stability of a Folded Protein Depends on [102] F. Meersman, C. M. Dobson, and K. Heremans, Protein Interfacial Water Properties and Residue-Residue Inter- Unfolding, Amyloid Fibril Formation and Configurational actions, arXiv:1704.03370. Energy Landscapes under High Pressure Conditions, [105] L. E. Coronas, V. Bianco, A. Zantop, and G. Franzese, Chem. Soc. Rev. 35, 908 (2006). Liquid-Liquid Critical Point in 3D Many-Body Water [103] N. V. Nucci, B. Fuglestad, E. A. Athanasoula, and A. J. Model, arXiv:1610.00419. Wand, Role of Cavities and Hydration in the Pressure [106] D. Frenkel and B. Smit, Understand Molecular Unfolding of T4 Lysozyme, Proc. Natl. Acad. Sci. U.S.A. Simulations (Academic Press, San Diego, London, 111, 13846 (2014). 2002). 021047-15 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Physical Review X American Physical Society (APS)

Role of Water in the Selection of Stable Proteins at Ambient and Extreme Thermodynamic Conditions

Free
15 pages

Loading next page...
 
/lp/aps_physical/role-of-water-in-the-selection-of-stable-proteins-at-ambient-and-Rz5w0HhGF0
Publisher
The American Physical Society
Copyright
Copyright © Published by the American Physical Society
eISSN
2160-3308
D.O.I.
10.1103/PhysRevX.7.021047
Publisher site
See Article on Publisher Site

Abstract

PHYSICAL REVIEW X 7, 021047 (2017) Role of Water in the Selection of Stable Proteins at Ambient and Extreme Thermodynamic Conditions 1 2 1 1 Valentino Bianco, Giancarlo Franzese, Christoph Dellago, and Ivan Coluzza Faculty of Physics, University of Vienna, Vienna 1090, Austria Secció de Física Estadística i Interdisciplinària–Departament de Física de la Matèria Condensada, Facultat de Física & Institute of Nanoscience and Nanotechnology (IN2UB), Universitat de Barcelona, Barcelona 08028, Spain (Received 21 October 2016; revised manuscript received 1 February 2017; published 26 June 2017) Proteins that are functional at ambient conditions do not necessarily work at extreme conditions of temperature T and pressure P. Furthermore, there are limits of T and P above which no protein has a stable functional state. Here, we show that these limits and the selection mechanisms for working proteins depend on how the properties of the surrounding water change with T and P. We find that proteins selected at high T are superstable and are characterized by a nonextreme segregation of a hydrophilic surface and a hydrophobic core. Surprisingly, a larger segregation reduces the stability range in T and P. Our computer simulations, based on a new protein design protocol, explain the hydropathy profile of proteins as a consequence of a selection process influenced by water. Our results, potentially useful for engineering proteins and drugs working far from ambient conditions, offer an alternative rationale to the evolutionary action exerted by the environment in extreme conditions. DOI: 10.1103/PhysRevX.7.021047 Subject Areas: Biological Physics, Soft Matter, Statistical Physics I. INTRODUCTION high-T adaptation alone selects for superstable sequences that are highly resistant to both cold and pressure denatu- Proteins are molecules made of a sequence of amino acid ration. Here, we define the proteins as superstable if their residues that fold to target structures (native states) for a average stability region encompasses the average stability range of temperatures T and pressures P. Different pro- region of proteins designed at ambient conditions. Our teins, with different sequences, have different T–P ranges results confirm, for the first time, the hypothesis that a of stability for their native state. As a matter of fact, many strong stability of the folded state at high T corresponds to life forms survive under extreme T–P conditions [1,2], strong stability also at low T and high P [17–20]. implying that their proteins fit that environment. Recent We develop a protein design strategy that focuses on the works [3–10] have evidenced that changes in T and P relation between sequence and folded structure, allowing us significantly alter the water-mediated hydrophilic and to calculate the stability of many proteins in a wide range of hydrophobic interactions of the residues along the chain, T and P in explicit water, a formidable task that is currently with an effect that depends on the sequence. However, a infeasible for atomistic models. Following the standard direct observation of how the protein selection responds to approach introduced by Shakhnovich and Gutin [21,22], extreme changes in the aqueous conditions is still lacking. we adopt a coarse-grained lattice representation of proteins One of the reasons for this is that, for the few notable that is computationally effective and yields results that can studies accounting for explicit water [11–16], an exhaustive also be extended to off-lattice proteins, as recently dem- sampling of the T–P stability region of the selected proteins onstrated for the implicit solvent case [23–25]. The main is beyond the reach of current computers. Here, we fill this difference between our strategy and the standard approach gap. We present a computational study based on a novel is that here, instead of considering the phenomenology of strategy that mimics the adaptation process of solvated water implicitly, we explicitly include water in our coarse- proteins in a wide range of thermodynamic conditions. Our graining. As an explicit solvent, we adopt a many-body results reveal the border beyond which artificial protein water model that has recently been shown to provide design and natural evolution fail. Moreover, we show that relevant information on the role of water in the cold- and pressure-denaturation mechanisms of proteins [9]. II. DESIGN APPROACH Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Protein and solvent interactions are described by the Further distribution of this work must maintain attribution to Hamiltonian (see Appendix A for details) H ≡ H þ R;R the author(s) and the published article’s title, journal citation, ðbÞ ðhÞ and DOI. H þ H þ H , where the first term accounts for the w;w w;w R;w 2160-3308=17=7(2)=021047(15) 021047-1 Published by the American Physical Society BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) residue-residue interactions and the second term for the the calculations for the 30 amino acid proteins. The same residue-hydrophobic or hydrophilic interaction with the results hold for longer proteins. In each protocol, we design ðbÞ the protein at a given temperature T and pressure P and d d solvent, respectively. The third term, H , accounts for w;w test its stability in a wide range of T and P [37]. We choose the water-water interaction in the bulk, including the T and P uniformly over values ranging from the liquid- d d isotropic van der Waals interaction, as well as the direc- gas transition to the glass-transition temperature, and from tional and cooperative components of the hydrogen bonds negative pressure to ≃1 GPa. For each (T , P ) and each ðhÞ d d (HBs) [8,9,32–36]. The last term, H , describes the w;w protocol, we identify 5–15 optimized sequences, for a total water-water interaction in the hydration layer—the first of more than 1.5 × 10 optimizations, a number far beyond layer of water molecules in contact with the protein—and the capability of any fully atomistic protein model. For accounts for the effect of the protein’s hydrophobic and each designed sequence, we test the stability in T–P by hydrophilic residues on the interaction and structure of the checking, with Monte Carlo simulations, if the average hydration water [9]. equilibrium conformation coincides with the protein native- By assuming that the protein interface affects the water target structure. Then, we average the stability regions of all properties in the hydration shell, we use a design strategy sequences optimized for a given target-native state. For all based on the enthalpy associated with the hydrated protein, ðhÞ designed proteins, we find stability regions that resemble regardless of the bulk contribution, H ≡ H þ H þ R R;R R;w those predicted by theory [38], previous numerical works ðhÞ ðbÞ H þ PðV − V Þ, with a bias that enhances the w;w [5,7,9,39], and those observed in experiments [18,40–52] sequence heterogeneity to avoid homopolymer design (Fig. 1). Throughout the paper, we use IU for T and P solutions [23,24,29–31]. In the above expression for defined in Methods. We find that the boundaries of the ðhÞ H , V is the total volume of the system and V is the stability regions extend from low T, where proteins volume associate with the bulk water, i.e., the water not in undergo cold denaturation, to high T, where thermal direct contact with the protein. Since our optimization denaturation occurs. The T range of stability depends on accounts for changes in the aqueous conditions, the the protein sequence and size. Small proteins undergo enthalpies of each tested sequence must be averaged over cold denaturation for T< 0.1 IU [53], consistent with the water configurations. We then envisage two possible previous results [9]. Longer proteins, with higher content of design protocols based on measuring the average sequence hydrophobic residues, cold-denaturate at higher T. All the ðhÞ sequences we consider are unstable above the threshold enthalpies hH i, where the angular brackets h…i refer to pressure P ≃ 0.7 IU (Figs. 5 and 6 in Appendix D), the thermodynamic average calculated over equilibrium qualitatively consistent with what is observed in water configurations, including the bulk. The first strategy ðhÞ Refs. [40,54,55]. We find that denaturation occurs also consists in minimizing the average enthalpy hH i of the R;f at low or negative P, consistent with recent findings hydrated protein in its folded (f) conformation. The second [9,39,56]. ðhÞ consists in maximizing the enthalpic gap ΔH between Experimental data for different proteins [Fig. 1(b)] show the f state and the unfolded (u) protein conformation, that the higher the thermal stability of the proteins, the both hydrated. As a u state, we use a completely stretched higher the pressure-stability region [17,18]. Our simula- structure that is a representative of all equivalent tions have the same trend. In particular, we find that the unbounded configurations. For both protocols, the enthalpy stability region is more extended for sequences designed of the hydration water is averaged over equilibrium at higher T and intermediate P , corroborating the d d configurations, accounting for the energy, density, and hypothesis that stability ranges in T and P are positively entropy of the solvent at the given thermodynamic con- correlated: Proteins with a pronounced thermal stability are ditions and protein interface. We refer to the two protocols also more stable with respect to pressure. Furthermore, we as MIN ENTHALPYand MAX GAP, respectively. In order observe that sequences designed at high T are stable both to quantify the solvent contribution, we compare the results at high and low T, while those designed at lower T can obtained with these two protocols with those of a simpler fold only at low T. For example, sequences designed at method (IMPLICIT SOLVENT) based on the minimization P ¼ 0 and T ¼ 0.1 IU are stable up to T ≃ 0.3 IU, while d 1 of the residue interactions without any water contribution. to get a sequence that also folds at higher T ¼ 0.4 IU, we need to design it at T ¼ T [Fig. 1(a)]. These results d 2 are consistent with experiments comparing the stability III. RESULTS AND DISCUSSION range of mesophilic and thermophilic proteins [57] and A. Thermal and pressure stability of proteins with data revealing the higher resistance of thermophilic We focus our attention on ten different target proteins proteins to cold denaturation [19,20]. More generally, we with lengths between 30 and 90 amino acids that are find that proteins that are designed at high T are super- compact enough that the exposed surface to the solvent is stable; i.e., they also show a remarkable stability at low T and at high P. the smallest possible. For sake of clarity, here we show only 021047-2 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) (a) (b) FIG. 1. T–P stability regions of designed and experimental proteins. (a) Average stability regions (closed curves) within which the protein sequences designed with the MIN ENTHALPY protocol at different T and P fold into the native state. The regions for proteins d d designed at P ¼ 0 (red, green, and blue curves), close to ambient pressure, are enclosed one into another with an extension that is proportional to T . All of these regions enclose the stability region of proteins designed at higher P (pink curve). The dotted line shows d d the stability region for a sequence calculated with the IMPLICIT SOLVENT protocol. The “glass-transition” (solid black) line defines the temperatures below which the system does not equilibrate. The long-dashed line represents the limit of stability (spinodal) of the liquid with respect to the gas phase. Pressure and temperature are expressed in internal units (IU). (b) Experimental stability region for different proteins indicated in the legend (adapted from Refs. [40–49]). The long-dashed line is the liquid-gas transition line. The data show a clear positive correlation between the T range and P range of stability. Pressure denaturation is observed in a range of −0.1 ≲ P=GPa ≲ 0.6, while stability at higher P is reached by introducing artificial covalent bridges between the amino acids, as in the case of Zn cytochrome c [43]. In both panels, the shaded region corresponds to ambient conditions. B. Water effect on the protein surface surface and a hydrophobic core (Fig. 3). These proteins and core hydropathy profiles with less segregation in their sequence are less stable at high T [Fig. 1(a)]. We rationalize this fundamental property of protein This finding is consistent with previous studies on evolution in terms of the relative fraction of hydrophobic thermophilic proteins, i.e., proteins stable at high T, and hydrophilic residues in the sequence and how it showing that the higher thermostability is correlated with changes depending on the thermodynamic state of the PHI a stronger segregation between the polar surface [60,61] solvent. We calculate the fraction P ðT ;P Þ of hydro- surf d d and the hydrophobic core [62,63]. In particular, a system- philic (PHI) amino acids on the protein surface—number atic analysis of the hydropathy profile of thermophilic [58], of hydrophilic amino acids on the surface divided by the mesophilic [58] (i.e., stable at intermediate T), and ice- total number of surface amino acids—and the fraction PHO binding proteins (i.e., proteins that interfere with ice Q ðT ;P Þ of hydrophobic (PHO) amino acids into the core d d growth, Fig. 7) [59] reveals that these categories of proteins protein core—number of hydrophobic amino acids divided have, on average, compositions that are close to those by the total number of core amino acids (Figs. 2 and 3). The PHI PHO that we calculate with our design strategies at high comparison between P ðT ;P Þ and Q ðT ;P Þ char- d d core d d surf T (T ≳ 0.45 IU and P ≃ 0), ambient T (T ¼ 0.35 IU acterizes the level of segregation of the sequence in the and P ≃ 0), and low T (T ≲ 0.15 IU and P ≃ 0). folded state. By segregation, we mean the tendency of Indeed, the data [58] (Fig. 3) show that ≃78.8% of the the optimized sequence to have a surface exposed to the surface of thermophilic proteins is hydrophilic and ≃54.5% solvent that is rich in hydrophilic amino acids and a core of their core is hydrophobic, while for mesophilic proteins, that is abundant in hydrophobic amino acids. For all the percentages are ≃78.4% and ≃54.7%, respectively. designed proteins and for both design protocols with Moreover, our direct analysis of ice-binding proteins solvent, we find that the optimized sequences at ambient −4 [59] (Fig. 7) shows that ≃67.0% of their surface is conditions (P ≃ 0.7 × 10 IU, T ≃ 0.35 IU) are mostly d d hydrophilic and ≃48.5% of their core is hydrophobic. hydrophilic on the surface and hydrophobic into the core, as expected. However, upon lowering T , both protocols Hypothesizing that the protein’s composition is optimized with solvent generate protein sequences that, in their folded at the environmental conditions at which the protein works, state, have a lower segregation between a hydrophilic we observe that the real proteins have a segregation 021047-3 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) 0.7 (a) (b) 0.8 0.6 Thermophilic Mesophilic 0.6 Proteins Proteins 0.5 0.4 0.2 Ice Binding 0.4 Proteins 0.3 (c) (d) 0.5 0.8 0.4 0.6 0.4 0.3 0.2 0.2 0.1 0.1 0.2 0.3 0.4 0.1 0.2 0.3 0.4 0.5 FIG. 3. Hydrophilic profile of a protein surface designed at PHI Temperature T [IU] Temperature T [IU] ambient pressure. The fraction P ðT ;P Þ of surface hydro- d d d d surf philic amino acid decreases with the design temperature T , for FIG. 2. Hydropathy profile of the designed protein surface and both the design protocols MIN ENTHALPY (black circles) and PHI core. The color-coded contour plots of the fraction P ðT ;P Þ MAX GAP (diamonds). Data are from Figs. 2(a) and 2(b) at surf d d of surface hydrophilic amino acid [panels (a) and (b)] and the P ¼ 0. Our results qualitatively follow the trend observed for the PHO fraction Q ðT ;P Þ of core hydrophobic amino acid [panels average hydropathy of real thermophilic [58] (large red circle), core d d (c) and (d)] as a function of the design temperature T and mesophilic [58] (large green circle), and ice-binding proteins [59] pressure P , calculated with both design protocols, MIN EN- (large blue circle), plotted at high T (T ¼ 0.45 IU), intermediate THALPY [panels (a) and (c)] and MAX GAP [panels (b) and T (T ¼ 0.3 IU), and low T (T ¼ 0.15 IU), respectively, to ease (d)], show a stronger segregation between the hydrophilic surface the comparison. Such values of T resemble the average working and the hydrophobic core for proteins designed at high T and temperature of real thermophilic, mesophilic, and ice-binding low or intermediate P . The straight (green) line in the lower- proteins. The hydropathy profile (independent of T ) of sequen- d d right corner of each panel marks the liquid-gas spinodal line of ces designed with implicit solvent (dashed line) is unable to the solvent. T and P are expressed in internal units. reproduce the observed trend. d d between hydrophilic surface and hydrophobic core that MAX GAP protocol leads to proteins with a large follows the same qualitative trend as our prediction number of hydrophilic residues in the core without major (Figs. 3 and 8). Hence, both our theory and the real data changes on the surface with respect to the high-T case show that the higher the temperature of stability of the [Figs. 2(b) and 2(d)]. protein, the stronger the segregation of the amino acid These differences in the selection are a consequence of sequence into the hydrophilic surface and hydrophobic the fact that at low T the hydrophobic amino acids are more core, although an extreme segregation does not imply the soluble [3,9,64,65] because the energetic gain of forming stability of the folded state at higher T, as shown by the water-water HBs at the hydrophobic interface compensates sequences generated with the implicit solvent design for the loss of water-hydrophilic residue HBs [9]. Hence, [Figs. 1(a) and 3]. on the one hand, the first protocol minimizes the enthalpy To get insight into the role of water in the evolutionary of the f state by increasing the number of exposed process of sequence segregation, we perform a detailed hydrophobic residues. comparison of the results of the two design protocols MIN On the other hand, the increased solubility at low T of a ENTHALPY and MAX GAP. We find that they select largely hydrophobic core would decrease the enthalpy of similar sequences at high T but not at low T . the u state and, as a consequence, the relative enthalpic gain d d In particular, at high T and intermediate P , e.g., T ¼ of the f state. Hence, to maximize this gain, the second d d d 0.45 IU and P ¼ 0.2 IU, both protocols select sequences protocol selects sequences with a less hydrophobic core with strong segregation between the hydrophilic surface with respect to the high-T case and without changes of the and the hydrophobic core (Fig. 2). Instead, at low T , surface composition, i.e., of the enthalpy of the f state. e.g., T ¼ 0.1 IU and P ¼ 0.2 IU, the MIN ENTHALPY Remarkably, both categories of sequences—those with d d protocol selects sequences with a large number of surface less hydrophilic surface and those with less hydrophobic hydrophobic residues with minor changes in the core with core—have similar stability at low T [Figs. 1(a) and 9(a)]. respect to the high-T case [Figs. 2(a) and 2(c)], while the Hence, at low T, there is a variety of folding—less 021047-4 Pressure P [IU] Pressure P [IU] d d ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) segregated—sequences that are larger than at high T. Furthermore, we observe that proteins designed at low T —e.g., at (T ¼ 0.2 IU, P ¼ 0)—have a stability range d d d that includes ambient conditions [Fig. 1(a)]. Hence, the lack of sequence segregation found in naturally evolved proteins [66,67] is a way to maximize the number of folding sequences at ambient conditions, taking advantage of the T–P-dependent water properties. This observation allows us to reconsider the usual understanding of the high number of hydrophobic residues on the surface of natural proteins as a necessary compro- mise between solubility and functionality [66,67].To clarify this point, we apply the IMPLICIT SOLVENT protocol, which uses implicit water without accounting for the water contribution to the enthalpy; we find that it FIG. 4. Isothermal free energy, enthalpy, and entropy generates only sequences that are highly segregated—with variation. Inset: Color-coded (in IU) maximum value of ≃90% hydrophilic residues on the surface and ≃55% ðhÞ hΔH i found for the sequences designed with the MAX GAP hydrophobic residues in the core—and that are unstable at protocol. Main panel: The variations of the denaturation free- extreme conditions, as well as those designed at ambient T energy gain hΔ ðPÞ − Δ ð0Þi, of the denaturation enthalpy G G and P [Fig. 1(a)] [68]. Therefore, our results suggest that at gain hΔ ðPÞ − Δ ð0Þi, and of the folding entropy surplus H H extreme T and P, proteins are more stable if not completely hΔ ðPÞ − Δ ð0Þi increase with P along the isotherm TS TS segregated in their sequence, and this is the result of an T ¼ 0.15 IU for a protein designed at P ¼ 0.8 IU and T ¼ d d adaptation process in water since water is a solvent that 0.2 IU. We calculate hΔ i as thermodynamic average over P f changes properties with temperature and pressure. equilibrium configurations, hΔ ðPÞ − Δ ð0Þi ¼ hV ðPÞ − G G V ð0ÞidP and hΔ ðPÞ−Δ ð0Þi¼hΔ ðPÞ−Δ ð0Þi−hΔ ðPÞ− TS TS H H G Δ ð0Þi. C. High-pressure design and stability We now focus on the role of water in selecting sequences that fold at high P. While proteins designed at high T are d the larger bulk-water contribution. Hence, the balance stable in their f state in a range of T> T , we find that d Δ − Δ regulates the protein stability, favoring the u H TS those generated by the protocols at high P fold only at d state when Δ > 0 and the f state when Δ < 0. G G P<P [Fig. 1(a)]. Furthermore, for P ≳ 0.7 IU, no d We find that for proteins designed at high P and low T , d d protein folds. e.g., P ¼ 0.8 IU and T ¼ 0.2 IU (Fig. 4), both Δ and d d H These findings are surprising because, although the Δ become larger for increasing P at constant T, with the TS design protocols at high P select for proteins that are d denaturation enthalpy gain increasing faster than the folding more homogeneous than those generated at lower P —with d entropy surplus inducing the unfolding of the protein. The surfaces less hydrophilic and cores less hydrophobic increase of Δ is expected because, for increasing P,the TS (Fig. 2)—the generated sequences are such that the number of HBs decreases, making the entropy difference hydrated protein enthalpy of the f state is much less than between the f and the u state larger when the water-protein that of the u state (Fig. 4, inset). Therefore, the larger interface decreases upon folding. On the other hand, our instability of the folded protein is due to all the terms in the results (Fig. 4, inset) suggest that the protein and hydration- Gibbs free energy that are not explicitly included in our water contributions to Δ are negative for the considered calculation of the hydrated protein enthalpy, i.e., to the range of P and T and are slowly changing with P at T ¼ bulk-water contributions. 0.15 IU. Therefore, the increase of Δ implies that the To clearly show this result, we calculate, as a function of contribution coming from bulk water is increasingly larger f u P and T, the denaturation free-energy gain Δ ≡ G − G , G for increasing P. We conclude that the proteins designed at f u where G and G are the free energy for the entire system— high P are unstable in their native state under pressurization protein, hydration water, and bulk water—calculated for because of the bulk-water contribution to the total free the f state and for a completely stretched conformation energy G of the system. This contribution completely representative of the u state, respectively. This quantity dominates G for those pressures for which the number of is given by the difference Δ ¼ Δ − Δ between the HBs is vanishing—in our case, at P> 0.6 IU [69,70]. G H TS f u denaturation enthalpy gain Δ ≡ H − H and the folding Hence, according to our simulations, both the cold- and f u α α entropy surplus Δ ≡ TðS − S Þ, where H (S ) are the pressure-denaturation boundaries are due to the free- TS the enthalpy (entropy) of the α state with α ¼ f, u.A energy contribution of the hydration and bulk water [71]. This limit is a consequence of the specific features of the positive value of Δ favors the denaturation, while upon protein folding, there is an entropy surplus, Δ > 0, due to HB, representing a natural barrier above which folding is TS 021047-5 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) not expected. More importantly, such boundaries are also when the selection process explicitly takes into account the barriers for evolution and can be crossed only by intro- thermodynamic properties of the solvent. ducing artificial interactions between the amino acids to stabilize the protein native state (Fig. 1) [72]. ACKNOWLEDGMENTS We are thankful to L. Rovigatti, E. Locatelli, and M. IV. CONCLUSIONS Goethe for helpful discussions and suggestions. V. B. We present a computationally efficient model capable of and I. C. acknowledge support from the Austrian Science describing the artificial evolution of protein stability Fund (FWF), Grant No. P 26253-N27. V. B. acknowledges regions as a function of the thermodynamic properties of support from the FWF, Grant No. M 2150-N36. G. F. water. With our method, we study a large number of acknowledges support from Spanish MINECO Grants scenarios and demonstrate that the resulting stability No. FIS2012-31025 and No. FIS2015-66879-C2-2-P. regions are qualitatively similar to those of natural proteins. The computational results presented here have been Our results elucidate the role that water has in the achieved using the Vienna Scientific Cluster (VSC). selection process of protein sequences. In particular, we show that the maximum denaturation pressure ≃1 GPa APPENDIX A: PROTEIN-WATER MODEL above which all proteins denature in experiments is a At any P and T, we partition the total volume VðP; TÞ of consequence of the specific features of the hydration and our system into N regular, nonoverlapping cells of volume bulk-water hydrogen bonds. v ≡ V=N ≥ v , where v is the water-excluded volume, 0 0 We adopt design protocols that mimic the natural selection each occupied by one of the N protein’s residues or of proteins and find that the proteins selected at high T are by water molecules (Fig. 10). We adopt a coarse-grain superstable (Fig. 1), i.e., are stable in a wide range of T and representation of the protein that follows previous works P, while those selected at lower T arestableonlyina [29–31] and has been extensively used in the literature to reduced range of T and P. We observe that proteins designed get a qualitative understanding of protein properties (e.g., to be stable at different conditions of T and P are Refs. [4,5,9,73,74]) and protein design [21,22,75–78].For characterized by sequences with different degrees of segre- the solvent, we consider a coarse-grain model capable of gation between the hydrophilic surface and the hydrophobic reproducing in a qualitative way, at least, the properties of core (Figs. 2 and 3). The optimal degree of segregation is water [8,32–34,36,79] and its contribution to the protein selected spontaneously as a consequence of the free-energy folding [9,35]. balance of the protein aqueous solution without the necessity The Hamiltonian for the entire system is given by of an external active process of selection. In particular, we find that the segregation decreases by decreasing T ,and at H ≡ H þ H þ H ; ðA1Þ R;R R;w w;w ambient conditions, it is moderate (≃70% of the surface is hydrophilic and ≃50% of the core is hydrophobic). This where each term is defined in the following. As in result is consistent with a trend observed in the composition Refs. [29–31], we assume that the protein’s residues have of thermophilic, mesophilic, and ice-binding proteins only nearest-neighbor interactions, (Fig. 3). The broader stability of high-T designed proteins implies that these segregated sequences are a subset of those N N N C C W X X X designed at lower T . H ≡ H þ H ≡ C S þ C S ; p R;R R;w ij ij ij Furthermore, the general observation that many proteins i j≠i expose the solvent to a high percentage of hydrophobic ðA2Þ residues [66,67], as predicted by our model, suggests that such an exposure is not a compromise between stability and where the indices i and j run over the residues and j runs biological functionality of the proteins but rather a natural over the N ≤ N − N water molecules in the hydration W C consequence of the water properties. As a matter of fact, shell [80]; C is a contact matrix, with C ¼ 1,if l and k l;k selecting artificial sequences with an extreme segregation are first neighbors, and 0 otherwise; S are elements of does not increase the stability of proteins but rather reduces it. ij the Miyazawa-Jernigan residue-residue interaction matrix We believe that our findings could potentially improve the S [29,81,82] (solvent independent—see Table 1 in engineering of artificial biopolymers since the aggregation Appendix C) accounting for the correlations between can be prevented or enhanced in different thermodynamic real amino acids, and for the protein-water interaction conditions, according to the hydrophilic-hydrophobic ratio of W PHO W PHI S ¼ ε if the residue i is hydrophobic and S ¼ ε if the protein surface, although an experimental validation of the i i it is hydrophilic [9]. proposed design strategies remains crucial. Each cell accommodates, at most, one molecule, with the Putting the hydrophobic effect in an evolutionary per- average O─O distance between next-neighbor water mol- spective, our results substantiate an intriguing hypothesis: 1=3 Many features observed in natural proteins generally arise ecules given by r ¼ v . We associate a variable n ¼ 1 021047-6 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) ðbÞ ðhÞ H ≡ H þ H ; ðA4Þ with each cell, if the cell i is occupied by a water molecule w;w w;w w;w and has v =v > 0.5, and n ¼ 0 otherwise. Hence, n is a 0 i i where discretized density field replacing the water translational degrees of freedom. ðhÞ PHO PHO PHI PHI PHO PHO PHI PHI H ≡−½J N þJ N −½J N þJ N : w;w σ coop σ coop HB HB The Hamiltonian of bulk water is ðA5Þ ðbÞ ðbÞ ðbÞ H ≡ Uðr Þ − JN − J N : ðA3Þ w;w coop ij σ HB ij Within the square brackets of Eq. (A5), we have 1=3 explicitly indicated the contribution from the water mol- The first term, with UðrÞ ≡ ∞ for r< r ≡ v ¼ 2.9 Å 0 0 ecules at the protein-interface (h) neighboring hydrophobic (water van der Waals diameter), UðrÞ ≡ 4ϵ½ðr =rÞ − or hydrophilic residues. ðr =rÞ  for r ≥ r , with ϵ ≡ 2.9 kJ=mol, and UðrÞ ≡ 0 0 0 The hydrophobic interface strengthens the water-water for r>r ≡ 3r (cutoff), accounts for the O─O van der c 0 hydrogen bonding in the first hydration shell [7,86–90] Waals interaction between molecules i and j. The sum runs and increases the local water density upon pressurization over all possible water-molecule couples. PHO [90,95–97]. Therefore, we assume J >J for HBs The second term represents the directional component ðbÞ between water molecules at the hydrophobic interface of the HB interaction, where N ≡ n n δ is HB hiji i j σ ;σ ij ji PHO and J >J for the cooperative component. This con- σ σ the total number of bulk HBs. The sum is over each dition guarantees that the solvation free energy of a nearest-neighbor pair, and the argument is nonzero if hydrophobic amino acid decreases at low T [65]. the following conditions are satisfied: (i) n n ¼ 1, i.e., i j Following Ref. [9], we express the average volume change 1=3 r ≤ 2 r ¼ 3.6 Å, the maximum distance that is con- ij 0 per water-water HB at the hydrophobic interface as a series ventionally associated with a HB (while n n ¼ 0 other- i j PHO PHO expansion in P up to the linear term v =v ≡ 1 − k P HB HB;0 wise), and (ii) δ ¼ 1, with δ ¼ 1 if a ¼ b, and 0 σ ;σ ab PHO ij ji [98], where v is the volume change associated with the HB;0 otherwise. Here, σ ¼ 1; …;q is a bonding index. ij HB formation in the hydrophobic hydration shell at P ¼ 0 PHO Conventionally, a HB is broken if OOH > 30°, implying and k > 0. Therefore, the volume contribution V to that only 1=6 of the entire range of values [0,360°] for total volume V due to the HBs in the hydrophobic shell is d PHO PHO PHO PHO PHO the OOH angle is associated with a bonded state. Thus, V ≡ N v , where V and N are the hydro- HB HB HB choosing q ¼ 6, we correctly account for the entropy phobic hydration shell volume and number of HBs, variation due to HB formation and breaking. With this respectively. We assume that the water-water hydrogen definition, each molecule can form up to four HBs with its bonding and the water density at the hydrophilic interface neighbors. Bifurcated HBs are excluded. are not affected by the protein since they are equal to the PHI PHI The third term accounts for the HB cooperativity due to bulk-water values. Therefore, J ¼ J, J ¼ J , and σ σ the quantum many-body interaction [83,84] and leads to ðbÞ PHI v ¼ v . Moreover, the polarization effect of hydro- HB HB the low-P tetrahedral structure [85]. The cooperativity is philic residues on the HB network in the hydrophilic shell P P ðbÞ defined as the sum N ≡ n δ , over all the coop i σ ;σ i lk is not included here because, in our coarse-grain descrip- ik il water molecules i and over all the lk pairs of the bonding tion, it does not show a qualitative change in protein indices σ and σ of the molecule i. We choose J ≪ J to denaturation mechanisms [9], consistent with previous il ik σ guarantee an asymmetry between the two HB terms. observations [99]. A HB between two water molecules, The formation of a HB in the bulk leads to the local one hydrating a PHO amino acid and the other a PHI amino ðbÞ PHO PHI volume increase v =v , with an enthalpic variation 0 acid, is formed with a coupling constant ðJ þ J Þ=2, HB ðbÞ PHO PHI −J þ Pv , which accounts for the P-disrupting effect and it leads to a local increase of volume ðv þ v Þ=2. HB HB HB ðbÞ Hence, the total volume is on the HB network. Here, v =v represents the average HB volume increase between high-density ices VI and VIII and ðbÞ ðhÞ V ≡ V þ V ; ðA6Þ low-density (tetrahedral) ice Ih, and it is chosen as an ðhÞ PHO PHI approximation of the average volume variation per HB where V ≡ V þ V is the volume due to the HBs in when a tetrahedral HB network is formed. Hence, the the hydration shell. ðbÞ ðbÞ ðbÞ volume of bulk molecules is V ¼ Nv þ N v . In order to favor the visualization and the understanding HB HB The presence of the hydrophobic or hydrophilic protein of our results, we adopt a model representation in two interface affects the water-water hydrogen bonding in the dimensions [4,5,7,75–78,100]. We choose the parameters in hydration shell [7,86–94] (water molecules that are first Eq. (A1) in such a way as to get proteins that are stable for neighbors of the protein amino acids). Hence, the 250 ≲ T=K ≲ 350 and P< 1 GPa, consistent with exper- Hamiltonian for water, including hydration molecules imental observations [40–51,54,55,101–103].Weexpress all and the many-body effect on HB formation close to the the quantities in IU: adopting 8ϵ as the energy unit, v as the protein interface, reads volume unit, 8ϵ=k as the temperature unit, and 8ϵ=v as the B 0 021047-7 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) pressure unit. Accordingly, we fix the interaction constants homopolymer solutions. We set T ≪ T, the design tem- (all expressed in IU) J ¼ 0.3 and J ¼ 0.05 for bulk water, perature, because we look for sequences with either PHI PHI J ¼ J and J ¼ J for water at hydrophilic interfaces, minimum or maximum values of the enthalpy. A σ σ PHO PHO Monte Carlo step consists in an attempt to modify the and J ¼ 1.2 and J ¼ 0.2 for water at hydrophobic PHO protein sequence followed by a number of water moves that interfaces. Moreover, we fix k ¼ 4 and v ¼ 2 (in IU). HB;0 is large enough to equilibrate the solvent around the fixed These choices have two effects: (i) They balance the residue- protein. For the MAX GAP design protocol, we also residue, residue-water, and water-water interactions and perform a simulation with a completely stretched protein, make the proteins stable for thermodynamic conditions whose sequence is identical to the folded protein, in order comprised in the (stable and metastable) liquid phase, to calculate the enthalpy difference between the two including ambient conditions; (ii) they account for the lower conformations (native and unfolded). For both the water- surface volume ratio in the two-dimensional system with dependent design protocols, the protein enthalpy that we respect to a three-dimensional one, enhancing the interface ðhÞ ðhÞ ðhÞ interactions. Changes in parameters combine in a nontrivial consider is hH i ≡ hH iþhH iþ PhV i. w;w way, resulting in a shift, broadening, or reduction of the We perform the design for a large number of thermo- stability regions of proteins, but leaving the qualitative dynamic state points, sampling ½3; 5 × 10 independent scenario unaffected [104]. Lastly, since our preliminary sequences for each T, P and protein target-native structure. results on the three-dimensional many-body water model Each water-dependent design is performed on a grid of [105] show a phase diagram qualitatively similar to the ≃90 different T–P points. Implicit water design is per- ðhÞ one in two dimensions [36], we expect that our findings formed once per structure. We sample the averages hH i will remain substantially unaltered in a more realistic three- ðhÞ and hΔH i for each sequence over ≃10 water configu- dimensional version of the model. rations. We sort, in ascending order, the sequences accord- ðhÞ ðhÞ ing to their values of hH i, −hΔH i, and H , and R;R APPENDIX B: DESIGN AND R R consider, for characterization and stability analysis, only FOLDING SIMULATIONS the top 5; 15 sequences from each list. Overall, we perform The concept of protein design refers to an optimization more than 1500 independent designs. scheme for the sequence of amino acids, aiming to We test the validity of the design by folding the protein maximize the probability of folding into a specific target with Monte Carlo simulations at constant values of P, T, N , conformation. The design simulation consists in a broad and N. We start from a stretched protein conformation and, sampling of the space of sequences, on top of a fixed keeping the amino acid sequence, allow the protein to move protein structure that defines the native-target conforma- using local corner flip, pivot, and crankshaft algorithms tions (Fig. 10). We perform Monte Carlo simulations, [106]. Water is equilibrated using cluster moves [36].For keeping P, T, N , and N constant, with point mutation each sequence and each state point, we calculate the free of the sequence and residue swapping moves with the energy as a function of the number of native contacts. A following acceptance probability: protein is defined to be stable if, at the free-energy minimum, 90% of its native contacts are formed. This definition min −Δ=T n o ϵ o p P ≡ minf1;e g min f1; ðN =N Þ g; ðB1Þ acc P P guarantees that a stable protein folds in its native state. Computing the free energy on a grid of values of T and P where Δ is the difference between the new and the old yields the stability region in the T–P plane. By averaging the ðhÞ ðhÞ configurations in hH i or hΔH i or H , depending on R;R stability region over all sequences designed at the same T R R the design protocol; T ¼ 0.05 IU is the optimization and P , we calculate the average stability curve. n o temperature, N and N are the number of permutations P P for the new (n) and old (o) amino acid sequences, APPENDIX C: RESIDUE-RESIDUE respectively; and ϵ ¼ 14 is a weighting parameter. The INTERACTION MATRIX S n o ϵ term ðN =N Þ is added to bias towards highly hetero- P P In Table I we report the interaction matrix between the geneous sequences, which are better folders [24,30]. amino acids. This matrix does not include solvent contri- Therefore, we minimize the enthalpy of the folded structure butions, since these are explicitly stated in the Hamiltonians via a Monte Carlo scheme with separate acceptance criteria ðhÞ for the water moves and the sequence moves. While the H and H , and has been scaled by a factor 2 to balance w;w R;w water is simulated at T and P , the sequences are sampled with the water-water HB interaction. Amino acids are d d at low optimization temperature T . In Fig. 8, we show that indicated with letters following the FASTA code. The amino tuning ϵ does not significantly affect the hydropathy acids I, V, L, F, C, M, and A are assumed to be hydrophobic, profile of proteins, although a low value of ϵ generates according to the Kyte-Doolittle hydropathy scale [82]. 021047-8 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) TABLE I. Residue-residue interaction energy expressed in internal units. ACD E F G H I K L M N P Q R S T V W Y A −0.26 0 0.24 0.52 0.06 −0.14 0.68 −0.44 0.28 −0.02 0.5 0.56 0.2 0.16 0.86 −0.12 −0.18 −0.2 −0.18 0.18 C −2.12 0.06 1.38 −0.46 −0.16 −0.38 0.32 1.42 −0.16 0.38 0.26 0 0.1 0.48 −0.04 0.38 0.12 0.16 0.08 D 0.08 −0.3 0.78 −0.44 −0.78 1.18 −1.52 1.34 1.3 −0.6 0.08 −0.34 −1.44 −0.62 −0.58 1.16 0.48 0 E −0.06 0.54 0.5 −0.9 0.7 −1.94 0.86 0.88 −0.64 −0.2 −0.34 −1.48 −0.52 0 0.68 0.58 −0.2 F −0.88 −0.76 −0.32 −0.38 0.88 −0.6 −0.84 0.36 0.4 −0.58 0.82 0.58 0.62 −0.44 −0.32 0 G −0.76 0.4 0.5 0.22 0.46 0.38 −0.28 −0.22 −0.12 −0.08 −0.32 −0.52 0.32 0.36 0.28 H −0.58 0.98 0.44 0.32 1.98 −0.48 −0.42 −0.04 −0.24 −0.1 −0.38 0.38 −0.24 −0.68 I −0.44 0.72 −0.82 −0.56 1.06 0.5 0.72 0.84 0.42 0.28 −0.5 0.04 0.22 K 0.5 0.38 0 −0.66 0.22 −0.76 1.5 −0.26 −0.18 0.88 0.44 −0.42 L −0.54 −0.4 0.6 0.84 0.52 0.7 0.5 0.4 −0.58 −0.18 0.48 M 0.08 0.16 −0.68 0.92 0.62 0.28 0.38 −0.28 −1.34 −0.26 N −1.06 −0.36 −0.5 −0.28 −0.28 −0.22 1 0.12 −0.4 P 0.52 −0.84 −0.76 0.02 −0.14 0.18 −0.56 −0.66 Q 0.58 −1.04 −0.28 −0.28 0.48 0.16 −0.4 R 0.22 0.34 −0.7 0.6 −0.32 −0.5 S −0.4 −0.16 0.36 0.68 0.18 T 0.06 0.5 0.44 0.26 V −0.58 −0.14 0.04 W −0.24 −0.08 Y −0.12 APPENDIX D: ADDITIONAL FIGURES In this Appendix we report supplementary figures. Temperature [IU] FIG. 5. Average stability regions for proteins designed at constant pressure according to the MIN ENTHALPY protocol. The proteins are designed along the isobar P ¼ 0.2 IU. We observe how designed sequences at high T are more resistant to thermal fluctuations. d d Percentage of compactness 1 100 0.8 0.6 0.4 0.2 0.1 0.2 0.3 0.4 0.5 Temperature [IU] FIG. 6. Average compactness of the designed proteins. The compactness is calculated as the average number of residue-residue contact points, regardless of if they correspond to the native contacts, as a function of T and P. It is expressed as a (color-coded) percentage over the maximum possible number of contacts and averaged over proteins designed at different values of P and T . The straight (green) line on the d d bottom represents the liquid-gas spinodal line. 021047-9 Pressure [IU] Pressure [IU] BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) 0.6 0.5 0.4 0.3 Antifreeze & Ice-nucleating proteins 0.2 0.5 0.6 0.7 0.8 Hydrophilicity of the protein surface FIG. 7. Hydropathy of the protein surface and the protein core of the ice-binding proteins. (a) 0.7 0.6 ε =18 ε =14 ε =10 ε =7 0.5 0.1 0.2 0.3 0.4 0.5 Design temperature T [IU] (b) 0.5 0.4 0.3 0.1 0.2 0.3 0.4 0.5 Design temperature T [IU] FIG. 8. Hydropathy profile of proteins designed at ambient pressure for different values of ϵ . Here, we report how the hydrophilic profile of the protein surface and the hydrophobic profile of the protein core, respectively shown in panels (a) and (b), changes as a function of T ,at ambient pressure, accordingtothe parameter ϵ , appearingin Eq. (B1). Thedataarecalculated followingtheprotocolMIN ENTHALPY.Inall cases, the segregation between the hydrophilic surface and the hydrophobic core decreases with T. 021047-10 Hydrophobicity of the protein core Hydrophilicity of the protein surface Hydrophobicity of the protein core ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) Temperature [IU] Sequences designed at T =0.2 and 0.8 P =-0.1 P =0.5 0.6 P =0.8 0.4 0.2 -0.2 0.1 0.2 0.3 0.4 0.5 0.6 Temperature [IU] FIG. 9. Average stability regions for proteins designed according to the MAX GAP protocol. Average stability regions for proteins designed along the isobar P ¼ 0 and the isotherm T ¼ 0.2 IU, respectively, shown in panels (a) and (b). Arrows point at the design d d pressure P ¼ 0 IU and to the design temperature T ¼ 0.2 IU. d d FIG. 10. Schematic representation of the water-protein model. The position of the water molecules is coarse-grained, assigning a cell to each water molecule. The cell size coincides with the average distance between first-neighbor molecules and fluctuates according to the Boltzmann weight of isobaric-isothermal ensemble. The conformational state of water molecule i is described via four bonding indices σ , each one accounting for the bonding conformation of molecule i with respect to the neighbor in the direction j. Water ij molecules around the protein form the hydration shell (dark blue cells). Protein is modeled as a self-avoiding lattice chain, composed of 20 different amino acids. Each amino acid can be hydrophobic or hydrophilic, according to the Kyte-Doolittle hydropathy scale [82]. Red and green amino acids in the figure refer to core and surface elements, respectively. 021047-11 Pressure [IU] Pressure [IU] BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) [1] K. E. Zachariassen and E. Kristiansen, Ice Nucleation and [21] E. I. Shakhnovich and A. M. Gutin, A New Approach Antinucleation in Nature, Cryobiology 41, 257 (2000). to the Design of Stable Proteins, Protein Eng. 6,793 [2] L. J. Rothschild and R. L. Mancinelli, Life in Extreme (1993). Environments, Nature (London) 409, 1092 (2001). [22] E. I. Shakhnovich and A. M. Gutin, Engineering of Stable [3] G. Salvi, S. Mölbert, and P. De Los Rios, Design of Lattice and Fast-Folding Sequences of Model Proteins, Proc. Natl. Proteins with Explicit Solvent, Phys. Rev. E 66, 061911 Acad. Sci. U.S.A. 90, 7195 (1993). (2002). [23] I. Coluzza, Transferable Coarse-Grained Potential for De [4] M. I. Marqués, J. M. Borreguero, H. E. Stanley, and N. V. Novo Protein Folding and Design, PLoS One 9, e112852 Dokholyan, Possible Mechanism for Cold Denaturation of (2014). Proteins at High Pressure, Phys. Rev. Lett. 91, 138103 [24] I. Coluzza, A Coarse-Grained Approach to Protein (2003). Design: Learning from Design to Understand Folding, [5] B. A. Patel, P. G. Debenedetti, F. H. Stillinger, and P. J. PLoS One 6, e20853 (2011). Rossky, A Water-Explicit Lattice Model of Heat-, Cold-, [25] In order to optimize a sequence for a given target structure, a common strategy, known as negative design, consists in and Pressure-Induced Protein Unfolding, Biophys. J. 93, 4116 (2007). minimizing the free energy of the target conformation with [6] I. N. Berezovsky, K. B. Zeldovich, and E. I. Shakhnovich, respect to the free energy of all the possible compact Positive and Negative Design in Stability and Thermal conformations [26–28]. Negative design is normally nec- Adaptation of Natural Proteins, PLoS Biol. 3, e52 (2007). essary when, due to the limitations of the model used (e.g., [7] C. L. Dias, T. Ala-Nissila, M. Karttunen, I. Vattulainen, reduced alphabet size or coarse-graining), the standard and M. Grant, Microscopic Mechanism for Cold Denatu- procedure cannot find a sequence that folds to the target ration, Phys. Rev. Lett. 100, 118101 (2008). structure. In all the models we considered so far, we could [8] G. Franzese and V. Bianco, Water at Biological and always directly design on- and off–lattice structures using Inorganic Interfaces, Food Biophys. 8, 153 (2013). an alphabet of 20 letters [29–31]. [9] V. Bianco and G. Franzese, Contribution of Water to [26] C. Micheletti, F. Seno, A. Maritan, and J. R. Banavar, Pressure and Cold Denaturation of Proteins, Phys. Rev. Design of Proteins with Hydrophobic and Polar Amino Lett. 115, 108101 (2015). Acids, Proteins 32, 80 (1998). [10] S. B. Kim, J. C. Palmer, and P. G. Debenedetti, Computa- [27] I. Samish, C. M. Macdermaid, J. M. Perez-Aguilar, and tional Investigation of Cold Denaturation in the Trp-Cage J. G. Saven, Theoretical and Computational Protein Miniprotein, Proc. Natl. Acad. Sci. U.S.A. 113,8991(2016). Design, Annu. Rev. Phys. Chem. 62, 129 (2011). [11] J. R. Desjarlais and T. M. Handel, De Novo Design of the [28] N. Koga, R. Tatsumi-Koga, G. Liu, R. Xiao, T. B. Acton, Hydrophobic Cores of Proteins, Protein Sci. 4, 2006 G. T. Montelione, and D. Baker, Principles for Designing (1995). Ideal Protein Structures, Nature (London) 491, 222 [12] B. I. Dahiyat, De Novo Protein Design: Fully Automated (2012). Sequence Selection, Science 278, 82 (1997). [29] I. Coluzza and D. Frenkel, Designing Specificity of [13] B. I. Dahiyat and S. L. Mayo, Probing the Role of Packing Protein-Substrate Interactions, Phys. Rev. E 70, 051917 Specificity in Protein Design, Proc. Natl. Acad. Sci. U.S.A. (2004). 94, 10172 (1997). [30] I. Coluzza and D. Frenkel, Monte Carlo Study of Sub- [14] B. Kuhlman et al., Design of a Novel Globular Protein strate-Induced Folding and Refolding of Lattice Proteins, Fold with Atomic-Level Accuracy, Science 302, 1364 Biophys. J. 92, 1150 (2007). (2003). [31] I. Coluzza and J.-P. Hansen, Transition from Highly [15] D. Rothlisberger et al., Kemp Elimination Catalysts by to Fully Stretched Polymer Brushes in Good Solvent, Computational Enzyme Design, Nature (London) 453, 190 Phys. Rev. Lett. 100, 016104 (2008). (2008). [32] K. Stokely, M. G. Mazza, H. E. Stanley, and G. Franzese, [16] L. Jiang et al., De Novo Computational Design of Retro- Effect of Hydrogen Bond Cooperativity on the Behavior of Aldol Enzymes, Science 319, 1387 (2008). Water, Proc. Natl. Acad. Sci. U.S.A. 107, 1301 (2010). [17] D. J. Hei and D. S. Clark, Pressure Stabilization of [33] M. G. Mazza, K. Stokely, S. E. Pagnotta, F. Bruni, H. E. Proteins from Extreme Thermophiles, Appl. Environ. Stanley, and G. Franzese, More than One Dynamic Cross- Microbiol. 60, 932 (1994). over in Protein Hydration Water, Proc. Natl. Acad. Sci. [18] J. Torrent, P. Rubens, M. Ribó, K. Heremans, and M. U.S.A. 108, 19873 (2011). Vilanova, Pressure Versus Temperature Unfolding of [34] G. Franzese, V. Bianco, and S. Iskrov, Water at Interface Ribonuclease A: An FTIR Spectroscopic Characterization with Proteins, Food Biophys. 6, 186 (2011). of 10 Variants at the Carboxy-Terminal Site, Protein Sci. [35] V. Bianco, S. Iskrov, and G. Franzese, Understanding the 10, 725 (2001). Role of Hydrogen Bonds in Water Dynamics and Protein [19] S. Kumar, C. J. Tsai, and R. Nussinov, Temperature Stability, J. Biol. Phys. 38, 27 (2012). Range of Thermodynamic Stability for the Native State [36] V. Bianco and G. Franzese, Critical Behavior of a Water of Reversible Two-State Proteins, Biochemistry 42, 4864 Monolayer under Hydrophobic Confinement, Sci. Rep. 4, (2003). 4440 (2014). [20] B. N. Dominy, H. Minoux, and C. L. Brooks, An Electro- [37] Here, T defines the fluctuations of the water’s energy, in static Basis for the Stability of Thermophilic Proteins, units of the Boltzmann constant k , over which the Proteins 57, 128 (2004). enthalpy is averaged. This parameter is different from 021047-12 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) what is called “design temperature” in V. S. Pande, A. Y. [53] With the adopted units, 1 atm correspond to P ≃ 0.7 × −4 Grosberg, and T. Tanaka, Biophys. J. 73, 3192 (1997), 10 IU. For such values of P, the liquid-gas transition which in our case has a constant and very low value. occurs at T ≃ 0.44 IU. Therefore, the ambient temper- LG [38] S. A. Hawley, Reversible Pressure-Temperature ature is T ¼ 0.8 × T ≃ 0.35 IU. LG Denaturation of Chymotrypsinogen, Biochemistry 10, [54] W. B. Floriano, M. A. Nascimento, G. B. Domont, and 2436 (1971). W. A. Goddard, Effects of Pressure on the Structure of [39] H. W. Hatch, F. H. Stillinger, and P. G. Debenedetti, Metmyoglobin: Molecular Dynamics Predictions for Pres- Computational Study of the Stability of the Miniprotein sure Unfolding through a Molten Globule Intermediate, Trp-Cage, the GB 1β-Hairpin, and the AK16 Peptide, Protein Sci. 7, 2301 (1998). under Negative Pressure, J. Phys. Chem. B 118, 7761 [55] M. Gross and R. Jaenicke, Proteins under Pressure. The (2014). Influence of High Hydrostatic Pressure on Structure, [40] L. Smeller, Pressure-Temperature Phase Diagrams of Function and Assembly of Proteins and Protein Biomolecules, Biochim. Biophys. Acta 1595, 11 (2002). Complexes, Eur. J. Biochem. 221, 617 (1994). [41] R. Ravindra and R. Winter, On the Temperature–Pressure [56] E. Larios and M. Gruebele, Protein Stability at Negative Free-Energy Landscape of Proteins, Chem. Phys. Chem. Pressure, Methods 52, 51 (2010). 4, 359 (2003). [57] J. Hollien and S. Marqusee, A Thermodynamic Compari- [42] H. Herberhold and R. Winter, Temperature- and Pressure- son of Mesophilic and Thermophilic Ribonucleases H, Induced Unfolding and Refolding of Ubiquitin: A Static Biochemistry 38, 3831 (1999). and Kinetic Fourier Transform Infrared Spectroscopy [58] S. Fukuchi and K. Nishikawa, Protein Surface Amino Acid Study, Biochemistry 41, 2396 (2002). Compositions Distinctively Differ Between Thermophilic [43] H. Lesch, H. Stadlbauer, J. Friedrich, and J. M. and Mesophilic Bacteria, J. Mol. Biol. 309, 835 (2001). Vanderkooi, Stability Diagram and Unfolding of a Modi- [59] We analysed the PDB files of the following ice-binding fied Cytochrome c: What Happens in the Transformation proteins: 1MSI, 1AME, 1B71, 1B7J, 1B7K, 1C3Z, 1C8A, Regime?, Biophys. J. 82, 1644 (2002). 1EKL, 1EZG, 1HG7, 1JAB, 1KDF, 1OPS, 1WFB, 1XUZ, [44] J. Wiedersich, S. Köhler, A. Skerra, and J. Friedrich, 2AME, 2BRD, 2 MSI, 2PY2, 2ZDR, 2ZIB, 3P4G, 3ULT, Temperature and Pressure Dependence of Protein Stabil- 3UYV, 3WP9, 4DT5, 4KDV, 4NU3, 4UR4, and 4UR6. We ity: The Engineered Fluorescein-Binding Lipocalin FluA assume that an amino acid is exposed to the solvent if the Shows an Elliptic Phase Diagram, Proc. Natl. Acad. Sci. average accessible surface area is ≥ 0.3 of the total U.S.A. 105, 5756 (2008). accessible area of the amino acid (calculations are per- [45] A. Zipp and W. Kauzmann, Pressure Denaturation of formed with the MDTRAJ program at http://mdtraj.org/1.7 Metmyoglobin, Biochemistry 12, 4217 (1973). .2). Protein structures are taken from the PDB database at [46] M. W. Lassalle, H. Yamada, and K. Akasaka, The http://www.rcsb.org/pdb/home/home.do. Pressure-Temperature Free Energy-Landscape of Staphy- [60] G. Vogt, S. Woell, and P. Argos, Protein Thermal Stability, lococcal Nuclease Monitored by (1)H NMR, J. Mol. Biol. Hydrogen Bonds, and Ion Pairs, J. Mol. Biol. 269, 631 298, 293 (2000). (1997). [47] A. Maeno, H. Matsuo, and K. Akasaka, The Pressure- [61] G. Vogt and P. Argos, Protein Thermal Stability: Hydrogen Temperature Phase Diagram of Hen Lysozyme at Low pH, Bonds or Internal Packing?, Folding Des. 2, S40 (1997). Biophysik 5, 1 (2009). [62] V. Z. Spassov, A. D. Karshikof, and R. Ladenstein, The [48] J. Somkuti, Z. Mártonfalvi, M. S. Z. Kellermayer, and Optimization of Protein-Solvent Interactions: Thermosta- L. Smeller, Different Pressure-Temperature Behavior bility and the Role of Hydrophobic and Electrostatic of the Structured and Unstructured Regions of Titin, Interactions, Protein Sci. 4, 1516 (1995). Biochim. Biophys. Acta 1834, 112 (2013). [63] P. Haney, J. Konisky, K. K. Koretke, Z. Luthey-Schulten, [49] J. Somkuti, S. Jain, S. Ramachandran, and L. Smeller, and P. G. Wolynes, Structural Basis for Thermostability Folding-Unfolding Transitions of Rv3221c on the and Identification of Potential Active Site Residues Pressure-Temperature Plane, High Press. Res. 33, 250 for Adenylate Kinases from the Archaeal Genus (2013). Methanococcus, Proteins 28, 117 (1997). [50] F. Meersman, L. Smeller, and K. Heremans, Pressure- [64] E. van Dijk, P. Varilly, T. P. J. Knowles, D. Frenkel, and Assisted Cold Unfolding of Proteins and Its Effects on the S. Abeln, Consistent Treatment of Hydrophobicity in Conformational Stability Compared to Pressure and Heat Protein Lattice Models Accounts for Cold Denaturation, Unfolding, High Press. Res. 19, 263 (2000). Phys. Rev. Lett. 116, 078101 (2016). [51] A. Pastore, S. R. Martin, A. Politou, K. C. Kondapalli, [65] M. S. Moghaddam and H. S. Chan, Pressure and T. Stemmler, and P. A. Temussi, Unbiased Cold Denatu- Temperature Dependence of Hydrophobic Hydration: ration: Low- and High-Temperature Unfolding of Yeast Volumetric, Compressibility, and Thermodynamic Signa- Frataxin under Physiological Conditions, J. Am. Chem. tures, J. Chem. Phys. 126, 114507 (2007). Soc. 129, 5374 (2007). [66] L. Lins, A. Thomas, and R. Brasseur, Analysis of [52] J. Roche, J. A. Caro, D. R. Norberto, P. Barthe, C. Accessible Surface of Residues in Proteins, Protein Sci. Roumestand, J. L. Schlessman, A. E. Garcia, B. Garcia- 12, 1406 (2003). Moreno E., and C. A. Royer, Cavities Determine the [67] S. Moelbert, E. Emberly, and C. Tang, Correlation Pressure Unfolding of Proteins, Proc. Natl. Acad. Sci. Between Sequence Hydrophobicity and Surface-Exposure U.S.A. 109, 6945 (2012). Pattern of Database Proteins, Protein Sci. 13, 752 (2004). 021047-13 BIANCO, FRANZESE, DELLAGO, and COLUZZA PHYS. REV. X 7, 021047 (2017) min [68] The sequences designed with H are independent of T [84] U. Góra, R. Podeszwa, W. Cencek, and K. Szalewicz, R;R and P . Interaction Energies of Large Clusters from Many-Body [69] The condition for a vanishing number of HBs is P ≃ Expansion, J. Chem. Phys. 135, 224102 (2011). J=v (where J and v are parameters of the model [85] A. K. Soper and M. A. Ricci, Structures of High- HB HB described in the Methods section), which in Ref. [70] gives Density and Low-Density Water, Phys. Rev. Lett. 84, P ≥ 1 IU, while here it gives P ≥ 0.6 IU ≃ 1 GPa when 2881 (2000). [86] N. Giovambattista, P. J. Rossky, and P. G. Debenedetti, converted to real units. [70] G. Franzese and H. E. Stanley, The Widom Line of Effect of Pressure on the Phase Behavior and Structure of Supercooled Water, J. Phys. Condens. Matter 19, Water Confined Between Nanoscale Hydrophobic and 205126 (2007). Hydrophilic Plates, Phys. Rev. E 73, 041604 (2006). [71] Details of our results reveal that the protein interface [87] C. Petersen, K.-J. Tielrooij, and H. J. Bakker, Strong affects up to the second hydration shell and that this Temperature Dependence of Water Reorientation in contributes significantly to the enthalpy difference between Hydrophobic Hydration Shells, J. Chem. Phys. 130, the u and the f states. 214511 (2009). [72] Depending on the specific set of residue-residue inter- [88] Y. I. Tarasevich, State and Structure of Water in Vicinity of actions, the exact boundary can move, but the fact that both Hydrophobic Surfaces, Colloid J. 73, 257 (2011). in experiments and in our simulations the high pressure [89] J. G. Davis, K. P. Gierszal, P. Wang, and D. Ben- boundary is ≃1 GPa implies that the values we are using to Amotz, Water Structural Transformation at Molecular represent the interactions between the amino acids have the Hydrophobic Interfaces, Nature (London) 491, 582 correct order of magnitude. (2012). [73] G. Caldarelli and P. De Los Rios, Cold and Warm [90] S. Sarupria and S. Garde, Quantifying Water Density Denaturation of Proteins, J. Biol. Phys. 27, 229 (2001). Fluctuations and Compressibility of Hydration Shells of [74] S. Matysiak, P. G. Debenedetti, and P. J. Rossky Role Hydrophobic Solutes and Proteins, Phys. Rev. Lett. 103, of Hydrophobic Hydration in Protein Stability: A 3D 037803 (2009). Water-Explicit Protein Model Exhibiting Cold and Heat [91] K. A. Dill, T. M. Truskett, V. Vlachy, and B. Hribar-Lee B, Denaturation, J. Phys. Chem. B 116, 8095 (2012). Modeling Water, the Hydrophobic Effect, and the Ion [75] K. F. Lau and K. A. Dill, A Lattice Statistical Mechanics Solvation, Annu. Rev. Biophys. Biomol. Struct. 34, 173 Model of the Conformational and Sequence Spaces of (2005). Proteins, Macromolecules 22, 3986 (1989). [92] A. Badasyan, Y. S. Mamasakhlisov, R. Podgornik, and [76] K. Yue and K. A. Dill, Inverse Protein Folding Problem: V. A. Parsegian, Solvent Effects in the Helix-Coil Tran- Designing Polymer Sequences, Proc. Natl. Acad. Sci. sition Model Can Explain the Unusual Biophysics of U.S.A. 89, 4163 (1992). Intrinsically Disordered Proteins, J. Chem. Phys. 143, [77] M. S. Shell, P. G. Debenedetti, and A. Z. Panagiotopoulos, 014102 (2015). Computational Characterization of the Sequence Land- [93] C. J. Fennell, C. W. Kehoe, and K. A. Dill, Modeling scape in Simple Protein Alphabets, Proteins 62, 232 Aqueous Solvation with Semi-Explicit Assembly, Proc. (2006). Natl. Acad. Sci. U.S.A. 108, 3234 (2011). [78] B. A. Patel, P. G. Debenedetti, F. H. Stillinger, and P. J. [94] C. Camilloni, D. Bonetti, A. Morrone, R. Giri, C. M. Rossky, The Effect of Sequence on the Conformational Dobson, M. Brunori, S. Gianni, and M. Vendruscolo, Stability of a Model Heteropolymer in Explicit Water, Towards a Structural Biology of the Hydrophobic Effect in J. Chem. Phys. 128, 175102 (2008). Protein Folding, Sci. Rep. 6, 28285 (2016). [79] F. de los Santos and G. Franzese, Understanding Diffusion [95] P. Das and S. Matysiak, Direct Characterization of and Density Anomaly in a Coarse-Grained Model for Hydrophobic Hydration During Cold and Pressure Water Confined Between Hydrophobic Walls, J. Phys. Denaturation, J. Phys. Chem. B 116, 5342 (2012). Chem. B 115, 14311 (2011). [96] T. Ghosh, A. E. García, and S. Garde, Molecular Dynamics [80] Here, for the sake of simplicity, following Ref. [9],we Simulations of Pressure Effects on Hydrophobic Inter- assume that, in each volume v, there is at most one water actions, J. Am. Chem. Soc. 123, 10997 (2001). molecule. This approximation could be removed by gen- [97] C. L. Dias and H. S. Chan, Pressure-Dependent Properties eralizing the model in such a way that the water variables of Elementary Hydrophobic Interactions: Ramifications associated with each cell could account for the state of for Activation Properties of Protein Folding, J. Phys. more than one water molecule at a time. Chem. B 118, 7488 (2014). [81] S. Miyazawac and R. L. Jernigan, Estimation of Effective [98] This approximation implies that our calculations are valid Interresidue Contact Energies from Protein Crystal Struc- only for P< 1=k . [99] M. Kurnik, L. Hedberg, J. Danielsson, and M. Oliveberg, tures: Quasi-chemical Approximation, Macromolecules 18, 534 (1985). Folding without Charges, Proc. Natl. Acad. Sci. U.S.A. [82] J. Kyte and R. F. Doolittle, A Simple Method for Display- 109, 5705 (2012). ing the Hydropathic Character of a Protein, J. Mol. Biol. [100] P. De Los Rios and G. Caldarelli, Putting Proteins Back 157, 105 (1982). into Water, Phys. Rev. E 62, 8449 (2000). [83] L. H. de la Peña and P. G. Kusalik, Temperature Depend- [101] G. Hummer, S. Garde, A. E. García, M. E. Paulaitis, and ence of Quantum Effects in Liquid Water, J. Am. L. R. Pratt, The Pressure Dependence of Hydrophobic Chem. Soc. 127, 5246 (2005). Interactions Is Consistent with the Observed Pressure 021047-14 ROLE OF WATER IN THE SELECTION OF STABLE … PHYS. REV. X 7, 021047 (2017) Denaturation of Proteins, Proc. Natl. Acad. Sci. U.S.A. 95, [104] V. Bianco, N. P. Gelabert, I. Coluzza, and G. Franzese, 1552 (1998). How the Stability of a Folded Protein Depends on [102] F. Meersman, C. M. Dobson, and K. Heremans, Protein Interfacial Water Properties and Residue-Residue Inter- Unfolding, Amyloid Fibril Formation and Configurational actions, arXiv:1704.03370. Energy Landscapes under High Pressure Conditions, [105] L. E. Coronas, V. Bianco, A. Zantop, and G. Franzese, Chem. Soc. Rev. 35, 908 (2006). Liquid-Liquid Critical Point in 3D Many-Body Water [103] N. V. Nucci, B. Fuglestad, E. A. Athanasoula, and A. J. Model, arXiv:1610.00419. Wand, Role of Cavities and Hydration in the Pressure [106] D. Frenkel and B. Smit, Understand Molecular Unfolding of T4 Lysozyme, Proc. Natl. Acad. Sci. U.S.A. Simulations (Academic Press, San Diego, London, 111, 13846 (2014). 2002). 021047-15

Journal

Physical Review XAmerican Physical Society (APS)

Published: Apr 1, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off