Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Three-dimensional computation of atom depth in complex molecular structures

Three-dimensional computation of atom depth in complex molecular structures Vol. 21 no. 12 2005, pages 2856–2860 BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/bti444 Structural bioinformatics Three-dimensional computation of atom depth in complex molecular structures 1 1,2 1,2 2 Daniele Varrazzo , Andrea Bernini , Ottavia Spiga , Arianna Ciutti , 2 1 1 1,2,∗ Stefano Chiellini , Vincenzo Venditti , Luisa Bracci and Neri Niccolai Biomolecular Structure Research Center and Department of Molecular Biology, Università di Siena, I-53100 Siena, Italy and SienaBioGrafix Srl, I-53100 Siena, Italy Received on February 9, 2005; revised on March 30, 2005; accepted on April 7, 2005 Advance Access publication April 12, 2005 ABSTRACT et al., 1995), molecular volumes (Richmond, 1984) and potential Motivation: For a complex molecular system the delineation of binding sites (Lo Conte et al., 1999; Shulman-Peleg et al., 2004; atom–atom contacts, exposed surface and binding sites represents Tsuchiya et al., 2004; Innis et al., 2004; Gutteridge et al., 2003) a fundamental step to predict its interaction with solvent, ligands and with functional features, protein folding and structural stability other molecules. Recently, atom depth has been also considered as (Serrano et al., 1992). an additional structural descriptor to correlate protein structure with Recently, the calculation of atom depth from the protein sur- folding and functional properties. The distance between an atom and face has been proposed as an additional criterion to define protein the nearest water molecule or the closest surface dot has been pro- structures more accurately (Pintar et al., 2003b; Chakravarty and posed as a measure of the atom depth, but, in both cases, the 3D Varadarajan, 1999) by exploring the interior of the molecule, as character of depth is largely lost. In the present study, a new approach the strength of van der Waals (VdW) and electrostatic interactions is proposed to calculate atom depths in a way that the molecular shape might be dependent on the distance from the molecular surface can be taken into account. (Chakravarty and Varadarajan, 1999; Richards, 1977). Moreover, Results: An algorithm has been developed to calculate intersec- once the deepest molecular moieties can be defined, a systematic tions between the molecular volume and spheres centered on the analysis of their properties can be carried out to gain information on atoms whose depth has to be quantified. Many proteins with different molecular structure and stability. size and shape have been chosen to compare the results obtained Atom depth can be defined as the distance between an atom and from distance-based and volume-based depth calculations. From the the nearest surface water molecule, either experimentally defined wealth of experimental data available for hen egg white lysozyme, or hypothetically present. However, by evaluating the closest dis- H/D exchange rates and TEMPOL induced paramagnetic perturba- tance between an atom and a dot of the solvent accessible surface tions have been analyzed both in terms of depth indexes and of atom (Chakravarty and Varadarajan, 1999) or the distance between an atom distances to the solvent accessible surface. The algorithm here pro- and its closest solvent accessible neighbor (Pintar et al., 2003a), con- posed yields better correlations between experimental data and atom tributions from the 3D molecular shape to the actual atom depth are depth, particularly for those atoms which are located near to the protein largely lost. surface. Hence, to estimate atom depth a new approach reflecting the Availability: Instructions to obtain source code and the executable molecular shape is proposed here by measuring intersections program are available either from http://sienabiografix.com or http:// between the molecular volume and a sphere of a suitable radius, sadic.sourceforge.net centered on the atom whose depth has to be quantified. It is appar- Contact: niccolai@unisi.it ent, indeed, that smaller the exposed volume, deeper is the 3D Supplementary information: http://www.Sienabiogzefix.com/ insertion of the investigated atom inside the molecular structure. publication Since, in general, depth is a very relative quantity which depends on the overall size and shape of the object under discussion, an atom INTRODUCTION depth index is suggested as a more appropriate parameter to dis- Structural biology is nowadays rapidly growing, due to a synergistic cuss atom insertions within each investigated molecular systems. post-genomic effect of the large developments of X-ray crystallo- These depth indexes, calculated by using SADIC (Simple Atom graphy, nuclear magnetic resonance (NMR) and bioinformatics. In Depth Index Calculator) algorithm, for instance, could constitute a the Protein Data Bank (PDB) (Berman et al., 2000), a wealth of new rational basis to reanalyze inner and outer amino acid composi- resolved and predicted protein structures are available and, on this tions of proteins or to improve the analysis of depth-related physical basis, structural descriptors have been developed to correlate access- phenomena. Among the latter molecular processes, H/D isotopic ible molecular surface (Lee and Richards, 1971; Richmond, 1984; exchange of protein amide hydrogens is particularly relevant being Quillin and Matthews, 2000; Totrov and Abagyan, 1996; Gerstein commonly referred to as molecular surface exposures. Exchange rates are very frequently determined from NMR (Roder et al., 1985) or mass spectrometry (Miranker et al., 1996) studies to explore To whom correspondence should be addressed. 2856 © The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org 3D Computation of atom depth protein conformations and dynamics. It has also been shown that SADIC, as described in detail in the Supplementary information NMR studies of through-the-space interactions, occurring between section. paramagnetic probes and protein nuclei, can be interpreted in terms of protein surface exposures (Niccolai et al., 2001, 2003; Pintacuda IMPLEMENTATION and Otting, 2002). The current SADIC implementation is written in Python and C pro- In the present report, volume-based and distance-based atom depth gramming languages (see Supplementary information). The program have been evaluated and compared for proteins of different size consists of an object-oriented library providing classes responsible to and shape. Calculated depths have been also correlated with H/D generate sampling patterns, to model solid objects (they can be sub- exchange and paramagnetic perturbation data available for hen egg classed to add new capabilities to the framework) and to parse PDB. white lysozyme (HEWL). (Berman et al., 2000) entity files. An executable with a command line interface is provided: the program can read PDB entity files either ALGORITHM from a local file system or from an external database through its URL (http, ftp and file protocols are supported) or its pdb ID code. The SADIC algorithm is based on the simple idea of sampling the space user can perform a molecule sampling on a list of given points in the around each atom of a given molecule by evaluating, for selected space or on a selection of entity atoms, either absolutely referred by distances from the atom center, the portion of volume that is external serial number or selected by atom name (e.g. using ‘CA’ to refer to to any protein atom. In other words, such volume, henceforth called protein backbone α-carbon), residue number, chain identifier. If the the exposed volume, represents the space external to the molecular entity file contains more than one structure, as is the case of NMR surface comprised at a distance r in all directions around the atom. determined structures, the sampling can be separately performed on Therefore, the size of the exposed volume is a direct measure of atom a selection of structures. In this case, average and SD of the results depth with respect to the molecular surface, as smaller the exposed may be automatically calculated. volume, deeper is the atom within the molecular structure. When dealing with exposed volumes instead of linear distances, as previ- ously proposed for depth calculations (Chakravarty and Varadarajan, RESULTS AND DISCUSSION 1999; Pintar et al., 2003b), the information on surface shape is con- SADIC program has been developed to obtain a new tool for a struc- sidered. SADIC algorithm can yield an accurate indication of atom tural characterization of complex molecules, such as proteins and depth, since distances from the atom center to the solvent exposed nucleic acids, by considering the atom depth. Since depth is a char- surface are simultaneously evaluated in all directions. It follows that acteristic which can be conveniently discussed only in relation to atoms located in protruding loops have exposed volumes greater than the size of each investigated system, SADIC outputs are more con- those exhibited by atoms which are equally close to the surface but veniently analyzed as atom depth indexes, D, rather than absolute located at the bottom of a pocket. exposed volumes. In principle, this algorithm can be used to analyze local depth for Thus, for an atom i of a given molecule and a sampling radius r , objects having any size and shape, provided that only an ‘inside’ a depth index D may be defined as i,r and an ‘outside’ can be unambiguously assigned. The 3D model of a molecule, as an assembly of sphere shaped atoms, satisfies 2V i,r D = , (1) i,r this requirement, since all the points located inside one of these 0,r spheres, i.e. closer to an atom center than the VdW atom radius, can be considered ‘inside’ the molecule. where V is the exposed volume of a sphere of radius r centered i,r In order to approximate the volume of the intersection between the on atom i and V is the exposed volume of the same sphere when 0,r molecule and a sphere with a given radius r and center C, the sphere centered on an isolated atom. interior is split into units whose volume is known. For each volume As already pointed out in the Algorithm section, to avoid flattening unit a representative point is taken: we approximate the exposed of the algorithm outputs towards similar D s, the selection of the r i,r volume by testing the sampling points against the molecule and value represents a very critical step. In Figure 1 the evolution of D s i,r summing the volume relative to all the points outside the molecular of a representative selection of HEWL Cα carbons is shown: Thr47 model. and Ile58 Cα atoms are both equally close to the solvent exposed sur- The choice of r is of critical importance, since too small or too large face, but in the convex and concave molecular regions, respectively. r values would yield null or large exposed volumes, respectively at Each of the Trp28 and Ser50 Cα atoms are, instead, deeply inserted a very similar extent for all the atoms of the investigated molecule. in one of the two HEWL domains. For small and large r values, all It should be noticed that the values obtained by sampling inside a D s converge to 0 and 2, respectively and the atom depth index cal- i,r sphere of radius r can be effectively used to calculate exposition culated by SADIC loses its structural information. Conversely, in an for each sphere of radius r ≤ r . To exploit this possibility better, intermediate region of r values, centered for HEWL at ∼9 Å, a large sampling points are chosen over concentric spheres with growing dispersion of D s can be observed. Then, to analyze conveniently i,r radii r ··· r = r . atom depths it seems appropriate to choose the biggest sphere radius 0 n A simplistic pattern to sample a sphere interior consists of a which determines the condition D = 0 only for one nth inner n,r regular grid in spherical coordinates. This method has the draw- most atom. In the case of HEWL this condition is met by Trp28 Cα back to produce the same number of samples at each radius, thus carbon, thus resulting as the most internal atom, at a r value of 9 Å. yielding more packed points toward the poles, and the center and Thus, once a suitable sphere radius has been chosen, calculated coarser points toward the equator and the outside. In order to D s readily describe the topology of each atom, as values close to i,r overcome the problem, a different sampling pattern is used by 0 or 1 defines the inner or outer atoms (Fig. 2). Furthermore, the 2857 D.Varrazzo et al. (a) (b) Fig. 1. The evolution of D values with the sphere radius r calculated for i,r selected backbone Cα carbons of HEWL (Trp28: crosses, Thr47: squares, Ser50: filled circles and Ile58: triangles). Fig. 3. Comparison of the depth index of atom i, D with the distance i,r between atom i and its closest solvent accessible neighbor, dpx or the nearest surface water molecule, dnw , along the protein sequence positions. These different kinds of atom depth, calculated by using SADIC, the remote server http://hydra.icgeb.trieste.it/dpx/ and the software package MolMol (Koradi et al., 1996) are shown against the sequence positions of the backbone Cα of (a) human neutrophil collagenase, a small spherical protein (pdb ID code 1BZS and (b) acetylcholinesterase, a large oblate protein (pdb ID code 1EEA). Protein shapes have been classified on the basis of the corresponding moments of inertia, as estimated with the program EdPDB (Zhang and Brian, 1995). protein surface, is reached by SADIC. This feature directly derives from the fact that only D s depend both on surface distances and i,r molecular shape and that equally distant atoms from the surface, but close to concave or convex surface regions, exhibit very different Fig. 2. Space fill representation of lysozyme (pdb file ID code 4lzt); the depth indexes. enzyme is halved into two complementary moieties to show some of the To check how D s can be useful in the structural interpretation i,r inner heavy atoms colored according to their D values. i,9 of experimental data, reported H/D exchange rates of HEWL amide protons (Pedersen et al., 1993) and paramagnetic perturbations of D > 1 condition defines atoms which are very close to a convex NMR signals (Niccolai et al., 2003) have been analyzed in terms of i,r molecular surface, as in the case of HEWL α carbons of Thr47, atom depth. As shown in Figure 4, D s, dpx s and dnw s correl- i,9 i i Asp48 and Gly117, whose D are 1.22, 1.10 and 1.15, respectively. ate similarly with H/D exchange rates, K , and paramagnetic signal i,9 exi The validity of the proposed algorithm has been tested on many attenuations, A , only in the case of the innermost HEWL atoms. For proteins by comparing SADIC outputs with distance-based atom the outer ones, K s and A s are, indeed, all grouped in a very narrow ex i depths. Thus, as shown in Figure 3, D s of a small spherical pro- range of both dpx and dnw values. By simple inspection of Figure 4, i,r i i tein and of a large oblate one are compared with atom distances it is apparent that D s generally exhibit a higher correlation than i,9 calculated from the closest exposed neighbor, dpx (Pintar et al., distance-based depths with the experimental data, as all the slowest 2003b) and from the nearest surface water molecule (Chakravarty exchange rates of HEWL amide hydrogens have been calculated for and Varadarajan, 1999), dnw . Among the different sets of data, a atoms with D < 0.6. It should be noted that for the latter amide i i,9 good agreement exists, as the Cα carbons exhibiting the highest D hydrogens a large variety of distance-based atom depths is derived. i,r values correspond to the shortest dpx s and dnw s. Conversely, for Furthermore, any overlapping of the experimentally derived para- i i the Cα carbons having the longest dpx and dnw values, D s close meters observed at the closest surface distances is largely resolved, i i i,r to 0 are found. It is also evident that a higher detail in describing while the slow exchange rates measured for Ala10, Phe34 and Leu83 atom depth, particularly for those atoms which are located near to the amide groups are more consistent with the corresponding D s. i,9 2858 3D Computation of atom depth (a) (b) (c) (d) (e) (f) 13 1 Fig. 4. Correlations of TEMPOL induced paramagnetic attenuations, A , of HEWL C- H HSQC NMR signals of Cα carbons with (a) D ,(b) dpx and (c) i i,9 i dnw . Correlations between Kex of HEWL backbone amide protons and (d) D ,(e) dpx and (f ) dnw . Identical dpx and dnw values are exhibited by the i i i,9 i i i i most exposed atoms: the latter atoms are highlighted with triangles in the D plot. Labels refer to amino acid residue positions in the protein sequence. i,9 A linear or higher level dependence of D s on HEWL K s and factors reported in the crystal structure of HEWL with PDB (Berman i,9 ex A s cannot be delineated, as fast H/D exchange rates were measured et al., 2000) ID code 4lzt, it is apparent that local structural flexibility for the deeply inserted Thr40 and Cys94 amide hydrogens, in spite is not responsible for the limited correlation between atom depth and of their close proximity to a concave surface. Moreover, a small A accessibility dependent experimental data. value is exhibited by the surface exposed Asp48 Cα carbon, while for It can be concluded that the use of SADIC algorithm might favor the buried Ile98 a strong paramagnetic attenuation is observed. These improved depth-oriented discussions on experimental data, possibly four cases represent the most evident discrepancies, but many other enhancing our understanding of structure stability and dynamics of anomalous behaviors of K s and A s versus both volume-based and complex molecules. ex i distance-based depth can be seen in the data shown in Figure 4. In this respect, it should be stressed that both experimental parameters ACKNOWLEDGEMENTS depend on atom depth only at a first approximation and that a more detailed discussion of exchange rates and paramagnetic perturba- Thanks are due to grants from the Italian Ministry of University tions in terms of atom depth would be needed. The H/D exchange PRIN03-059395 and from the University of Siena (PAR 2002). process, determined by the dynamics of the hydrogen bond network Special thanks are also due to Francesco Niccolai for technical within the protein and its hydration shell, is commonly related to assistance. solvent accessibility. The fact that SADIC outputs are more con- sistent with K s than atom depths obtained from distance-based ex REFERENCES calculations, suggests that a step forward in the structural interpret- ation of the latter experimental parameter might be achieved. On the Berman,H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242. Chakravarty,S. and Varadarajan,R. (1999) Residue depth: a novel parameter other hand, the weaker correlation observed between paramagnetic for the analysis of protein structure and stability. Structure Fold. Des., 7, perturbations induced by TEMPOL and atom depths confirms that 723–732. complex dynamics control the interaction of protein surfaces with Gerstein,M. et al. (1995) The volume of atoms on the protein surface: calculated from paramagnetic probes (Niccolai et al., 2003). On the basis of the B simulation, using Voronoi polyhedra. J. Mol. Biol., 249, 955–966. 2859 D.Varrazzo et al. Gutteridge,A. et al. (2003) Using a neural network and spatial clustering to predict the Pintar,A. et al. (2003b) DPX: for the analysis of the protein core. Bioinformatics, 19, location of active sites in enzymes. J. Mol. Biol., 330, 719–734. 313–314. Innis,C.A. et al. (2004) Prediction of functional sites in proteins using conserved Quillin,M.L. and Matthews,B.W. (2000) Accurate calculation of the density of proteins. functional group analysis. J. Mol. Biol., 337, 1053–1068. Acta Crystallogr. D Biol. Crystallogr., 56, 791–794. Koradi,R. et al. (1996) MOLMOL: a program for display and analysis of macromolecular Richards,F.M. (1977) Areas, volumes, packing and protein structure. Annu. Rev. Biophys. structures. J. Mol. Graph., 14, 51–55, 29–32. Bioeng., 6, 151–176. Lee,B. and Richards,F.M. (1971) The interpretation of protein structures: estimation of Richmond,T.J. (1984) Solvent accessible surface area and excluded volume in proteins. static accessibility. J. Mol. Biol., 55, 379–400. Analytical equations for overlapping spheres and implications for the hydrophobic Lo Conte,L. et al. (1999) The atomic structure of protein–protein recognition sites. effect. J. Mol. Biol., 178, 63–89. J. Mol. Biol., 285, 2177–2198. Roder,H. et al. (1985) Individual amide proton exchange rates in thermally unfolded Miranker,A. et al. (1996) Investigation of protein folding by mass spectrometry. basic pancreatic trypsin inhibitor. Biochemistry, 24, 7407–7411. FASEB J., 10, 93–101. Serrano,L. et al. (1992) The folding of an enzyme. II. Substructure of barnase and Niccolai,N. et al. (2001) NMR studies of protein surface accessibility. J. Biol. Chem., the contribution of different interactions to protein stability. J. Mol. Biol., 224, 276, 42455–42461. 783–804. Niccolai,N. et al. (2003) NMR studies of protein hydration and TEMPOL accessibility. Shulman-Peleg,A. et al. (2004) Recognition of functional sites in protein structures. J. Mol. Biol., 332, 437–447. J. Mol. Biol., 339, 607–633. Pedersen,T.G. et al. (1993) Determination of the rate constants k1 and k2 of the Totrov,M. and Abagyan,R. (1996) The contour-buildup algorithm to calculate the Linderstrom–Lang model for protein amide hydrogen exchange. A study of the analytical molecular surface. J. Struct. Biol., 116, 138–143. individual amides in hen egg-white lysozyme. J. Mol. Biol., 230, 651–660. Tsuchiya,Y. et al. (2004) Structure-based prediction of DNA-binding sites on proteins Pintacuda,G. and Otting,G. (2002) Identification of protein surfaces by NMR measure- using the empirical preference of electrostatic potential and the shape of molecular ments with a paramagnetic Gd(III) chelate. J. Am. Chem. Soc., 124, 372–373. surfaces. Proteins, 55, 885–894. Pintar,A. et al. (2003a) Atom depth as a descriptor of the protein interior. Biophys. J., Zhang,X. and Brian,B.W. (1995) EdPDB: a multifunctional tool for protein structure 84, 2553–2561. analysis. J. Appl. Cryst., 28, 624–630. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

Three-dimensional computation of atom depth in complex molecular structures

Loading next page...
 
/lp/oxford-university-press/three-dimensional-computation-of-atom-depth-in-complex-molecular-WNgDedYKbu

References (27)

Publisher
Oxford University Press
Copyright
© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/bti444
pmid
15827080
Publisher site
See Article on Publisher Site

Abstract

Vol. 21 no. 12 2005, pages 2856–2860 BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/bti444 Structural bioinformatics Three-dimensional computation of atom depth in complex molecular structures 1 1,2 1,2 2 Daniele Varrazzo , Andrea Bernini , Ottavia Spiga , Arianna Ciutti , 2 1 1 1,2,∗ Stefano Chiellini , Vincenzo Venditti , Luisa Bracci and Neri Niccolai Biomolecular Structure Research Center and Department of Molecular Biology, Università di Siena, I-53100 Siena, Italy and SienaBioGrafix Srl, I-53100 Siena, Italy Received on February 9, 2005; revised on March 30, 2005; accepted on April 7, 2005 Advance Access publication April 12, 2005 ABSTRACT et al., 1995), molecular volumes (Richmond, 1984) and potential Motivation: For a complex molecular system the delineation of binding sites (Lo Conte et al., 1999; Shulman-Peleg et al., 2004; atom–atom contacts, exposed surface and binding sites represents Tsuchiya et al., 2004; Innis et al., 2004; Gutteridge et al., 2003) a fundamental step to predict its interaction with solvent, ligands and with functional features, protein folding and structural stability other molecules. Recently, atom depth has been also considered as (Serrano et al., 1992). an additional structural descriptor to correlate protein structure with Recently, the calculation of atom depth from the protein sur- folding and functional properties. The distance between an atom and face has been proposed as an additional criterion to define protein the nearest water molecule or the closest surface dot has been pro- structures more accurately (Pintar et al., 2003b; Chakravarty and posed as a measure of the atom depth, but, in both cases, the 3D Varadarajan, 1999) by exploring the interior of the molecule, as character of depth is largely lost. In the present study, a new approach the strength of van der Waals (VdW) and electrostatic interactions is proposed to calculate atom depths in a way that the molecular shape might be dependent on the distance from the molecular surface can be taken into account. (Chakravarty and Varadarajan, 1999; Richards, 1977). Moreover, Results: An algorithm has been developed to calculate intersec- once the deepest molecular moieties can be defined, a systematic tions between the molecular volume and spheres centered on the analysis of their properties can be carried out to gain information on atoms whose depth has to be quantified. Many proteins with different molecular structure and stability. size and shape have been chosen to compare the results obtained Atom depth can be defined as the distance between an atom and from distance-based and volume-based depth calculations. From the the nearest surface water molecule, either experimentally defined wealth of experimental data available for hen egg white lysozyme, or hypothetically present. However, by evaluating the closest dis- H/D exchange rates and TEMPOL induced paramagnetic perturba- tance between an atom and a dot of the solvent accessible surface tions have been analyzed both in terms of depth indexes and of atom (Chakravarty and Varadarajan, 1999) or the distance between an atom distances to the solvent accessible surface. The algorithm here pro- and its closest solvent accessible neighbor (Pintar et al., 2003a), con- posed yields better correlations between experimental data and atom tributions from the 3D molecular shape to the actual atom depth are depth, particularly for those atoms which are located near to the protein largely lost. surface. Hence, to estimate atom depth a new approach reflecting the Availability: Instructions to obtain source code and the executable molecular shape is proposed here by measuring intersections program are available either from http://sienabiografix.com or http:// between the molecular volume and a sphere of a suitable radius, sadic.sourceforge.net centered on the atom whose depth has to be quantified. It is appar- Contact: niccolai@unisi.it ent, indeed, that smaller the exposed volume, deeper is the 3D Supplementary information: http://www.Sienabiogzefix.com/ insertion of the investigated atom inside the molecular structure. publication Since, in general, depth is a very relative quantity which depends on the overall size and shape of the object under discussion, an atom INTRODUCTION depth index is suggested as a more appropriate parameter to dis- Structural biology is nowadays rapidly growing, due to a synergistic cuss atom insertions within each investigated molecular systems. post-genomic effect of the large developments of X-ray crystallo- These depth indexes, calculated by using SADIC (Simple Atom graphy, nuclear magnetic resonance (NMR) and bioinformatics. In Depth Index Calculator) algorithm, for instance, could constitute a the Protein Data Bank (PDB) (Berman et al., 2000), a wealth of new rational basis to reanalyze inner and outer amino acid composi- resolved and predicted protein structures are available and, on this tions of proteins or to improve the analysis of depth-related physical basis, structural descriptors have been developed to correlate access- phenomena. Among the latter molecular processes, H/D isotopic ible molecular surface (Lee and Richards, 1971; Richmond, 1984; exchange of protein amide hydrogens is particularly relevant being Quillin and Matthews, 2000; Totrov and Abagyan, 1996; Gerstein commonly referred to as molecular surface exposures. Exchange rates are very frequently determined from NMR (Roder et al., 1985) or mass spectrometry (Miranker et al., 1996) studies to explore To whom correspondence should be addressed. 2856 © The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org 3D Computation of atom depth protein conformations and dynamics. It has also been shown that SADIC, as described in detail in the Supplementary information NMR studies of through-the-space interactions, occurring between section. paramagnetic probes and protein nuclei, can be interpreted in terms of protein surface exposures (Niccolai et al., 2001, 2003; Pintacuda IMPLEMENTATION and Otting, 2002). The current SADIC implementation is written in Python and C pro- In the present report, volume-based and distance-based atom depth gramming languages (see Supplementary information). The program have been evaluated and compared for proteins of different size consists of an object-oriented library providing classes responsible to and shape. Calculated depths have been also correlated with H/D generate sampling patterns, to model solid objects (they can be sub- exchange and paramagnetic perturbation data available for hen egg classed to add new capabilities to the framework) and to parse PDB. white lysozyme (HEWL). (Berman et al., 2000) entity files. An executable with a command line interface is provided: the program can read PDB entity files either ALGORITHM from a local file system or from an external database through its URL (http, ftp and file protocols are supported) or its pdb ID code. The SADIC algorithm is based on the simple idea of sampling the space user can perform a molecule sampling on a list of given points in the around each atom of a given molecule by evaluating, for selected space or on a selection of entity atoms, either absolutely referred by distances from the atom center, the portion of volume that is external serial number or selected by atom name (e.g. using ‘CA’ to refer to to any protein atom. In other words, such volume, henceforth called protein backbone α-carbon), residue number, chain identifier. If the the exposed volume, represents the space external to the molecular entity file contains more than one structure, as is the case of NMR surface comprised at a distance r in all directions around the atom. determined structures, the sampling can be separately performed on Therefore, the size of the exposed volume is a direct measure of atom a selection of structures. In this case, average and SD of the results depth with respect to the molecular surface, as smaller the exposed may be automatically calculated. volume, deeper is the atom within the molecular structure. When dealing with exposed volumes instead of linear distances, as previ- ously proposed for depth calculations (Chakravarty and Varadarajan, RESULTS AND DISCUSSION 1999; Pintar et al., 2003b), the information on surface shape is con- SADIC program has been developed to obtain a new tool for a struc- sidered. SADIC algorithm can yield an accurate indication of atom tural characterization of complex molecules, such as proteins and depth, since distances from the atom center to the solvent exposed nucleic acids, by considering the atom depth. Since depth is a char- surface are simultaneously evaluated in all directions. It follows that acteristic which can be conveniently discussed only in relation to atoms located in protruding loops have exposed volumes greater than the size of each investigated system, SADIC outputs are more con- those exhibited by atoms which are equally close to the surface but veniently analyzed as atom depth indexes, D, rather than absolute located at the bottom of a pocket. exposed volumes. In principle, this algorithm can be used to analyze local depth for Thus, for an atom i of a given molecule and a sampling radius r , objects having any size and shape, provided that only an ‘inside’ a depth index D may be defined as i,r and an ‘outside’ can be unambiguously assigned. The 3D model of a molecule, as an assembly of sphere shaped atoms, satisfies 2V i,r D = , (1) i,r this requirement, since all the points located inside one of these 0,r spheres, i.e. closer to an atom center than the VdW atom radius, can be considered ‘inside’ the molecule. where V is the exposed volume of a sphere of radius r centered i,r In order to approximate the volume of the intersection between the on atom i and V is the exposed volume of the same sphere when 0,r molecule and a sphere with a given radius r and center C, the sphere centered on an isolated atom. interior is split into units whose volume is known. For each volume As already pointed out in the Algorithm section, to avoid flattening unit a representative point is taken: we approximate the exposed of the algorithm outputs towards similar D s, the selection of the r i,r volume by testing the sampling points against the molecule and value represents a very critical step. In Figure 1 the evolution of D s i,r summing the volume relative to all the points outside the molecular of a representative selection of HEWL Cα carbons is shown: Thr47 model. and Ile58 Cα atoms are both equally close to the solvent exposed sur- The choice of r is of critical importance, since too small or too large face, but in the convex and concave molecular regions, respectively. r values would yield null or large exposed volumes, respectively at Each of the Trp28 and Ser50 Cα atoms are, instead, deeply inserted a very similar extent for all the atoms of the investigated molecule. in one of the two HEWL domains. For small and large r values, all It should be noticed that the values obtained by sampling inside a D s converge to 0 and 2, respectively and the atom depth index cal- i,r sphere of radius r can be effectively used to calculate exposition culated by SADIC loses its structural information. Conversely, in an for each sphere of radius r ≤ r . To exploit this possibility better, intermediate region of r values, centered for HEWL at ∼9 Å, a large sampling points are chosen over concentric spheres with growing dispersion of D s can be observed. Then, to analyze conveniently i,r radii r ··· r = r . atom depths it seems appropriate to choose the biggest sphere radius 0 n A simplistic pattern to sample a sphere interior consists of a which determines the condition D = 0 only for one nth inner n,r regular grid in spherical coordinates. This method has the draw- most atom. In the case of HEWL this condition is met by Trp28 Cα back to produce the same number of samples at each radius, thus carbon, thus resulting as the most internal atom, at a r value of 9 Å. yielding more packed points toward the poles, and the center and Thus, once a suitable sphere radius has been chosen, calculated coarser points toward the equator and the outside. In order to D s readily describe the topology of each atom, as values close to i,r overcome the problem, a different sampling pattern is used by 0 or 1 defines the inner or outer atoms (Fig. 2). Furthermore, the 2857 D.Varrazzo et al. (a) (b) Fig. 1. The evolution of D values with the sphere radius r calculated for i,r selected backbone Cα carbons of HEWL (Trp28: crosses, Thr47: squares, Ser50: filled circles and Ile58: triangles). Fig. 3. Comparison of the depth index of atom i, D with the distance i,r between atom i and its closest solvent accessible neighbor, dpx or the nearest surface water molecule, dnw , along the protein sequence positions. These different kinds of atom depth, calculated by using SADIC, the remote server http://hydra.icgeb.trieste.it/dpx/ and the software package MolMol (Koradi et al., 1996) are shown against the sequence positions of the backbone Cα of (a) human neutrophil collagenase, a small spherical protein (pdb ID code 1BZS and (b) acetylcholinesterase, a large oblate protein (pdb ID code 1EEA). Protein shapes have been classified on the basis of the corresponding moments of inertia, as estimated with the program EdPDB (Zhang and Brian, 1995). protein surface, is reached by SADIC. This feature directly derives from the fact that only D s depend both on surface distances and i,r molecular shape and that equally distant atoms from the surface, but close to concave or convex surface regions, exhibit very different Fig. 2. Space fill representation of lysozyme (pdb file ID code 4lzt); the depth indexes. enzyme is halved into two complementary moieties to show some of the To check how D s can be useful in the structural interpretation i,r inner heavy atoms colored according to their D values. i,9 of experimental data, reported H/D exchange rates of HEWL amide protons (Pedersen et al., 1993) and paramagnetic perturbations of D > 1 condition defines atoms which are very close to a convex NMR signals (Niccolai et al., 2003) have been analyzed in terms of i,r molecular surface, as in the case of HEWL α carbons of Thr47, atom depth. As shown in Figure 4, D s, dpx s and dnw s correl- i,9 i i Asp48 and Gly117, whose D are 1.22, 1.10 and 1.15, respectively. ate similarly with H/D exchange rates, K , and paramagnetic signal i,9 exi The validity of the proposed algorithm has been tested on many attenuations, A , only in the case of the innermost HEWL atoms. For proteins by comparing SADIC outputs with distance-based atom the outer ones, K s and A s are, indeed, all grouped in a very narrow ex i depths. Thus, as shown in Figure 3, D s of a small spherical pro- range of both dpx and dnw values. By simple inspection of Figure 4, i,r i i tein and of a large oblate one are compared with atom distances it is apparent that D s generally exhibit a higher correlation than i,9 calculated from the closest exposed neighbor, dpx (Pintar et al., distance-based depths with the experimental data, as all the slowest 2003b) and from the nearest surface water molecule (Chakravarty exchange rates of HEWL amide hydrogens have been calculated for and Varadarajan, 1999), dnw . Among the different sets of data, a atoms with D < 0.6. It should be noted that for the latter amide i i,9 good agreement exists, as the Cα carbons exhibiting the highest D hydrogens a large variety of distance-based atom depths is derived. i,r values correspond to the shortest dpx s and dnw s. Conversely, for Furthermore, any overlapping of the experimentally derived para- i i the Cα carbons having the longest dpx and dnw values, D s close meters observed at the closest surface distances is largely resolved, i i i,r to 0 are found. It is also evident that a higher detail in describing while the slow exchange rates measured for Ala10, Phe34 and Leu83 atom depth, particularly for those atoms which are located near to the amide groups are more consistent with the corresponding D s. i,9 2858 3D Computation of atom depth (a) (b) (c) (d) (e) (f) 13 1 Fig. 4. Correlations of TEMPOL induced paramagnetic attenuations, A , of HEWL C- H HSQC NMR signals of Cα carbons with (a) D ,(b) dpx and (c) i i,9 i dnw . Correlations between Kex of HEWL backbone amide protons and (d) D ,(e) dpx and (f ) dnw . Identical dpx and dnw values are exhibited by the i i i,9 i i i i most exposed atoms: the latter atoms are highlighted with triangles in the D plot. Labels refer to amino acid residue positions in the protein sequence. i,9 A linear or higher level dependence of D s on HEWL K s and factors reported in the crystal structure of HEWL with PDB (Berman i,9 ex A s cannot be delineated, as fast H/D exchange rates were measured et al., 2000) ID code 4lzt, it is apparent that local structural flexibility for the deeply inserted Thr40 and Cys94 amide hydrogens, in spite is not responsible for the limited correlation between atom depth and of their close proximity to a concave surface. Moreover, a small A accessibility dependent experimental data. value is exhibited by the surface exposed Asp48 Cα carbon, while for It can be concluded that the use of SADIC algorithm might favor the buried Ile98 a strong paramagnetic attenuation is observed. These improved depth-oriented discussions on experimental data, possibly four cases represent the most evident discrepancies, but many other enhancing our understanding of structure stability and dynamics of anomalous behaviors of K s and A s versus both volume-based and complex molecules. ex i distance-based depth can be seen in the data shown in Figure 4. In this respect, it should be stressed that both experimental parameters ACKNOWLEDGEMENTS depend on atom depth only at a first approximation and that a more detailed discussion of exchange rates and paramagnetic perturba- Thanks are due to grants from the Italian Ministry of University tions in terms of atom depth would be needed. The H/D exchange PRIN03-059395 and from the University of Siena (PAR 2002). process, determined by the dynamics of the hydrogen bond network Special thanks are also due to Francesco Niccolai for technical within the protein and its hydration shell, is commonly related to assistance. solvent accessibility. The fact that SADIC outputs are more con- sistent with K s than atom depths obtained from distance-based ex REFERENCES calculations, suggests that a step forward in the structural interpret- ation of the latter experimental parameter might be achieved. On the Berman,H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242. Chakravarty,S. and Varadarajan,R. (1999) Residue depth: a novel parameter other hand, the weaker correlation observed between paramagnetic for the analysis of protein structure and stability. Structure Fold. Des., 7, perturbations induced by TEMPOL and atom depths confirms that 723–732. complex dynamics control the interaction of protein surfaces with Gerstein,M. et al. (1995) The volume of atoms on the protein surface: calculated from paramagnetic probes (Niccolai et al., 2003). On the basis of the B simulation, using Voronoi polyhedra. J. Mol. Biol., 249, 955–966. 2859 D.Varrazzo et al. Gutteridge,A. et al. (2003) Using a neural network and spatial clustering to predict the Pintar,A. et al. (2003b) DPX: for the analysis of the protein core. Bioinformatics, 19, location of active sites in enzymes. J. Mol. Biol., 330, 719–734. 313–314. Innis,C.A. et al. (2004) Prediction of functional sites in proteins using conserved Quillin,M.L. and Matthews,B.W. (2000) Accurate calculation of the density of proteins. functional group analysis. J. Mol. Biol., 337, 1053–1068. Acta Crystallogr. D Biol. Crystallogr., 56, 791–794. Koradi,R. et al. (1996) MOLMOL: a program for display and analysis of macromolecular Richards,F.M. (1977) Areas, volumes, packing and protein structure. Annu. Rev. Biophys. structures. J. Mol. Graph., 14, 51–55, 29–32. Bioeng., 6, 151–176. Lee,B. and Richards,F.M. (1971) The interpretation of protein structures: estimation of Richmond,T.J. (1984) Solvent accessible surface area and excluded volume in proteins. static accessibility. J. Mol. Biol., 55, 379–400. Analytical equations for overlapping spheres and implications for the hydrophobic Lo Conte,L. et al. (1999) The atomic structure of protein–protein recognition sites. effect. J. Mol. Biol., 178, 63–89. J. Mol. Biol., 285, 2177–2198. Roder,H. et al. (1985) Individual amide proton exchange rates in thermally unfolded Miranker,A. et al. (1996) Investigation of protein folding by mass spectrometry. basic pancreatic trypsin inhibitor. Biochemistry, 24, 7407–7411. FASEB J., 10, 93–101. Serrano,L. et al. (1992) The folding of an enzyme. II. Substructure of barnase and Niccolai,N. et al. (2001) NMR studies of protein surface accessibility. J. Biol. Chem., the contribution of different interactions to protein stability. J. Mol. Biol., 224, 276, 42455–42461. 783–804. Niccolai,N. et al. (2003) NMR studies of protein hydration and TEMPOL accessibility. Shulman-Peleg,A. et al. (2004) Recognition of functional sites in protein structures. J. Mol. Biol., 332, 437–447. J. Mol. Biol., 339, 607–633. Pedersen,T.G. et al. (1993) Determination of the rate constants k1 and k2 of the Totrov,M. and Abagyan,R. (1996) The contour-buildup algorithm to calculate the Linderstrom–Lang model for protein amide hydrogen exchange. A study of the analytical molecular surface. J. Struct. Biol., 116, 138–143. individual amides in hen egg-white lysozyme. J. Mol. Biol., 230, 651–660. Tsuchiya,Y. et al. (2004) Structure-based prediction of DNA-binding sites on proteins Pintacuda,G. and Otting,G. (2002) Identification of protein surfaces by NMR measure- using the empirical preference of electrostatic potential and the shape of molecular ments with a paramagnetic Gd(III) chelate. J. Am. Chem. Soc., 124, 372–373. surfaces. Proteins, 55, 885–894. Pintar,A. et al. (2003a) Atom depth as a descriptor of the protein interior. Biophys. J., Zhang,X. and Brian,B.W. (1995) EdPDB: a multifunctional tool for protein structure 84, 2553–2561. analysis. J. Appl. Cryst., 28, 624–630.

Journal

BioinformaticsOxford University Press

Published: Apr 12, 2005

There are no references for this article.