Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Long-timescale dynamics of the Drew–Dickerson dodecamer

Long-timescale dynamics of the Drew–Dickerson dodecamer 4052–4066 Nucleic Acids Research, 2016, Vol. 44, No. 9 Published online 15 April 2016 doi: 10.1093/nar/gkw264 Long-timescale dynamics of the Drew–Dickerson dodecamer 1,2 1,2,3 1,2 4 4,5 Pablo D. Dans , Linda Danilane ¯ , Ivan Ivani ,Toma ´sD ˇ rsata ˇ , Filip Lankas ˇ , 1,2 1,2 1,2 1,2 Adam Hospital ,Jur ¨ gen Walther , Ricard Illa Pujagut , Federica Battistini , Josep 6 7 1,2,6,* Lluis Gelp´ ı , Richard Lavery and Modesto Orozco Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, 08028 Barcelona, Spain, Joint BSC-IRB Research Program in Computational Biology, Baldiri Reixac 10-12, 08028 Barcelona, Spain, School of Chemistry, University of East Anglia (UEA), Norwich Research Park, Norwich NR4 7TJ, UK, Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, Flemingovo nam ´ 2, 166 10 Prague, Czech Republic, Laboratory of Informatics and Chemistry, University of Chemistry and Technology Prague, Technicka´ 5, 166 28 Prague, Czech Republic, Department of Biochemistry and Molecular Biology, University of Barcelona, 08028 Barcelona, Spain and Bases Moleculaires ´ et Structurales des Systemes ` Infectieux, UniversiteL ´ yonI/CNRS UMR 5086, IBCP, 7 Passage du Vercors, Lyon 69367, France Received December 29, 2015; Revised March 25, 2016; Accepted March 31, 2016 ABSTRACT to due to the presence of ligands or of changes in the envi- ronment (1). Clearly, DNA structure should be explained in We present a systematic study of the long-timescale terms of conformational ensembles rather than in terms of dynamics of the Drew–Dickerson dodecamer (DDD: individual structures. d(CGCGAATTGCGC) ) a prototypical B-DNA duplex. Recently experimental techniques (2–5) are providing in- Using our newly parameterized PARMBSC1 force valuable information on DNA dynamics, however most of field, we describe the conformational landscape of what we know about the sequence-dependent flexibility of DNA comes from atomistic molecular dynamics (MD) sim- DDD in a variety of ionic environments from minimal + − + − ulations (6–10). As computer power increases and the reli- salt to 2 M Na Cl or K Cl . The sensitivity of the ability of force fields improve, more reliable information is simulations to the use of different solvent and ion derived from atomistic MD simulations (11). Such simula- models is analyzed in detail using multi-microsecond tions have revealed the extent, and the complexity, of DNA simulations. Finally, an extended (10 s) simulation movements and their tight coupling to the nature and dy- is used to characterize slow and infrequent confor- namics of the environment (8,12–14). Unfortunately, MD mational changes in DDD, leading to the identifica- simulations are extremely dependent on the quality of the tion of previously uncharacterized conformational force field ( 11,15–17) and, as simulations become longer, states of this duplex which can explain biologically errors induced by force fields accumulate, generating erro- relevant conformational transitions. With a total of neous patterns of flexibility ( 18–20). Continuous refinement more than 43 s of unrestrained molecular dynamics of the force field is therefore required in order to profit from simulation, this study is the most extensive investi- computational improvements and to gain better insight into the structure and dynamics of DNA. With this aim in mind, gation of the dynamics of the most prototypical DNA we have recently developed the PARMBSC1 force field, a duplex. new functional with an excellent ability to describe a vari- ety of DNA structures on the microsecond timescale (20). INTRODUCTION Here we use PARMBSC1 (20) to make a detailed explo- The static picture of DNA derived from the early X-Ray ration the dynamics of the best known fragment of DNA: studies is now challenged by a myriad of experimental and the Drew–Dickerson dodecamer (DDD, (21)). DDD is an theoretical studies which show DNA to be a highly flexi- ideal model system: (i) it contains a biologically relevant se- ble entity, undergoing many conformational alterations and quence that fits well into the canonical B-form of DNA, (ii) even modifications of its covalent structure. Simple inspec- it has been extensively studied experimentally (135 struc- tion of the Protein Data Bank (PDB) illustrates how differ- tures with the DDD sequence are available in the PDB, ent sequences adopt different conformations, but also how some of them solved at very high resolution) and (iii) it identical sequences can be found in different conformations To whom correspondence should be addressed.Tel: +34 93 4037156; Fax: +34 93 4037156; Email: [email protected] C The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Nucleic Acids Research, 2016, Vol. 44, No. 9 4053 has also been widely studied by means of nanosecond-to- mostat was also used in combination with PARMBSC1, microsecond MD simulations (7,22,23). In summary, DDD giving results without perceptible differences (20) (data not is the best-known model system of DNA, and its analysis shown). Center of mass motion was removed every 10 ps to is likely to produce results that can be extrapolated to any limit build up of the translational kinetic energy of the so- canonical B-DNA. lute. SHAKE (34) was used to keep all bonds involving hy- In a first step, we evaluate the impact on DNA of a wide drogen at their equilibrium values, which allowed us to use a variety of solvent and ion models. In a second step, we an- 2 fs step for the integration of Newton equations of motion. alyze the impact that changes in ionic strength can have on Long-range electrostatic interactions were accounted for by the collected conformational samples. Finally, we explore in using the Particle Mesh Ewald method (35) with standard detail the long-timescale dynamics of DNA by using many defaults and a real-space cutoff of 10 A. The PARMBSC1 multi-microsecond trajectories and one extended (10 s) force field ( 20) was used to represent DNA interactions. All single trajectory. Our study reveals that the main confor- simulations were carried out using the PMEMD CUDA mational characteristics of DNA are quite insensitive to the code module (36)ofAMBER 14 (37). nature of the models used to describe the solvent and ion en- + − vironment. Changes due to the nature of the salt (Na Cl Analysis + − or K Cl ), or to the ionic strength, are also quite mod- est. PARMBSC1 provides very stable conformational sam- During production runs, data was typically collected every plings, which agree well with experimental information on 1 ps, which allowed us to study infrequent, but fast move- this duplex, but also highlight unusual anharmonic defor- ments. All the trajectories were pre-processed with the CPP- mations of DNA that can explain some biologically rele- TRAJ (38) module of the AMBERTOOLS 15 package (37), vant transitions. Overall, in the framework of fixed-charge the NAFlex server (39) and tools developed in the group all-atom force fields, we present here the broadest and con- (http://mmb.irbbarcelona.org/www/tools). DNA helical pa- ceivably the most accurate study of the multi-microsecond rameters and backbone torsion angles associated with the timescale dynamics of duplex B-DNA to date. each base pair (bp) and base pair step (bps) were measured with the CURVES+ and CANAL programs (40). The sub- states of the torsion angles of the backbone (, ,  and ) MATERIALS AND METHODS were categorized following the standard definition: gauche System set-up positive (g+) = 60 ± 40 degrees; trans (t) = 180 ± 40 de- grees; and gauche negative (g−) = 300 ± 40 degrees. For Starting geometries for all systems used either Arnott-B the analysis of the vast majority of helical parameters we DNA canonical values (24) or the high resolution X-Ray took advantage of the palindromic nature of the DDD se- structure of DDD with PDB ID: 1JGR (25). The systems quence considering both strands independently, or as an av- were then solvated by adding TIP3P or SPC/Ewatersto erage between the Watson and Crick strands. For compar- a truncated octahedral box and neutralized by adding 22 + + ison with the data available in the experimental databases, Na or K . Both Smith-Dang (S&D; (26)) and Joung- which were obtained in different environments, we built a Cheatham (J&C; (27)) ion models were considered and ex- + − + − single theoretical conformational space containing almost tra salt (Na Cl or K Cl ) was added up to a chosen ionic 40 million structures taken from all the independent trajec- strength (0.15, 0.5 and 2.0 M added salt). Counterions were ˚ tories and constituting an aggregated simulation time of 43 initially placed randomly, at a minimum distance of 5 A ˚ s. from the solute and 3.5 A from one another. For simula- tions involving potassium, two extra ion parameterizations Experimental structures of the Drew–Dickerson docecamer. were tested: Jensen-Jorgensen (J&J; (28)) and Beglov-Roux The experimental conformational space of DDD was de- (B&R; (29)). All the systems were energy minimized, ther- fined as a set of experimental structures in the PDB with the malized and pre-equilibrated using our standard multi-step sequence: d(CpGpCpGpApApTpTpCpGpCpG) ; see Sup- protocol (22,30) followed by 50 ns of equilibration. All the plementary Table S1 for a detailed list. The final ensemble systems were then simulated (production runs) on the mi- contained structures of DDD either isolated or in complex crosecond timescale (see Table 1). with small organic compounds (Supplementary Table S1). Both ligands and those sequences containing non-canonical Simulation details covalent modifications were removed. After this selection All systems were simulated in the isothermal-isobaric en- procedure, the remaining 93 structures were analyzed with semble (P = 1atm; T = 298 K) using the Berendsen al- CURVES+ (40) and used as a reference conformational en- gorithm (31) to control the temperature and the pressure, semble. with a coupling constant of 5 ps. Although this has been the standard protocol adopted by the ABC consortium (8), and Solution X-ray scattering profiles from MD simulations. many others, for the simulation of short B-DNA sequences, We computed SAXS/WAXS spectra from MD, with readers should be aware that the Berendsen thermostat may PARMBSC1, by taking 1000 structures from the last mi- + − produce a non-uniform temperature distribution. While this crosecond of the simulation with 0.15 M Na Cl in TIP3P was demonstrated for proteins (32), the compact structure water, and generating 100 spectra, each being the average of of the DDD dodecamer and the weak coupling of the ther- 10 snapshots. The conditions in that simulation are the clos- mostat (5 ps) are likely to minimize such effects in our sim- est to the experimental ones, obtained in 0.10 M of added + − ulations. In a previous work, the Nose–Hoo ´ ver (33)ther- Na Cl plus 0.05 M of Tris·HCl (41). With an estimated 4054 Nucleic Acids Research, 2016, Vol. 44, No. 9 Table 1. Overall simulation information for the systems studied Initial Solvent Number of Number of System name structure model waters Ion type Ion model ions Total time Sample PARMBSC1 J&C Na neutral fiber SPC /E 4968 Na+ J&C 22 2 s1ps S&D Na neutral fiber SPC /E 4968 Na+ S&D 22 2 s1ps S&D Na neutral fiber TIP3P 4998 Na+ S&D 22 2 /10 s1/20 ps J&C Na neutral fiber TIP3P 4970 Na+ J&C 22 2 s1ps S&D NaCl 0.15M fiber TIP3P 5324 Na+ /Cl− S&D 36/14 2 s1ps S&D NaCl 0.5M fiber TIP3P 5118 Na+ /Cl− S&D 64/42 3 s1ps S&D NaCl 2.0M fiber TIP3P 5095 Na+ /Cl− S&D 162/140 2 s1ps J&C neutral 1JGR SPC/E 5037 K+ J&C 22 3 s1ps S&D neutral 1JGR SPC/E 5187 K+ S&D 22 3 s1ps S&D neutral TIP 1JGR TIP3P 5187 K+ S&D 22 2 s1ps S&D 0.15M 1JGR SPC/E 5159 K+/Cl− S&D 36/14 5 s1ps S&D 0.5M 1JGR SPC/E 5118 K+/Cl− S&D 64/42 3 s1ps S&D 2.0M 1JGR SPC/E 5095 K+/Cl− S&D 162/140 2 s1ps J&J neutral 1JGR SPC/E 8609 K+ J&J 22 1 s1ps B&R neutral 1JGR SPC/E 4993 K+ B&R 22 1 s1ps PARMBSC0 S&D neutral TIP 1BNA TIP3P 4998 Na+ S&D 22 4 s1ps S&D 0.15M fiber SPCE 5044 K+ /Cl− S&D 36/14 2.4 s10ps J&C 0.15M fiber SPCE 5046 K+ /Cl− J&C 36/14 0.6 s10ps S&D NaCl 0.15M fiber SPCE 5044 Na+ /Cl− S&D 36/14 0.6 s10ps J&C NaCl 0.15M fiber SPCE 5049 Na+ /Cl− J&C 36/14 0.6 s10ps resolution of 2 A, this experimental WAXS spectrum is, as units of molarity as detailed elsewhere (45). Special atten- far as we know, the most accurate available for the DDD se- tion was paid to the convergence of the ion population both quence (41). To measure the intensities we used the method inside and outside the DNA major and minor grooves for developed by Park et al. (42), which was implemented in each bps as previously described (47). AmberTools by Case group. Conceptually, X-ray scattering compares the scattering intensity from the sample of inter- Similarities, global and local flexibility in Cartesian and He- est, in this case the full solvated DNA, to a ‘blank’ with just lical spaces. Deformation modes were determined from solvent present, and reports the difference, or ‘excess’ inten- a principal component analysis of the collected simu- sity. Consequently, we simulated a water box with 0.15 M lations using PCASUITE (http://mmb.pcb.ub.es/software/ + − of added Na Cl (50 ns of production run), with the same pcasuite/pcasuite.html). DNA entropy values were ob- settings mentioned above, and used it as the ‘blank’ sample. tained from trajectories using the Schlitter (48) and the Only waters and ions within 10 A distance from the nearest Andricioaei–Karplus (49) methods for all heavy atoms (ex- DNA atom were considered to build the spectra, and hy- cluding terminal base-pairs). Similarity indices were calcu- drogen atoms from the DNA were explicitly considered. In lated using Hess metrics (50), and energy-weighted similar- addition, we used the recent RISM model (43) to compute ities (9). Eigenvalues (in A ) were computed by diagonal- the WAXS spectra of the experimental structures 1BNA (X- izing the covariance matrix and were ordered according to ray), 1GIP (nuclear magnetic resonance; NMR) and the av- their contribution to the total variance. Self-similarities of erage structure from the MD. 1GIP is known to be the ex- the first 10 eigenvalues were computed by comparing the perimental structure that best matches the experimental so- first and second halves of a given trajectory. Relative sim- lution scattering profile ( 44). The distribution function of ilarities were computed as described in our previous work waters and ions computed with RISM also considered a (6,51,52). Stiffness constants were determined using base + − TIP3P solution with 0.15 of added Na Cl . pair step helical stiffness matrices, and base-resolution stiff- ness matrices, always obtained from the inversion of co- variance matrices derived from the atomistic simulations Analysis of the cations. The new CANION module from (10). Persistence lengths (in nm) were obtained according CURVES+ (45–47) was used to determine the position of to (53), considering all possible DNA sub-fragments. The each cation in curvilinear helicoidal coordinates for each sequence used to calculate the persistence lengths was arti- snapshot of the simulations with respect to the instanta- ficially extended by taking the inner 8 bp of the DDD se- neous helical axis. Given a distance D along the helical axis, quence and multiplying this segment by 20 to create a 160 ion distributions were computed for each bps (defined here bp oligomer: (CGAATTCG) . The calculations were ex- as N-0.2 ≤ D ≤ N+1.2 for a generic bps N pN ) inside 20 i i+1 ˚ ecuted on ensembles of 10 structures generated by an in- the grooves (distance from the axis R ≤ 10.25 A), divid- house implementation of Olson’s Monte Carlo procedure ing the contribution between the minor groove (A = 33– ◦ ◦ ◦ (54). As discussed elsewhere (53,54), persistence length is a 147 ) and the major groove (polar angle A = 33 to 0 to macroscopic descriptor of the polymeric flexibility of du- 147 ), as shown in Supplementary Figure S1. We analyzed plex DNA that can be compared with experimental mea- the ion distribution in one-dimensional (R, D or A) and sures. two-dimensional (RA, DA, DR) curvilinear helicoidal co- ordinates. Three-dimensional distributions were also recon- structed in Cartesian coordinates using an average structure Sampling of extreme cases and anharmonic distortions. for the DNA oligomers obtained with CPPTRAJ (38)from Certain fluctuations in the global structure of the DNA can- the full-length simulations. Ion densities were obtained in not be reasonably explained in the harmonic regime. Oth- Nucleic Acids Research, 2016, Vol. 44, No. 9 4055 ers, while harmonic, represent extreme cases found at the dimensions and torsion angles sampled in the simulations margins of the distribution of sampled conformations. The in all cases matched the values expected from experimental latter were detected by looking at the distribution of defor- structures. A more detailed comparison with a large num- mation energies, calculated for a reduced set of DNA con- ber of high resolution X-ray and solution NMR structures formations taken from each MD simulation (one structure is carried out below. every 2 ns), with respect to a reference state defined by the As already suggested by previous studies (7), the impact MD-derived basepair step stiffness matrix (6)and theaver- of using two different solvent models (SPC/E and TIP3P) is age DNA conformation. We approximate the distribution negligible in terms of the structural properties of DNA. We of deformation energies to a normal distribution, and con- analyzed in detail the six helical base pair step parameters sider extreme deformations to be those structures with en- along the DDD sequence (Figure 1), and the complete set ergies above either two or three standard deviations from of 16 helical parameters and 8 torsion angles (see Supple- + − + − the average. Anharmonicity was evaluated by applying the mentary Table S3 (Na Cl ) and S4 (K Cl )). The impact Shapiro–Wilk test (55), since none of the reduced sets of de- of changing the ion force-field parameters also led to very formation energies obtained for each trajectory had more small differences in terms of the global properties of DNA than 5000 values. Furthermore, the complete ensemble of (see Figure 1 and supplementary Tables S3 and 4), in good deformation energies (combining all the trajectories) was agreement with previous PARMBSC0 (19) results (7,63). + − + − analyzed graphically by means of Q-Q and box plots, char- The use of Na Cl or K Cl has also little impact on the acterizing the structures sampled by the force field beyond DNA structure, again in good agreement with previous sim- the harmonic approximation. In these analyses the terminal ulations (46,63). Finally, increasing the ionic strength (here base pairs were not considered. we tested ∼0.15, ∼0.5 and ∼2 M), also seemed to produce little effect (see Figure 1) on the average structure of DNA (for local effects linked to ionic strength see the discussion Statistics, graphics and molecular plots below). The statistical analysis, including the Bayesian Information As the MD conformational ensembles appeared to be Criterion (BIC) and linear correlations, as well as associated quite robust to changes in the solvent or ionic atmosphere, graphics, were obtained with the R 3.0.1 statistical package we combined all the trajectories to create a 43 smeta- (56) and the ggplot2 library (57). The molecular plots were trajectory from which a ‘theoretical’ conformational space generated using either VMD 1.9 (58), or the UCSF Chimera of B-DNA can be defined and compared with that derived package version 1.8.1 (59). from experimental structures (see ‘Materials and Meth- ods’ section and Supplementary Table S1), which were RESULTS AND DISCUSSION also solved in different environments (93 structures, which thanks to the palindromic nature of DDD sequence provide Environmental impacts on DNA 186 experimental estimates of helical parameters for each MD trajectories are dependent on the solvent and ion envi- type of base pair step in the sequence: CpG, GpC, GpA, ronment in two different ways: one legitimate, since changes ApA, ApT and TpT). As shown in Figure 2,noexperi- in solvent, ionic strength or the nature of the ions should mental conformation lies outside the ‘theoretical’ confor- impact simulations, sometimes in a dramatic way (13,60), mational space derived from our simulations. Furthermore, and one illegitimate, linked to the uncertainties in the force experimentally observed conformations lie in the regions of fields used to represent water or ions. Before analyzing the higher density in the MD-derived sampling (see Figure 2 for a comparison of the rotational space, and Supplementary detailed dynamics of DNA, it is therefore necessary to eval- Figure S2 for the translational space). DDD actually cov- uate the uncertainties introduced into simulations by the ion and/or solvent models used. For water, we considered ers quite a wide conformational space, as reflected by the the two most popular three-point models: TIP3P (61)and variety of experimental structures, well reproduced by the + − + − SPC/E(62), while for salt (Na Cl or K Cl ) we con- MD simulations. sidered models by Joung-Cheatham (J&C), Smith-Dang Although X-ray and NMR derived structures have been (S&D), Jensen-Jorgensen (J&J) and Beglov-Roux (B&R) considered for the last 20 years the gold-standard for force + − (the last two only for K Cl , see ‘Materials and Methods’ field comparison ( 64), they both suffer from some limita- section). All trajectories involved at least 2–3 s of simula- tions when it comes to represent the structure of the DNA in tion and were extended up to 5 s when ion convergence solution (65). Crystal packing and crystallization artifacts was not clear. in X-ray techniques, or user-biased integration of the peaks Consistent with the general article describing the parame- and refinements based on all-atom force fields in NMR ex- terization and validation of the PARMBSC1 force field ( 20), periments, are just some of the known limitations. To com- all the simulations yielded average RMSDs in the range plement this data, Small-Angle X-ray Scattering (SAXS) 1.9–2.3 A with respect to Arnott B-DNA values (1.6–2.0 and Wide-Angle X-ray Scattering (WAXS) are able to de- ˚ ˚ A if terminal base pairs were excluded), ∼2.2 Awithre- liver information about the shape and size of the molecules spect to the X-Ray structures (PDB IDs: 1BNA, 2BNA, in solution. At high resolution, structural polymorphism 7BNA and 9BNA) and ∼1.8 A with respect to the ensemble such as the B-DNA/B’-DNA can be detected (41), although of NMR structures in the PDB ID: 1NAJ. About 91–98% the spatial averaging carried out to derive the profiles also of the Watson–Crick hydrogen bonds were maintained dur- leads to a loss of information in SAXS/WAXS compared ing the multi-microsecond simulations explored in this work to crystallography. To complement our findings, we com- (see Supplementary Table S2). Helical parameters, groove pared our simulations to the high-resolution (2 A) WAXS 4056 Nucleic Acids Research, 2016, Vol. 44, No. 9 Figure 1. Averaged base pair step helical parameters along sequence for all the simulation performed with PARMBSC1. See Table 1 for a detailed de- scription of the simulated systems. Translational parameters (shift, slide and rise) are reported in A, and rotational ones (tilt, roll and twist) in degrees. The terminal base pairs were removed from the analysis. spectrum obtained for DDD by Zuo and Tiede (41). Nev- forming very well. Further improvements are likely to re- ertheless, the reader should be aware that comparisons be- quire the inclusion of new factors such as polarization. tween theoretically-derived and experimental spectra have to be made with caution, due to the problems in generat- Similarities, global and local flexibility ing profiles from structural models (specially related to the different way to treat the solvent; see ‘Materials and Meth- Processing of the covariance matrix obtained from atom- ods’ section), the different conditions in the simulation and istic MD simulations provides a direct measure of DNA experiments, and the lack of definition in certain regions flexibility in Cartesian space, which can be described in of the spectra. With these cautions in mind, it seems clear terms of essential deformation movements and quasi- from Figure 3 that PARMBSC1 is able to overcome some harmonic entropies (see ‘Materials and Methods’ section of the deviations from experiment described previously us- and reference (51)). These estimates cannot be directly com- ing PARMBSC0 (60), and provide spectra that fit well the pared with experimental observables, but are very useful experimental ones. Quantitative comparison of peak loca- in determining the similarity between deformation patterns tion reveals that PARMBSC1 recovers the first peak (P1) and stiffnesses derived from different force fields ( 6). As −1 near q ≈ 0.45 A , which was reported to be absent in shown in Supplementary Figure S3, the dynamics of the PARMBSC0 simulation (65). The major deviation from ex- central 10 bp of DDD obtained with PARMBSC1 have periment was found at wide angles (P5), where PARMBSC1 a very high (>75%) similarity with respect to a reference is slightly shifted respect to the experimental value, but simulation using PARMBSC0. This similarity increases to where the resolution of both theory (43) and experiment is more than 90% if Boltzmann indices are considered (Sup- also lower and peak location is not so clear. It is worth not- plementary Figure S3D). We can conclude that the nature ing that in general PARMBSC1 fits the experimental pro- and the magnitude of the principal deformations of DNA files with a quality similar or better than the best experi- are very similar with both force fields. This is confirmed by mentally derived structures (by NMR or X-Ray), even in an analysis of the Cartesian entropies and the stiffnesses the most complicated region (P1–P3) that reflects the struc- associated with the main deformation movements (Supple- ture of the sugar-phosphate backbone (see Supplementary mentary Table S6). Helical stiffness analysis provides an Table S5). alternative picture of DNA dynamics by considering lo- In summary, PARMBSC1 trajectories reproduce experi- cal perturbations of helical parameters (10). Results at the mental observables accurately and seem robust with respect base pair level (Supplementary Figure S4) and at the base to the (somewhat arbitrary) selection of ion and solvent pair step level (Supplementary Figure S5) show the expected force fields. In terms of general DNA structure, the trajecto- sequence dependence (6) and confirm the similarity be- ries are also quite insensitive to ionic strength (over a ‘phys- tween PARMBSC0 and PARMBSC1 results. Finally, poly- + − iological’ range) and to the nature of the salt (Na Cl or meric MD-stiffnesses derived from the extension of DDD + − K Cl ). These results suggest that globally, despite the use to a very long duplex (see ‘Materials and Methods’ section) of simple additive potentials (66,67) PARMBSC1 is per- demonstrate that PARMBSC1 also reproduces the persis- Nucleic Acids Research, 2016, Vol. 44, No. 9 4057 Figure 2. Comparison at the bps level between the theoretical and experimental rotational spaces. Rotational parameters (tilt, roll and twist) are reported in degrees. All distinct bps found in DDD are shown (removing the ends): GC (first column), CG (second column), GA (third column), AA (fourth column) and AT (fifth column). Smoothed 2D densities, estimated by fitting the observed distributions to a bivariated normal kernel (evaluated on a square grid of 90 × 90 bins), are depicted by coloring the points coming from the MD simulations with a color gradient from low (blue) to high (red) density. Four iso-density curves are shown in white, and are quantified on the bottom right side of each plot. Experimental conformations are shown as black dots (supplementary Table S1). tence length of duplex DNA with values ranging from 48 to clearly showing that PARMBSC1 simulations significantly 57 nm depending on simulation conditions (Supplementary sample extreme conformations, more frequently than ex- Table S6), compared to experimental estimates of around 50 pected from the harmonic regime. Furthermore, we applied nm (68). the Shapiro–Wilk test to check whether the sets of distor- To further investigate the capacity of MD to sample ex- tion energies could be drawn from a normally distributed treme conformations and also structural distortions beyond population. For all the trajectories, we rejected the null hy- the harmonic regime, we fitted the deformation energies (see pothesis with P < 0.05, supporting the deviation from nor- ‘Materials and Methods’ section) with a normal distribu- mality. We also analyzed the complete space of deforma- −1 tion obtaining an average () of 1.8 kcal mol with a stan- tion energies (combining the results obtained separately for dard deviation () of 0.4 for the meta-trajectory (note that each trajectory), using a graphical approximation. The dis- we obtained the same  and  for the single 10 s trajec- tribution, Q-Q plot, and boxplot presented in Supplemen- tory). If the movements of the DNA could indeed be de- tary Figure S6, clearly supports the presence of anharmonic scribed by the harmonic regime, one would expect that the distortions related with highly bent structures. tails of the distribution beyond  ± 2 would account for We saw above that solvent and ion force-field models, in- + − + − 4.56% of the total probability distribution. Following the cluding both Na Cl or K Cl , had little impact on the same reasoning, the probability that a normal deviate would global structure or flexibility of DDD. This may seem rea- lie beyond  ± 3 is at most 0.27%. Counting the num- sonable given the ‘physiological’ conditions range used in ber of times these extreme regions are sampled in the 10 s this work (see Noy and Golestanian (69)). However, while long trajectory led to 5.64 and 1.32% beyond the  ± 2 simulations carried out with minimal (neutralizing) ionic and  ± 3 limits respectively. Using the complete meta- strength seem to lead to shorter persistence lengths (Supple- trajectory that describes 43 s of DDD in different envi- mentary Table S6), there is no systematic trend relating flex- ronments we obtained 8.91% ( ± 2) and 2.41% ( ± 3), ibility and ionic strength. Note that the use of the AMBER 4058 Nucleic Acids Research, 2016, Vol. 44, No. 9 solvent or ionic atmosphere. We investigated this possibility in more detail by looking at the interactions of DNA with ions. The first point that becomes evident when looking at the trajectories is that while DNA structure is reasonably well converged in several hundred ns (46), the ionic envi- ronment may require significantly more time to converge, as suggested from earlier simulations (47). This is indeed what the analysis of ion population at the base pair step level (see ‘Materials and Methods’ section) shows (see Sup- + − plementary Figures S7–10 for the analysis of K Cl . Simi- + − lar profiles were obtained for Na Cl ). It is also clear that the convergence of the ion distribution depends on the ion force field (Supplementary Figures S7 and 8) and also on the region of DNA that is analyzed. As an example, ions rep- resented with the J&J model converge quickly (200 ns) in- side the grooves and around the DNA, whereas the J&C ion atmosphere is not fully converged in the 3 s studied here (Supplementary Figures S7 and 8). It is also clear that con- vergence is in general faster in external regions (around the phosphates, Supplementary Figures S7 and 9), than within the grooves and notably within the narrow minor groove where saw tooth-like curves can be observed (Supplemen- tary Figures S8 and 10). In these cases, convergence is not fully guaranteed even after 5 s (see the 150 mM S&D simu- lation in Supplementary Figure S10). It is worth noting that convergence problems do not decrease when ionic strength is increased, despite the fact that more ions are available in the DNA environment, indicating that it is not a sim- ple statistical problem. Indeed, the saw tooth-like popu- lation curves of S&D ions inside the minor groove (espe- cially at the central AT step) in the minimal salt simula- tions are present, and sometimes even amplified, in simu- lations at higher ionic strength (see Supplementary Figure S10). This suggests that ions visiting some narrow regions in Figure 3. Solution scattering profiles. ( A) Solution interference patterns the grooves may be frustrated (47), trapped in an oscillatory computed with the RISM approach (43) for the DDD crystal (PDB ID: regime between two different substates. Ions with long resi- 1BNA, green), the NMR (PDB ID:1GIP, yellow) and the average structure dence times inside the grooves, could also explain part of the from the MD simulation (PARMBSC1, blue), compared to the experimen- tal profile (red). Vertical dotted lines in red represent the peaks determined oscillatory regime. Thus, using the S&D 0.15 M simulation experimentally. (B) Scattering profiles obtained from the MD simulation we compared the volume of the groove, the time evolution of with PARMBSC1 using the method from Park et al.(42) (see ‘Materials + K ions visiting the minor groove and the average residence and Methods’ section), compared to the experimental result. The positions at A T and C G , which are the two most populated bps 6 7 3 4 of the peaks are reported in Supplementary Table S5. Note that the abso- at physiological concentration (Supplementary Table S7). lute intensities were accordingly shifted to a common origin to maximize the overlap. The data to produce the experimental curve was a courtesy of The average volume of the minor groove is significantly nar- 3 3 Prof David Tiede (41). ˚ ˚ rower for AT (193 A ) than for GC (239 A ) as previously re- ported (7,22). We also found that the average residence time of K inside the minor groove was 108 ps for AT versus 50 implementation of PME to treat long-range electrostatics ps for CG (if we consider an ion to be present when it stays precludes performing simulations at very low ionic strength at least 20 consecutive ps inside the groove (46)). Indeed, (net-charged systems), where the connection between global K ions are able to remain within the AT step for several flexibility and ionic strength could become significant. The hundreds of ns (Supplementary Figure S11). During these reason is the implicit presence of a net-neutralizing plasma long periods there is a higher probability of simultaneously that appears due to the omission of the zeroeth-order term finding two cations inside the narrow groove of AT, com- in the reciprocal Ewald sum (70,71). pared to CG. Based on the visual inspection of a single ex- tended trajectory, this double occupancy seems to produce Ion atmosphere an imbalance that triggers the release of both ions from the Previous sections have shown that the global structure and groove within a few ps. This could explain the oscillatory dynamics of DNA is not dramatically dependent on the sol- ion population at the AT step, as it indeed occurs at each vent or ion force field, the nature of the monovalent cations of the sawtooth-like peak we observed in the AT time series + + (Na or K ), or the ionic strength (within the range stud- (Supplementary Figure S11). Nevertheless, a more system- ied). However, this robustness of DNA to environmental atic approach with statistical support should be undertaken conditions does not preclude local changes linked to the Nucleic Acids Research, 2016, Vol. 44, No. 9 4059 to confirm these findings. Similar events are not seen in the cations ‘fixed’ in the crystal cell, and the high concentration minor groove of CG steps (Supplementary Figure S12). at which the cations were found inside the grooves of some While remaining cautious with respect to convergence specific bps (up to two orders of magnitude higher respect problems, we can reach some general conclusions on the to the physiological background), make us think that these impact of the ion force field on ion populations around cations reached their final position in the crystal following DNA. Of the four K models tested, the Lorentz-Berthelot a clear sequence-dependent pattern. (LB) implementation of J&J is the one showing the weak- As discussed below, differences in cation population or est affinity for DNA (Figure 4 and Supplementary Figure density do not lead to significant structural or dynamic dif- S8), leading to very low ion populations inside the grooves ferences in DNA. However, a detailed analysis does show and failing to explain the regions of high cation density local changes linked to ion populations in the grooves. As an found experimentally (25,72,73) in the minor groove of the example, the J&J model which showed the weaker affinity AATT segment (note that this different behavior could be for DNA leads to wider grooves (very visible in the AATT due to the conversion from geometric to LB combination segment, see Supplementary Figure S14), while the J&C rules used to build the Lennard-Jones potential in AMBER model, with the strongest DNA affinity, leads to narrower (37,63), since these parameters were created to work with minor grooves in the central AATT segment. A clear corre- another van der Waals functional (28)). J&C is the one with lation is also observed between increasing ion concentration the strongest DNA affinity, possibly explaining its severe and the width of both DNA grooves (Supplementary Fig- convergence problems in regions with narrow grooves. Fi- ure S16). With the S&D model, the average minor groove ˚ ˚ nally, the S&D and B&R models (the two most used in width changes from 4.06 A (neutral system) to 3.88 A(2.0 −5 DNA simulations), at a first glance, give similar ion dis- M system; standard error of 2 × 10 A). The absolute dif- tributions (see Figure 4 and Supplementary Figure S8). A ference between these two extreme conditions is small in ab- more detailed picture of ion environment can be obtained solute terms, but is enough to add extra structural frustra- by looking at 3D density plots (see ‘Materials and Meth- tion to K+ ions entering the minor groove. In general, an ods’ section) such as those shown for B&R, J&C and S&D increased population of ions attached to DNA leads to an in Figure 5. increase in the local stiffness associated to the central AATT Looking at the minimal salt trajectories, differences in tract, but the differences are evident only when the ‘extreme’ ionic atmosphere are especially visible at the edges of the J&J and J&C models are compared (Supplementary Figure groove, around the phosphate groups (where only the J&C S15). In contrast to groove geometry, no noticeable changes model generates ion density when analyzing the 1.5 M iso- were found in BI/BII populations. Overall, it is clear that B- molarity surface, Supplementary Figure S13) and in the DNA is more affected by the choice of ion force fields than central minor groove region, where the J&C model tends by the bulk ion concentration (Supplementary Figures S15 to concentrate all ion density in the AT and CG steps. In and 16), but these effects remain relatively mild. contrast the J&J model predicts a low ion concentration of ions anywhere in the groove, while S&D and B&R models Long-timescale dynamics of DNA show a more homogeneous distribution of ions along the groove (Figure 4, see Supplementary Tables S7–10). Look- Results above demonstrate that, in general terms, the tra- ing at high isomolarity surfaces (5 M) shows that not all jectories obtained from MD simulations are robust with the parameterizations are able to reproduce the location respect to choices of water or ion force fields, the ionic of the K ions that co-crystallized with the DNA in the strength or the nature of the salt. We next decided to ex- high-resolution X-ray experiment (25). It could be argued tend one of our trajectories (S&D, minimal salt (22,46)) to that those co-crystallized cations reached their final loca- 10 s to explore slow conformational changes that might tion in the crystal cell due to packing or crystallization ef- be not visible in shorter simulations. The entire 10 s DDD fects. Nevertheless, the cations are found buried inside the trajectory samples canonical B-DNA conformations which grooves of the DNA in close interaction with the bases. are very close to both X-Ray (21,74–76) and NMR (77) The analysis of their precise location as been the subject of structures (Figure 6). The average helical parameters (twist, several studies, where the position of cations is correlated roll, tilt, shift, slide, rise) derived from the MD simulation with changes in the groove widths, being part of a complex (including terminal bases) is (35.2, 2.8, 0.2, 0.0, −0.2, 3.3), structured network involving cations, water and DNA in- which compares extremely well with the results derived from teractions (22,73). While J&C and S&D models correctly NMR ensembles (PDB ID: 1NAJ (35.9, 2.3, 0.0, 0.0, −0.2, predict the position of K ions in both grooves (S&D be- 3.2)) or X-Ray crystallography (PDB IDs: 1BNA, 2BNA, ing the more accurate), B&R only reproduces the cations 7BNA and 9BNA: (35.0, −0.3, −0.2, −0.1, −0.2, 3.3), con- found in the major groove, while a systematic shift of the firming the ability of PARMBSC1 to reproduce the overall density clouds is observed in the minor groove (Figure 5). conformational properties of DNA (see Table 2 for more de- When the ionic strength is increased (only studied for the tails) in simulations that are at least an order of magnitude S&D model), the general 3D distribution of K does not longer than today’s ‘state-of-the-art’. change dramatically (see Supplementary Figure S14), ex- This long trajectory is also able to capture some subtle cept for the overall increase in ion density that is particularly details, such as the lower  values sampled by guanosines visible in the major groove and at the groove edges close compared with the other nucleotides, sugar phase angle dis- to the phosphate groups where new sites are populated. tributions sampling the correct South–South East regions The general good agreement between the densities coming (Supplementary Table S11), the sequence variability of he- from the free dynamics in solution and the co-crystallized lical twist (7,51)and BI/BII populations (Supplementary 4060 Nucleic Acids Research, 2016, Vol. 44, No. 9 Figure 4. Average K populations inside and outside the DNA grooves. Populations inside the major and the minor groove (left two panels), and outside both grooves (right two panels). The populations were measured for each bps removing the terminal ones (see ‘Materials and Methods’ section). See Supplementary Figure S1 for the CHC partitioning scheme used to divide the grooves. Figure 5. Potassium distributions along the helix. Cartesian K isomolarity surfaces at 5.0 M reconstructed from the CHC histograms with respect to the average structure (shown as a silver surface). For comparison purposes, neutral systems have been overlapped with the Tl cations (red spheres) that co-crystallized with the DNA (PDB ID: 1JGR). Note that thallium cations are used as a replacement of potassium in diffraction experiments (1). The distribution with the J&J parameters are not shown since any visible density was observed at 5.0 M concentration. Nucleic Acids Research, 2016, Vol. 44, No. 9 4061 Figure 6. Descriptors of the quality of the simulation: (A) Root mean square deviation (RMSD) of all the heavy atoms of the Drew–Dickerson dodecamer (DDD) respect to the average experimental value. In blue, RMSD against an average of the X-ray structures (PDB IDs: 1BNA, 2BNA, 7BNA and 9BNA); In orange, RMSD against an average of the NMR ensemble with 5 structures (PDB ID: 1NAJ). For the sake of clarity, running averages every 20 ns are shown in dark orange and dark blue, for X-ray and NMR respectively. (B) Same than (A) but without considering the capping base pairs (i.e. removing all the heavy atoms of base pairs C1:G34 and G12:C13). (C) Evolution of the total number of Watson–Crick hydrogen bonds (Hbonds) with time. Considering a perfect interaction between the 12 bp would lead to a total of 32 Hbonds (light blue dashed line). Without considering the capping base pairs (light red), the ideal total number of Hbonds is 26 (light red dashed line). Hbonds were considered formed if the distance between the donor–acceptor atoms was ≤3.5 A. (D) Sequence averaged twist for all the base pair steps (bps), excluding the terminals, with time. The average MD value is shown with a black line, while the experimental references are shown in dark red and orange dashed lines, for X-ray and NMR respectively. Figure S17). Backbone torsions follow the expected behav- plexes (78) are also detected here (Figure 7), thus improving ior for a duplex B-DNA, preferentially exploring the canon- on PARMBSC0 behavior. As shown in Supplementary Fig- ical B-DNA substate characterize by  in gauche− (g−)and ure S18,  transitions are common (on average 460 tran- in gauche+ (g+) (Supplementary Figure S18 and Supple- sitions per s per nucleotide) and fast (average residence mentary Table S12), with  in trans (t) and  in g− (i.e. the times being around 3 ps); although only 0.01% of the non- BI state). canonical  states have non-negligible survival times of Very interestingly, despite the canonical  state be- up to 1.2 ns (averaging across the four nucleotide types). ing the most populated (in agreement with previous Long-lived -flips of around 100 ns from the g+ to the non- PARMBSC0 simulations (10,22)), all the non-canonical canonical t state where not observed. Note that these flips conformations found in experimental protein–DNA com- were recently suggested to be a source of convergence prob- 4062 Nucleic Acids Research, 2016, Vol. 44, No. 9 Table 2. Sequence-averaged conformational parameters obtained from the 10 s Drew–Dickerson dodecamer simulation b c Parameter Average SD Range Minimum Maximum NMR Xray Shear 0.00 0.30 6.43 − 3.62 2.81 0.00 0.03 Stretch 0.02 0.12 3.27 − 0.78 2.48 − 0.29 0.19 Stagger 0.10 0.38 4.79 − 2.15 2.66 0.02 0.21 Buckle 0.0 9.7 92.9 − 48.3 48.0 0.0 − 0.5 Propeller − 9.2 8.4 81.6 − 49.6 39.2 − 17.4 − 14.4 Opening 1.3 4.0 71.1 − 29.4 56.8 1.1 1.6 Xdisp − 0.58 1.05 10.42 − 6.05 4.36 − 0.01 − 0.15 Ydisp 0.00 0.84 9.16 − 4.59 4.58 0.02 0.52 Inclination 2.2 5.7 56.0 − 26.1 31.0 1.7 − 0.6 Tip 0.1 6.9 58.6 − 41.7 41.2 0.0 − 2.6 Shift − 0.01 0.81 7.31 − 3.71 3.63 0.00 − 0.07 Slide − 0.24 0.53 5.50 − 3.17 2.34 − 0.22 0.14 Rise 3.32 0.29 3.32 1.96 5.30 3.20 3.35 Tilt − 0.1 4.7 44.6 − 22.6 23.0 0.0 − 0.4 Roll 1.5 5.5 60.1 − 31.8 31.7 2.3 − 0.7 Twist 34.3 5.5 52.3 3.4 57.1 36.0 35.2 − 72.6 18.2 314.4 − 60.2 − 57.5 166.8 21.7 251.1 171.0 166.4 55.6 23.2 243.6 49.6 48.3 135.2 14.5 119.7 125.7 126.3 − 159.9 23.5 169.1 − 170.8 − 164.3 − 111.4 36.1 203.1 − 103.5 − 112.1 − 111.7 16.2 138.6 − 111.5 − 113.5 Phase 152.0 27.2 267.3 135.0 135.7 Amplitude 41.3 6.5 61.4 34.0 41.1 Capping base pairs were removed from the analysis. Computed from the ensemble of structures (PDB ID: 1NAJ). Computed from the X-ray structures with PDB IDs: 1BNA, 2BNA, 7BNA and 9BNA. For the dihedral angles only the Watson strand was considered. lems when using PARMBSC0 (10). As expected (7,22), C (Supplementary Table S13). Note also that the weighted- and G nucleotides show longer-lived and more frequent  average twist obtained with the BIC method (82) is higher transitions than A or T (Supplementary Figures S18). How- with PARMBSC1 as a consequence of the higher popula- ever, the occurrence of non-canonical  states is not coop- tion of the high twist state (0.66 versus 0.52 in PARMBSC0; erative and does not lead to the destructuring of the dou- Supplementary Table S13). PARMBSC1 is able to correctly ble helix that was found with older force fields ( 18). On av- reproduce this complex choreography: twist is correlated erage, at any given moment, less than one (0.86) of the 24 with ,withBI/BII, with the formation of the CH-O in- nucleotides is in an unusual  state. Extrapolating to poly- teraction and with the slide polymorphism in the neighbor- meric DNA implies that 3.6% of nucleotides will exhibit an ing step (Supplementary Figure S20). This is expected to unusual  conformation. This could be a factor favoring be particularly important in understanding indirect recog- recognition by specific proteins, given that crystal structures nition of DNA by proteins (83,84,85), and also the mech- of protein-DNA complexes show a significant percentage of anism of DNA intercalation by small organic compounds such states (10%, (67)). (46,86). As expected from previous experimental and theoretical We have also used our long (10 s) DDD simulation to studies (7,8,22,79,80), C·G base pairs show a significant investigate base opening events. It is experimentally known propensity for BI/BII transitions (Figure 7), as evidenced that base opening (understood as conformational states by the concerted changes in  and  from t to g− and from where bases are unpaired for significant periods of time) g− to t respectively (Supplementary Figure S19). BI/BII happens on the millisecond timescale (87), at least for cod- relative populations are notably improved with respect to ing nucleobases (88), and thus far beyond our simulation PARMBSC0, and now reproduce more accurately NMR window. Terminal base pairs, where stacking interactions experiments (Table 3 and Supplementary Figure S17). As are weaker should show slightly higher opening frequencies, previously suggested (8,46,81), BI/BII polymorphism is but available experimental data do not support the presence directly connected to two sources of structural polymor- of long-lived open states for terminal base pairs (20)and phism, namely, the twist bimodality found for CG steps and strongly warn against long-lived non-canonical conforma- the slide polymorphism observed for RpR steps. Concerted tions of terminal base pairs stabilized by interactions with movements of the backbone and the bases allow the for- the DNA grooves, as found in previous simulations with mation of an intra-molecular hydrogen bond of the type other force fields ( 20,89). The PARMBSC1 10 s trajectory C8H8-O3 between RpR steps, that was proved to be casu- shows that, as expected, the terminal C·G pairs are more ally connected to BII populations (8,46). These concerted labile that the central ones and it is not rare to lose some movements seems to be linked to twist polymorphism, the of the hydrogen bonds for short periods of time (see Figure low twist state being driven by the presence of cations specif- 6). Fraying events are common, and unusual arrangements ically binding in the minor groove of CG steps (46). Indeed, of the terminal bases are visited, but they quickly revert to a systematic increase in the low twist state is observed upon canonical Watson–Crick pairing (see Figure 8)asexperi- increasing the amount of added K+Cl− from 0.15 to 2.0M mentally expected (20). The tWC/SE state (where cytosine Nucleic Acids Research, 2016, Vol. 44, No. 9 4063 Table 3. Percentage of BI substates obtained with PARMBSC1 compared with PARMBSC0 and NMR experiments Bps PARMBSC0 PARMBSC1 NMR1 NMR2 GC 48 73 75 53 CG 87 78 75 66 GA 53 45 53 44 AA 91 82 75 63 AT 99 98 100 78 TT 100 99 99 93 TC 98 93 89 92 CG 79 62 75 77 GC 71 61 79 35 BI percentage for the bps in the DD dodecamer, obtained by averaging the difference between  and  angles at the 3 -junction of the Watson strand of each base pair. NMR1 and NMR2 are values obtained using phosphorus chemical shifts as detailed in the works of Schwieters et al. (NMR1 (80)) and Tian et al. (NMR2, (79)), respectively. is turned around the glycosidic bond from the anti to syn conformation to form a non-WC pair resembling the trans Watson–Crick/Sugar Edge C·G pair well-known in RNA −5 structures (90)), now occurs very rarely (probability <10 and with ps lifetimes), in line with NMR experiments, but in contrast to earlier simulations (7,89). The time evolution of the base pair opening parameter shows that most of the fraying is due to transient sampling of large opening angles (Figure 8). We observe very few events where the glycosidic torsion ( ) goes from anti to syn in the terminal cytosine (one such event is highlighted by the blue circle in Figure 8). These rare and reversible events are connected with the for- mation of aO2···5 OH intra-cytosine hydrogen bond, which in turn stabilizes the anomalous tWC/SE conformation. Our simulations suggest that rare and short-lived tWC/SE conformations do not affect neighboring base steps and have no impact on DNA structure and dynamics on multi- microsecond timescales. In addition, no through-the-groove interactions between terminal and inner bases are observed in our long trajectory. Finally, the 10 s trajectory allowed us to analyze conver- gence issues in unprecedented detail; significantly extend- ing previous studies (10,91,92). For this reason, we per- formed principle component analysis on segments of 1, 2 and 5 s extracted from the 10 s trajectory. Visual inspec- tion of Supplementary Figure S21 (supplementary mate- rial) shows slight divergence between 1 s segments, but no significant differences are observed between the 2 and 5 s segments. The smoothed histograms of the main principal components clearly overlap, suggesting that DNA is sam- pling the same conformational space. Similarly, differences in entropy values for segments of 2 s are smaller than 2% with respect to the entropy of the whole 10 s trajectory, independently of the method used to calculate the entropy (see Supplementary Figure S22 in the supplementary mate- rial). This analysis suggest that PARMBSC1 is able to sam- ple in 2 s, what the user can expect to see in terms of con- formational ensembles in 10 s(91), and for most purposes (ion distribution being an exception) 2 s trajectories can be considered to be converged. Figure 7. Major substates of the backbone observed during the simula- tion of DDD. (A) Scatter plot of  and  angles grouped by nucleobase. To obtain the distribution of A in the -plane the dynamics of the nucle- CONCLUSIONS obases A5 and A6 were considered together (similarly: C3/C9, G4/G10 and T7/T8 were used to build the C, G and T ensembles respectively). (B) The availability of PARMBSC1, a new and accurate force Same as (A), but for  and  angles. The global percentages (considering field for DNA, has allowed us to explore the long timescale both strands) of the major canonical states are shown in white. dynamics of the DDD, a prototypical B-DNA duplex, in aqueous solution. An unprecedented degree of sampling (involving more than 43 s of simulation, including a sin- 4064 Nucleic Acids Research, 2016, Vol. 44, No. 9 Figure 8. Analysis of the base-pair fraying at the ends of the dodecamer. Note that we used the same metrics described elsewhere (78) to analyze the fraying of DDD simulated with previous force fields. ( A) Formation of an anomalous intra-cytosine hydrogen bond observed with PARMBSC0 and PARMBSC0- OL4 (one formation event is highlighted by a blue circle). (B) Time evolution of the opening parameter. (C)  angle during simulation is shown in red, light blue, dark green and light green for C1, G12, C13 and G24 respectively. (D) Mass-weighted RMSD of the capping base pairs respect to the first structure of the simulation. (E) Representative structures of the four mayor conformations found are depicted bellow. A similar behavior was observed in the other simulations performed changing the environment conditions (data not shown). gle 10 s trajectory) has allowed us to test the impact of FUNDING different solvent and ion models, and of different salts, on MINECO Severo Ochoa Award of Excellence (Government DNA and to characterize the ion atmosphere around the of Spain) (to IRB Barcelona); Spanish Ministry of Science double helix. We have also analyzed the detailed interplay [BIO2012-32868, BFU2014-61670-EXP to M.O.]; Catalan between ions and local and global conformational changes, SGR (to M.O.); Instituto Nacional de Bioinformatica ´ (to the prevalence of non-harmonic distortions and we have ob- M.O.); European Research Council (ERC SimDNA) (to tained reliable estimates for conformational jumps that can M.O.); H2020 program (MuG and BioExcel projects) (to be important in explaining the specific binding of proteins M.O.); Czech Republic Grant Agency [14-21893S to T.D., or ligands to DNA. Lastly, the simulations of the DDD with F.L.]; ANR grant CHROME [ANR-12-BSV5-0017-01 to PARMBSC1 are shown to accurately reproduce experimen- R.L.]. Funding for open access charge: European Research tal data and to represent a clear improvement over the re- Council (ERC SimDNA). sults obtained with earlier force fields. Conflict of interest statement. None declared. SUPPLEMENTARY DATA Supplementary Data are available at NAR Online. Nucleic Acids Research, 2016, Vol. 44, No. 9 4065 REFERENCES 25. Howerton,S., Sines,C., VanDerveer,D. and Williams,L.D. (2001) Locating monovalent cations in the grooves of B-DNA. Biochemistry, 1. Hud,N.V. (2008) Nucleic Acid-Metal Ion Interactions. Biomolecular 40, 10023–10031. Sciences. The Royal Society of Chemistry, Cambridge. 26. Smith,D.E. and Dang,L.X. (1994) Computer simulations of NaCl 2. Geggier,S. and Vologodskii,A. (2010) Sequence dependence of DNA association in polarizable water. J. Chem. Phys., 100, 3757–3766. bending rigidity. Proc. Natl. Acad. Sci. U.S.A., 107, 15421–15426. 27. Joung,I.S. and Cheatham,T.E. (2008) Determination of alkali and 3. Huguet,J.M., Bizarro,C. V, Forns,N., Smith,S.B., Bustamante,C. and halide monovalent ion parameters for use in explicitly solvated Ritort,F. (2010) Single-molecule derivation of salt dependent biomolecular simulations. J. Phys. Chem. B, 112, 9020–9041. base-pair free energies in DNA. Proc. Natl. Acad. Sci. U.S.A., 107, 28. Jensen,K.P. and Jorgensen,W.L. (2006) Halide, ammonium, and 15431–15436. alkali metal ion parameters for modeling aqueous solutions. J. Chem. 4. Heng,J.B., Aksimentiev,A., Ho,C., Marks,P., Grinkova,Y.V., Theory Comput., 2, 1499–1509. Sligar,S., Schulten,K. and Timp,G. (2006) The electromechanics of 29. Beglov,D. and Roux,B. (1994) Finite representation of an infinite DNA in a synthetic nanopore. Biophys. J., 90, 1098–1106. bulk system: Solvent boundary potential for computer simulations. J. 5. Strick,T.R., Dessinges,M.-N., Charvin,G., Dekker,N.H., Chem. Phys., 100, 9050–9063. Allemand,J.-F., Bensimon,D. and Croquette,V. (2003) Stretching of 30. Shields,G.C., Laughton,C.A. and Orozco,M. (1997) Molecular macromolecules and proteins. Reports Prog. Phys., 66, 1–45. dynamics simulations of the d(T·A·T) triple helix. J. Am. Chem. Soc., 6. Per ´ ez,A., Lankas,F., Luque,F.J. and Orozco,M. (2008) Towards a 119, 7463–7469. molecular dynamics consensus view of B-DNA flexibility. Nucleic 31. Berendsen,H.J.C., Postma,J.P.M., van Gunsteren,W.F., DiNola,A. Acids Res., 36, 2379–2394. and Haak,J.R. (1984) Molecular dynamics with coupling to an 7. Drsa ˇ ta,T., Per ´ ez,A., Orozco,M., Morozov,A. V, Sponer,J. and external bath. 81, 3684–3690. Lankas,F ˇ . (2013) Structure, stiffness and substates of the 32. Mor,A., Ziv,G. and Levy,Y. (2008) Simulations of proteins with dickerson-drew dodecamer. J. Chem. Theory Comput., 9, 707–721. inhomogeneous degrees of freedom: The effect of thermostats. J. 8. Pasi,M., Maddocks,J.H., Beveridge,D., Bishop,T.C., Case,D.A., Comput. Chem., 29, 1992–1998. Cheatham,T., Dans,P.D., Jayaram,B., Lankas,F., Laughton,C. et al. 33. Nose,S ´ . and Klein,M.L. (2006) Constant pressure molecular (2014) ABC: a systematic microsecond molecular dynamics study of dynamics for molecular systems. Mol. Phys., 50, 1055–1076. tetranucleotide sequence effects in B-DNA. Nucleic Acids Res., 42, 34. Ryckaert,J.-P., Ciccotti,G. and Berendsen,H.J. (1977) Numerical 12272–12283. integration of the cartesian equations of motion of a system with 9. Lankas,F., Sponer,J., Hobza,P. and Langowski,J. (2000) constraints: molecular dynamics of n-alkanes. J. Comput. Phys., 23, Sequence-dependent elastic properties of DNA. J. Mol. Biol., 299, 327–341. 695–709. 35. Darden,T., York,D. and Pedersen,L. (1993) Particle mesh Ewald: an 10. Drsa ˇ ta,T. and Lankas,F ˇ . (2015) Multiscale modelling of DNA N·log(N) method for Ewald sums in large systems. J. Chem. Phys., 98, mechanics. J. Phys. Condens. Matter, 27, 323102. 10089–10092. 11. Per ´ ez,A., Luque,F.J. and Orozco,M. (2012) Frontiers in molecular 36. Salomon-Ferrer,R., Gotz,A.W ¨ ., Poole,D., Le Grand,S. and dynamics simulations of DNA. Acc. Chem. Res., 45, 196–205. Walker,R.C. (2013) Routine microsecond molecular dynamics 12. Arcella,A., Dreyer,J., Ippoliti,E., Ivani,I., Portella,G., Gabelica,V., simulations with AMBER on GPUs. 2. Explicit solvent particle mesh Carloni,P. and Orozco,M. (2015) Structure and dynamics of Ewald. J. Chem. Theory Comput., 9, 3878–3888. oligonucleotides in the gas phase. Angew. Chem. Int. Ed. Engl., 54, 37. Case,D.A., Babin,V., Berryman,J.T., Betz,R.M., Cai,Q., Cerutti,D.S., 467–471. Cheatham,T.E. III, Darden,T.A., Duke,R.E., Gohlke,H. et al. (2014) 13. Portella,G., Germann,M.W., Hud,N.V. and Orozco,M. (2014) MD AMBER. University of California, San Francisco. and NMR analyses of choline and TMA binding to duplex DNA: on 38. Roe,D.R. and Cheatham,T.E. (2013) PTRAJ and CPPTRAJ: the origins of aberrant sequence-dependent stability by alkyl cations software for processing and analysis of molecular dynamics trajectory in aqueous and water-free solvents. J. Am. Chem. Soc., 136, data. J. Chem. Theory Comput., 9, 3084–3095. 3075–3086. 39. Hospital,A., Faustino,I., Collepardo-Guevara,R., Gonzalez,C ´ ., 14. Arcella,A., Portella,G., Collepardo-Guevara,R., Chakraborty,D., Gelp´ı,J.L. and Orozco,M. (2013) NAFlex: a web server for the study Wales,D.J. and Orozco,M. (2014) Structure and properties of DNA in of nucleic acid flexibility. Nucleic Acids Res., 41, W47–W55. apolar solvents. J. Phys. Chem. B, 118, 8540–8548. 40. Lavery,R., Moakher,M., Maddocks,J.H., Petkeviciute,D. and 15. Dans,P.D., Walther,J., Gomez,H. ´ and Orozco,M. (2016) Multiscale Zakrzewska,K. (2009) Conformational analysis of nucleic acids simulation of DNA. Curr. Opin. Struct. Biol., 37, 29–45. revisited: Curves+. Nucleic Acids Res., 37, 5917–5929. 16. Sponer,J., Cang,X. and Cheatham,T.E. (2012) Molecular dynamics 41. Zuo,X., Cui,G., Merz,K.M., Zhang,L., Lewis,F.D. and Tiede,D.M. simulations of G-DNA and perspectives on the simulation of nucleic (2006) X-ray diffraction “fingerprinting” of DNA structure in acid structures. Methods, 57, 25–39. solution for quantitative evaluation of molecular dynamics 17. Sim,A.Y.L., Minary,P. and Levitt,M. (2012) Modeling nucleic acids. simulation. Proc. Natl. Acad. Sci. U.S.A., 103, 3534–3539. Curr. Opin. Struct. Biol., 22, 273–278. 42. Park,S., Bardhan,J.P., Roux,B. and Makowski,L. (2009) Simulated 18. Varnai,P ´ . and Zakrzewska,K. (2004) DNA and its counterions: a x-ray scattering of protein solutions using explicit-solvent models. J. molecular dynamics study. Nucleic Acids Res., 32, 4269–4280. Chem. Phys., 130, 134114. 19. Per ´ ez,A., Marchan,I., ´ Svozil,D., Sponer,J., Cheatham,T.E., 43. Nguyen,H.T., Pabit,S.A., Meisburger,S.P., Pollack,L. and Case,D.A. Laughton,C.A. and Orozco,M. (2007) Refinement of the AMBER (2014) Accurate small and wide angle x-ray scattering profiles from force field for nucleic acids: improving the description of atomic models of proteins and nucleic acids. J. Chem. Phys., 141, alpha/gamma conformers. Biophys. J., 92, 3817–3829. 22D508. 20. Ivani,I., Dans,P.D., Noy,A., Per ´ ez,A., Faustino,I., Hospital,A., 44. Zuo,X. and Tiede,D.M. (2005) Resolving conflicting crystallographic Walther,J., Andrio,P., Goni,R., ˜ Balaceanu,A. et al. (2015) Parmbsc1: and NMR models for solution-state DNA with solution X-ray a refined force field for DNA simulations. Nat. Methods, 13, 55–58. diffraction. J. Am. Chem. Soc., 127, 16–17. 21. Drew,H.R., Wing,R.M., Takano,T., Broka,C., Tanaka,S., Itakura,K. 45. Lavery,R., Maddocks,J.H., Pasi,M. and Zakrzewska,K. (2014) and Dickerson,R.E. (1981) Structure of a B-DNA dodecamer: Analyzing ion distributions around DNA. Nucleic Acids Res., 42, conformation and dynamics. Proc. Natl. Acad. Sci. U.S.A., 78, 8138–8149. 2179–2183. 46. Dans,P.D., Faustino,I., Battistini,F., Zakrzewska,K., Lavery,R. and 22. Perez,A., Luque,F.J. and Orozco,M. (2007) Dynamics of B-DNA on Orozco,M. (2014) Unraveling the sequence-dependent polymorphic the microsecond time scale. J. Am. Chem. Soc., 129, 14739–14745. behavior of d(CpG) steps in B-DNA. Nucleic Acids Res., 42, 23. Trieb,M., Rauch,C., Wellenzohn,B., Wibowo,F., Loerting,T. and 11304–11320. Liedl,K.R. (2004) Dynamics of DNA: B I and B II Phosphate 47. Pasi,M., Maddocks,J.H. and Lavery,R. (2015) Analyzing ion Backbone Transitions. J. Phys. Chem. B, 108, 2470–2476. distributions around DNA: sequence-dependence of potassium ion 24. Arnott,S. and Hukins,D.W. (1972) Optimised parameters for A-DNA distributions from microsecond molecular dynamics. Nucleic Acids and B-DNA. Biochem. Biophys. Res. Commun., 47, 1504–1509. Res., 43, 2412–2423. 4066 Nucleic Acids Research, 2016, Vol. 44, No. 9 48. Schlitter,J. (1993) Estimation of absolute and relative entropies of 73. Shui,X., Sines,C.C., McFail-Isom,L., VanDerveer,D. and macromolecules using the covariance matrix. Chem. Phys. Lett., 215, Williams,L.D. (1998) Structure of the potassium form of 617–621. CGCGAATTCGCG: DNA deformation by electrostatic collapse 49. Andricioaei,I. and Karplus,M. (2001) On the calculation of entropy around inorganic cations. Biochemistry, 37, 16877–16887. from covariance matrices of the atomic fluctuations. J. Chem. Phys., 74. Drew,H., Samson,S. and Dickerson,R.E. (1982) Structure of a 115, 6289–6292. B-DNA dodecamer at 16 K. Proc. Natl. Acad. Sci. U.S.A., 79, 50. Hess,B. (2000) Similarities between principal components of protein 4040–4044. 75. Westhof,E. (1987) Re-refinement of the B-dodecamer dynamics and random diffusion. Phys. Rev. E. Stat. Phys. Plasmas. d(CGCGAATTCGCG) with a comparative analysis of the solvent in Fluids. Relat. Interdiscip. Topics, 62, 8438–8448. it and in the Z-hexamer d(5BrCG5BrCG5BrCG). J. Biomol. Struct. 51. Per ´ ez,A., Blas,J.R., Rueda,M., Lopez-Bes,J ´ .M., de la Cruz,X. and Dyn., 5, 581–600. Orozco,M. (2005) Exploring the essential dynamics of B-DNA. J. 76. Holbrook,S., Dickerson,R. and Kim,S.-H. (1985) Anisotropic Chem. Theory Comput., 1, 790–800. thermal-parameter refinement of the DNA dodecamer 52. Orozco,M., Per ´ ez,A., Noy,A. and Luque,F.J. (2003) Theoretical CGCGAATTCGCG by the segmented rigid-body method. Acta methods for the simulation of nucleic acids. Chem.Soc.Rev., 32, Crystallogr. B, 41, 255–262. 350–364. 77. Wu,Z., Delaglio,F., Tjandra,N., Zhurkin,V. and Bax,A. (2003) 53. Noy,A. and Golestanian,R. (2012) Length Scale Dependence of Overall structure and sugar dynamics of a DNA dodecamer from DNA Mechanical Properties. Phys. Rev. Lett., 109, 228101. homo- and heteronuclear dipolar couplings and (31)P chemical shift 54. Zheng,G., Czapla,L., Srinivasan,A.R. and Olson,W.K. (2010) How anisotropy. J. Biomol. Nmr, 26, 297–315. stiff is DNA? Phys. Chem. Chem. Phys., 12, 1399–1406. 78. Varnai,P. (2002) alpha/gamma transitions in the B-DNA backbone. 55. Mecklin,C.J. (2007) Shapiro-Wilk Test for Normality. In: Salkind,NJ Nucleic Acids Res., 30, 5398–5406. and Rasmussen,K (eds). Encyclopedia of Measurement and Statistics. 79. Tian,Y., Kayatta,M., Shultis,K., Gonzalez,A., Mueller,L.J. and SAGE Publications, Inc., Thousand Oaks, pp. 884–887. Hatcher,M.E. (2009) 31P NMR investigation of backbone dynamics 56. R Core Team. (2013) R: A language and environment for statistical in DNA binding sites. J. Phys. Chem. B, 113, 2596–2603. computing. R Foundation for Statistical Computing, Vienna. 80. Schwieters,C.D. and Clore,G.M. (2007) A physical picture of atomic 57. Wickham,H. (2009) ggplot2. Springer, NY. motions within the Dickerson DNA dodecamer in solution derived 58. Humphrey,W., Dalke,A. and Schulten,K. (1996) VMD: visual from joint ensemble refinement against NMR and large-angle X-ray molecular dynamics. J. Mol. Graph., 14, 33–38. scattering data. Biochemistry, 46, 1152–1166. 59. Pettersen,E.F., Goddard,T.D., Huang,C.C., Couch,G.S., 81. Dans,P.D., Per ´ ez,A., Faustino,I., Lavery,R. and Orozco,M. (2012) Greenblatt,D.M., Meng,E.C. and Ferrin,T.E. (2004) UCSF Exploring polymorphisms in B-DNA helical conformations. Nucleic Chimera–a visualization system for exploratory research and Acids Res., 40, 10668–10678. analysis. J. Comput. Chem., 25, 1605–1612. 82. Schwarz,G., Annals,T. and Mar,N. (1978) Estimating the dimension 60. Savelyev,A. and MacKerell,A.D. (2015) Differential impact of the of a model. Ann. Stat., 6, 461–464. monovalent ions Li(+), Na(+), K(+), and Rb(+) on DNA 83. Djuranovic,D. and Hartmann,B. (2004) DNA fine structure and conformational properties. J. Phys. Chem. Lett., 6, 212–216. dynamics in crystals and in solution: the impact of BI/BII backbone 61. Jorgensen,W.L., Chandrasekhar,J., Madura,J.D., Impey,R.W. and conformations. Biopolymers, 73, 356–368. Klein,M.L. (1983) Comparison of simple potential functions for 84. Wecker,K. (2002) The role of the phosphorus BI-BII transition in simulating liquid water. J. Chem. Phys., 79, 926–935. protein-DNA recognition: the NF-kappaB complex. Nucleic Acids 62. Berendsen,H.J.C., Grigera,J.R. and Straatsma,T.P. (1987) The missing Res., 30, 4452–4459. term in effective pair potentials. J. Phys. Chem., 91, 6269–6271. 85. Madhumalar,A. and Bansal,M. (2005) Sequence preference for 63. Noy,A., Soteras,I., Luque,F.J. and Orozco,M. (2009) The impact of BI/BII conformations in DNA: MD and crystal structure data monovalent ion force field model in nucleic acids simulations. Phys. analysis. J. Biomol. Struct. Dyn., 23, 13–27. Chem. Chem. Phys., 11, 10596–10607. 86. Frederick,C.A., Williams,L.D., Ughetto,G., Van der Marel,G.A., Van 64. Per ´ ez,A., Luque,F.J. and Orozco,M. (2012) Frontiers in molecular Boom,J.H., Rich,A. and Wang,A.H.J. (1990) Structural comparison dynamics simulations of DNA. Acc. Chem. Res., 45, 196–205. of anticancer drug-DNA complexes: adriamycin and daunomycin. 65. Savelyev,A. and MacKerell,A.D. (2015) Differential impact of the + + + + Biochemistry, 29, 2538–2549. monovalent ions Li ,Na ,K , and Rb on DNA conformational 87. Fei,J. and Ha,T. (2013) Watching DNA breath one molecule at a time. properties. J. Phys. Chem. Lett., 6, 212–216. Proc. Natl. Acad. Sci. U.S.A., 110, 17173–17174. 66. Levitt,M. (2001) The birth of computational structural biology. Nat. 88. Cubero,E., Sherer,E.C., Luque,F.J., Orozco,M. and Laughton,C.A. Struct. Biol., 8, 392–393. (1999) Observation of spontaneous base pair breathing events in the 67. Bixon,M. and Lifson,S. (1967) Potential functions and conformations molecular dynamics simulation of a difluorotoluene-containing DNA in cycloalkanes. Tetrahedron, 23, 769–784. oligonucleotide. J. Am. Chem. Soc., 121, 8653–8654. 68. Mazur,A.K. and Maaloum,M. (2014) Atomic force microscopy study ´ ˇ ˇ of DNA flexibility on short length scales: smooth bending versus 89. Zgarbova,M., Otyepka,M., Sponer,J., Lankas,F. and Jurecka,P. kinking. Nucleic Acids Res., 42, 14006–14012. (2014) Base pair fraying in molecular dynamics simulations of DNA and RNA. J. Chem. Theory Comput., 10, 3177–3189. 69. Noy,A. and Golestanian,R. (2010) The chirality of DNA: elasticity 90. Leontis,N.B., Stombaugh,J. and Westhof,E. (2002) The cross-terms at base-pair level including A-tracts and the influence of non-Watson-Crick base pairs and their associated isostericity ionic strength. J. Phys. Chem. B, 114, 8022–8031. matrices. Nucleic Acids Res., 30, 3497–3531. 70. de Leeuw,S.W., Perram,J.W. and Smith,E.R. (1980) Simulation of 91. Galindo-Murillo,R., Roe,D.R. and Cheatham,T.E. (2014) On the electrostatic systems in periodic boundary conditions. I. Lattice sums absence of intrahelical DNA dynamics on the s to ms timescale. and dielectric constants. Proc.R.Soc.AMath.Phys. Eng. Sci., 373, Nat. Commun., 5, 5152. 27–56. 92. Galindo-Murillo,R., Roe,D.R. and Cheatham,T.E. (2015) 71. Sagui,C. and Darden,T.A. (1999) Molecular dynamics simulations of Convergence and reproducibility in molecular dynamics simulations biomolecules: long-range electrostatic effects. Annu. Rev. Biophys. of the DNA duplex d(GCACGAACGAACGAACGC). Biochim. Biomol. Struct., 28, 155–179. Biophys. Acta, 1850, 1041–1058. 72. Shui,X., McFail-Isom,L., Hu,G.G. and Williams,L.D. (1998) The B-DNA dodecamer at high resolution reveals a spine of water on sodium. Biochemistry, 37, 8341–8355. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Nucleic Acids Research Oxford University Press

Loading next page...
 
/lp/oxford-university-press/long-timescale-dynamics-of-the-drew-dickerson-dodecamer-2oE4CjMC5n

References (90)

Publisher
Oxford University Press
Copyright
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
ISSN
0305-1048
eISSN
1362-4962
DOI
10.1093/nar/gkw264
pmid
27084952
Publisher site
See Article on Publisher Site

Abstract

4052–4066 Nucleic Acids Research, 2016, Vol. 44, No. 9 Published online 15 April 2016 doi: 10.1093/nar/gkw264 Long-timescale dynamics of the Drew–Dickerson dodecamer 1,2 1,2,3 1,2 4 4,5 Pablo D. Dans , Linda Danilane ¯ , Ivan Ivani ,Toma ´sD ˇ rsata ˇ , Filip Lankas ˇ , 1,2 1,2 1,2 1,2 Adam Hospital ,Jur ¨ gen Walther , Ricard Illa Pujagut , Federica Battistini , Josep 6 7 1,2,6,* Lluis Gelp´ ı , Richard Lavery and Modesto Orozco Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, 08028 Barcelona, Spain, Joint BSC-IRB Research Program in Computational Biology, Baldiri Reixac 10-12, 08028 Barcelona, Spain, School of Chemistry, University of East Anglia (UEA), Norwich Research Park, Norwich NR4 7TJ, UK, Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, Flemingovo nam ´ 2, 166 10 Prague, Czech Republic, Laboratory of Informatics and Chemistry, University of Chemistry and Technology Prague, Technicka´ 5, 166 28 Prague, Czech Republic, Department of Biochemistry and Molecular Biology, University of Barcelona, 08028 Barcelona, Spain and Bases Moleculaires ´ et Structurales des Systemes ` Infectieux, UniversiteL ´ yonI/CNRS UMR 5086, IBCP, 7 Passage du Vercors, Lyon 69367, France Received December 29, 2015; Revised March 25, 2016; Accepted March 31, 2016 ABSTRACT to due to the presence of ligands or of changes in the envi- ronment (1). Clearly, DNA structure should be explained in We present a systematic study of the long-timescale terms of conformational ensembles rather than in terms of dynamics of the Drew–Dickerson dodecamer (DDD: individual structures. d(CGCGAATTGCGC) ) a prototypical B-DNA duplex. Recently experimental techniques (2–5) are providing in- Using our newly parameterized PARMBSC1 force valuable information on DNA dynamics, however most of field, we describe the conformational landscape of what we know about the sequence-dependent flexibility of DNA comes from atomistic molecular dynamics (MD) sim- DDD in a variety of ionic environments from minimal + − + − ulations (6–10). As computer power increases and the reli- salt to 2 M Na Cl or K Cl . The sensitivity of the ability of force fields improve, more reliable information is simulations to the use of different solvent and ion derived from atomistic MD simulations (11). Such simula- models is analyzed in detail using multi-microsecond tions have revealed the extent, and the complexity, of DNA simulations. Finally, an extended (10 s) simulation movements and their tight coupling to the nature and dy- is used to characterize slow and infrequent confor- namics of the environment (8,12–14). Unfortunately, MD mational changes in DDD, leading to the identifica- simulations are extremely dependent on the quality of the tion of previously uncharacterized conformational force field ( 11,15–17) and, as simulations become longer, states of this duplex which can explain biologically errors induced by force fields accumulate, generating erro- relevant conformational transitions. With a total of neous patterns of flexibility ( 18–20). Continuous refinement more than 43 s of unrestrained molecular dynamics of the force field is therefore required in order to profit from simulation, this study is the most extensive investi- computational improvements and to gain better insight into the structure and dynamics of DNA. With this aim in mind, gation of the dynamics of the most prototypical DNA we have recently developed the PARMBSC1 force field, a duplex. new functional with an excellent ability to describe a vari- ety of DNA structures on the microsecond timescale (20). INTRODUCTION Here we use PARMBSC1 (20) to make a detailed explo- The static picture of DNA derived from the early X-Ray ration the dynamics of the best known fragment of DNA: studies is now challenged by a myriad of experimental and the Drew–Dickerson dodecamer (DDD, (21)). DDD is an theoretical studies which show DNA to be a highly flexi- ideal model system: (i) it contains a biologically relevant se- ble entity, undergoing many conformational alterations and quence that fits well into the canonical B-form of DNA, (ii) even modifications of its covalent structure. Simple inspec- it has been extensively studied experimentally (135 struc- tion of the Protein Data Bank (PDB) illustrates how differ- tures with the DDD sequence are available in the PDB, ent sequences adopt different conformations, but also how some of them solved at very high resolution) and (iii) it identical sequences can be found in different conformations To whom correspondence should be addressed.Tel: +34 93 4037156; Fax: +34 93 4037156; Email: [email protected] C The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Nucleic Acids Research, 2016, Vol. 44, No. 9 4053 has also been widely studied by means of nanosecond-to- mostat was also used in combination with PARMBSC1, microsecond MD simulations (7,22,23). In summary, DDD giving results without perceptible differences (20) (data not is the best-known model system of DNA, and its analysis shown). Center of mass motion was removed every 10 ps to is likely to produce results that can be extrapolated to any limit build up of the translational kinetic energy of the so- canonical B-DNA. lute. SHAKE (34) was used to keep all bonds involving hy- In a first step, we evaluate the impact on DNA of a wide drogen at their equilibrium values, which allowed us to use a variety of solvent and ion models. In a second step, we an- 2 fs step for the integration of Newton equations of motion. alyze the impact that changes in ionic strength can have on Long-range electrostatic interactions were accounted for by the collected conformational samples. Finally, we explore in using the Particle Mesh Ewald method (35) with standard detail the long-timescale dynamics of DNA by using many defaults and a real-space cutoff of 10 A. The PARMBSC1 multi-microsecond trajectories and one extended (10 s) force field ( 20) was used to represent DNA interactions. All single trajectory. Our study reveals that the main confor- simulations were carried out using the PMEMD CUDA mational characteristics of DNA are quite insensitive to the code module (36)ofAMBER 14 (37). nature of the models used to describe the solvent and ion en- + − vironment. Changes due to the nature of the salt (Na Cl Analysis + − or K Cl ), or to the ionic strength, are also quite mod- est. PARMBSC1 provides very stable conformational sam- During production runs, data was typically collected every plings, which agree well with experimental information on 1 ps, which allowed us to study infrequent, but fast move- this duplex, but also highlight unusual anharmonic defor- ments. All the trajectories were pre-processed with the CPP- mations of DNA that can explain some biologically rele- TRAJ (38) module of the AMBERTOOLS 15 package (37), vant transitions. Overall, in the framework of fixed-charge the NAFlex server (39) and tools developed in the group all-atom force fields, we present here the broadest and con- (http://mmb.irbbarcelona.org/www/tools). DNA helical pa- ceivably the most accurate study of the multi-microsecond rameters and backbone torsion angles associated with the timescale dynamics of duplex B-DNA to date. each base pair (bp) and base pair step (bps) were measured with the CURVES+ and CANAL programs (40). The sub- states of the torsion angles of the backbone (, ,  and ) MATERIALS AND METHODS were categorized following the standard definition: gauche System set-up positive (g+) = 60 ± 40 degrees; trans (t) = 180 ± 40 de- grees; and gauche negative (g−) = 300 ± 40 degrees. For Starting geometries for all systems used either Arnott-B the analysis of the vast majority of helical parameters we DNA canonical values (24) or the high resolution X-Ray took advantage of the palindromic nature of the DDD se- structure of DDD with PDB ID: 1JGR (25). The systems quence considering both strands independently, or as an av- were then solvated by adding TIP3P or SPC/Ewatersto erage between the Watson and Crick strands. For compar- a truncated octahedral box and neutralized by adding 22 + + ison with the data available in the experimental databases, Na or K . Both Smith-Dang (S&D; (26)) and Joung- which were obtained in different environments, we built a Cheatham (J&C; (27)) ion models were considered and ex- + − + − single theoretical conformational space containing almost tra salt (Na Cl or K Cl ) was added up to a chosen ionic 40 million structures taken from all the independent trajec- strength (0.15, 0.5 and 2.0 M added salt). Counterions were ˚ tories and constituting an aggregated simulation time of 43 initially placed randomly, at a minimum distance of 5 A ˚ s. from the solute and 3.5 A from one another. For simula- tions involving potassium, two extra ion parameterizations Experimental structures of the Drew–Dickerson docecamer. were tested: Jensen-Jorgensen (J&J; (28)) and Beglov-Roux The experimental conformational space of DDD was de- (B&R; (29)). All the systems were energy minimized, ther- fined as a set of experimental structures in the PDB with the malized and pre-equilibrated using our standard multi-step sequence: d(CpGpCpGpApApTpTpCpGpCpG) ; see Sup- protocol (22,30) followed by 50 ns of equilibration. All the plementary Table S1 for a detailed list. The final ensemble systems were then simulated (production runs) on the mi- contained structures of DDD either isolated or in complex crosecond timescale (see Table 1). with small organic compounds (Supplementary Table S1). Both ligands and those sequences containing non-canonical Simulation details covalent modifications were removed. After this selection All systems were simulated in the isothermal-isobaric en- procedure, the remaining 93 structures were analyzed with semble (P = 1atm; T = 298 K) using the Berendsen al- CURVES+ (40) and used as a reference conformational en- gorithm (31) to control the temperature and the pressure, semble. with a coupling constant of 5 ps. Although this has been the standard protocol adopted by the ABC consortium (8), and Solution X-ray scattering profiles from MD simulations. many others, for the simulation of short B-DNA sequences, We computed SAXS/WAXS spectra from MD, with readers should be aware that the Berendsen thermostat may PARMBSC1, by taking 1000 structures from the last mi- + − produce a non-uniform temperature distribution. While this crosecond of the simulation with 0.15 M Na Cl in TIP3P was demonstrated for proteins (32), the compact structure water, and generating 100 spectra, each being the average of of the DDD dodecamer and the weak coupling of the ther- 10 snapshots. The conditions in that simulation are the clos- mostat (5 ps) are likely to minimize such effects in our sim- est to the experimental ones, obtained in 0.10 M of added + − ulations. In a previous work, the Nose–Hoo ´ ver (33)ther- Na Cl plus 0.05 M of Tris·HCl (41). With an estimated 4054 Nucleic Acids Research, 2016, Vol. 44, No. 9 Table 1. Overall simulation information for the systems studied Initial Solvent Number of Number of System name structure model waters Ion type Ion model ions Total time Sample PARMBSC1 J&C Na neutral fiber SPC /E 4968 Na+ J&C 22 2 s1ps S&D Na neutral fiber SPC /E 4968 Na+ S&D 22 2 s1ps S&D Na neutral fiber TIP3P 4998 Na+ S&D 22 2 /10 s1/20 ps J&C Na neutral fiber TIP3P 4970 Na+ J&C 22 2 s1ps S&D NaCl 0.15M fiber TIP3P 5324 Na+ /Cl− S&D 36/14 2 s1ps S&D NaCl 0.5M fiber TIP3P 5118 Na+ /Cl− S&D 64/42 3 s1ps S&D NaCl 2.0M fiber TIP3P 5095 Na+ /Cl− S&D 162/140 2 s1ps J&C neutral 1JGR SPC/E 5037 K+ J&C 22 3 s1ps S&D neutral 1JGR SPC/E 5187 K+ S&D 22 3 s1ps S&D neutral TIP 1JGR TIP3P 5187 K+ S&D 22 2 s1ps S&D 0.15M 1JGR SPC/E 5159 K+/Cl− S&D 36/14 5 s1ps S&D 0.5M 1JGR SPC/E 5118 K+/Cl− S&D 64/42 3 s1ps S&D 2.0M 1JGR SPC/E 5095 K+/Cl− S&D 162/140 2 s1ps J&J neutral 1JGR SPC/E 8609 K+ J&J 22 1 s1ps B&R neutral 1JGR SPC/E 4993 K+ B&R 22 1 s1ps PARMBSC0 S&D neutral TIP 1BNA TIP3P 4998 Na+ S&D 22 4 s1ps S&D 0.15M fiber SPCE 5044 K+ /Cl− S&D 36/14 2.4 s10ps J&C 0.15M fiber SPCE 5046 K+ /Cl− J&C 36/14 0.6 s10ps S&D NaCl 0.15M fiber SPCE 5044 Na+ /Cl− S&D 36/14 0.6 s10ps J&C NaCl 0.15M fiber SPCE 5049 Na+ /Cl− J&C 36/14 0.6 s10ps resolution of 2 A, this experimental WAXS spectrum is, as units of molarity as detailed elsewhere (45). Special atten- far as we know, the most accurate available for the DDD se- tion was paid to the convergence of the ion population both quence (41). To measure the intensities we used the method inside and outside the DNA major and minor grooves for developed by Park et al. (42), which was implemented in each bps as previously described (47). AmberTools by Case group. Conceptually, X-ray scattering compares the scattering intensity from the sample of inter- Similarities, global and local flexibility in Cartesian and He- est, in this case the full solvated DNA, to a ‘blank’ with just lical spaces. Deformation modes were determined from solvent present, and reports the difference, or ‘excess’ inten- a principal component analysis of the collected simu- sity. Consequently, we simulated a water box with 0.15 M lations using PCASUITE (http://mmb.pcb.ub.es/software/ + − of added Na Cl (50 ns of production run), with the same pcasuite/pcasuite.html). DNA entropy values were ob- settings mentioned above, and used it as the ‘blank’ sample. tained from trajectories using the Schlitter (48) and the Only waters and ions within 10 A distance from the nearest Andricioaei–Karplus (49) methods for all heavy atoms (ex- DNA atom were considered to build the spectra, and hy- cluding terminal base-pairs). Similarity indices were calcu- drogen atoms from the DNA were explicitly considered. In lated using Hess metrics (50), and energy-weighted similar- addition, we used the recent RISM model (43) to compute ities (9). Eigenvalues (in A ) were computed by diagonal- the WAXS spectra of the experimental structures 1BNA (X- izing the covariance matrix and were ordered according to ray), 1GIP (nuclear magnetic resonance; NMR) and the av- their contribution to the total variance. Self-similarities of erage structure from the MD. 1GIP is known to be the ex- the first 10 eigenvalues were computed by comparing the perimental structure that best matches the experimental so- first and second halves of a given trajectory. Relative sim- lution scattering profile ( 44). The distribution function of ilarities were computed as described in our previous work waters and ions computed with RISM also considered a (6,51,52). Stiffness constants were determined using base + − TIP3P solution with 0.15 of added Na Cl . pair step helical stiffness matrices, and base-resolution stiff- ness matrices, always obtained from the inversion of co- variance matrices derived from the atomistic simulations Analysis of the cations. The new CANION module from (10). Persistence lengths (in nm) were obtained according CURVES+ (45–47) was used to determine the position of to (53), considering all possible DNA sub-fragments. The each cation in curvilinear helicoidal coordinates for each sequence used to calculate the persistence lengths was arti- snapshot of the simulations with respect to the instanta- ficially extended by taking the inner 8 bp of the DDD se- neous helical axis. Given a distance D along the helical axis, quence and multiplying this segment by 20 to create a 160 ion distributions were computed for each bps (defined here bp oligomer: (CGAATTCG) . The calculations were ex- as N-0.2 ≤ D ≤ N+1.2 for a generic bps N pN ) inside 20 i i+1 ˚ ecuted on ensembles of 10 structures generated by an in- the grooves (distance from the axis R ≤ 10.25 A), divid- house implementation of Olson’s Monte Carlo procedure ing the contribution between the minor groove (A = 33– ◦ ◦ ◦ (54). As discussed elsewhere (53,54), persistence length is a 147 ) and the major groove (polar angle A = 33 to 0 to macroscopic descriptor of the polymeric flexibility of du- 147 ), as shown in Supplementary Figure S1. We analyzed plex DNA that can be compared with experimental mea- the ion distribution in one-dimensional (R, D or A) and sures. two-dimensional (RA, DA, DR) curvilinear helicoidal co- ordinates. Three-dimensional distributions were also recon- structed in Cartesian coordinates using an average structure Sampling of extreme cases and anharmonic distortions. for the DNA oligomers obtained with CPPTRAJ (38)from Certain fluctuations in the global structure of the DNA can- the full-length simulations. Ion densities were obtained in not be reasonably explained in the harmonic regime. Oth- Nucleic Acids Research, 2016, Vol. 44, No. 9 4055 ers, while harmonic, represent extreme cases found at the dimensions and torsion angles sampled in the simulations margins of the distribution of sampled conformations. The in all cases matched the values expected from experimental latter were detected by looking at the distribution of defor- structures. A more detailed comparison with a large num- mation energies, calculated for a reduced set of DNA con- ber of high resolution X-ray and solution NMR structures formations taken from each MD simulation (one structure is carried out below. every 2 ns), with respect to a reference state defined by the As already suggested by previous studies (7), the impact MD-derived basepair step stiffness matrix (6)and theaver- of using two different solvent models (SPC/E and TIP3P) is age DNA conformation. We approximate the distribution negligible in terms of the structural properties of DNA. We of deformation energies to a normal distribution, and con- analyzed in detail the six helical base pair step parameters sider extreme deformations to be those structures with en- along the DDD sequence (Figure 1), and the complete set ergies above either two or three standard deviations from of 16 helical parameters and 8 torsion angles (see Supple- + − + − the average. Anharmonicity was evaluated by applying the mentary Table S3 (Na Cl ) and S4 (K Cl )). The impact Shapiro–Wilk test (55), since none of the reduced sets of de- of changing the ion force-field parameters also led to very formation energies obtained for each trajectory had more small differences in terms of the global properties of DNA than 5000 values. Furthermore, the complete ensemble of (see Figure 1 and supplementary Tables S3 and 4), in good deformation energies (combining all the trajectories) was agreement with previous PARMBSC0 (19) results (7,63). + − + − analyzed graphically by means of Q-Q and box plots, char- The use of Na Cl or K Cl has also little impact on the acterizing the structures sampled by the force field beyond DNA structure, again in good agreement with previous sim- the harmonic approximation. In these analyses the terminal ulations (46,63). Finally, increasing the ionic strength (here base pairs were not considered. we tested ∼0.15, ∼0.5 and ∼2 M), also seemed to produce little effect (see Figure 1) on the average structure of DNA (for local effects linked to ionic strength see the discussion Statistics, graphics and molecular plots below). The statistical analysis, including the Bayesian Information As the MD conformational ensembles appeared to be Criterion (BIC) and linear correlations, as well as associated quite robust to changes in the solvent or ionic atmosphere, graphics, were obtained with the R 3.0.1 statistical package we combined all the trajectories to create a 43 smeta- (56) and the ggplot2 library (57). The molecular plots were trajectory from which a ‘theoretical’ conformational space generated using either VMD 1.9 (58), or the UCSF Chimera of B-DNA can be defined and compared with that derived package version 1.8.1 (59). from experimental structures (see ‘Materials and Meth- ods’ section and Supplementary Table S1), which were RESULTS AND DISCUSSION also solved in different environments (93 structures, which thanks to the palindromic nature of DDD sequence provide Environmental impacts on DNA 186 experimental estimates of helical parameters for each MD trajectories are dependent on the solvent and ion envi- type of base pair step in the sequence: CpG, GpC, GpA, ronment in two different ways: one legitimate, since changes ApA, ApT and TpT). As shown in Figure 2,noexperi- in solvent, ionic strength or the nature of the ions should mental conformation lies outside the ‘theoretical’ confor- impact simulations, sometimes in a dramatic way (13,60), mational space derived from our simulations. Furthermore, and one illegitimate, linked to the uncertainties in the force experimentally observed conformations lie in the regions of fields used to represent water or ions. Before analyzing the higher density in the MD-derived sampling (see Figure 2 for a comparison of the rotational space, and Supplementary detailed dynamics of DNA, it is therefore necessary to eval- Figure S2 for the translational space). DDD actually cov- uate the uncertainties introduced into simulations by the ion and/or solvent models used. For water, we considered ers quite a wide conformational space, as reflected by the the two most popular three-point models: TIP3P (61)and variety of experimental structures, well reproduced by the + − + − SPC/E(62), while for salt (Na Cl or K Cl ) we con- MD simulations. sidered models by Joung-Cheatham (J&C), Smith-Dang Although X-ray and NMR derived structures have been (S&D), Jensen-Jorgensen (J&J) and Beglov-Roux (B&R) considered for the last 20 years the gold-standard for force + − (the last two only for K Cl , see ‘Materials and Methods’ field comparison ( 64), they both suffer from some limita- section). All trajectories involved at least 2–3 s of simula- tions when it comes to represent the structure of the DNA in tion and were extended up to 5 s when ion convergence solution (65). Crystal packing and crystallization artifacts was not clear. in X-ray techniques, or user-biased integration of the peaks Consistent with the general article describing the parame- and refinements based on all-atom force fields in NMR ex- terization and validation of the PARMBSC1 force field ( 20), periments, are just some of the known limitations. To com- all the simulations yielded average RMSDs in the range plement this data, Small-Angle X-ray Scattering (SAXS) 1.9–2.3 A with respect to Arnott B-DNA values (1.6–2.0 and Wide-Angle X-ray Scattering (WAXS) are able to de- ˚ ˚ A if terminal base pairs were excluded), ∼2.2 Awithre- liver information about the shape and size of the molecules spect to the X-Ray structures (PDB IDs: 1BNA, 2BNA, in solution. At high resolution, structural polymorphism 7BNA and 9BNA) and ∼1.8 A with respect to the ensemble such as the B-DNA/B’-DNA can be detected (41), although of NMR structures in the PDB ID: 1NAJ. About 91–98% the spatial averaging carried out to derive the profiles also of the Watson–Crick hydrogen bonds were maintained dur- leads to a loss of information in SAXS/WAXS compared ing the multi-microsecond simulations explored in this work to crystallography. To complement our findings, we com- (see Supplementary Table S2). Helical parameters, groove pared our simulations to the high-resolution (2 A) WAXS 4056 Nucleic Acids Research, 2016, Vol. 44, No. 9 Figure 1. Averaged base pair step helical parameters along sequence for all the simulation performed with PARMBSC1. See Table 1 for a detailed de- scription of the simulated systems. Translational parameters (shift, slide and rise) are reported in A, and rotational ones (tilt, roll and twist) in degrees. The terminal base pairs were removed from the analysis. spectrum obtained for DDD by Zuo and Tiede (41). Nev- forming very well. Further improvements are likely to re- ertheless, the reader should be aware that comparisons be- quire the inclusion of new factors such as polarization. tween theoretically-derived and experimental spectra have to be made with caution, due to the problems in generat- Similarities, global and local flexibility ing profiles from structural models (specially related to the different way to treat the solvent; see ‘Materials and Meth- Processing of the covariance matrix obtained from atom- ods’ section), the different conditions in the simulation and istic MD simulations provides a direct measure of DNA experiments, and the lack of definition in certain regions flexibility in Cartesian space, which can be described in of the spectra. With these cautions in mind, it seems clear terms of essential deformation movements and quasi- from Figure 3 that PARMBSC1 is able to overcome some harmonic entropies (see ‘Materials and Methods’ section of the deviations from experiment described previously us- and reference (51)). These estimates cannot be directly com- ing PARMBSC0 (60), and provide spectra that fit well the pared with experimental observables, but are very useful experimental ones. Quantitative comparison of peak loca- in determining the similarity between deformation patterns tion reveals that PARMBSC1 recovers the first peak (P1) and stiffnesses derived from different force fields ( 6). As −1 near q ≈ 0.45 A , which was reported to be absent in shown in Supplementary Figure S3, the dynamics of the PARMBSC0 simulation (65). The major deviation from ex- central 10 bp of DDD obtained with PARMBSC1 have periment was found at wide angles (P5), where PARMBSC1 a very high (>75%) similarity with respect to a reference is slightly shifted respect to the experimental value, but simulation using PARMBSC0. This similarity increases to where the resolution of both theory (43) and experiment is more than 90% if Boltzmann indices are considered (Sup- also lower and peak location is not so clear. It is worth not- plementary Figure S3D). We can conclude that the nature ing that in general PARMBSC1 fits the experimental pro- and the magnitude of the principal deformations of DNA files with a quality similar or better than the best experi- are very similar with both force fields. This is confirmed by mentally derived structures (by NMR or X-Ray), even in an analysis of the Cartesian entropies and the stiffnesses the most complicated region (P1–P3) that reflects the struc- associated with the main deformation movements (Supple- ture of the sugar-phosphate backbone (see Supplementary mentary Table S6). Helical stiffness analysis provides an Table S5). alternative picture of DNA dynamics by considering lo- In summary, PARMBSC1 trajectories reproduce experi- cal perturbations of helical parameters (10). Results at the mental observables accurately and seem robust with respect base pair level (Supplementary Figure S4) and at the base to the (somewhat arbitrary) selection of ion and solvent pair step level (Supplementary Figure S5) show the expected force fields. In terms of general DNA structure, the trajecto- sequence dependence (6) and confirm the similarity be- ries are also quite insensitive to ionic strength (over a ‘phys- tween PARMBSC0 and PARMBSC1 results. Finally, poly- + − iological’ range) and to the nature of the salt (Na Cl or meric MD-stiffnesses derived from the extension of DDD + − K Cl ). These results suggest that globally, despite the use to a very long duplex (see ‘Materials and Methods’ section) of simple additive potentials (66,67) PARMBSC1 is per- demonstrate that PARMBSC1 also reproduces the persis- Nucleic Acids Research, 2016, Vol. 44, No. 9 4057 Figure 2. Comparison at the bps level between the theoretical and experimental rotational spaces. Rotational parameters (tilt, roll and twist) are reported in degrees. All distinct bps found in DDD are shown (removing the ends): GC (first column), CG (second column), GA (third column), AA (fourth column) and AT (fifth column). Smoothed 2D densities, estimated by fitting the observed distributions to a bivariated normal kernel (evaluated on a square grid of 90 × 90 bins), are depicted by coloring the points coming from the MD simulations with a color gradient from low (blue) to high (red) density. Four iso-density curves are shown in white, and are quantified on the bottom right side of each plot. Experimental conformations are shown as black dots (supplementary Table S1). tence length of duplex DNA with values ranging from 48 to clearly showing that PARMBSC1 simulations significantly 57 nm depending on simulation conditions (Supplementary sample extreme conformations, more frequently than ex- Table S6), compared to experimental estimates of around 50 pected from the harmonic regime. Furthermore, we applied nm (68). the Shapiro–Wilk test to check whether the sets of distor- To further investigate the capacity of MD to sample ex- tion energies could be drawn from a normally distributed treme conformations and also structural distortions beyond population. For all the trajectories, we rejected the null hy- the harmonic regime, we fitted the deformation energies (see pothesis with P < 0.05, supporting the deviation from nor- ‘Materials and Methods’ section) with a normal distribu- mality. We also analyzed the complete space of deforma- −1 tion obtaining an average () of 1.8 kcal mol with a stan- tion energies (combining the results obtained separately for dard deviation () of 0.4 for the meta-trajectory (note that each trajectory), using a graphical approximation. The dis- we obtained the same  and  for the single 10 s trajec- tribution, Q-Q plot, and boxplot presented in Supplemen- tory). If the movements of the DNA could indeed be de- tary Figure S6, clearly supports the presence of anharmonic scribed by the harmonic regime, one would expect that the distortions related with highly bent structures. tails of the distribution beyond  ± 2 would account for We saw above that solvent and ion force-field models, in- + − + − 4.56% of the total probability distribution. Following the cluding both Na Cl or K Cl , had little impact on the same reasoning, the probability that a normal deviate would global structure or flexibility of DDD. This may seem rea- lie beyond  ± 3 is at most 0.27%. Counting the num- sonable given the ‘physiological’ conditions range used in ber of times these extreme regions are sampled in the 10 s this work (see Noy and Golestanian (69)). However, while long trajectory led to 5.64 and 1.32% beyond the  ± 2 simulations carried out with minimal (neutralizing) ionic and  ± 3 limits respectively. Using the complete meta- strength seem to lead to shorter persistence lengths (Supple- trajectory that describes 43 s of DDD in different envi- mentary Table S6), there is no systematic trend relating flex- ronments we obtained 8.91% ( ± 2) and 2.41% ( ± 3), ibility and ionic strength. Note that the use of the AMBER 4058 Nucleic Acids Research, 2016, Vol. 44, No. 9 solvent or ionic atmosphere. We investigated this possibility in more detail by looking at the interactions of DNA with ions. The first point that becomes evident when looking at the trajectories is that while DNA structure is reasonably well converged in several hundred ns (46), the ionic envi- ronment may require significantly more time to converge, as suggested from earlier simulations (47). This is indeed what the analysis of ion population at the base pair step level (see ‘Materials and Methods’ section) shows (see Sup- + − plementary Figures S7–10 for the analysis of K Cl . Simi- + − lar profiles were obtained for Na Cl ). It is also clear that the convergence of the ion distribution depends on the ion force field (Supplementary Figures S7 and 8) and also on the region of DNA that is analyzed. As an example, ions rep- resented with the J&J model converge quickly (200 ns) in- side the grooves and around the DNA, whereas the J&C ion atmosphere is not fully converged in the 3 s studied here (Supplementary Figures S7 and 8). It is also clear that con- vergence is in general faster in external regions (around the phosphates, Supplementary Figures S7 and 9), than within the grooves and notably within the narrow minor groove where saw tooth-like curves can be observed (Supplemen- tary Figures S8 and 10). In these cases, convergence is not fully guaranteed even after 5 s (see the 150 mM S&D simu- lation in Supplementary Figure S10). It is worth noting that convergence problems do not decrease when ionic strength is increased, despite the fact that more ions are available in the DNA environment, indicating that it is not a sim- ple statistical problem. Indeed, the saw tooth-like popu- lation curves of S&D ions inside the minor groove (espe- cially at the central AT step) in the minimal salt simula- tions are present, and sometimes even amplified, in simu- lations at higher ionic strength (see Supplementary Figure S10). This suggests that ions visiting some narrow regions in Figure 3. Solution scattering profiles. ( A) Solution interference patterns the grooves may be frustrated (47), trapped in an oscillatory computed with the RISM approach (43) for the DDD crystal (PDB ID: regime between two different substates. Ions with long resi- 1BNA, green), the NMR (PDB ID:1GIP, yellow) and the average structure dence times inside the grooves, could also explain part of the from the MD simulation (PARMBSC1, blue), compared to the experimen- tal profile (red). Vertical dotted lines in red represent the peaks determined oscillatory regime. Thus, using the S&D 0.15 M simulation experimentally. (B) Scattering profiles obtained from the MD simulation we compared the volume of the groove, the time evolution of with PARMBSC1 using the method from Park et al.(42) (see ‘Materials + K ions visiting the minor groove and the average residence and Methods’ section), compared to the experimental result. The positions at A T and C G , which are the two most populated bps 6 7 3 4 of the peaks are reported in Supplementary Table S5. Note that the abso- at physiological concentration (Supplementary Table S7). lute intensities were accordingly shifted to a common origin to maximize the overlap. The data to produce the experimental curve was a courtesy of The average volume of the minor groove is significantly nar- 3 3 Prof David Tiede (41). ˚ ˚ rower for AT (193 A ) than for GC (239 A ) as previously re- ported (7,22). We also found that the average residence time of K inside the minor groove was 108 ps for AT versus 50 implementation of PME to treat long-range electrostatics ps for CG (if we consider an ion to be present when it stays precludes performing simulations at very low ionic strength at least 20 consecutive ps inside the groove (46)). Indeed, (net-charged systems), where the connection between global K ions are able to remain within the AT step for several flexibility and ionic strength could become significant. The hundreds of ns (Supplementary Figure S11). During these reason is the implicit presence of a net-neutralizing plasma long periods there is a higher probability of simultaneously that appears due to the omission of the zeroeth-order term finding two cations inside the narrow groove of AT, com- in the reciprocal Ewald sum (70,71). pared to CG. Based on the visual inspection of a single ex- tended trajectory, this double occupancy seems to produce Ion atmosphere an imbalance that triggers the release of both ions from the Previous sections have shown that the global structure and groove within a few ps. This could explain the oscillatory dynamics of DNA is not dramatically dependent on the sol- ion population at the AT step, as it indeed occurs at each vent or ion force field, the nature of the monovalent cations of the sawtooth-like peak we observed in the AT time series + + (Na or K ), or the ionic strength (within the range stud- (Supplementary Figure S11). Nevertheless, a more system- ied). However, this robustness of DNA to environmental atic approach with statistical support should be undertaken conditions does not preclude local changes linked to the Nucleic Acids Research, 2016, Vol. 44, No. 9 4059 to confirm these findings. Similar events are not seen in the cations ‘fixed’ in the crystal cell, and the high concentration minor groove of CG steps (Supplementary Figure S12). at which the cations were found inside the grooves of some While remaining cautious with respect to convergence specific bps (up to two orders of magnitude higher respect problems, we can reach some general conclusions on the to the physiological background), make us think that these impact of the ion force field on ion populations around cations reached their final position in the crystal following DNA. Of the four K models tested, the Lorentz-Berthelot a clear sequence-dependent pattern. (LB) implementation of J&J is the one showing the weak- As discussed below, differences in cation population or est affinity for DNA (Figure 4 and Supplementary Figure density do not lead to significant structural or dynamic dif- S8), leading to very low ion populations inside the grooves ferences in DNA. However, a detailed analysis does show and failing to explain the regions of high cation density local changes linked to ion populations in the grooves. As an found experimentally (25,72,73) in the minor groove of the example, the J&J model which showed the weaker affinity AATT segment (note that this different behavior could be for DNA leads to wider grooves (very visible in the AATT due to the conversion from geometric to LB combination segment, see Supplementary Figure S14), while the J&C rules used to build the Lennard-Jones potential in AMBER model, with the strongest DNA affinity, leads to narrower (37,63), since these parameters were created to work with minor grooves in the central AATT segment. A clear corre- another van der Waals functional (28)). J&C is the one with lation is also observed between increasing ion concentration the strongest DNA affinity, possibly explaining its severe and the width of both DNA grooves (Supplementary Fig- convergence problems in regions with narrow grooves. Fi- ure S16). With the S&D model, the average minor groove ˚ ˚ nally, the S&D and B&R models (the two most used in width changes from 4.06 A (neutral system) to 3.88 A(2.0 −5 DNA simulations), at a first glance, give similar ion dis- M system; standard error of 2 × 10 A). The absolute dif- tributions (see Figure 4 and Supplementary Figure S8). A ference between these two extreme conditions is small in ab- more detailed picture of ion environment can be obtained solute terms, but is enough to add extra structural frustra- by looking at 3D density plots (see ‘Materials and Meth- tion to K+ ions entering the minor groove. In general, an ods’ section) such as those shown for B&R, J&C and S&D increased population of ions attached to DNA leads to an in Figure 5. increase in the local stiffness associated to the central AATT Looking at the minimal salt trajectories, differences in tract, but the differences are evident only when the ‘extreme’ ionic atmosphere are especially visible at the edges of the J&J and J&C models are compared (Supplementary Figure groove, around the phosphate groups (where only the J&C S15). In contrast to groove geometry, no noticeable changes model generates ion density when analyzing the 1.5 M iso- were found in BI/BII populations. Overall, it is clear that B- molarity surface, Supplementary Figure S13) and in the DNA is more affected by the choice of ion force fields than central minor groove region, where the J&C model tends by the bulk ion concentration (Supplementary Figures S15 to concentrate all ion density in the AT and CG steps. In and 16), but these effects remain relatively mild. contrast the J&J model predicts a low ion concentration of ions anywhere in the groove, while S&D and B&R models Long-timescale dynamics of DNA show a more homogeneous distribution of ions along the groove (Figure 4, see Supplementary Tables S7–10). Look- Results above demonstrate that, in general terms, the tra- ing at high isomolarity surfaces (5 M) shows that not all jectories obtained from MD simulations are robust with the parameterizations are able to reproduce the location respect to choices of water or ion force fields, the ionic of the K ions that co-crystallized with the DNA in the strength or the nature of the salt. We next decided to ex- high-resolution X-ray experiment (25). It could be argued tend one of our trajectories (S&D, minimal salt (22,46)) to that those co-crystallized cations reached their final loca- 10 s to explore slow conformational changes that might tion in the crystal cell due to packing or crystallization ef- be not visible in shorter simulations. The entire 10 s DDD fects. Nevertheless, the cations are found buried inside the trajectory samples canonical B-DNA conformations which grooves of the DNA in close interaction with the bases. are very close to both X-Ray (21,74–76) and NMR (77) The analysis of their precise location as been the subject of structures (Figure 6). The average helical parameters (twist, several studies, where the position of cations is correlated roll, tilt, shift, slide, rise) derived from the MD simulation with changes in the groove widths, being part of a complex (including terminal bases) is (35.2, 2.8, 0.2, 0.0, −0.2, 3.3), structured network involving cations, water and DNA in- which compares extremely well with the results derived from teractions (22,73). While J&C and S&D models correctly NMR ensembles (PDB ID: 1NAJ (35.9, 2.3, 0.0, 0.0, −0.2, predict the position of K ions in both grooves (S&D be- 3.2)) or X-Ray crystallography (PDB IDs: 1BNA, 2BNA, ing the more accurate), B&R only reproduces the cations 7BNA and 9BNA: (35.0, −0.3, −0.2, −0.1, −0.2, 3.3), con- found in the major groove, while a systematic shift of the firming the ability of PARMBSC1 to reproduce the overall density clouds is observed in the minor groove (Figure 5). conformational properties of DNA (see Table 2 for more de- When the ionic strength is increased (only studied for the tails) in simulations that are at least an order of magnitude S&D model), the general 3D distribution of K does not longer than today’s ‘state-of-the-art’. change dramatically (see Supplementary Figure S14), ex- This long trajectory is also able to capture some subtle cept for the overall increase in ion density that is particularly details, such as the lower  values sampled by guanosines visible in the major groove and at the groove edges close compared with the other nucleotides, sugar phase angle dis- to the phosphate groups where new sites are populated. tributions sampling the correct South–South East regions The general good agreement between the densities coming (Supplementary Table S11), the sequence variability of he- from the free dynamics in solution and the co-crystallized lical twist (7,51)and BI/BII populations (Supplementary 4060 Nucleic Acids Research, 2016, Vol. 44, No. 9 Figure 4. Average K populations inside and outside the DNA grooves. Populations inside the major and the minor groove (left two panels), and outside both grooves (right two panels). The populations were measured for each bps removing the terminal ones (see ‘Materials and Methods’ section). See Supplementary Figure S1 for the CHC partitioning scheme used to divide the grooves. Figure 5. Potassium distributions along the helix. Cartesian K isomolarity surfaces at 5.0 M reconstructed from the CHC histograms with respect to the average structure (shown as a silver surface). For comparison purposes, neutral systems have been overlapped with the Tl cations (red spheres) that co-crystallized with the DNA (PDB ID: 1JGR). Note that thallium cations are used as a replacement of potassium in diffraction experiments (1). The distribution with the J&J parameters are not shown since any visible density was observed at 5.0 M concentration. Nucleic Acids Research, 2016, Vol. 44, No. 9 4061 Figure 6. Descriptors of the quality of the simulation: (A) Root mean square deviation (RMSD) of all the heavy atoms of the Drew–Dickerson dodecamer (DDD) respect to the average experimental value. In blue, RMSD against an average of the X-ray structures (PDB IDs: 1BNA, 2BNA, 7BNA and 9BNA); In orange, RMSD against an average of the NMR ensemble with 5 structures (PDB ID: 1NAJ). For the sake of clarity, running averages every 20 ns are shown in dark orange and dark blue, for X-ray and NMR respectively. (B) Same than (A) but without considering the capping base pairs (i.e. removing all the heavy atoms of base pairs C1:G34 and G12:C13). (C) Evolution of the total number of Watson–Crick hydrogen bonds (Hbonds) with time. Considering a perfect interaction between the 12 bp would lead to a total of 32 Hbonds (light blue dashed line). Without considering the capping base pairs (light red), the ideal total number of Hbonds is 26 (light red dashed line). Hbonds were considered formed if the distance between the donor–acceptor atoms was ≤3.5 A. (D) Sequence averaged twist for all the base pair steps (bps), excluding the terminals, with time. The average MD value is shown with a black line, while the experimental references are shown in dark red and orange dashed lines, for X-ray and NMR respectively. Figure S17). Backbone torsions follow the expected behav- plexes (78) are also detected here (Figure 7), thus improving ior for a duplex B-DNA, preferentially exploring the canon- on PARMBSC0 behavior. As shown in Supplementary Fig- ical B-DNA substate characterize by  in gauche− (g−)and ure S18,  transitions are common (on average 460 tran- in gauche+ (g+) (Supplementary Figure S18 and Supple- sitions per s per nucleotide) and fast (average residence mentary Table S12), with  in trans (t) and  in g− (i.e. the times being around 3 ps); although only 0.01% of the non- BI state). canonical  states have non-negligible survival times of Very interestingly, despite the canonical  state be- up to 1.2 ns (averaging across the four nucleotide types). ing the most populated (in agreement with previous Long-lived -flips of around 100 ns from the g+ to the non- PARMBSC0 simulations (10,22)), all the non-canonical canonical t state where not observed. Note that these flips conformations found in experimental protein–DNA com- were recently suggested to be a source of convergence prob- 4062 Nucleic Acids Research, 2016, Vol. 44, No. 9 Table 2. Sequence-averaged conformational parameters obtained from the 10 s Drew–Dickerson dodecamer simulation b c Parameter Average SD Range Minimum Maximum NMR Xray Shear 0.00 0.30 6.43 − 3.62 2.81 0.00 0.03 Stretch 0.02 0.12 3.27 − 0.78 2.48 − 0.29 0.19 Stagger 0.10 0.38 4.79 − 2.15 2.66 0.02 0.21 Buckle 0.0 9.7 92.9 − 48.3 48.0 0.0 − 0.5 Propeller − 9.2 8.4 81.6 − 49.6 39.2 − 17.4 − 14.4 Opening 1.3 4.0 71.1 − 29.4 56.8 1.1 1.6 Xdisp − 0.58 1.05 10.42 − 6.05 4.36 − 0.01 − 0.15 Ydisp 0.00 0.84 9.16 − 4.59 4.58 0.02 0.52 Inclination 2.2 5.7 56.0 − 26.1 31.0 1.7 − 0.6 Tip 0.1 6.9 58.6 − 41.7 41.2 0.0 − 2.6 Shift − 0.01 0.81 7.31 − 3.71 3.63 0.00 − 0.07 Slide − 0.24 0.53 5.50 − 3.17 2.34 − 0.22 0.14 Rise 3.32 0.29 3.32 1.96 5.30 3.20 3.35 Tilt − 0.1 4.7 44.6 − 22.6 23.0 0.0 − 0.4 Roll 1.5 5.5 60.1 − 31.8 31.7 2.3 − 0.7 Twist 34.3 5.5 52.3 3.4 57.1 36.0 35.2 − 72.6 18.2 314.4 − 60.2 − 57.5 166.8 21.7 251.1 171.0 166.4 55.6 23.2 243.6 49.6 48.3 135.2 14.5 119.7 125.7 126.3 − 159.9 23.5 169.1 − 170.8 − 164.3 − 111.4 36.1 203.1 − 103.5 − 112.1 − 111.7 16.2 138.6 − 111.5 − 113.5 Phase 152.0 27.2 267.3 135.0 135.7 Amplitude 41.3 6.5 61.4 34.0 41.1 Capping base pairs were removed from the analysis. Computed from the ensemble of structures (PDB ID: 1NAJ). Computed from the X-ray structures with PDB IDs: 1BNA, 2BNA, 7BNA and 9BNA. For the dihedral angles only the Watson strand was considered. lems when using PARMBSC0 (10). As expected (7,22), C (Supplementary Table S13). Note also that the weighted- and G nucleotides show longer-lived and more frequent  average twist obtained with the BIC method (82) is higher transitions than A or T (Supplementary Figures S18). How- with PARMBSC1 as a consequence of the higher popula- ever, the occurrence of non-canonical  states is not coop- tion of the high twist state (0.66 versus 0.52 in PARMBSC0; erative and does not lead to the destructuring of the dou- Supplementary Table S13). PARMBSC1 is able to correctly ble helix that was found with older force fields ( 18). On av- reproduce this complex choreography: twist is correlated erage, at any given moment, less than one (0.86) of the 24 with ,withBI/BII, with the formation of the CH-O in- nucleotides is in an unusual  state. Extrapolating to poly- teraction and with the slide polymorphism in the neighbor- meric DNA implies that 3.6% of nucleotides will exhibit an ing step (Supplementary Figure S20). This is expected to unusual  conformation. This could be a factor favoring be particularly important in understanding indirect recog- recognition by specific proteins, given that crystal structures nition of DNA by proteins (83,84,85), and also the mech- of protein-DNA complexes show a significant percentage of anism of DNA intercalation by small organic compounds such states (10%, (67)). (46,86). As expected from previous experimental and theoretical We have also used our long (10 s) DDD simulation to studies (7,8,22,79,80), C·G base pairs show a significant investigate base opening events. It is experimentally known propensity for BI/BII transitions (Figure 7), as evidenced that base opening (understood as conformational states by the concerted changes in  and  from t to g− and from where bases are unpaired for significant periods of time) g− to t respectively (Supplementary Figure S19). BI/BII happens on the millisecond timescale (87), at least for cod- relative populations are notably improved with respect to ing nucleobases (88), and thus far beyond our simulation PARMBSC0, and now reproduce more accurately NMR window. Terminal base pairs, where stacking interactions experiments (Table 3 and Supplementary Figure S17). As are weaker should show slightly higher opening frequencies, previously suggested (8,46,81), BI/BII polymorphism is but available experimental data do not support the presence directly connected to two sources of structural polymor- of long-lived open states for terminal base pairs (20)and phism, namely, the twist bimodality found for CG steps and strongly warn against long-lived non-canonical conforma- the slide polymorphism observed for RpR steps. Concerted tions of terminal base pairs stabilized by interactions with movements of the backbone and the bases allow the for- the DNA grooves, as found in previous simulations with mation of an intra-molecular hydrogen bond of the type other force fields ( 20,89). The PARMBSC1 10 s trajectory C8H8-O3 between RpR steps, that was proved to be casu- shows that, as expected, the terminal C·G pairs are more ally connected to BII populations (8,46). These concerted labile that the central ones and it is not rare to lose some movements seems to be linked to twist polymorphism, the of the hydrogen bonds for short periods of time (see Figure low twist state being driven by the presence of cations specif- 6). Fraying events are common, and unusual arrangements ically binding in the minor groove of CG steps (46). Indeed, of the terminal bases are visited, but they quickly revert to a systematic increase in the low twist state is observed upon canonical Watson–Crick pairing (see Figure 8)asexperi- increasing the amount of added K+Cl− from 0.15 to 2.0M mentally expected (20). The tWC/SE state (where cytosine Nucleic Acids Research, 2016, Vol. 44, No. 9 4063 Table 3. Percentage of BI substates obtained with PARMBSC1 compared with PARMBSC0 and NMR experiments Bps PARMBSC0 PARMBSC1 NMR1 NMR2 GC 48 73 75 53 CG 87 78 75 66 GA 53 45 53 44 AA 91 82 75 63 AT 99 98 100 78 TT 100 99 99 93 TC 98 93 89 92 CG 79 62 75 77 GC 71 61 79 35 BI percentage for the bps in the DD dodecamer, obtained by averaging the difference between  and  angles at the 3 -junction of the Watson strand of each base pair. NMR1 and NMR2 are values obtained using phosphorus chemical shifts as detailed in the works of Schwieters et al. (NMR1 (80)) and Tian et al. (NMR2, (79)), respectively. is turned around the glycosidic bond from the anti to syn conformation to form a non-WC pair resembling the trans Watson–Crick/Sugar Edge C·G pair well-known in RNA −5 structures (90)), now occurs very rarely (probability <10 and with ps lifetimes), in line with NMR experiments, but in contrast to earlier simulations (7,89). The time evolution of the base pair opening parameter shows that most of the fraying is due to transient sampling of large opening angles (Figure 8). We observe very few events where the glycosidic torsion ( ) goes from anti to syn in the terminal cytosine (one such event is highlighted by the blue circle in Figure 8). These rare and reversible events are connected with the for- mation of aO2···5 OH intra-cytosine hydrogen bond, which in turn stabilizes the anomalous tWC/SE conformation. Our simulations suggest that rare and short-lived tWC/SE conformations do not affect neighboring base steps and have no impact on DNA structure and dynamics on multi- microsecond timescales. In addition, no through-the-groove interactions between terminal and inner bases are observed in our long trajectory. Finally, the 10 s trajectory allowed us to analyze conver- gence issues in unprecedented detail; significantly extend- ing previous studies (10,91,92). For this reason, we per- formed principle component analysis on segments of 1, 2 and 5 s extracted from the 10 s trajectory. Visual inspec- tion of Supplementary Figure S21 (supplementary mate- rial) shows slight divergence between 1 s segments, but no significant differences are observed between the 2 and 5 s segments. The smoothed histograms of the main principal components clearly overlap, suggesting that DNA is sam- pling the same conformational space. Similarly, differences in entropy values for segments of 2 s are smaller than 2% with respect to the entropy of the whole 10 s trajectory, independently of the method used to calculate the entropy (see Supplementary Figure S22 in the supplementary mate- rial). This analysis suggest that PARMBSC1 is able to sam- ple in 2 s, what the user can expect to see in terms of con- formational ensembles in 10 s(91), and for most purposes (ion distribution being an exception) 2 s trajectories can be considered to be converged. Figure 7. Major substates of the backbone observed during the simula- tion of DDD. (A) Scatter plot of  and  angles grouped by nucleobase. To obtain the distribution of A in the -plane the dynamics of the nucle- CONCLUSIONS obases A5 and A6 were considered together (similarly: C3/C9, G4/G10 and T7/T8 were used to build the C, G and T ensembles respectively). (B) The availability of PARMBSC1, a new and accurate force Same as (A), but for  and  angles. The global percentages (considering field for DNA, has allowed us to explore the long timescale both strands) of the major canonical states are shown in white. dynamics of the DDD, a prototypical B-DNA duplex, in aqueous solution. An unprecedented degree of sampling (involving more than 43 s of simulation, including a sin- 4064 Nucleic Acids Research, 2016, Vol. 44, No. 9 Figure 8. Analysis of the base-pair fraying at the ends of the dodecamer. Note that we used the same metrics described elsewhere (78) to analyze the fraying of DDD simulated with previous force fields. ( A) Formation of an anomalous intra-cytosine hydrogen bond observed with PARMBSC0 and PARMBSC0- OL4 (one formation event is highlighted by a blue circle). (B) Time evolution of the opening parameter. (C)  angle during simulation is shown in red, light blue, dark green and light green for C1, G12, C13 and G24 respectively. (D) Mass-weighted RMSD of the capping base pairs respect to the first structure of the simulation. (E) Representative structures of the four mayor conformations found are depicted bellow. A similar behavior was observed in the other simulations performed changing the environment conditions (data not shown). gle 10 s trajectory) has allowed us to test the impact of FUNDING different solvent and ion models, and of different salts, on MINECO Severo Ochoa Award of Excellence (Government DNA and to characterize the ion atmosphere around the of Spain) (to IRB Barcelona); Spanish Ministry of Science double helix. We have also analyzed the detailed interplay [BIO2012-32868, BFU2014-61670-EXP to M.O.]; Catalan between ions and local and global conformational changes, SGR (to M.O.); Instituto Nacional de Bioinformatica ´ (to the prevalence of non-harmonic distortions and we have ob- M.O.); European Research Council (ERC SimDNA) (to tained reliable estimates for conformational jumps that can M.O.); H2020 program (MuG and BioExcel projects) (to be important in explaining the specific binding of proteins M.O.); Czech Republic Grant Agency [14-21893S to T.D., or ligands to DNA. Lastly, the simulations of the DDD with F.L.]; ANR grant CHROME [ANR-12-BSV5-0017-01 to PARMBSC1 are shown to accurately reproduce experimen- R.L.]. Funding for open access charge: European Research tal data and to represent a clear improvement over the re- Council (ERC SimDNA). sults obtained with earlier force fields. Conflict of interest statement. None declared. SUPPLEMENTARY DATA Supplementary Data are available at NAR Online. Nucleic Acids Research, 2016, Vol. 44, No. 9 4065 REFERENCES 25. Howerton,S., Sines,C., VanDerveer,D. and Williams,L.D. (2001) Locating monovalent cations in the grooves of B-DNA. Biochemistry, 1. Hud,N.V. (2008) Nucleic Acid-Metal Ion Interactions. Biomolecular 40, 10023–10031. Sciences. The Royal Society of Chemistry, Cambridge. 26. Smith,D.E. and Dang,L.X. (1994) Computer simulations of NaCl 2. Geggier,S. and Vologodskii,A. (2010) Sequence dependence of DNA association in polarizable water. J. Chem. Phys., 100, 3757–3766. bending rigidity. Proc. Natl. Acad. Sci. U.S.A., 107, 15421–15426. 27. Joung,I.S. and Cheatham,T.E. (2008) Determination of alkali and 3. Huguet,J.M., Bizarro,C. V, Forns,N., Smith,S.B., Bustamante,C. and halide monovalent ion parameters for use in explicitly solvated Ritort,F. (2010) Single-molecule derivation of salt dependent biomolecular simulations. J. Phys. Chem. B, 112, 9020–9041. base-pair free energies in DNA. Proc. Natl. Acad. Sci. U.S.A., 107, 28. Jensen,K.P. and Jorgensen,W.L. (2006) Halide, ammonium, and 15431–15436. alkali metal ion parameters for modeling aqueous solutions. J. Chem. 4. Heng,J.B., Aksimentiev,A., Ho,C., Marks,P., Grinkova,Y.V., Theory Comput., 2, 1499–1509. Sligar,S., Schulten,K. and Timp,G. (2006) The electromechanics of 29. Beglov,D. and Roux,B. (1994) Finite representation of an infinite DNA in a synthetic nanopore. Biophys. J., 90, 1098–1106. bulk system: Solvent boundary potential for computer simulations. J. 5. Strick,T.R., Dessinges,M.-N., Charvin,G., Dekker,N.H., Chem. Phys., 100, 9050–9063. Allemand,J.-F., Bensimon,D. and Croquette,V. (2003) Stretching of 30. Shields,G.C., Laughton,C.A. and Orozco,M. (1997) Molecular macromolecules and proteins. Reports Prog. Phys., 66, 1–45. dynamics simulations of the d(T·A·T) triple helix. J. Am. Chem. Soc., 6. Per ´ ez,A., Lankas,F., Luque,F.J. and Orozco,M. (2008) Towards a 119, 7463–7469. molecular dynamics consensus view of B-DNA flexibility. Nucleic 31. Berendsen,H.J.C., Postma,J.P.M., van Gunsteren,W.F., DiNola,A. Acids Res., 36, 2379–2394. and Haak,J.R. (1984) Molecular dynamics with coupling to an 7. Drsa ˇ ta,T., Per ´ ez,A., Orozco,M., Morozov,A. V, Sponer,J. and external bath. 81, 3684–3690. Lankas,F ˇ . (2013) Structure, stiffness and substates of the 32. Mor,A., Ziv,G. and Levy,Y. (2008) Simulations of proteins with dickerson-drew dodecamer. J. Chem. Theory Comput., 9, 707–721. inhomogeneous degrees of freedom: The effect of thermostats. J. 8. Pasi,M., Maddocks,J.H., Beveridge,D., Bishop,T.C., Case,D.A., Comput. Chem., 29, 1992–1998. Cheatham,T., Dans,P.D., Jayaram,B., Lankas,F., Laughton,C. et al. 33. Nose,S ´ . and Klein,M.L. (2006) Constant pressure molecular (2014) ABC: a systematic microsecond molecular dynamics study of dynamics for molecular systems. Mol. Phys., 50, 1055–1076. tetranucleotide sequence effects in B-DNA. Nucleic Acids Res., 42, 34. Ryckaert,J.-P., Ciccotti,G. and Berendsen,H.J. (1977) Numerical 12272–12283. integration of the cartesian equations of motion of a system with 9. Lankas,F., Sponer,J., Hobza,P. and Langowski,J. (2000) constraints: molecular dynamics of n-alkanes. J. Comput. Phys., 23, Sequence-dependent elastic properties of DNA. J. Mol. Biol., 299, 327–341. 695–709. 35. Darden,T., York,D. and Pedersen,L. (1993) Particle mesh Ewald: an 10. Drsa ˇ ta,T. and Lankas,F ˇ . (2015) Multiscale modelling of DNA N·log(N) method for Ewald sums in large systems. J. Chem. Phys., 98, mechanics. J. Phys. Condens. Matter, 27, 323102. 10089–10092. 11. Per ´ ez,A., Luque,F.J. and Orozco,M. (2012) Frontiers in molecular 36. Salomon-Ferrer,R., Gotz,A.W ¨ ., Poole,D., Le Grand,S. and dynamics simulations of DNA. Acc. Chem. Res., 45, 196–205. Walker,R.C. (2013) Routine microsecond molecular dynamics 12. Arcella,A., Dreyer,J., Ippoliti,E., Ivani,I., Portella,G., Gabelica,V., simulations with AMBER on GPUs. 2. Explicit solvent particle mesh Carloni,P. and Orozco,M. (2015) Structure and dynamics of Ewald. J. Chem. Theory Comput., 9, 3878–3888. oligonucleotides in the gas phase. Angew. Chem. Int. Ed. Engl., 54, 37. Case,D.A., Babin,V., Berryman,J.T., Betz,R.M., Cai,Q., Cerutti,D.S., 467–471. Cheatham,T.E. III, Darden,T.A., Duke,R.E., Gohlke,H. et al. (2014) 13. Portella,G., Germann,M.W., Hud,N.V. and Orozco,M. (2014) MD AMBER. University of California, San Francisco. and NMR analyses of choline and TMA binding to duplex DNA: on 38. Roe,D.R. and Cheatham,T.E. (2013) PTRAJ and CPPTRAJ: the origins of aberrant sequence-dependent stability by alkyl cations software for processing and analysis of molecular dynamics trajectory in aqueous and water-free solvents. J. Am. Chem. Soc., 136, data. J. Chem. Theory Comput., 9, 3084–3095. 3075–3086. 39. Hospital,A., Faustino,I., Collepardo-Guevara,R., Gonzalez,C ´ ., 14. Arcella,A., Portella,G., Collepardo-Guevara,R., Chakraborty,D., Gelp´ı,J.L. and Orozco,M. (2013) NAFlex: a web server for the study Wales,D.J. and Orozco,M. (2014) Structure and properties of DNA in of nucleic acid flexibility. Nucleic Acids Res., 41, W47–W55. apolar solvents. J. Phys. Chem. B, 118, 8540–8548. 40. Lavery,R., Moakher,M., Maddocks,J.H., Petkeviciute,D. and 15. Dans,P.D., Walther,J., Gomez,H. ´ and Orozco,M. (2016) Multiscale Zakrzewska,K. (2009) Conformational analysis of nucleic acids simulation of DNA. Curr. Opin. Struct. Biol., 37, 29–45. revisited: Curves+. Nucleic Acids Res., 37, 5917–5929. 16. Sponer,J., Cang,X. and Cheatham,T.E. (2012) Molecular dynamics 41. Zuo,X., Cui,G., Merz,K.M., Zhang,L., Lewis,F.D. and Tiede,D.M. simulations of G-DNA and perspectives on the simulation of nucleic (2006) X-ray diffraction “fingerprinting” of DNA structure in acid structures. Methods, 57, 25–39. solution for quantitative evaluation of molecular dynamics 17. Sim,A.Y.L., Minary,P. and Levitt,M. (2012) Modeling nucleic acids. simulation. Proc. Natl. Acad. Sci. U.S.A., 103, 3534–3539. Curr. Opin. Struct. Biol., 22, 273–278. 42. Park,S., Bardhan,J.P., Roux,B. and Makowski,L. (2009) Simulated 18. Varnai,P ´ . and Zakrzewska,K. (2004) DNA and its counterions: a x-ray scattering of protein solutions using explicit-solvent models. J. molecular dynamics study. Nucleic Acids Res., 32, 4269–4280. Chem. Phys., 130, 134114. 19. Per ´ ez,A., Marchan,I., ´ Svozil,D., Sponer,J., Cheatham,T.E., 43. Nguyen,H.T., Pabit,S.A., Meisburger,S.P., Pollack,L. and Case,D.A. Laughton,C.A. and Orozco,M. (2007) Refinement of the AMBER (2014) Accurate small and wide angle x-ray scattering profiles from force field for nucleic acids: improving the description of atomic models of proteins and nucleic acids. J. Chem. Phys., 141, alpha/gamma conformers. Biophys. J., 92, 3817–3829. 22D508. 20. Ivani,I., Dans,P.D., Noy,A., Per ´ ez,A., Faustino,I., Hospital,A., 44. Zuo,X. and Tiede,D.M. (2005) Resolving conflicting crystallographic Walther,J., Andrio,P., Goni,R., ˜ Balaceanu,A. et al. (2015) Parmbsc1: and NMR models for solution-state DNA with solution X-ray a refined force field for DNA simulations. Nat. Methods, 13, 55–58. diffraction. J. Am. Chem. Soc., 127, 16–17. 21. Drew,H.R., Wing,R.M., Takano,T., Broka,C., Tanaka,S., Itakura,K. 45. Lavery,R., Maddocks,J.H., Pasi,M. and Zakrzewska,K. (2014) and Dickerson,R.E. (1981) Structure of a B-DNA dodecamer: Analyzing ion distributions around DNA. Nucleic Acids Res., 42, conformation and dynamics. Proc. Natl. Acad. Sci. U.S.A., 78, 8138–8149. 2179–2183. 46. Dans,P.D., Faustino,I., Battistini,F., Zakrzewska,K., Lavery,R. and 22. Perez,A., Luque,F.J. and Orozco,M. (2007) Dynamics of B-DNA on Orozco,M. (2014) Unraveling the sequence-dependent polymorphic the microsecond time scale. J. Am. Chem. Soc., 129, 14739–14745. behavior of d(CpG) steps in B-DNA. Nucleic Acids Res., 42, 23. Trieb,M., Rauch,C., Wellenzohn,B., Wibowo,F., Loerting,T. and 11304–11320. Liedl,K.R. (2004) Dynamics of DNA: B I and B II Phosphate 47. Pasi,M., Maddocks,J.H. and Lavery,R. (2015) Analyzing ion Backbone Transitions. J. Phys. Chem. B, 108, 2470–2476. distributions around DNA: sequence-dependence of potassium ion 24. Arnott,S. and Hukins,D.W. (1972) Optimised parameters for A-DNA distributions from microsecond molecular dynamics. Nucleic Acids and B-DNA. Biochem. Biophys. Res. Commun., 47, 1504–1509. Res., 43, 2412–2423. 4066 Nucleic Acids Research, 2016, Vol. 44, No. 9 48. Schlitter,J. (1993) Estimation of absolute and relative entropies of 73. Shui,X., Sines,C.C., McFail-Isom,L., VanDerveer,D. and macromolecules using the covariance matrix. Chem. Phys. Lett., 215, Williams,L.D. (1998) Structure of the potassium form of 617–621. CGCGAATTCGCG: DNA deformation by electrostatic collapse 49. Andricioaei,I. and Karplus,M. (2001) On the calculation of entropy around inorganic cations. Biochemistry, 37, 16877–16887. from covariance matrices of the atomic fluctuations. J. Chem. Phys., 74. Drew,H., Samson,S. and Dickerson,R.E. (1982) Structure of a 115, 6289–6292. B-DNA dodecamer at 16 K. Proc. Natl. Acad. Sci. U.S.A., 79, 50. Hess,B. (2000) Similarities between principal components of protein 4040–4044. 75. Westhof,E. (1987) Re-refinement of the B-dodecamer dynamics and random diffusion. Phys. Rev. E. Stat. Phys. Plasmas. d(CGCGAATTCGCG) with a comparative analysis of the solvent in Fluids. Relat. Interdiscip. Topics, 62, 8438–8448. it and in the Z-hexamer d(5BrCG5BrCG5BrCG). J. Biomol. Struct. 51. Per ´ ez,A., Blas,J.R., Rueda,M., Lopez-Bes,J ´ .M., de la Cruz,X. and Dyn., 5, 581–600. Orozco,M. (2005) Exploring the essential dynamics of B-DNA. J. 76. Holbrook,S., Dickerson,R. and Kim,S.-H. (1985) Anisotropic Chem. Theory Comput., 1, 790–800. thermal-parameter refinement of the DNA dodecamer 52. Orozco,M., Per ´ ez,A., Noy,A. and Luque,F.J. (2003) Theoretical CGCGAATTCGCG by the segmented rigid-body method. Acta methods for the simulation of nucleic acids. Chem.Soc.Rev., 32, Crystallogr. B, 41, 255–262. 350–364. 77. Wu,Z., Delaglio,F., Tjandra,N., Zhurkin,V. and Bax,A. (2003) 53. Noy,A. and Golestanian,R. (2012) Length Scale Dependence of Overall structure and sugar dynamics of a DNA dodecamer from DNA Mechanical Properties. Phys. Rev. Lett., 109, 228101. homo- and heteronuclear dipolar couplings and (31)P chemical shift 54. Zheng,G., Czapla,L., Srinivasan,A.R. and Olson,W.K. (2010) How anisotropy. J. Biomol. Nmr, 26, 297–315. stiff is DNA? Phys. Chem. Chem. Phys., 12, 1399–1406. 78. Varnai,P. (2002) alpha/gamma transitions in the B-DNA backbone. 55. Mecklin,C.J. (2007) Shapiro-Wilk Test for Normality. In: Salkind,NJ Nucleic Acids Res., 30, 5398–5406. and Rasmussen,K (eds). Encyclopedia of Measurement and Statistics. 79. Tian,Y., Kayatta,M., Shultis,K., Gonzalez,A., Mueller,L.J. and SAGE Publications, Inc., Thousand Oaks, pp. 884–887. Hatcher,M.E. (2009) 31P NMR investigation of backbone dynamics 56. R Core Team. (2013) R: A language and environment for statistical in DNA binding sites. J. Phys. Chem. B, 113, 2596–2603. computing. R Foundation for Statistical Computing, Vienna. 80. Schwieters,C.D. and Clore,G.M. (2007) A physical picture of atomic 57. Wickham,H. (2009) ggplot2. Springer, NY. motions within the Dickerson DNA dodecamer in solution derived 58. Humphrey,W., Dalke,A. and Schulten,K. (1996) VMD: visual from joint ensemble refinement against NMR and large-angle X-ray molecular dynamics. J. Mol. Graph., 14, 33–38. scattering data. Biochemistry, 46, 1152–1166. 59. Pettersen,E.F., Goddard,T.D., Huang,C.C., Couch,G.S., 81. Dans,P.D., Per ´ ez,A., Faustino,I., Lavery,R. and Orozco,M. (2012) Greenblatt,D.M., Meng,E.C. and Ferrin,T.E. (2004) UCSF Exploring polymorphisms in B-DNA helical conformations. Nucleic Chimera–a visualization system for exploratory research and Acids Res., 40, 10668–10678. analysis. J. Comput. Chem., 25, 1605–1612. 82. Schwarz,G., Annals,T. and Mar,N. (1978) Estimating the dimension 60. Savelyev,A. and MacKerell,A.D. (2015) Differential impact of the of a model. Ann. Stat., 6, 461–464. monovalent ions Li(+), Na(+), K(+), and Rb(+) on DNA 83. Djuranovic,D. and Hartmann,B. (2004) DNA fine structure and conformational properties. J. Phys. Chem. Lett., 6, 212–216. dynamics in crystals and in solution: the impact of BI/BII backbone 61. Jorgensen,W.L., Chandrasekhar,J., Madura,J.D., Impey,R.W. and conformations. Biopolymers, 73, 356–368. Klein,M.L. (1983) Comparison of simple potential functions for 84. Wecker,K. (2002) The role of the phosphorus BI-BII transition in simulating liquid water. J. Chem. Phys., 79, 926–935. protein-DNA recognition: the NF-kappaB complex. Nucleic Acids 62. Berendsen,H.J.C., Grigera,J.R. and Straatsma,T.P. (1987) The missing Res., 30, 4452–4459. term in effective pair potentials. J. Phys. Chem., 91, 6269–6271. 85. Madhumalar,A. and Bansal,M. (2005) Sequence preference for 63. Noy,A., Soteras,I., Luque,F.J. and Orozco,M. (2009) The impact of BI/BII conformations in DNA: MD and crystal structure data monovalent ion force field model in nucleic acids simulations. Phys. analysis. J. Biomol. Struct. Dyn., 23, 13–27. Chem. Chem. Phys., 11, 10596–10607. 86. Frederick,C.A., Williams,L.D., Ughetto,G., Van der Marel,G.A., Van 64. Per ´ ez,A., Luque,F.J. and Orozco,M. (2012) Frontiers in molecular Boom,J.H., Rich,A. and Wang,A.H.J. (1990) Structural comparison dynamics simulations of DNA. Acc. Chem. Res., 45, 196–205. of anticancer drug-DNA complexes: adriamycin and daunomycin. 65. Savelyev,A. and MacKerell,A.D. (2015) Differential impact of the + + + + Biochemistry, 29, 2538–2549. monovalent ions Li ,Na ,K , and Rb on DNA conformational 87. Fei,J. and Ha,T. (2013) Watching DNA breath one molecule at a time. properties. J. Phys. Chem. Lett., 6, 212–216. Proc. Natl. Acad. Sci. U.S.A., 110, 17173–17174. 66. Levitt,M. (2001) The birth of computational structural biology. Nat. 88. Cubero,E., Sherer,E.C., Luque,F.J., Orozco,M. and Laughton,C.A. Struct. Biol., 8, 392–393. (1999) Observation of spontaneous base pair breathing events in the 67. Bixon,M. and Lifson,S. (1967) Potential functions and conformations molecular dynamics simulation of a difluorotoluene-containing DNA in cycloalkanes. Tetrahedron, 23, 769–784. oligonucleotide. J. Am. Chem. Soc., 121, 8653–8654. 68. Mazur,A.K. and Maaloum,M. (2014) Atomic force microscopy study ´ ˇ ˇ of DNA flexibility on short length scales: smooth bending versus 89. Zgarbova,M., Otyepka,M., Sponer,J., Lankas,F. and Jurecka,P. kinking. Nucleic Acids Res., 42, 14006–14012. (2014) Base pair fraying in molecular dynamics simulations of DNA and RNA. J. Chem. Theory Comput., 10, 3177–3189. 69. Noy,A. and Golestanian,R. (2010) The chirality of DNA: elasticity 90. Leontis,N.B., Stombaugh,J. and Westhof,E. (2002) The cross-terms at base-pair level including A-tracts and the influence of non-Watson-Crick base pairs and their associated isostericity ionic strength. J. Phys. Chem. B, 114, 8022–8031. matrices. Nucleic Acids Res., 30, 3497–3531. 70. de Leeuw,S.W., Perram,J.W. and Smith,E.R. (1980) Simulation of 91. Galindo-Murillo,R., Roe,D.R. and Cheatham,T.E. (2014) On the electrostatic systems in periodic boundary conditions. I. Lattice sums absence of intrahelical DNA dynamics on the s to ms timescale. and dielectric constants. Proc.R.Soc.AMath.Phys. Eng. Sci., 373, Nat. Commun., 5, 5152. 27–56. 92. Galindo-Murillo,R., Roe,D.R. and Cheatham,T.E. (2015) 71. Sagui,C. and Darden,T.A. (1999) Molecular dynamics simulations of Convergence and reproducibility in molecular dynamics simulations biomolecules: long-range electrostatic effects. Annu. Rev. Biophys. of the DNA duplex d(GCACGAACGAACGAACGC). Biochim. Biomol. Struct., 28, 155–179. Biophys. Acta, 1850, 1041–1058. 72. Shui,X., McFail-Isom,L., Hu,G.G. and Williams,L.D. (1998) The B-DNA dodecamer at high resolution reveals a spine of water on sodium. Biochemistry, 37, 8341–8355.

Journal

Nucleic Acids ResearchOxford University Press

Published: May 19, 2016

There are no references for this article.