TY - JOUR AB - Introduction While aging is a fundamental characteristic of living systems, its underlying principles are still to be fully deciphered. Recent observations of ageing in unicellular models, in absence of genetic or environmental variability, have paved way to new quantitative experimental systems to address ageing's underlying molecular mechanisms [1], [2]. Further, the notion of aging was extended beyond asymmetrically dividing unicellular organisms such as the budding yeast Saccharomyces cerevisiae or the bacterium Caulobacter crescentus -where a clear morphological difference and existence of a juvenile phase distinguishes between the aging mother cell and its daughter cells [3], [4] - to symmetrically dividing bacteria. This pushed aging definition to demand functional asymmetry as minimal requirement for a system to age [5]. Specifically, Escherichia coli and Bacillus subtilis were shown to age as observed by loss of fitness at small generation scale (<10) [6]–[8 (for B. subtilis),9–11] and increased probability of death at larger generation scale (up to 250 generations) [12]. Age in this system was defined as the number of consecutive divisions a cell has inherited the older cellular pole [7]; the sibling that inherits the older cell pole was shown to grow slower than the newer pole sibling. From a cellular viewpoint, aging is arguably due to the accumulation of damage over time that degenerates cellular functions, ultimately affecting the survival of the organism [1], [2]. In the case of E. coli, a significant portion of the age-related fitness loss is accounted for by the presence of protein aggregates that accumulate in the bacterial older poles [7], [9], [10]. Such accumulation is reminiscent of many known age-related protein folding diseases [1]. Preferential sequestration of damaged proteins is also observed in S. cerevisiae between the bud and the mother cell [13]–[15] and between specific intracellular compartments in yeast and mammalian cell [16], [17]. Therefore spatial localization, as non-homogeneous distribution of damaged protein aggregates in the cytoplasm, has been postulated to be an optimized strategy allowing cell populations to maintain large growth rates in the face of the accumulation of damages that accompany metabolism during cell life [14], [18], [19]. These results suggest that spatial localization of damaged protein aggregates could present an ageing process conserved across different living kingdoms. Given the documented link between protein aggregation and ageing, the short life-span, ease of quantification of large number of individuals, molecular biology and genetics accessibility of E. coli may make this bacterium into a relevant model system to elucidate protein aggregation role in a ageing. A first obstacle along this path is to understand the mechanisms by which cells can localize protein aggregates at specific locations within their intracellular space. Generally, thermal agitation and the resulting diffusion (Brownian movement) of proteins forbid localization in space on long timescale, since diffusion is a mixing process that will render every accessible position equiprobable. Inside eukaryotic cells, active mechanisms such as directed transport or sub-compartmentalization by internal membranes permit to counteract the uniforming effects of diffusion. It is however known since the 1952 seminal paper by Alan Turing [20] that subtle interactions between chemical reactions and diffusion can spontaneously lead to steady states with non-uniform spatial extension. This is also true for bacteria, as exemplified by the spatial oscillations in the minCDE system [21] or in the case of diffusion-trapping coupling [22]. Recently, the importance of precise sub-cellular localization of proteins within bacteria has become apparent [23]–[25]. In absence of a general cytoskeleton-based directed, active transport mechanism nor internal membranes, this would favor diffusion-reaction based localization within bacteria (see however [26]–[28]). Specifically it is still unclear whether for single-cell organisms, preferential localization mechanism of damaged proteins is based on active directed transport or passive Brownian diffusion. In S. cerevisiae, initial reports incriminated a role for active directed transport (actin cytoskeleton) or sub-compartmentalization (membrane tethering) in the segregation of molecular damages (damaged proteins, episomal DNA) in the mother cell [13], [29]. Yet, more recent reports contradict the need for directed transport, e.g. on the actin cable, and favor diffusion-based localization [16], [30], [31]. In E. coli, protein aggregates have consistently been reported to localize in the cell poles and in the middle of the cell [7], [9], [10]. The number of distinct aggregates per cell seems to depend on the cellular environment. In non-stressed conditions, at most one aggregate per cell is observed with rare cases (<4%) of two foci per cell detected [6]; under heat shock, most cells contain two or three aggregates [8], [9]. In heat-shock conditions, Winkler et al. [9] concluded in favor of a Brownian passive motion of the protein aggregates. This study also pointed out that one simple possible passive aggregate localization mechanism may be based on spatially non-homogeneous macromolecular crowding. Indeed, in healthy cells, the bacterial chromosome spontaneously condensates [32] thus delineating a restricted sub-region of the cell called “nucleoid”, where molecular crowding is much larger than in the rest of the cytoplasm [33]. Macromolecular crowding then alternates along the cell long axis between low intensity zones (cytosol) and large intensity ones (nucleoid). Monte-Carlo dynamics modeling suggests that such non-homogeneous spatial distribution of the molecular crowding may be sufficient to localize large proteins to the cell poles [34]. In line with this proposal are experimental reports that the observed aggregates preferentially localize in the nucleoid-free regions of the cell [7], [9], i.e. precisely in the regions of alleged lower macromolecular crowding. In spite of these hints though, whether the transport of the aging-related protein aggregates in E. coli is of a directed active nature or purely passive Brownian origin remains elusive, since contradictory results indicate that this process would include ATP-dependent stages [9]. Here, our aim is to determine whether the movement of aging-related protein aggregates in E. coli is purely diffusive (Brownian) or includes some active process (ATP-dependent, directed transport or membrane tethering). To this aim, we devised an integrated approach combining time-lapse fluorescence microscopy of E. coli cells in vivo, open-source automated image analysis, and individual-based modeling. Our results strongly indicate that purely diffusive pattern of aggregates mobility combined with nucleoid occlusion underlie their accumulation in polar and mid-cell positions. Results Trajectory analysis of single protein aggregates In vivo analysis of individual trajectories of proteins of interest (or aggregates thereof) is a powerful method to determine whether the movement of the target protein is of Brownian nature or additionally exhibits further ingredients (active directed transport, caging or corralling effects, transient trapping, anomalous sub-diffusion) [15], [35]–[40]. Here, we focused on naturally forming protein aggregates tethered with the small heat-shock protein IbpA in E. coli whose spatio-temporal dynamics have been implicated in aging of the bacteria [7], [13]. To characterize the motion of IbpA-tethered aggregates in single E. coli cells, we monitored intracellular trajectories of single foci of IbpA-YFP fusion proteins 7,41 in non-stressed conditions (37°C, in LB medium, see Materials and Methods). For the automatic quantification of the resulting time-lapse fluorescence microscopy movies, we developed dedicated image analysis and tracking software tools (see Materials and Methods). This software suite performs automatic segmentation and tracking of the cells (Fig. 1A). Moreover, it automatically detects the fluorescence aggregates foci and monitors their movements relative to the cell in which they are located, with sub-pixel resolution. Download: PPT PowerPoint slide PNG larger image TIFF original image Figure 1. Localization of the detected aggregates in the cells. (A) In each image on the time-lapse fluorescence movies, the bacterial cells are automatically isolated (each individual cell is given a unique random color). The aggregates appearing during the movie are automatically detected and their trajectory within the cell quantified (internal trajectories). (B) By convention, we referred to the projection of the aggregate location on the long axis of the cell as the x-component and that along the short axis as the y-component. (C) Histogram of the x-component of the initial position of the trajectories (total of 1,644 trajectories). Since the cell length at the start of the trajectory is highly variable, the x-component was rescaled by division by the cell half-length. After this normalization, the cell poles are located at locations −1.0 and 1.0 respectively, for every trajectory. (D) Experimentally measured positions of the aggregates detected in the poles (both poles pooled, n = 9,242 points). The green-dashed curves in (D–F) locate the 2d projection of the 3d semi-ellipsoid that was used to approximate the cell pole. (E) Synthetic data for bulk positions: 10,000 3d positions were drawn uniformly at random in the 3d semi-ellipsoid pole. The figure shows the corresponding 2d projections. (F) Synthetic data of membranary positions: 10,000 3d positions were drawn uniformly at random in the external boundary (membrane) of the 3d semi-ellipsoid pole. The figure shows the corresponding 2d projections. (G) To quantify figures D–F, the correlation function ρ(s) was computed as the density of positions located within crescent D(s) (gray). See text for more detail. (H–I) Local density of aggregate positions ρ(s) in the synthetic (H) and experimental (I) data shown in E (bulk, blue), F (membranary, red) and D (experimental, orange). The dashed black line shows the local density computed for 10,000 synthetic 2d positions that were drawn uniformly at random in the 2d semi-ellipse resulting from the 2d projection of the 3d pole ellipsoid (green dashed curve in D–F). https://doi.org/10.1371/journal.pcbi.1003038.g001 Localization of protein aggregates is non-homogeneous along the cells. Detectable protein aggregates (in the form of localized fluorescence foci) were observed in half of the cells monitored (54%; Ncells = 1625 recorded in 72 independent movies), in agreement with previous experimental reports [7], [13]. No further foci were detected by doubling exposure time (see materials and methods). This suggests that smaller undetected aggregates that may exist either diffuse faster than the acquisition time and are therefore not recorded as “localized”, or alternatively, that they merge into bigger, detectable aggregates before full maturation of the fluorophore (≤7.5 min). [42]. Cells in non-stressed conditions tend to exhibit smaller copy numbers of distinct protein aggregates than in heat-shocked conditions (compare e.g. [7] with [9], [10]). Accordingly, in our hands, nearly all the foci-containing cells (98.8%) displayed a single fluorescence focus within the imaging time, while the remaining cells had at most two foci. The distribution of the (initial) spatial localization of the aggregates is displayed in Fig. 1. As a convention, we denote the long axis of the bacteria cell as its x-axis and the short one as its y-axis (Fig. 1B). In this figure, we express the aggregate position relative to the cell center of mass, and rescale to [−1, +1]2 range. Thus the two bacterial poles correspond to x = −1 and x = +1 in this relative scale. The histogram of the location of the aggregate at the starting point of each trajectory is shown in Fig. 1C. The distribution is highly non-homogeneous, with most of the aggregates predominantly localized at one of the two poles, and the others mainly around the middle of the cell. This distribution is similar to the results obtained in [7] (Fig. 2-E-F in [7]), except that here, since we do not differentiate between old and new poles, the amplitude of the polar modes in Fig. 1C are roughly symmetrical. Similar distributions were also obtained in heat-shock conditions [9], [10]. Download: PPT PowerPoint slide PNG larger image TIFF original image Figure 2. Single-aggregate tracking analysis inside E. coli cells. Coordinates along the x and y-axis are shown in red and black, respectively. Low frequency sampling trajectories (LF) are displayed using full lines and high frequency ones (HF) using open symbols. Light red and black swaths indicate + and −95% confidence intervals for the x- and y-axis data, respectively (for clarity, − and + intervals for the x- and y-axis data, respectively, are omitted) (A) Corrected mean displacement where uc(t) is the applied correction. For the y-component, the correction is the time-average of the y-coordinate. For the x-component, the applied correction is cell growth : where L(t) is the cell half-length at time t and Δt is the time interval between two consecutive images. (B) Corresponding mean squared displacements . The inset shows a magnification of the HF results and their close-to-linear behavior for the first 10–15 seconds (dashed line). https://doi.org/10.1371/journal.pcbi.1003038.g002 The marked localization of the aggregates suggests they might be tethered to the membrane at the poles (and center) at some point of the aggregation process, thus restricting their motion. Fig. 1D shows the location of the polar aggregates at first detection (both poles were pooled). Because these experimental results are two-dimensional projections of three-dimensional positions, one cannot directly determine whether the aggregates are bound to the cell membrane or spread in the three dimensional cytoplasmic bulk. To this aim, we generated 104 aggregate positions (uniformly) at random in a volume of the same shape and dimensions than the cell pole. Fig. 1E shows the two-dimensional projection of these positions when the proteins were randomly located in the three-dimensional bulk whereas Fig. 1F shows the two-dimensional projections when the proteins were randomly located on the cytoplasmic membrane enclosing the bulk. To quantify these plots, we analyzed the local density of protein positions in the two-dimensional projections. Assuming the 2d projection of the pole is a semi-ellipse of radii ax and ay (green dashed shapes in Fig. 1 D–G), its area is 1/2πaxay. The area of the elementary semi-elliptic crescent Ds (gray in Fig. 1G) delimited by the semi-ellipse of radii sax and say (0150 generations. Under these conditions, many of the ageing cells indeed accumulate clearly visible aggregates (Fig. S4), pointing to the validity of our approach to use the IbpA-yfp system for better detection [7]. The mechanism described here for IbpA-yfp tethered aggregates can be generalized as ample evidence exist for polar localization of aggregates resulting from heterologous over-expression of proteins, streptomycin treatment [7 and ref therein], large protein assemblies of fluorescently-labelled protein fusions (due to avidity of low multimerization propensity of some fluorescent proteins and independent of the diffusive positioning of the native proteins studied [13]), large RNA-protein assemblies [45], [46]. In all cases, given the non-specific nature of hydrophobic interactions governing aggregate assembly, it is unsurprising that co-localization may occur amongst different aggregated polypeptides and chaperones, based on the common diffusive mechanism of polar accumulation described here. Moreover, our recent work demonstrates that large engineered RNA assemblies accumulate as well in the cells' poles [56, (electron microscope images therein)]. Therefore, the polar localization pattern of low diffusive elements in bacteria is not limited to large purely protein assemblies. We propose that it might be a more general process concerning other cell constituents, such as nucleic acids. Materials and Methods Bacterial strains The sequenced wild-type E. coli strain, MG1655 [57] was modified to express an improved version of the YFP fluorescent protein fused to the C terminus of IbpA [35] under the control of the endogenous chromosomal ibpA promoter resulting in the MGAY strain. E. coli strains were grown in Luria-Bertani (LB) broth medium half salt at 37°C. For more information about the cloning, see [7], S.I. Fluorescence time-lapse microscopy setup After an overnight growth at 37°C, MGAY cultures were diluted 200 times. When the cells reached an absorbance 0.2 (600 nm), they were placed on microscope slide that was layered with a 1.5% agarose pad containing LB half salt medium. The agarose pad was covered with a cover-slide, the boarder of which was then sealed with nail polish oil. Cells were let to recover for 1 hour before observation using Nikon automated microscope (ECLIPSE Ti, Nikon INTENSILIGHT C-HGFIE, 100× objective) and the Metamorph software (Molecular Devices, Roper Scientific), at 37°C. Phase contrast and fluorescence images (25% lamp energy, 1 second illumination LF movies and 600 milliseconds for HF movies) were sampled at two different time-scales. For low-frequency (LF) movies, images were taken every 3 seconds for a total of 5 minutes, while for high-frequency (HF) movies, fluorescence images were taken about every 0.60 seconds for a total of 2 min (and phase contrast images were sampled about every 7 fluorescence images). Fluorescence excitation light energy level used here is 5-fold higher than previously described [7] to allow proportional decrease of exposure time, enabling a higher temporal resolution. Under these conditions, doubling the exposure time did not result in further detection of fluorescent foci yet resulted in accelerated bleaching that prevented consecutive time lapse imaging of the observed foci. Image analysis and aggregate tracking Phase contrast images were analyzed by customized software “Cellst” [58] for cell segmentation and single cell lineage reconstruction. Phase contrast images were denoised using the flatten background filter of Metamorph software for long movies or a mixed denoising algorithm [58] for fast movies. The mixed denoising algorithm combines two famous image denoising methods: NL-means denoising [59], which is patch-based and Total Variation denoising [60], [61], which is used as regularization. The Cellst software was used to automatically segment the cells on most of the images, albeit when necessary, manual corrections were applied. At the end of the whole segmentation and tracking process, Cellst also calculates the lineage of every cell in the movie. The fluorescent protein aggregates were detected by another customized software. Detection of each spot was realized using the a contrario methodology based on a circular patch model with a central zone of detection and an external zone of context. The patch radius was then optimized to optimally match the spot. The energy of each spot was computed in the following way: the total image was modeled as a sum of a constant background and 2D circular Gaussian curves, centered on the maximal intensity pixel of the detected spots with a deviation of 3 pixels. The quadratic minimum deviation between the image and the model enabled to calculate the Gaussian coefficients. These coefficients were considered the energy values of each spot. The coordinates of the detected spots were then refined to subpixel resolution. This was achieved by computing a weighted average of the coordinates of the pixels in a circular neighborhood of the detected spot. The weights were given by the intensity values of the pixels to which the local background is subtracted. The local background was then computed as the median value of the pixel intensity in the neighborhood. Only pixels having intensity bigger than the median value were considered in the weighted average. This algorithm has been tested on both synthetic and real image and it shown a precision of 1/10 of a pixel on very poor contrasted spots. After detection and localization, the movements of the fluorescent aggregates were tracked and quantified by a third customized software named “aggtracker” based on the cell lineage and the detected spots. The algorithm uses the lineage and cell information to ensure that an aggregate is consistently tracked through points with points that are inside the same cell. The output of this software are time-series for the coordinates x and y (in pixels) of each fluorescence spot as well as the affiliation of the spot to the cell it is in. The last step consisted in the projection of the coordinates of the fluorescence spot from their initial absolute values in the image (in pixels) to their value along the long and short axes of the 2d image of the cell. To this aim, we used active skeletons. A skeleton represents an object by a median line (the center line in the case of a tubular bacteria). Here we used one active skeleton for each cell, providing the long axis of the cell image (the median line) and its short axis (along the skeleton width). Active skeletons were adapted to bacteria in order to optimize the position of the skeleton in the image of the cell. The coordinates of the fluorescence spots were then expressed as the coordinate of the center of the fluorescence spot in the basis composed of the active skeletons that localize the cell long and short axes. We exploited the simple shape of the skeleton to estimate the total cell width and length as that of the respective skeleton. As a convention, we refer below to the aggregate coordinate along the long axis as the x-coordinate and that along the short axis as the y-coordinate. In order to improve precision, aggregate trajectories made of less than 10 successive images in the movies were not further taken into account. In total, we obtained 1644 aggregate trajectories. Individual-based modeling of protein aggregation To simulate the diffusion and aggregation process of proteins in a single cell, we used a 3d individual-based lattice-free model. Each protein p was explicitly modeled as a sphere of radius rp centered at coordinates (xp, yp, zp) in the 3d intracellular space of the cell. We simulated protein diffusion in the cell and aggregation as they encounter using as realistic conditions as possible. In particular, the radius and diffusion coefficient of the protein aggregates explicitly increased as they grow. Moreover, we explicitly modeled the larger molecular crowding in the nucleoids. Details of the simulations are as follows. The bacterial cell was simulated as a 3d square cylinder with width and depth 1.0 µm [62] and length 4.0 µm (chosen to correspond to a bacterial cell just before division) and reflective boundaries. Note that we also ran simulations with more realistic cell shapes (i.e. spherical caps at cell ends) and did not find significant differences compared to square cylinders (except for the much higher computation cost with spherical caps). Within each cells, we also explicitly modeled the larger molecular crowding found in the nucleoids. Indeed, in healthy cells, the bacterial chromosome condensates into a restricted sub-region of the cell called “nucleoid”, where molecular crowding is much larger than in the rest of the cytoplasm [33]. To model this increased molecular crowding in the nucleoids, we placed at random (with uniform probability) 50,000 bulky immobile, impenetrable and unreactive obstacles (radius 10 nm) in the region of the cell where a nucleoid is expected. Because cell cycle and DNA replication in E. coli are not synchronized, roughly 75% of the cells in exponential phase contain two nucleoids [48]. We thus explicitly positioned two nucleoids within the cell. The location and size of the two nucleoids were estimated from DAPI-stained inverted phase contrast images of the nucleoids found in [49]. Both nucleoids were 3d square cylinders of length 1220 nm (along the cell long axis) and width and height 532 nm. Each nucleoid started at 540 nm from each cell pole and was centered on the cell long axis. The volume occupied by the two nuclei area thus formed is about 12%, which is consistent with literature [63], [64]. Each simulation was initialized by positioning Np individual IbpA-YFP proteins (monomers) at non-overlapping randomly chosen (with uniform probability) locations in the free intracellular space of the cell (i.e. the whole interior of the cell minus the space occupied by the obstacles in the nucleoids). At each time step, each molecule is independently allowed to diffuse over a distance d that depends on the protein diffusion constant Dp, according to d = (6 Dp Δt)1/2, where Δt is the time step, in agreement with basic Brownian motion. Note that Dp itself depends on the aggregate size rp (see below). The new position of the protein (x′,y′,z′) was then computed by drawing two random real numbers, θ and c, uniformly distributed in [0, 2π] and [−1,1], respectively, and spherical coordinates: x′ = x(t)+d sin(acos(c)) cos(θ); y′ = y(t)+d sin(acos(c)) sin(θ) and z′ = z(t)+d c where (x(t),y(t),z(t)) is the initial position of the protein. If the protein in this new position (x′,y′,z′) overlaps with any of the immobile obstacles (i.e. if there exists at least one obstacle such that the distance between the obstacle center and (x′,y′,z′) is smaller than the sum of their radii) the attempted movement is rejected (x(t+Δt),y(t+Δt),z(t+Δt)) = (x(t),y(t),z(t)). This classical approximation of the aggregate reflection by the static obstacles is not expected to change the simulation results significantly, but it drastically reduces the computation load. If no obstacle overlaps, the movement is accepted, i.e. (x(t+Δt),y(t+Δt),z(t+Δt)) = (x′,y′,z′). After each molecule has moved once, the algorithm searches for overlaps between proteins. Two proteins are overlapping whenever the distance between their centers is smaller than the sum of their radii. Each overlapping pair was allowed to aggregate with (uniform) probability pag (irrespective of their size). In our simulations, pag was varied between 0.1 and 1.0 (limited at the lower band by simulation time needed to score enough aggregation events). To model the aggregation from two overlapping proteins, we could not, for computation time reasons, keep track of the shape of the aggregates (i.e. the individual location of each protein in the aggregates). Instead, we used the simplifying hypothesis that all along the simulation, the aggregates maintain a spherical shape with constant internal density. It follows that the radius of an aggregate C, born out of the aggregation of two aggregates A and B of respective size rA and rB is rC = (rA3+rB3)1/3. Upon aggregation, we thus remove the aggregates A and B from the cell, and add a new aggregate with size rC, centered at the center of mass of the two former aggregates A and B. Finally, to set the diffusion constant of the aggregates, we used the classical Stokes-Einstein relation for a Newtonian fluid, where the diffusion constant is inversely proportional to its radius. In our case, this leads to Dp = D0r0/rp where r0 and D0 are the radius and diffusion constant, respectively, of individual (monomeric) IbpA-YFP molecules. Note that this relation could be violated for large molecules in the cytoplasm of E. coli [52], [53]. In a subset of simulations, we used power law relations, such as Dp∝rp−6, as suggested in [51], without noticeable change in our results (except that the time needed to reach a given threshold aggregate size was increased). Note that aggregation was considered irreversible in our model (i.e. aggregates do never breakdown into smaller pieces). This is in agreement with our experimental observations, where we never measured decay of foci fluorescence. The diffusion constant of the 26-kDa GFP (radius 2 nm) in E. coli cytoplasm is around 7.0 µm2/s and that of the GFP-MBP fusion (72 kDa) around 2.5 µm2/s [37]. Using this data and the Stokes-Einstein relation combined to our constant spherical hypothesis, led to estimates of the radius and diffusion coefficient of the individual (monomeric) 39 kDa IbpA-YFP fusion of r0 = 3 nm and D0 = 4.4 µm2/s. The value of the time step Δt has to be small enough so that proteins cannot jump over each other during a single time step, meaning that the distance diffused during a single time step is limited by d<4r0. Using the definition for d above, one then has Δt<8/3 r02/D0≈5 µs. Here, we used Δt = 1 µs yielding d = 5 nm for monomeric proteins. Every simulation was run for a total of 2×106 time steps. The translation of this value into real time is hardly possible since we have no indication of the experimental value of the aggregation probability per encounter pag (see above) even less so of its dependence on the aggregate size. A lower bound can be estimated to 2 seconds real time (for 2×106 time steps) if the aggregation is always diffusion limited (i.e. pag = 1). On general grounds however, the experimental value of pag can be expected to be smaller, so that the 2×106 simulation time steps would correspond to more than this 2 seconds real time minimal value. For the results to be statistically significant, we ran nrun simulations for each parameter and condition, with different realization of the random processes (initial location, random choice of the positions or of the aggregation events) and averaged the results over these nrun simulations. In the results presented here we used nrun = 103. Fitting procedure for the aggregate radius, diffusion constant and cell dimensions The data from the LF movies were partitioned into 5 classes based on the aggregate fluorescence intensity at the beginning of the measured trajectory, yielding 5 pairs of experimental curves for the mean-squared displacement, and where i = {1,…,5} indexes the intensity class. Corresponding theoretical values were obtained by individual-based simulations of confined random walks similar to those described above but modified as follows: the cells, of dimensions LX (length), LY = LZ = LYZ (height and width) were devoid of nucleoids or aggregation (aggregation probability pag = 0) and we used N = 5,000 IbpA-YFP proteins. Each 12-uplet of parameters {LX, LYZ, ri, Di} yields two theoretical curves and . The aim of the fitting procedure is to minimize the distance between the experimental and theoretical curves, ie to minimize the cost function:where the indices j are over the N time steps. The formulation of this cost function corresponds to the traditional least squares, so that the optimization procedure actually looks for best fits in the least-square sense (minimization of the squared residuals between the theoretical predictions and experimental observations). To minimize automatically the cost function F, thus adjusting the theoretical to the experimental curves, we used the C++ implementation of the evolutionary strategy algorithm CMA-ES [39] with population size 12 and 400 generations. Bacterial strains The sequenced wild-type E. coli strain, MG1655 [57] was modified to express an improved version of the YFP fluorescent protein fused to the C terminus of IbpA [35] under the control of the endogenous chromosomal ibpA promoter resulting in the MGAY strain. E. coli strains were grown in Luria-Bertani (LB) broth medium half salt at 37°C. For more information about the cloning, see [7], S.I. Fluorescence time-lapse microscopy setup After an overnight growth at 37°C, MGAY cultures were diluted 200 times. When the cells reached an absorbance 0.2 (600 nm), they were placed on microscope slide that was layered with a 1.5% agarose pad containing LB half salt medium. The agarose pad was covered with a cover-slide, the boarder of which was then sealed with nail polish oil. Cells were let to recover for 1 hour before observation using Nikon automated microscope (ECLIPSE Ti, Nikon INTENSILIGHT C-HGFIE, 100× objective) and the Metamorph software (Molecular Devices, Roper Scientific), at 37°C. Phase contrast and fluorescence images (25% lamp energy, 1 second illumination LF movies and 600 milliseconds for HF movies) were sampled at two different time-scales. For low-frequency (LF) movies, images were taken every 3 seconds for a total of 5 minutes, while for high-frequency (HF) movies, fluorescence images were taken about every 0.60 seconds for a total of 2 min (and phase contrast images were sampled about every 7 fluorescence images). Fluorescence excitation light energy level used here is 5-fold higher than previously described [7] to allow proportional decrease of exposure time, enabling a higher temporal resolution. Under these conditions, doubling the exposure time did not result in further detection of fluorescent foci yet resulted in accelerated bleaching that prevented consecutive time lapse imaging of the observed foci. Image analysis and aggregate tracking Phase contrast images were analyzed by customized software “Cellst” [58] for cell segmentation and single cell lineage reconstruction. Phase contrast images were denoised using the flatten background filter of Metamorph software for long movies or a mixed denoising algorithm [58] for fast movies. The mixed denoising algorithm combines two famous image denoising methods: NL-means denoising [59], which is patch-based and Total Variation denoising [60], [61], which is used as regularization. The Cellst software was used to automatically segment the cells on most of the images, albeit when necessary, manual corrections were applied. At the end of the whole segmentation and tracking process, Cellst also calculates the lineage of every cell in the movie. The fluorescent protein aggregates were detected by another customized software. Detection of each spot was realized using the a contrario methodology based on a circular patch model with a central zone of detection and an external zone of context. The patch radius was then optimized to optimally match the spot. The energy of each spot was computed in the following way: the total image was modeled as a sum of a constant background and 2D circular Gaussian curves, centered on the maximal intensity pixel of the detected spots with a deviation of 3 pixels. The quadratic minimum deviation between the image and the model enabled to calculate the Gaussian coefficients. These coefficients were considered the energy values of each spot. The coordinates of the detected spots were then refined to subpixel resolution. This was achieved by computing a weighted average of the coordinates of the pixels in a circular neighborhood of the detected spot. The weights were given by the intensity values of the pixels to which the local background is subtracted. The local background was then computed as the median value of the pixel intensity in the neighborhood. Only pixels having intensity bigger than the median value were considered in the weighted average. This algorithm has been tested on both synthetic and real image and it shown a precision of 1/10 of a pixel on very poor contrasted spots. After detection and localization, the movements of the fluorescent aggregates were tracked and quantified by a third customized software named “aggtracker” based on the cell lineage and the detected spots. The algorithm uses the lineage and cell information to ensure that an aggregate is consistently tracked through points with points that are inside the same cell. The output of this software are time-series for the coordinates x and y (in pixels) of each fluorescence spot as well as the affiliation of the spot to the cell it is in. The last step consisted in the projection of the coordinates of the fluorescence spot from their initial absolute values in the image (in pixels) to their value along the long and short axes of the 2d image of the cell. To this aim, we used active skeletons. A skeleton represents an object by a median line (the center line in the case of a tubular bacteria). Here we used one active skeleton for each cell, providing the long axis of the cell image (the median line) and its short axis (along the skeleton width). Active skeletons were adapted to bacteria in order to optimize the position of the skeleton in the image of the cell. The coordinates of the fluorescence spots were then expressed as the coordinate of the center of the fluorescence spot in the basis composed of the active skeletons that localize the cell long and short axes. We exploited the simple shape of the skeleton to estimate the total cell width and length as that of the respective skeleton. As a convention, we refer below to the aggregate coordinate along the long axis as the x-coordinate and that along the short axis as the y-coordinate. In order to improve precision, aggregate trajectories made of less than 10 successive images in the movies were not further taken into account. In total, we obtained 1644 aggregate trajectories. Individual-based modeling of protein aggregation To simulate the diffusion and aggregation process of proteins in a single cell, we used a 3d individual-based lattice-free model. Each protein p was explicitly modeled as a sphere of radius rp centered at coordinates (xp, yp, zp) in the 3d intracellular space of the cell. We simulated protein diffusion in the cell and aggregation as they encounter using as realistic conditions as possible. In particular, the radius and diffusion coefficient of the protein aggregates explicitly increased as they grow. Moreover, we explicitly modeled the larger molecular crowding in the nucleoids. Details of the simulations are as follows. The bacterial cell was simulated as a 3d square cylinder with width and depth 1.0 µm [62] and length 4.0 µm (chosen to correspond to a bacterial cell just before division) and reflective boundaries. Note that we also ran simulations with more realistic cell shapes (i.e. spherical caps at cell ends) and did not find significant differences compared to square cylinders (except for the much higher computation cost with spherical caps). Within each cells, we also explicitly modeled the larger molecular crowding found in the nucleoids. Indeed, in healthy cells, the bacterial chromosome condensates into a restricted sub-region of the cell called “nucleoid”, where molecular crowding is much larger than in the rest of the cytoplasm [33]. To model this increased molecular crowding in the nucleoids, we placed at random (with uniform probability) 50,000 bulky immobile, impenetrable and unreactive obstacles (radius 10 nm) in the region of the cell where a nucleoid is expected. Because cell cycle and DNA replication in E. coli are not synchronized, roughly 75% of the cells in exponential phase contain two nucleoids [48]. We thus explicitly positioned two nucleoids within the cell. The location and size of the two nucleoids were estimated from DAPI-stained inverted phase contrast images of the nucleoids found in [49]. Both nucleoids were 3d square cylinders of length 1220 nm (along the cell long axis) and width and height 532 nm. Each nucleoid started at 540 nm from each cell pole and was centered on the cell long axis. The volume occupied by the two nuclei area thus formed is about 12%, which is consistent with literature [63], [64]. Each simulation was initialized by positioning Np individual IbpA-YFP proteins (monomers) at non-overlapping randomly chosen (with uniform probability) locations in the free intracellular space of the cell (i.e. the whole interior of the cell minus the space occupied by the obstacles in the nucleoids). At each time step, each molecule is independently allowed to diffuse over a distance d that depends on the protein diffusion constant Dp, according to d = (6 Dp Δt)1/2, where Δt is the time step, in agreement with basic Brownian motion. Note that Dp itself depends on the aggregate size rp (see below). The new position of the protein (x′,y′,z′) was then computed by drawing two random real numbers, θ and c, uniformly distributed in [0, 2π] and [−1,1], respectively, and spherical coordinates: x′ = x(t)+d sin(acos(c)) cos(θ); y′ = y(t)+d sin(acos(c)) sin(θ) and z′ = z(t)+d c where (x(t),y(t),z(t)) is the initial position of the protein. If the protein in this new position (x′,y′,z′) overlaps with any of the immobile obstacles (i.e. if there exists at least one obstacle such that the distance between the obstacle center and (x′,y′,z′) is smaller than the sum of their radii) the attempted movement is rejected (x(t+Δt),y(t+Δt),z(t+Δt)) = (x(t),y(t),z(t)). This classical approximation of the aggregate reflection by the static obstacles is not expected to change the simulation results significantly, but it drastically reduces the computation load. If no obstacle overlaps, the movement is accepted, i.e. (x(t+Δt),y(t+Δt),z(t+Δt)) = (x′,y′,z′). After each molecule has moved once, the algorithm searches for overlaps between proteins. Two proteins are overlapping whenever the distance between their centers is smaller than the sum of their radii. Each overlapping pair was allowed to aggregate with (uniform) probability pag (irrespective of their size). In our simulations, pag was varied between 0.1 and 1.0 (limited at the lower band by simulation time needed to score enough aggregation events). To model the aggregation from two overlapping proteins, we could not, for computation time reasons, keep track of the shape of the aggregates (i.e. the individual location of each protein in the aggregates). Instead, we used the simplifying hypothesis that all along the simulation, the aggregates maintain a spherical shape with constant internal density. It follows that the radius of an aggregate C, born out of the aggregation of two aggregates A and B of respective size rA and rB is rC = (rA3+rB3)1/3. Upon aggregation, we thus remove the aggregates A and B from the cell, and add a new aggregate with size rC, centered at the center of mass of the two former aggregates A and B. Finally, to set the diffusion constant of the aggregates, we used the classical Stokes-Einstein relation for a Newtonian fluid, where the diffusion constant is inversely proportional to its radius. In our case, this leads to Dp = D0r0/rp where r0 and D0 are the radius and diffusion constant, respectively, of individual (monomeric) IbpA-YFP molecules. Note that this relation could be violated for large molecules in the cytoplasm of E. coli [52], [53]. In a subset of simulations, we used power law relations, such as Dp∝rp−6, as suggested in [51], without noticeable change in our results (except that the time needed to reach a given threshold aggregate size was increased). Note that aggregation was considered irreversible in our model (i.e. aggregates do never breakdown into smaller pieces). This is in agreement with our experimental observations, where we never measured decay of foci fluorescence. The diffusion constant of the 26-kDa GFP (radius 2 nm) in E. coli cytoplasm is around 7.0 µm2/s and that of the GFP-MBP fusion (72 kDa) around 2.5 µm2/s [37]. Using this data and the Stokes-Einstein relation combined to our constant spherical hypothesis, led to estimates of the radius and diffusion coefficient of the individual (monomeric) 39 kDa IbpA-YFP fusion of r0 = 3 nm and D0 = 4.4 µm2/s. The value of the time step Δt has to be small enough so that proteins cannot jump over each other during a single time step, meaning that the distance diffused during a single time step is limited by d<4r0. Using the definition for d above, one then has Δt<8/3 r02/D0≈5 µs. Here, we used Δt = 1 µs yielding d = 5 nm for monomeric proteins. Every simulation was run for a total of 2×106 time steps. The translation of this value into real time is hardly possible since we have no indication of the experimental value of the aggregation probability per encounter pag (see above) even less so of its dependence on the aggregate size. A lower bound can be estimated to 2 seconds real time (for 2×106 time steps) if the aggregation is always diffusion limited (i.e. pag = 1). On general grounds however, the experimental value of pag can be expected to be smaller, so that the 2×106 simulation time steps would correspond to more than this 2 seconds real time minimal value. For the results to be statistically significant, we ran nrun simulations for each parameter and condition, with different realization of the random processes (initial location, random choice of the positions or of the aggregation events) and averaged the results over these nrun simulations. In the results presented here we used nrun = 103. Fitting procedure for the aggregate radius, diffusion constant and cell dimensions The data from the LF movies were partitioned into 5 classes based on the aggregate fluorescence intensity at the beginning of the measured trajectory, yielding 5 pairs of experimental curves for the mean-squared displacement, and where i = {1,…,5} indexes the intensity class. Corresponding theoretical values were obtained by individual-based simulations of confined random walks similar to those described above but modified as follows: the cells, of dimensions LX (length), LY = LZ = LYZ (height and width) were devoid of nucleoids or aggregation (aggregation probability pag = 0) and we used N = 5,000 IbpA-YFP proteins. Each 12-uplet of parameters {LX, LYZ, ri, Di} yields two theoretical curves and . The aim of the fitting procedure is to minimize the distance between the experimental and theoretical curves, ie to minimize the cost function:where the indices j are over the N time steps. The formulation of this cost function corresponds to the traditional least squares, so that the optimization procedure actually looks for best fits in the least-square sense (minimization of the squared residuals between the theoretical predictions and experimental observations). To minimize automatically the cost function F, thus adjusting the theoretical to the experimental curves, we used the C++ implementation of the evolutionary strategy algorithm CMA-ES [39] with population size 12 and 400 generations. Supporting Information Figure S1. Mean displacements of single-aggregates. The figure shows the time evolution of the mean displacement, , where (brackets denote averaging over the trajectories). Coordinates along the x and y-axis are shown in red and black, respectively. Low frequency sampling trajectories (LF) are displayed using full lines and high frequency ones (HF) using open symbols. The inset schematizes the increase of the cell half-length during growth that dominates the movement along the x-axis. https://doi.org/10.1371/journal.pcbi.1003038.s001 (EPS) Figure S2. Diffusion measurements clustered by cell length. Trajectories from the LF movies (Fig. 2) were clustered into 4 classes corresponding to the cell size at the time of measurement: L≤3.4 µm (light blue), 3.4 µm