PLoS Biology

journal article

Open Access Collection

Gap junctions allow transfer of metabolites between germ cells and somatic cells to promote germ cell growth in the Drosophila ovary

Vachias, Caroline;Tourlonias, Camille;Grelée, Louis;Gueguen, Nathalie;Renaud, Yoan;Venugopal, Parvathy;Richard, Graziella;Pouchin, Pierre;Brasset, Emilie;Mirouse, Vincent

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003045pmid: 39965028

Introduction Gap junctions are channels between adjacent cells. Each cells harbor a hemichannel made of six subunits, connexins in vertebrates and innexins in other animal phyla where gap junctions are found. These two protein families are very different in terms of primary sequence, but similar in terms of conformation [1–4]. Hemichannels can be homomeric or heteromeric in composition and channels can be formed from two identical or different hemichannels, leading to a large range of channel combinations with different regulations or solute specificities. Gap junctions generally assemble in plaques that contain many channels and each channel allows the direct communication, and thus molecular flows, between the cytoplasm of two cells. However, due to the channel size and conformation, only passive transfer of small molecules and ions, up to approximatively 1 kDa, including second messengers (e.g., inositol triphosphate and calcium) is allowed. Gap junction proteins also participate in other cellular mechanisms, such as cell migration [5,6]. Moreover, many molecules that can diffuse through gap junctions are linked to energetic metabolism (e.g., glucose and pyruvate) and to anabolic metabolism (e.g., amino acids). However, despite many studies on these metabolite flows, their functional relevance has been elusive. Such metabolite exchanges could promote cell growth. Available genetic data suggest that the growth of mammalian oocytes could be an illustration of such a mechanism. Indeed, mutation of connexin 37 (Cx37) in germ cells or of Cx43 in somatic granulosa cells, which surround and are in contact with the oocyte, strongly impairs oocyte and follicle growth [7,8]. Moreover, amino acid import and pyruvate production are more efficient in follicle cells, and granulosa cells are required for an effective uptake of some amino acids, including alanine and proline, by the oocyte [9–12]. These data suggest a potential metabolic flow towards the oocyte via the gap junctions. Nonetheless, it is not known: (i) to which extent this putative transfer of metabolites contributes to explain gap junction impact on oocyte growth; and (ii) whether such mechanism is coupled with follicle cell growth and more generally, is integrated in the genetic programme controlling follicle development. A major feature of oocytes is their large size throughout animal evolution, despite important variation among species [13,14]. Their size prefigures the early embryo size and their content will determine the early embryo development success rate and quality [15,16]. Hence, oocyte growth is an essential step that requires robust underlying mechanisms [17,18]. Oocyte development and growth usually occur in ovarian follicles where germ cells are surrounded by follicle cells (i.e., somatic epithelial cells). Drosophila oogenesis takes place in a structure called ovariole, in which follicles continuously arise and develop from the anterior to the posterior end. Each ovary is subdivided in about 16 ovarioles (Fig 1A). Follicles bud from a structure called the germarium that contains germ cells and somatic stem cells. Follicle development is divided in 14 morphological stages during which the oocyte volume increases 1000 times in about 3 days before being fertilized and laid [19]. At stage 1, a follicle contains a germline cyst of 16 interconnected cells: 15 nurse cells that will grow by endoreplication and 1 oocyte at the posterior end, blocked in early meiosis. Each germline cyst is encapsulated by the follicular epithelium that is initially composed of about 30 cells. These cells proliferate giving 800 cells at stage 6 and then they undergo endoreplication [20]. Both germ cell growth and follicle cell growth are cell autonomously sensitive to the usual growth systemic signals, such as the Insulin Receptor/Phosphatidylinositol 3 Kinase (InR/PI3K) and Target of Rapamycin (TOR) pathways [18,21–23]. These pathways affect also non-cell autonomously the growth of the adjacent tissue in both direction: the soma influences germline growth and vice versa [18,22–25]. The most striking illustration of this reciprocal coordination is observed when Pten, a InR/PI3K repressor, is mutated in follicle somatic cells. The faster growth of these somatic cells induces the faster growth of the surrounded wild-type germ cells. Regulation of somatic cell growth by germline growth involves the Hippo pathway and its modulation by tension induced on the epithelium due to the germline cyst volume increase [26]. However, it is not known how the follicle somatic cells control germ cell growth. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Soma-germline gap junctions are required for germline growth. (A) Scheme of an ovariole, oriented from the anterior (A) to the posterior (P) end, as all the subsequent images. The ovariole starts with the germarium from which young follicles bud before undergoing massive growth until the formation of an egg at the posterior end. In each follicle, somatic follicle cells (in purple) surround a germline cyst with the nurse cells (pink) and the oocyte (in blue). (B) Sagittal view and (C) maximum intensity projection images of a stage 7 follicle after immunostaining for Inx2 (green in B and C, white in B′ and C′) and Inx4 (magenta in B and C, white in B″ and C″). (D, E) Higher magnification of the insets in B and C to illustrate the formation of plaques at the germline-soma interface. (F) Immunostaining for Inx2 (green in F and F′, white in F‴) and Inx4 (magenta in F′, white in F″) in a follicle containing a Inx2 RNAi-expressing clone in follicle cells marked by GFP expression (white in F). (G) Maximum intensity projection of immunostaining for Inx2 (green in G, white in G″ and G‴) and Inx4 (magenta in G, white in G′) in follicles containing Inx4 RNAi-expressing germline clones visualized by Inx4 absence. Note the absence of Inx2 plaques at germline contacts at higher magnification (G‴) and the growth defect of Inx4 RNAi follicles (arrow) compared with the wild-type younger follicle (arrowhead) (G″). (H) Somatic Inx2A mutant clones, marked by the absence of RFP expression, induce a germline growth defect when they cover the whole epithelium (arrow) compared with the wild-type younger follicle (n − 1) (arrowhead). (I) Quantification of the ratio between the volume of the follicle showing a growth defect fully covered by Inx2 mutant cells, and the volume of the n − 1 follicle of the same ovariole (n = 10 control and 7 Inx2 mutant follicles, Mann–Whitney test). (J) quantification of stage 1−8 follicles positive or negative for Edu in the germline (GL) (n = 118 for control and n = 112 for Inx2 RNAi, Fisher’s exact test) (K, L) quantification of (K) cell surface and (L) EDU-positive cells in small mutant Inx2 clones compared with the surrounding wild-type cells (n = 20, 20 clones or groups of wild-type cells for K and L, Welch’s t-test). (M) Quantification of the volume ratio between a follicle with Inx4 RNAi in the germline cyst and the wildtype n − 1 follicle of the same ovariole (n = 13 for control and n = 17 for Inx4 RNAi, Mann–Whitney test). For all graphs, data are the mean ± SD. ***p < 0.01, ****p < 0.0001. Scale bars: 50 μm in B, G, H and 10 μm in E, F. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g001 Here, we tested the hypothesis of an evolutionary conservation of gap junction function between germ cells and follicle cells to regulate Drosophila female germ cell growth and explored the underlying mechanism. Several reports indicate that gap junctions are present during fly oogenesis. In gem cells, innexin 4 (Inx4) forms plaques at the contact with follicle cells [27–29]. Several innexins, including Inx2, are expressed in follicle cells and partially colocalize with or are juxtaposed to Inx4 [28,29]. Inx4 is required for germ cell survival before follicle formation, and Inx2 is involved in the germline cyst encapsulation by follicle cells during follicle budding [27,30,31]. Inx2 and Inx4 are also required for morphogenetic events during oogenesis, such as the stretching of anterior follicle cells and the migration of border cells during stage 9 [6,29]. Moreover, Inx4 and Inx2 form gap junctions in the testis where they are required at different steps of sperm development [27,32,33]. We found that Inx4/Inx2 gap junctions are required for female germline growth once follicles are formed. Moreover, follicle cells specifically express genes involved in amino acid biology, and the import of some amino acids in follicle cells is necessary for germline growth. This germline growth control by somatic follicle cells is linked to the formation of processing bodies (P-bodies) and can be partially bypassed by the direct expression of a putative amino acid transporter in the germline. Moreover, gap junction assembly is controlled by the InR/PI3K pathway, thus connecting growth coordination between these two cell types with systemic growth control. Results Inx2 and Inx4 form gap junctions at the soma–germline interface that are required for germline growth As gap junctions are required for oocyte growth in mammals, we asked whether they could play a similar role in Drosophila oogenesis [7,8]. Therefore, first, we characterized the gap junction composition and profile in somatic and germ cells, focusing on stages 1–8 when follicle development is globally limited to growth without major morphogenetic changes. As previously described [27–29], we detected Inx2 and Inx4 forming plaques at the interface between germ and somatic cells, these plaques often being larger in follicle anterior part. Inx2 also formed plaques on the lateral domain of follicle cells (Fig 1B–1E). Inx2 expression and plaque size tended to increase as follicles developed (S1A Fig). However, plaques tended to disappear specifically from the oocyte cortex around stage 8 and were completely absent at this position from stage 9 onwards (S1B Fig). The cytoplasmic signal localization suggested that Inx2 and Inx4 were mainly expressed in the soma and germline, respectively. Accordingly, Inx2 knockdown (Inx2 RNAi) induction in somatic clones led to the cell-autonomous loss of Inx2 staining, whereas induction of Inx4 knockdown (Inx4 RNAi) in the germline led to the loss of Inx4 staining (Fig 1F, 1G). Notably, we did not detect Inx4 at the contact of Inx2 RNAi follicle cells, confirming published results (Fig 1F‴) [29]. Similarly, when Inx4 was knocked down in germline cysts, we did not detect Inx2 plaques at the germline contact, although Inx2 protein expression was increased in follicle cells and still localized at the lateral domain of the cells (Fig 1G‴). This last observation could suggest a feedback mechanism between gap junction formation and Inx2 expression. Conversely, the loss of Inx2 in the germline or Inx4 in the soma did not induce visible defects (S1C, S1D Fig). Altogether, these data clearly established the presence of gap junction plaques composed of Inx4 in the germline and of Inx2 in follicle cells, and the Inx4-Inx2 interdependence for plaque assembly. We then tested whether these gap junctions influenced germline growth by generating large Inx2 mutant clones. In such conditions, we observed follicles that were smaller than the younger one at their anterior, a phenomenon never observed in the wild-type ovarioles. The quantification of the ratio between the volumes of the defective growth follicle and the follicle at its anterior confirmed these observations (Fig 1H, 1I). We observed this phenotype with three different alleles, and only when all (or almost) epithelial cells of a follicle harbored the mutated Inx2 while the anterior one was wild-type or contained a smaller percentage of mutant cells, whereas similar cases with control clones had no visible effect (S1E Fig). Constitutive Inx2 RNAi in the somatic lineage completely blocked oogenesis and precluded follicle observations. Therefore, we added a constitutive Gal80ts transgene to allow a temporal control of RNAi construct (18 °C then 48 h at 30 °C). Inx2 knockdown induced by RNAi in follicle cells during 48 h leads to a strong reduction of the proportion of Edu-positive germline cysts, indicating that the growth defect is associated with an effect on endoreplication (Fig 1J). Because it is known that germline and somatic growth influence each other, we aimed to determine whether Inx2 effect on germline growth was due to a cell-autonomous effect on somatic growth [18,22,24,26]. We analyzed follicle cell growth in smaller clones that do not induce germline defect to avoid feedback between tissues. In such mutant cells we did not observe any difference in cell size and in the proportion of cells in S phase (EDU-positive) between these populations, indicating that Inx2 did not influence somatic growth in a cell autonomous manner (Fig 1K, 1L). Of notice, follicle cells express other innexins at their lateral domain, suggesting that functional gap junction in between follicle cell may still exist in absence of Inx2 and may mask a potential communication in between these cells influencing their growth [28]. Nonetheless, these results suggest that Inx2-dependent gap junctions are specifically required for germline growth. We could not confirm this hypothesis using germline null Inx4 mutant clones because in these mutant germline cysts development is blocked very early in the germanium [27]. Therefore, we generated Inx4 RNAi clones in the germline, directly detected by the absence of Inx4 expression (Fig 1G). Inx4 RNAi cysts were not larger than the younger wild-type follicle, indicating defective germline growth (Fig 1G, 1M). Altogether, these data indicate that Inx2 and Inx4 form homomeric and heterotypic gap junctions between somatic and germ cells that are required for germ cell growth. Genes implicated in amino acid metabolism are enriched in somatic follicular cells Gap junction requirement for germ cell growth supports a model in which metabolites diffuse from follicle cells to germ cells. However, the direct identification of such metabolites is technically challenging. We hypothesized that if follicle cells produce or import metabolites not present in the germline, we might identify enzymes or transporters that are involved in this process and that are more expressed in follicle cells. To this aim, we performed translating ribosome affinity purification (TRAP), an approach that allows identifying the tissue-specific translatome. We used nanos:Gal4VP16 and trafficjam:Gal4 drivers to express a Green Fluorescent Protein-tagged ribosomal protein (UASp:RPL10a-GFP) specifically in the germline and in somatic cells, respectively (Fig 2A). Then, we immunoprecipitated the GFP-tagged polysomes and isolated and sequenced the associated mRNAs under translation. Our biological replicates were highly reproducible (S1 Table and S2A Fig). To extract genes with a specific or strongly enriched somatic expression, we used a fold-change of 5 between soma and germline as cut-off that gave a list of 811 genes (Figs 2B and S2B). We then selected genes involved in small molecule metabolic processes, reducing the list to 52 genes. The whole ecdysone synthesis pathway (six genes), which is exclusively active in follicle cells, was enriched in follicle cells, thus validating our TRAP approach [34]. Moreover, six genes encoding enzymes involved in amino acid biosynthesis were strongly enriched in follicle cells (Fig 2B). By analyzing this gene ontologyclass, we found that 29 enzymes were expressed in both cell types, but none was germline-specific (Fig 2C). Similarly, among the 60 amino acid transporters encoded by the fly genome, 21 were expressed in both tissues, 12 were specifically expressed in the soma, and none in the germline alone (Fig 2B, 2D). Interrogating the Fly Cell Atlas for the 18 genes involved in amino acid synthesis or transport and enriched in the soma in the TRAP experiments indicated that they were all detected in a higher proportion of “ovariole somatic cells” than “female germ cells” [35]. It suggests that the differences observed in the translatomes of soma and germline are already true at the level of their transcriptomes. These data indicated that among the genes strongly enriched in follicle cells there is a strong bias for genes involved in amino acid synthesis or import. Given the importance of these metabolites for growth, they are good candidates to be transferred through gap junctions. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. TRAP analysis identifies soma-specific expression of amino acid metabolism and transport genes. (A) Representative images of follicles expressing RPL10a-GFP in the soma (driven by tj:Gal4) or the germline (driven by nos:Gal4p16). After polysome immunoprecipitation (IP), mRNAs were sequenced. (B) The expression of 811 genes was significantly enriched in somatic cells compared with germline cells (fold-change > 5, p-value < 0.05). After filtering for genes involved in small-molecule metabolism or transport, genes related to amino acid biology were identified. (C) Distribution (soma and/or germline) of genes encoding proteins implicated in amino acid biosynthesis or transport according to their expression profile. No gene of these classes was enriched exclusively in the germline. The raw data underlying this figure can be found in S1 Table. https://doi.org/10.1371/journal.pbio.3003045.g002 A putative amino acid transporter is required in follicle cells for germline growth We performed a reverse genetic screen by inducing the silencing (RNAi) of the genes involved in amino acid synthesis or transport identified by TRAP in follicle cells and then looking for ovary growth defects. For each amino acid, many redundancies may occur between anabolic pathways or transporters that may preclude the observation of clear phenotypes. Moreover, although all the available RNAi lines for these genes were tested, we could not exclude that some were not efficient enough. However, the knockdown of one of the 18 tested genes, CG43693, reproductively induced an ovary growth defect in independent RNAi lines (Fig 3A–3C). We named this gene cochonnet (coch) for reasons explained below. The ovary growth defect observed upon coch knockdown is associated with a change in follicle stage distribution, with the accumulation of the younger stages (1–9) and a depletion of the more mature ones (10–14) (Fig 3D). Although uncharacterized in Drosophila, Coch unambiguously belongs to the well-characterized SLC36 family of amino acid transporters, and was especially close to mammalian SLC36A1 and SLC36A4 that are involved in non-polar amino acids transport as proline, alanine, or tryptophan [36–38]. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Import of some amino acid in follicle cells promotes germline growth. (A, B) Representative images of ovaries in (A) control and (B) after CG43693/coch knockdown by RNAi in somatic cells. (C) Ovary size (surface) quantification in control and after coch somatic knockdown using two RNAi lines (Ordinary one-way ANOVA + Dunnett’s multiple comparisons test). (D) stage distribution per ovariole in control or coch knockdown conditions. We regrouped follicle stages in four categories: early: 1−6, intermediate: 7−9, late: 10−12, mature: 13−14 (n = 43 for controls and n = 95 for coch RNAi, two-way ANOVA and Šídák’s multiple comparisons test). (E, F) In situ hybridization analysis of coch expression in (E) a whole ovariole and in (F) stage 4 and 7 follicles. (G) Expression of endogenous Coch protein tagged with GFP from germarium to stage 8. (H) Higher magnification of follicle cells showing Coch-GFP localization in the cell cortex. (I) Coch mutant clones marked by the absence of RFP (shown on I'). When all somatic cells of a follicle are mutated, follicle growth is affected (arrow) as seen when compared with the younger follicle (arrowhead). (J) Quantification of the ratio between the volume of follicle with a growth defect and fully covered by mutant cells and the n − 1 follicle in the same ovariole (n = 10 control and 6 coch mutant follicles, Mann–Whitney test). (K, L) quantification of (K) cell surface and (L) EDU-positive cells in small coch mutant clones compared with the surrounding wild-type cells (n = 12 control and 12 mutant clones or groups of wild-type cells, Welch’s t-test). For all graphs, data are the mean ± SD. **p < 0.01, ***p < 0.001. Scale bars: 500 μm in A, B, 50 μm D, E, F, H and 10 μm in G. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g003 RNA-FISH indicated that coch was specifically expressed in follicle cells, as no expression was detected in germline cells, and checked for its effective knockdown by RNAi in the somatic lineage (Figs 3E, 3F and S3A). Moreover, we detected coch mRNA in all follicle cells and at all stages of oogenesis, and its expression progressively increased with the stage (Fig 3E). This mRNA was enriched at the apical side of the follicle cell, fitting with a previous observation that many mRNAs, including this one, are localized at this domain in follicle cells [39]. A role in importing amino acids from the hemolymph would require a cell membrane localization, while some amino acid transporters are specific to some intracellular vesicular compartments. Therefore, to determine Coch protein localization, we generated a GFP knockin with the Minos-mediated integration cassette (MiMIC) system to insert in-frame an EGFP-encoding exon [40]. This insertion was in the non-conserved N-terminal part common to five of the seven described coch isoforms. Importantly, it tags isoforms RA and RB that were expressed in follicle cells according to our TRAP mRNA sequence data. The line coch-GFP was homozygous viable and fertile without visible phenotype. Moreover, ovary size was normal when coch-GFP was in trans with a deficiency covering the gene, indicating that the insertion did not affect protein function (S3B Fig). We observed a strong GFP signal that tended to increase throughout oogenesis in follicle cells, as previously observed for the mRNA, but no signal in the germline (Fig 3G). The GFP decorates the whole cortex of follicle cells, though appearing more enriched at the lateral and apical domains (Fig 3H). Of notice, septate junctions are not mature from stage 1–8, and the epithelium is therefore not impermeable, suggesting that solutes import via Coch can occur from any domain [41]. Moreover, Coch-GFP subcellular localization was not affected in mutant cells for Inx2, though its expression appears slightly lower in such a context (S3D Fig). Thus, both Coch tissue distribution, specific to somatic cells and present at all stages, and subcellular localization, at the cell cortex, were in agreement with the hypothesis that it is implicated in the import of some amino acids to promote follicle growth. We then tested whether Coch was required for follicle growth. The Minos element insertion (CG43693MI101960) used for the GFP knockin initially contains an exon with STOP codons in different frames. As it is located in the protein N-terminus, it should induce null mutation of the affected isoforms, which include the ones expressed in follicle cells. This allele was sublethal when homozygous or in trans with a deficiency covering the gene. In both cases, the ovaries of these flies were strongly atrophied, fitting with a role for this gene in follicle growth, though we could not detect a difference in the proportion of Edu-positive cysts (Figs S3B, 3C). Moreover, Replacing the STOP codon-containing exon with the GFP exon suppressed ovary growth defect. We generated mutant clones in the follicular epithelium. Analysis of small clones that did not cover the whole epithelium did not show any difference in cell size and proliferation, suggesting that follicle cell growth was not affected (Fig 3K, 3L). Conversely, when the whole epithelium of a follicle was mutated, such follicles were smaller (or of the same size) than the younger ones (Fig 3I, 3J). As this growth defect led to tiny and round follicles, we called this gene cochonnet (coch) after the nickname of the small wood ball used with larger metal boules for pétanque, a traditional game in the South of France. Importantly, these experiments indicated that coch expression in somatic follicle cells was required for germline growth. Genetic evidence of a metabolic transfer between soma and germline cells Our results support a model in which some amino acids imported in follicle cells diffuses through gap junctions to sustain germline growth. In this case, germline expression of coch should rescue genetic conditions in which it is absent in somatic follicle cells. Importantly, it is known that the follicular epithelium is permeable at least until stage 8, and thus that metabolites from the hemolymph can directly reach oocyte surface [41]. To test this hypothesis, we combined the UAS/Gal4 system with the QUAS/QF system [42]. We generated a MatTub:QF driver and a QUASp promoter, inspired by the UASp promoter, for proper expression in the germline [43]. A QUASp:GFP transgene driven by MatTub:QF was expressed once the germline cyst is formed in the germarium and in all the subsequent stages, starting slightly earlier than what described for MatTub:Gal4Vp16, indicating that both driver and QUASp promoter were functional (S4A Fig). Expression of the QUASp:coch transgene in the germline with the same driver (MatTub>coch) had a slight negative effect on ovary size (Fig 4A–4C, 4F). However, when it was combined with somatic knockdown of coch (tj>cochRNAi), it rescued the effect of the latter on ovary size (Fig 4A, 4D–4F). Importantly, since MatTub:QF driver is expressed only from late stages of germarium, the observed rescue cannot be due to an effect during gonad development. However, such rescue is not quantitatively observed in the stage distribution, suggesting that the latter is less sensitive than ovary size to probe growth modification (S4B Fig). We also quantified egg laying in these different genetic backgrounds. This experiment confirmed both the slight deleterious effect of germline coch expression, but also its ability to rescue its knockdown in the follicle cells (Fig 4G). Altogether, these genetic data demonstrated that the requirement of coch expression in follicular cells for germline growth can be partially compensated by its germline expression, suggesting a direct metabolic exchange between these cell types through the gap junctions. To further explore this idea, we attempted similar experiments between the ectopic expression of coch in the germline and the Inx2 knockdown. Of notice, in the conditions used for these experiments (18 °C then 48 h at 30 °C), due to the temporal control of Inx2 RNAi expression, we did not observe a significant negative effect of germline ectopic coch expression that we observed before (Fig 4F, 4H). However, it tends to rescue the ovary size of Inx2 knockdown (Fig 4H). Interestingly, this result fits with a model in which the solutes imported via Coch in the somatic cells reach the germline through Inx2-dependent gap junctions. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Ectopic coch expression in the germline compensates its absence in the soma or the one of gap junction. (A) Scheme representing coch expression in green in the different genotypes used on this figure. (B–E) Representative images of a full ovary from (B) a control female, (C) a MatTub>coch female, (D) a tj>coch RNAi female, a MatTub>coch female and (E) a tj>coch RNAi, MatTub>coch female. (F) Quantification of ovary size (surface) in the indicated genotypes (n = 14, 10, 19, and 27 ovaries, Ordinary one-way ANOVA + Tukey’s multiple comparisons test). (G) Quantification of egg laying per female and per hour, from four independent experiments with 10 females, for the indicated genotypes. (H) Quantification of ovary size (surface) in the indicated genotypes. As we pointed out, a slight variability between two independent experiments, but still with the same tendencies, data were analyzed using two-way ANOVA plus Šídák’s multiple comparisons test (n = 30, 26, 22, 30 ovaries). For all graphs, data are the mean ± SD., *p = 0.0378, **p < 0.01, ***p < 0.001, ****p < 0.0001. Scale bars: 500 μm. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g004 Blocking gap junctions or amino acid import induces processing bodies Previous studies indicated that somatic growth impairment affects germ cell development [18,22–24]. Moreover, the mRNA binding protein Me31B, which is typically concentrated where the anterior–posterior axis determinants localize in the oocyte, becomes enriched in large cytoplasmic structures in the germline when somatic growth is impaired [16,44]. These condensates are reminiscent of P-bodies and stress granules observed upon various stresses, including amino acid deprivation [45]. In follicles, their formation can be induced by reducing protein availability in fly food and are more easily seen at stage 9 in Me31B-GFP-expressing follicles (Fig 5A, 5B) [44]. Therefore, we set up a semi-automated method for their quantification by measuring the fluorescence fraction found in condensates in stage 9 nurse cells (Fig 5C). P-body formation is usually induced by inhibitory phosphorylation of the eukaryotic translation initiation factor 2 subunit alpha (eIF2α) on serine 51 (S51), an event also known to cause a translation arrest [45]. Overexpression in the germline of a transgene mimicking this phosphorylated form (eIF2α-S51D), but not wild-type eIF2α, strongly induced P-body formation, although we did not quantify them because these follicles never reached stage 9 (Fig 5D, 5E). Importantly, eIF2α-S51D overexpression was sufficient to completely block germline growth, suggesting a link between P-body formation and growth control (Fig 5E). As P-bodies potentially repress germline growth, we tested whether the absence of gap junctions between germ cells and follicle cells or defective amino acid import in the follicle cells could induce their formation. We knocked down Inx2 and coch in follicle cells using tj:Gal4 in the presence of Me31B-GFP. To be able to recover and observe stages 8–9, Inx2 RNAi was induced only during a short period of time (14 h, at 30 °C). We also attempted Inx4 knockdown by inducing RNAi specifically in the germline. Coch, Inx2 and Inx4 knockdowns strongly induced the formation of P-bodies when compared to their respective control, suggesting that the observed growth limitation was linked to their formation and a potential arrest of translation (Fig 5F–5L). We also noticed that P-body aspect may differ from a follicle to another, being more dotty or more flaky, though we could not find an explanation for this difference as it appeared not linked to the genotype or the amount of P-bodies. Finally, since ectopic coch expression in the germline tended to rescue ovary size in somatic Inx2 knockdown, we asked whether a similar effect could be observed on P-bodies. Interestingly, we found a significant reduction in P-bodies compared to Inx2 knockdown (Fig 5J). These results provide another indirect argument in favor of a model based on the exchange of solutes across gap junction that are initially imported by Coch in the somatic cells. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Defective gap junctions or amino acid import induces P-body formation. (A, B) Endogenous GFP-tagged Me31B protein expression in stage 9 follicles in (A) well-fed conditions or (B) after 15 h of protein starvation. (C) P-body quantification in normal and starved conditions (n = 16, 21). (D, E) Me31B-GFP expression in ovarioles germline that overexpresses (D) wild-type eIF2α or (E) or eIF2α-S51D (inhibitory phosphorylation). (F–I) Representative images of Me31B-GFP expression in stage 9 follicles in control (F) or after RNAi-based knockdown in somatic cells of (G) Inx2, (H) coch and (I) in the germ cells of Inx4. (J) P-body quantification in follicles from control females, MatTub>coch females, tj>Inx2 RNAi females, and tj>Inx2 RNAi, MatTub>coch females (n = 10, 10, 9, 10 follicles and one-way ANOVA plus Tukey’s multiple comparisons test). (K, L) P-body quantification in follicles after RNAi-based knockdown in somatic cells of coch (K) and in germline cells of Inx4 (L) (K, control n = 24, coch RNAi n = 24, L, control n = 27, Inx4 RNAi n = 26 and Mann–Whitney test). For all graphs, data are the mean ± SD, ***p < 0.001, ****p < 0.0001, Unpaired t-test. Scale bars: 50 μm. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g005 We hypothesized that the non-cell autonomous effect of coch and Inx2 RNAi on the formation of P-bodies could a be consequence of the deprivation for some amino acids. P-bodies or stress granules formation in response to a lack of some amino acids can occur independently or dependently of eIF2α S51 phosphorylation. In the first case, which is not the most frequently described, it is associated with the repression of the TOR pathway [46]. However, induction of Tor RNAi in the fly germline does not induce P-body formation, excluding this possibility [16]. In the second case, eIF2α is phosphorylated by GCN2 that acts as an indirect sensor of amino acid availability. We generated null mutant alleles for gcn2 that were homozygous viable, as similar alleles described during the course of this project [47–49]. Protein deprivation in gcn2 transheterozygous females still led to P-body formation in the germ cells (S5A–S5E Fig). Moreover, an indirect reporter of GCN2 activity and eIF2α−S51 phosphorylation (ATF4-GFP) showed no signal in the germline of wild-type flies, even after protein deprivation (S5F, S5G Fig). Lastly, we tested more directly gcn2 role in the germline when amino acid import in follicle cells is impaired (tj>cochRNAi). However, gcn2 mutant germline clones did not increase the growth rate of wild-type and coch RNAi follicles (S5H, S5I Fig). These results strongly argued against the implication of GCN2 as a germ cell growth repressor when the availability of the amino acids provided by Coch is reduced. Thus, being independent of TOR and GCN2, the mechanism leading to P-bodies formation in the female germline seems unusual and will require further investigation. Altogether these data indicate that although the two well-established pathways involved in amino acid sensing do not seem implicated, P-body formation acts as a metabolic stress sensor linked to the control of germ cell growth by gap junctions and the import of some amino acids in follicle cells. Gap junction assembly links intrinsic growth control and systemic control Published data indicated that InR/PI3K or Tor pathway inhibition in follicle cells induces the formation of P-bodies in the germline [16]. This result was reproduced by akt silencing (RNAi), an essential actor of this pathway, in follicle cells (Fig 6A–6C). Since somatic activity of the InR/PI3K pathway also strongly influences germline growth, we asked whether there was a link between the InR/PI3K pathway activity in follicle cells and their ability to transfer metabolites to the germline. Coch-GFP expression level and localization were similar in wild-type and in follicle cells with a PI3K gain of function or mutated for akt, suggesting no impact on this specific actor of amino acid import in follicle cells (S6A, S6B Fig). Conversely, in akt mutant cells, Inx2 was almost undetectable (Fig 6D), whereas Inx2 expression was strongly increased in Pten mutant cells and plaques size was increased (Fig 6E). These data indicated that Inx2 expression is sensitive to InR/PI3K pathway gain and loss of function. Moreover, we observed that Inx2 protein level was also sensitive to the loss but not the gain of function of the Tor pathway, suggesting that it is not equally controlled by all the pathways modulating cell growth (S6C, S6D Fig). We also observed that Inx2 mRNA level was strongly increased in Pten mutant follicle cells, suggesting that the observed effects on protein level and plaque assembly were due Inx2 gene expression upregulation (Fig 6F). These observations supported a model in which the non-cell autonomous effect of the InR/PI3K pathway from somatic cells to germ cells is mediated by gap junctions. To test this hypothesis, we performed an epistasis experiment. As previously described [18], large Pten mutant clones in the follicular epithelium accelerated germline growth in a non-cell autonomous manner (Fig 6G). This effect was abrogated upon Inx2 RNAi induction in Pten mutant cells (Fig 6H). Thus, gap junctions are required for germline growth control via InR/PI3K activity in the follicular epithelium. We also tested whether Inx2 overexpression could be sufficient to rescue the knockdown of the InR/PI3K pathway in the somatic cells. In this genetic combination, we observed neither an increase in ovary size nor a decrease in P-bodies compared to the akt knockdown (S6E, S6F Fig). Thus, the control of Inx2 expression is not sufficient to explain the non-cell-autonomous effect on germline growth of the InR/PI3K pathway in the follicle cells. Nonetheless, altogether, our data link the systemic control of somatic cells to the growth coordination between somatic and germline cells via the modulation of gap junction assembly. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Gap junctions are controlled by the InR/PI3K pathway. (A, B) Representative images of Me31B-BFP expression in stage 9 follicles from (A) a control female and (B) a tj:Gal4, Tub:Gal80ts>akt RNAi female. (C) P-bodies quantification (fluorescence intensity) in the indicated genotypes (n = 8 and 9 follicles, unpaired t-test). Data are the mean ± SD. ****p < 0.0001. (D) Inx2 expression in an aktq mutant clone (marked by the absence of GFP expression). (E, F) Inx2 protein (E) and Inx2 mRNA expression by FISH (F) in a Ptendj189 mutant clone marked by the absence of RFP expression. (G) Large MARCM Ptendj189 mutant clone marked by the presence of GFP expression showing faster follicle growth. (H) Large MARCM Ptendj189 mutant clone in which faster follicle growth was abolished after expression of an RNAi against Inx2. (I) Quantification of volume ratio between the older fully wild-type follicle and follicles containing a majority of Pten mutant cells and expressing or not RNAi against Inx2 and (n = 10 and 9, Mann–Whitney test). Scale bars: 50 μm in A, B, G, H and 20 μm in D, E, F. Data are the mean ± SD. ****p < 0.0001. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g006 Inx2 and Inx4 form gap junctions at the soma–germline interface that are required for germline growth As gap junctions are required for oocyte growth in mammals, we asked whether they could play a similar role in Drosophila oogenesis [7,8]. Therefore, first, we characterized the gap junction composition and profile in somatic and germ cells, focusing on stages 1–8 when follicle development is globally limited to growth without major morphogenetic changes. As previously described [27–29], we detected Inx2 and Inx4 forming plaques at the interface between germ and somatic cells, these plaques often being larger in follicle anterior part. Inx2 also formed plaques on the lateral domain of follicle cells (Fig 1B–1E). Inx2 expression and plaque size tended to increase as follicles developed (S1A Fig). However, plaques tended to disappear specifically from the oocyte cortex around stage 8 and were completely absent at this position from stage 9 onwards (S1B Fig). The cytoplasmic signal localization suggested that Inx2 and Inx4 were mainly expressed in the soma and germline, respectively. Accordingly, Inx2 knockdown (Inx2 RNAi) induction in somatic clones led to the cell-autonomous loss of Inx2 staining, whereas induction of Inx4 knockdown (Inx4 RNAi) in the germline led to the loss of Inx4 staining (Fig 1F, 1G). Notably, we did not detect Inx4 at the contact of Inx2 RNAi follicle cells, confirming published results (Fig 1F‴) [29]. Similarly, when Inx4 was knocked down in germline cysts, we did not detect Inx2 plaques at the germline contact, although Inx2 protein expression was increased in follicle cells and still localized at the lateral domain of the cells (Fig 1G‴). This last observation could suggest a feedback mechanism between gap junction formation and Inx2 expression. Conversely, the loss of Inx2 in the germline or Inx4 in the soma did not induce visible defects (S1C, S1D Fig). Altogether, these data clearly established the presence of gap junction plaques composed of Inx4 in the germline and of Inx2 in follicle cells, and the Inx4-Inx2 interdependence for plaque assembly. We then tested whether these gap junctions influenced germline growth by generating large Inx2 mutant clones. In such conditions, we observed follicles that were smaller than the younger one at their anterior, a phenomenon never observed in the wild-type ovarioles. The quantification of the ratio between the volumes of the defective growth follicle and the follicle at its anterior confirmed these observations (Fig 1H, 1I). We observed this phenotype with three different alleles, and only when all (or almost) epithelial cells of a follicle harbored the mutated Inx2 while the anterior one was wild-type or contained a smaller percentage of mutant cells, whereas similar cases with control clones had no visible effect (S1E Fig). Constitutive Inx2 RNAi in the somatic lineage completely blocked oogenesis and precluded follicle observations. Therefore, we added a constitutive Gal80ts transgene to allow a temporal control of RNAi construct (18 °C then 48 h at 30 °C). Inx2 knockdown induced by RNAi in follicle cells during 48 h leads to a strong reduction of the proportion of Edu-positive germline cysts, indicating that the growth defect is associated with an effect on endoreplication (Fig 1J). Because it is known that germline and somatic growth influence each other, we aimed to determine whether Inx2 effect on germline growth was due to a cell-autonomous effect on somatic growth [18,22,24,26]. We analyzed follicle cell growth in smaller clones that do not induce germline defect to avoid feedback between tissues. In such mutant cells we did not observe any difference in cell size and in the proportion of cells in S phase (EDU-positive) between these populations, indicating that Inx2 did not influence somatic growth in a cell autonomous manner (Fig 1K, 1L). Of notice, follicle cells express other innexins at their lateral domain, suggesting that functional gap junction in between follicle cell may still exist in absence of Inx2 and may mask a potential communication in between these cells influencing their growth [28]. Nonetheless, these results suggest that Inx2-dependent gap junctions are specifically required for germline growth. We could not confirm this hypothesis using germline null Inx4 mutant clones because in these mutant germline cysts development is blocked very early in the germanium [27]. Therefore, we generated Inx4 RNAi clones in the germline, directly detected by the absence of Inx4 expression (Fig 1G). Inx4 RNAi cysts were not larger than the younger wild-type follicle, indicating defective germline growth (Fig 1G, 1M). Altogether, these data indicate that Inx2 and Inx4 form homomeric and heterotypic gap junctions between somatic and germ cells that are required for germ cell growth. Genes implicated in amino acid metabolism are enriched in somatic follicular cells Gap junction requirement for germ cell growth supports a model in which metabolites diffuse from follicle cells to germ cells. However, the direct identification of such metabolites is technically challenging. We hypothesized that if follicle cells produce or import metabolites not present in the germline, we might identify enzymes or transporters that are involved in this process and that are more expressed in follicle cells. To this aim, we performed translating ribosome affinity purification (TRAP), an approach that allows identifying the tissue-specific translatome. We used nanos:Gal4VP16 and trafficjam:Gal4 drivers to express a Green Fluorescent Protein-tagged ribosomal protein (UASp:RPL10a-GFP) specifically in the germline and in somatic cells, respectively (Fig 2A). Then, we immunoprecipitated the GFP-tagged polysomes and isolated and sequenced the associated mRNAs under translation. Our biological replicates were highly reproducible (S1 Table and S2A Fig). To extract genes with a specific or strongly enriched somatic expression, we used a fold-change of 5 between soma and germline as cut-off that gave a list of 811 genes (Figs 2B and S2B). We then selected genes involved in small molecule metabolic processes, reducing the list to 52 genes. The whole ecdysone synthesis pathway (six genes), which is exclusively active in follicle cells, was enriched in follicle cells, thus validating our TRAP approach [34]. Moreover, six genes encoding enzymes involved in amino acid biosynthesis were strongly enriched in follicle cells (Fig 2B). By analyzing this gene ontologyclass, we found that 29 enzymes were expressed in both cell types, but none was germline-specific (Fig 2C). Similarly, among the 60 amino acid transporters encoded by the fly genome, 21 were expressed in both tissues, 12 were specifically expressed in the soma, and none in the germline alone (Fig 2B, 2D). Interrogating the Fly Cell Atlas for the 18 genes involved in amino acid synthesis or transport and enriched in the soma in the TRAP experiments indicated that they were all detected in a higher proportion of “ovariole somatic cells” than “female germ cells” [35]. It suggests that the differences observed in the translatomes of soma and germline are already true at the level of their transcriptomes. These data indicated that among the genes strongly enriched in follicle cells there is a strong bias for genes involved in amino acid synthesis or import. Given the importance of these metabolites for growth, they are good candidates to be transferred through gap junctions. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. TRAP analysis identifies soma-specific expression of amino acid metabolism and transport genes. (A) Representative images of follicles expressing RPL10a-GFP in the soma (driven by tj:Gal4) or the germline (driven by nos:Gal4p16). After polysome immunoprecipitation (IP), mRNAs were sequenced. (B) The expression of 811 genes was significantly enriched in somatic cells compared with germline cells (fold-change > 5, p-value < 0.05). After filtering for genes involved in small-molecule metabolism or transport, genes related to amino acid biology were identified. (C) Distribution (soma and/or germline) of genes encoding proteins implicated in amino acid biosynthesis or transport according to their expression profile. No gene of these classes was enriched exclusively in the germline. The raw data underlying this figure can be found in S1 Table. https://doi.org/10.1371/journal.pbio.3003045.g002 A putative amino acid transporter is required in follicle cells for germline growth We performed a reverse genetic screen by inducing the silencing (RNAi) of the genes involved in amino acid synthesis or transport identified by TRAP in follicle cells and then looking for ovary growth defects. For each amino acid, many redundancies may occur between anabolic pathways or transporters that may preclude the observation of clear phenotypes. Moreover, although all the available RNAi lines for these genes were tested, we could not exclude that some were not efficient enough. However, the knockdown of one of the 18 tested genes, CG43693, reproductively induced an ovary growth defect in independent RNAi lines (Fig 3A–3C). We named this gene cochonnet (coch) for reasons explained below. The ovary growth defect observed upon coch knockdown is associated with a change in follicle stage distribution, with the accumulation of the younger stages (1–9) and a depletion of the more mature ones (10–14) (Fig 3D). Although uncharacterized in Drosophila, Coch unambiguously belongs to the well-characterized SLC36 family of amino acid transporters, and was especially close to mammalian SLC36A1 and SLC36A4 that are involved in non-polar amino acids transport as proline, alanine, or tryptophan [36–38]. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Import of some amino acid in follicle cells promotes germline growth. (A, B) Representative images of ovaries in (A) control and (B) after CG43693/coch knockdown by RNAi in somatic cells. (C) Ovary size (surface) quantification in control and after coch somatic knockdown using two RNAi lines (Ordinary one-way ANOVA + Dunnett’s multiple comparisons test). (D) stage distribution per ovariole in control or coch knockdown conditions. We regrouped follicle stages in four categories: early: 1−6, intermediate: 7−9, late: 10−12, mature: 13−14 (n = 43 for controls and n = 95 for coch RNAi, two-way ANOVA and Šídák’s multiple comparisons test). (E, F) In situ hybridization analysis of coch expression in (E) a whole ovariole and in (F) stage 4 and 7 follicles. (G) Expression of endogenous Coch protein tagged with GFP from germarium to stage 8. (H) Higher magnification of follicle cells showing Coch-GFP localization in the cell cortex. (I) Coch mutant clones marked by the absence of RFP (shown on I'). When all somatic cells of a follicle are mutated, follicle growth is affected (arrow) as seen when compared with the younger follicle (arrowhead). (J) Quantification of the ratio between the volume of follicle with a growth defect and fully covered by mutant cells and the n − 1 follicle in the same ovariole (n = 10 control and 6 coch mutant follicles, Mann–Whitney test). (K, L) quantification of (K) cell surface and (L) EDU-positive cells in small coch mutant clones compared with the surrounding wild-type cells (n = 12 control and 12 mutant clones or groups of wild-type cells, Welch’s t-test). For all graphs, data are the mean ± SD. **p < 0.01, ***p < 0.001. Scale bars: 500 μm in A, B, 50 μm D, E, F, H and 10 μm in G. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g003 RNA-FISH indicated that coch was specifically expressed in follicle cells, as no expression was detected in germline cells, and checked for its effective knockdown by RNAi in the somatic lineage (Figs 3E, 3F and S3A). Moreover, we detected coch mRNA in all follicle cells and at all stages of oogenesis, and its expression progressively increased with the stage (Fig 3E). This mRNA was enriched at the apical side of the follicle cell, fitting with a previous observation that many mRNAs, including this one, are localized at this domain in follicle cells [39]. A role in importing amino acids from the hemolymph would require a cell membrane localization, while some amino acid transporters are specific to some intracellular vesicular compartments. Therefore, to determine Coch protein localization, we generated a GFP knockin with the Minos-mediated integration cassette (MiMIC) system to insert in-frame an EGFP-encoding exon [40]. This insertion was in the non-conserved N-terminal part common to five of the seven described coch isoforms. Importantly, it tags isoforms RA and RB that were expressed in follicle cells according to our TRAP mRNA sequence data. The line coch-GFP was homozygous viable and fertile without visible phenotype. Moreover, ovary size was normal when coch-GFP was in trans with a deficiency covering the gene, indicating that the insertion did not affect protein function (S3B Fig). We observed a strong GFP signal that tended to increase throughout oogenesis in follicle cells, as previously observed for the mRNA, but no signal in the germline (Fig 3G). The GFP decorates the whole cortex of follicle cells, though appearing more enriched at the lateral and apical domains (Fig 3H). Of notice, septate junctions are not mature from stage 1–8, and the epithelium is therefore not impermeable, suggesting that solutes import via Coch can occur from any domain [41]. Moreover, Coch-GFP subcellular localization was not affected in mutant cells for Inx2, though its expression appears slightly lower in such a context (S3D Fig). Thus, both Coch tissue distribution, specific to somatic cells and present at all stages, and subcellular localization, at the cell cortex, were in agreement with the hypothesis that it is implicated in the import of some amino acids to promote follicle growth. We then tested whether Coch was required for follicle growth. The Minos element insertion (CG43693MI101960) used for the GFP knockin initially contains an exon with STOP codons in different frames. As it is located in the protein N-terminus, it should induce null mutation of the affected isoforms, which include the ones expressed in follicle cells. This allele was sublethal when homozygous or in trans with a deficiency covering the gene. In both cases, the ovaries of these flies were strongly atrophied, fitting with a role for this gene in follicle growth, though we could not detect a difference in the proportion of Edu-positive cysts (Figs S3B, 3C). Moreover, Replacing the STOP codon-containing exon with the GFP exon suppressed ovary growth defect. We generated mutant clones in the follicular epithelium. Analysis of small clones that did not cover the whole epithelium did not show any difference in cell size and proliferation, suggesting that follicle cell growth was not affected (Fig 3K, 3L). Conversely, when the whole epithelium of a follicle was mutated, such follicles were smaller (or of the same size) than the younger ones (Fig 3I, 3J). As this growth defect led to tiny and round follicles, we called this gene cochonnet (coch) after the nickname of the small wood ball used with larger metal boules for pétanque, a traditional game in the South of France. Importantly, these experiments indicated that coch expression in somatic follicle cells was required for germline growth. Genetic evidence of a metabolic transfer between soma and germline cells Our results support a model in which some amino acids imported in follicle cells diffuses through gap junctions to sustain germline growth. In this case, germline expression of coch should rescue genetic conditions in which it is absent in somatic follicle cells. Importantly, it is known that the follicular epithelium is permeable at least until stage 8, and thus that metabolites from the hemolymph can directly reach oocyte surface [41]. To test this hypothesis, we combined the UAS/Gal4 system with the QUAS/QF system [42]. We generated a MatTub:QF driver and a QUASp promoter, inspired by the UASp promoter, for proper expression in the germline [43]. A QUASp:GFP transgene driven by MatTub:QF was expressed once the germline cyst is formed in the germarium and in all the subsequent stages, starting slightly earlier than what described for MatTub:Gal4Vp16, indicating that both driver and QUASp promoter were functional (S4A Fig). Expression of the QUASp:coch transgene in the germline with the same driver (MatTub>coch) had a slight negative effect on ovary size (Fig 4A–4C, 4F). However, when it was combined with somatic knockdown of coch (tj>cochRNAi), it rescued the effect of the latter on ovary size (Fig 4A, 4D–4F). Importantly, since MatTub:QF driver is expressed only from late stages of germarium, the observed rescue cannot be due to an effect during gonad development. However, such rescue is not quantitatively observed in the stage distribution, suggesting that the latter is less sensitive than ovary size to probe growth modification (S4B Fig). We also quantified egg laying in these different genetic backgrounds. This experiment confirmed both the slight deleterious effect of germline coch expression, but also its ability to rescue its knockdown in the follicle cells (Fig 4G). Altogether, these genetic data demonstrated that the requirement of coch expression in follicular cells for germline growth can be partially compensated by its germline expression, suggesting a direct metabolic exchange between these cell types through the gap junctions. To further explore this idea, we attempted similar experiments between the ectopic expression of coch in the germline and the Inx2 knockdown. Of notice, in the conditions used for these experiments (18 °C then 48 h at 30 °C), due to the temporal control of Inx2 RNAi expression, we did not observe a significant negative effect of germline ectopic coch expression that we observed before (Fig 4F, 4H). However, it tends to rescue the ovary size of Inx2 knockdown (Fig 4H). Interestingly, this result fits with a model in which the solutes imported via Coch in the somatic cells reach the germline through Inx2-dependent gap junctions. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Ectopic coch expression in the germline compensates its absence in the soma or the one of gap junction. (A) Scheme representing coch expression in green in the different genotypes used on this figure. (B–E) Representative images of a full ovary from (B) a control female, (C) a MatTub>coch female, (D) a tj>coch RNAi female, a MatTub>coch female and (E) a tj>coch RNAi, MatTub>coch female. (F) Quantification of ovary size (surface) in the indicated genotypes (n = 14, 10, 19, and 27 ovaries, Ordinary one-way ANOVA + Tukey’s multiple comparisons test). (G) Quantification of egg laying per female and per hour, from four independent experiments with 10 females, for the indicated genotypes. (H) Quantification of ovary size (surface) in the indicated genotypes. As we pointed out, a slight variability between two independent experiments, but still with the same tendencies, data were analyzed using two-way ANOVA plus Šídák’s multiple comparisons test (n = 30, 26, 22, 30 ovaries). For all graphs, data are the mean ± SD., *p = 0.0378, **p < 0.01, ***p < 0.001, ****p < 0.0001. Scale bars: 500 μm. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g004 Blocking gap junctions or amino acid import induces processing bodies Previous studies indicated that somatic growth impairment affects germ cell development [18,22–24]. Moreover, the mRNA binding protein Me31B, which is typically concentrated where the anterior–posterior axis determinants localize in the oocyte, becomes enriched in large cytoplasmic structures in the germline when somatic growth is impaired [16,44]. These condensates are reminiscent of P-bodies and stress granules observed upon various stresses, including amino acid deprivation [45]. In follicles, their formation can be induced by reducing protein availability in fly food and are more easily seen at stage 9 in Me31B-GFP-expressing follicles (Fig 5A, 5B) [44]. Therefore, we set up a semi-automated method for their quantification by measuring the fluorescence fraction found in condensates in stage 9 nurse cells (Fig 5C). P-body formation is usually induced by inhibitory phosphorylation of the eukaryotic translation initiation factor 2 subunit alpha (eIF2α) on serine 51 (S51), an event also known to cause a translation arrest [45]. Overexpression in the germline of a transgene mimicking this phosphorylated form (eIF2α-S51D), but not wild-type eIF2α, strongly induced P-body formation, although we did not quantify them because these follicles never reached stage 9 (Fig 5D, 5E). Importantly, eIF2α-S51D overexpression was sufficient to completely block germline growth, suggesting a link between P-body formation and growth control (Fig 5E). As P-bodies potentially repress germline growth, we tested whether the absence of gap junctions between germ cells and follicle cells or defective amino acid import in the follicle cells could induce their formation. We knocked down Inx2 and coch in follicle cells using tj:Gal4 in the presence of Me31B-GFP. To be able to recover and observe stages 8–9, Inx2 RNAi was induced only during a short period of time (14 h, at 30 °C). We also attempted Inx4 knockdown by inducing RNAi specifically in the germline. Coch, Inx2 and Inx4 knockdowns strongly induced the formation of P-bodies when compared to their respective control, suggesting that the observed growth limitation was linked to their formation and a potential arrest of translation (Fig 5F–5L). We also noticed that P-body aspect may differ from a follicle to another, being more dotty or more flaky, though we could not find an explanation for this difference as it appeared not linked to the genotype or the amount of P-bodies. Finally, since ectopic coch expression in the germline tended to rescue ovary size in somatic Inx2 knockdown, we asked whether a similar effect could be observed on P-bodies. Interestingly, we found a significant reduction in P-bodies compared to Inx2 knockdown (Fig 5J). These results provide another indirect argument in favor of a model based on the exchange of solutes across gap junction that are initially imported by Coch in the somatic cells. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Defective gap junctions or amino acid import induces P-body formation. (A, B) Endogenous GFP-tagged Me31B protein expression in stage 9 follicles in (A) well-fed conditions or (B) after 15 h of protein starvation. (C) P-body quantification in normal and starved conditions (n = 16, 21). (D, E) Me31B-GFP expression in ovarioles germline that overexpresses (D) wild-type eIF2α or (E) or eIF2α-S51D (inhibitory phosphorylation). (F–I) Representative images of Me31B-GFP expression in stage 9 follicles in control (F) or after RNAi-based knockdown in somatic cells of (G) Inx2, (H) coch and (I) in the germ cells of Inx4. (J) P-body quantification in follicles from control females, MatTub>coch females, tj>Inx2 RNAi females, and tj>Inx2 RNAi, MatTub>coch females (n = 10, 10, 9, 10 follicles and one-way ANOVA plus Tukey’s multiple comparisons test). (K, L) P-body quantification in follicles after RNAi-based knockdown in somatic cells of coch (K) and in germline cells of Inx4 (L) (K, control n = 24, coch RNAi n = 24, L, control n = 27, Inx4 RNAi n = 26 and Mann–Whitney test). For all graphs, data are the mean ± SD, ***p < 0.001, ****p < 0.0001, Unpaired t-test. Scale bars: 50 μm. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g005 We hypothesized that the non-cell autonomous effect of coch and Inx2 RNAi on the formation of P-bodies could a be consequence of the deprivation for some amino acids. P-bodies or stress granules formation in response to a lack of some amino acids can occur independently or dependently of eIF2α S51 phosphorylation. In the first case, which is not the most frequently described, it is associated with the repression of the TOR pathway [46]. However, induction of Tor RNAi in the fly germline does not induce P-body formation, excluding this possibility [16]. In the second case, eIF2α is phosphorylated by GCN2 that acts as an indirect sensor of amino acid availability. We generated null mutant alleles for gcn2 that were homozygous viable, as similar alleles described during the course of this project [47–49]. Protein deprivation in gcn2 transheterozygous females still led to P-body formation in the germ cells (S5A–S5E Fig). Moreover, an indirect reporter of GCN2 activity and eIF2α−S51 phosphorylation (ATF4-GFP) showed no signal in the germline of wild-type flies, even after protein deprivation (S5F, S5G Fig). Lastly, we tested more directly gcn2 role in the germline when amino acid import in follicle cells is impaired (tj>cochRNAi). However, gcn2 mutant germline clones did not increase the growth rate of wild-type and coch RNAi follicles (S5H, S5I Fig). These results strongly argued against the implication of GCN2 as a germ cell growth repressor when the availability of the amino acids provided by Coch is reduced. Thus, being independent of TOR and GCN2, the mechanism leading to P-bodies formation in the female germline seems unusual and will require further investigation. Altogether these data indicate that although the two well-established pathways involved in amino acid sensing do not seem implicated, P-body formation acts as a metabolic stress sensor linked to the control of germ cell growth by gap junctions and the import of some amino acids in follicle cells. Gap junction assembly links intrinsic growth control and systemic control Published data indicated that InR/PI3K or Tor pathway inhibition in follicle cells induces the formation of P-bodies in the germline [16]. This result was reproduced by akt silencing (RNAi), an essential actor of this pathway, in follicle cells (Fig 6A–6C). Since somatic activity of the InR/PI3K pathway also strongly influences germline growth, we asked whether there was a link between the InR/PI3K pathway activity in follicle cells and their ability to transfer metabolites to the germline. Coch-GFP expression level and localization were similar in wild-type and in follicle cells with a PI3K gain of function or mutated for akt, suggesting no impact on this specific actor of amino acid import in follicle cells (S6A, S6B Fig). Conversely, in akt mutant cells, Inx2 was almost undetectable (Fig 6D), whereas Inx2 expression was strongly increased in Pten mutant cells and plaques size was increased (Fig 6E). These data indicated that Inx2 expression is sensitive to InR/PI3K pathway gain and loss of function. Moreover, we observed that Inx2 protein level was also sensitive to the loss but not the gain of function of the Tor pathway, suggesting that it is not equally controlled by all the pathways modulating cell growth (S6C, S6D Fig). We also observed that Inx2 mRNA level was strongly increased in Pten mutant follicle cells, suggesting that the observed effects on protein level and plaque assembly were due Inx2 gene expression upregulation (Fig 6F). These observations supported a model in which the non-cell autonomous effect of the InR/PI3K pathway from somatic cells to germ cells is mediated by gap junctions. To test this hypothesis, we performed an epistasis experiment. As previously described [18], large Pten mutant clones in the follicular epithelium accelerated germline growth in a non-cell autonomous manner (Fig 6G). This effect was abrogated upon Inx2 RNAi induction in Pten mutant cells (Fig 6H). Thus, gap junctions are required for germline growth control via InR/PI3K activity in the follicular epithelium. We also tested whether Inx2 overexpression could be sufficient to rescue the knockdown of the InR/PI3K pathway in the somatic cells. In this genetic combination, we observed neither an increase in ovary size nor a decrease in P-bodies compared to the akt knockdown (S6E, S6F Fig). Thus, the control of Inx2 expression is not sufficient to explain the non-cell-autonomous effect on germline growth of the InR/PI3K pathway in the follicle cells. Nonetheless, altogether, our data link the systemic control of somatic cells to the growth coordination between somatic and germline cells via the modulation of gap junction assembly. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Gap junctions are controlled by the InR/PI3K pathway. (A, B) Representative images of Me31B-BFP expression in stage 9 follicles from (A) a control female and (B) a tj:Gal4, Tub:Gal80ts>akt RNAi female. (C) P-bodies quantification (fluorescence intensity) in the indicated genotypes (n = 8 and 9 follicles, unpaired t-test). Data are the mean ± SD. ****p < 0.0001. (D) Inx2 expression in an aktq mutant clone (marked by the absence of GFP expression). (E, F) Inx2 protein (E) and Inx2 mRNA expression by FISH (F) in a Ptendj189 mutant clone marked by the absence of RFP expression. (G) Large MARCM Ptendj189 mutant clone marked by the presence of GFP expression showing faster follicle growth. (H) Large MARCM Ptendj189 mutant clone in which faster follicle growth was abolished after expression of an RNAi against Inx2. (I) Quantification of volume ratio between the older fully wild-type follicle and follicles containing a majority of Pten mutant cells and expressing or not RNAi against Inx2 and (n = 10 and 9, Mann–Whitney test). Scale bars: 50 μm in A, B, G, H and 20 μm in D, E, F. Data are the mean ± SD. ****p < 0.0001. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.g006 Discussion In this article, we show that gap junctions participate in the control of cell growth, leading to the intuitive proposal that they allow a metabolic flow between cells. In accordance with this hypothesis, we identified a putative amino acid transporter, coch, specifically expressed in follicle cells and required for germline cell growth. Though a flux of amino acids from somatic cells to germ cells cannot be directly visualized, coch expression in the germline can partially bypass its silencing or the one of Inx2 in the soma, suggesting a functional metabolic exchange between these cell types through gap junctions. Moreover, direct genetic induction of P-bodies in the germline is sufficient to block germline growth, and both defective gap junction or amino-acid import in somatic cells induces P-bodies in the germ cells, a feature usually associated with an arrest of translation. However, this effect is not mediated by GCN2 and it is therefore unclear by which mechanism P-bodies are induced in such a situation. Thus, the most plausible explanation for all our data is a model in which gap junctions allow the transfer from somatic cells to germ cells of metabolites, such as amino acids, that germ cells cannot directly produce or import, thereby promoting germ cell growth through translation control. Inx2 and Inx4 have various functions during Drosophila oogenesis [6,27,29,31]. However, our data showing their involvement in germ cell growth can be more easily compared with gap junction role in mammal follicles where Cx37 is expressed in the oocyte and Cx43 in follicle cells [7,8]. This indicates that gap junction requirement for oocyte growth is a conserved feature throughout evolution. Gap junctions are present in most animal tissues and can directly connect different cell types, for instance, neurons and glial cells. Therefore, their involvement in growth control might be more general. We found that Inx2 expression and the subsequent formation of plaques were strongly regulated by the InR/PI3K pathway in a cell autonomous manner, providing an effective mechanism to link metabolic flow with growth systemic signals. Since InR/PI3K pathway activity in the follicle cells controls both somatic growth and germline growth ensuring their coordination, its impact on gap junctions likely participates in this coordination. Our results also established that Coch, a putative amino acid transporter of the SLC36A family, must be expressed in somatic cells for germline growth. Amino acid availability and cell growth control are usually linked by the TOR pathway [50]. However, this pathway is unlikely implicated in the mechanism described here. Indeed, first, when germline growth is blocked due to alteration of somatic cell growth upon loss of akt, which also regulates gap junction assembly, the TOR pathway is still active in the germline [18]. Moreover, overactivation of the TOR pathway in the germline in such conditions does not suppress growth inhibition. Finally, TOR inhibition in the Drosophila female germline does not induce P-body formation [16]. Therefore, we tested the involvement of the other well-described amino acid sensing pathway that relies on GCN2. Our results also excluded this mechanism to explain P-body formation and growth inhibition in the absence of proline import in somatic cells or of gap junctions. Therefore, more studies are needed to determine the precise mechanism underlying germline growth control. Several arguments suggest that amino acids imported by Coch might just be the tip of the iceberg of the metabolic cooperativity between follicle cells and germ cells. Inx2 mutant clones have a stronger impact on germline growth than coch mutant clones. Probably as a consequence of this difference of requirement, temporally controlled Inx2 knockdown induces an acute and strong arrest of germline growth, associated with a cell cycle arrest, while coch mutation has a milder effect that may lag each cell-cycle phase, but then do not significantly affect their respective proportion. In agreement with this idea, when somatic clones for coch induce smaller follicles, it is also associated with smaller nurse cell nuclei, which strongly suggests a delay in the cell progression through the rounds of endoreplication. Thus, it suggests that other metabolites from follicle cells also may promote germline growth. Accordingly, our translatome analysis showed that different genes involved in amino acid synthesis or import were specifically expressed in follicle cells, but none in germ cells. The absence of phenotype when these genes were knocked down, except coch, could be explained by multiple potential levels of redundancy among synthesis pathways, transporters, and synthesis and import. Nonetheless, detailed analysis of the genes overexpressed in follicle cells compared with germ cells suggest that several amino acids (e.g., proline, tryptophane, valine, tyrosine) could flow from the soma to the germline. Moreover, in the mouse, the alanine transporter SLC38A3 is strongly enriched in granulosa cells that are required for efficient alanine import in the oocyte [11]. Thus, altogether these data suggest that gap junction-dependent amino acid flow is not restricted to the amino acids imported via Coch. Besides, glucose intake and pyruvate production are more efficient in granulosa cells, suggesting that follicle cells could provide energetic molecules to the oocyte [12]. In line with this observation in mammals, TRAP data indicated that transporter for trehalose (tret1-1), the circulating sugar in insects, was expressed in somatic cells and not in the germline, whereas germline development is highly dependent on sugar and the pentose phosphate pathway [51]. Thus, a contribution of energetic metabolism in the soma–germline cooperativity via gap junctions will be an interesting avenue for future investigations. Moreover, it has already been shown that calcium can be exchanged between germ cells and follicle cells by Inx2/Inx4 gap junctions, and a potential role in the control of germ cell growth of this ion that can act a second messenger cannot be currently excluded [29]. Finally, the fact that Inx2 expression in follicle cells is not sufficient to explain the non-cell-autonomous effect on germline growth of the InR/PI3K pathway in the follicle cells, strongly suggests that this pathway controls other genes involved in the exchanges between the two tissues. One might ask what is the evolutive advantage of this metabolic cooperativity compared with direct import or synthesis in the germline. Three main hypotheses could be proposed. First, metabolites might not be able to directly reach oocyte membrane due to the presence of the follicular epithelium. However, in follicles from stage 1–8, septate junctions are immature, and the epithelium is not impermeable [41]. Accordingly, our results showing a partial rescue of coch absence in the soma by its ectopic expression in the germline imply that its solutes can reach the oocyte. Second, female germ cells undergo massive growth and there is a non-linear relation between volume and cell surface increases. Consequently, surface exchange with the extracellular medium might not allow sufficient metabolic import and may require the support of follicle cells. This hypothesis was proposed to explain gap junction requirement for mammal oocyte growth [52]. It could also explain why germline growth, but not somatic growth, is sensitive to the loss of coch, as follicular epithelium has a relatively constant height and thus increases much less in volume than the germline. However, in this case, both tissues should be able to import the required metabolites and our data does not support such a model because follicle cells expressed a whole set of anabolic enzymes and transporters that were not expressed in the germline. Alternatively, such a mechanism may provide a protective effect for the germline. It is well established that the cell metabolic activity can induce stress, with for instance, the production of reactive oxygen species (ROS). However, the germline must be protected, especially its DNA content that is transmitted to the embryo. A recent study demonstrated that mammalian oocytes block ROS production by suppressing mitochondrial complex I [53]. Thus, a gap junction-mediated metabolic exchange might allow externalizing the stress-generating metabolic activity to follicle cells that will anyway die few days later. Notably, coch ectopic expression in germ cells had a slight but significant negative impact on ovary size and egg laying, suggesting that its presence in germ cells is detrimental for follicle development. Although, the exact reason for this deleterious effect is unknown, this observation fits with a model in which externalization of metabolic import and activity could facilitate proper oocyte development. However, Coch is a transporter, and not an enzyme, and thus, the protective effect might not be directly due to amino acid exclusion, but to one of the many possible downstream metabolic activities. Altogether, our data indicate that gap junctions and a metabolic flow are essential for cell growth and that gap junction assembly can be modulated to adjust cell growth rate. Moreover, it suggests that amino acids, and potentially other metabolites, are important actors in this mechanism ending with the formation of P-bodies, opening a large field for investigation to obtain a comprehensive view of this metabolic cooperativity. Materials and methods Fly genetics and handling All fly stocks used are detailed in S2 Table. The final genotypes, temperature and heat-shock conditions are in S3 Table. gcn2 null alleles were generated by inducing indel mutations using an available gRNA line. Alleles are described in S2 Table. Unless specified, flies were kept on a corn meal-based medium with 80 g/L fresh yeast. Protein starvation was performed on grape juice agar plate. Coch-EGFP was obtained by inserting the EGFP-FlAsH-StrepII-TEV-3xFlag cassette in the Minos element insertion MI01960, as previously described [40]. For egg laying quantification, 10 females of each genotype were place on fruit juice agar plate with liquid yeast for few hours, then the number per hour and per female was calculated and the experiments were repeated four times. TRAP experiments Ovaries of 100 females of each genotype (tj>RPL10-GFP or Nos>RPL10-GFP) were dissected on ice. Then, TRAP was performed as described in [54]. Briefly, after homogenization, ovary extracts were preabsorbed with magnetic beads. Immunoprecipitation was performed with anti-GFP antibodies already coupled with magnetic beads before mRNA extraction. Tissue specificity and enrichment of the extracted mRNAs were checked by Reverse Transcription-qPCR with GFP (enrichment), traffic-jam (soma specific), Ago3 and Aubergine (germline specific) primers. mRNA libraries were made with the Nugen Ovation 1–16 droso Universal RNA-seq kit according to the manufacturer’s instructions. Sequencing was performed by Fasteris. Data were deposited on GEO (GSE230452) and described in S1 Table. Molecular cloning and transgenesis The QF sequence was amplified from the pAttB-QF-sv40 vector and cloned in the vector that contains the alpha4-tubulin promoter (MatTub). The pQUASp vector was constructed from pUAST in which the promoter was replaced by QUAS sites and the minimal P-element promoter was amplified from the pUASp vector. Then, this vector was used to clone EGFP or coch coding sequence and P-element insertions were generated. eIF2α and eIF2α-S51D coding sequences were cloned in the pUASz vector, and transgenes were inserted at the AttP40 landing site. All vectors and new Drosophila lines can be provided upon request. Immunostaining, FISH, EDU incorporation, imaging and quantitative analyses Resources and reagents are listed in S2 Table. Immunostaining was performed as described in Vachias and colleagues (2014). Stellaris SM-FISH oligo-probes against Inx2 and CG43693/coch mRNAs were produced by Biosearch Technologies. FISH was performed according to the manufacturer’s instruction. Images were acquired on a Zeiss LSM800 confocal microscope or a Zeiss Cell observer spinning disc microscope. P-bodies were quantified using a homemade macro initially designed to quantify basement membrane fibrils [55]. Quantifications were done on 5 μm z-stack projections of spinning-disc images acquired with a ×20 lens. After manual selection of a ROI corresponding to nurse cells, P-bodies were detected by keeping objects with a minimal size of 5 adjacent pixels and a minimal fluorescence intensity of ×1.75 of the mean intensity. Then, the total fluorescence contained in these P-bodies was quantified and reported as a fraction of the total fluorescence of the ROI. For EDU incorporation, ovaries were incubated with 10 μM EDU in complemented Schneider medium for 15 min. After fixation, staining was performed according to the manufacturer’s instructions (Kit EDU C10638, Thermo Fisher). Mutant-control follicle volume ratios were calculated after estimating the follicle volume as a spheroid ( = 4/3π(length × width2)). Mutant and control follicles in position 2–4 of the ovariole starting from the anterior were analyzed. Cell size was automatically determined after cell segmentation using Tissue Analyzer [56]. Statistical analyses were performed with Prism. For all experiments, the minimum sample size is indicated in the figure legends. For each experiment, multiple females were dissected. Randomization or blinding was not performed. The sample normality was calculated using the D’Agostino and Pearson normality test. Statistical tests and size samples are indicated in figure legends. Figures were prepared using ScientiFig [57]. Fly genetics and handling All fly stocks used are detailed in S2 Table. The final genotypes, temperature and heat-shock conditions are in S3 Table. gcn2 null alleles were generated by inducing indel mutations using an available gRNA line. Alleles are described in S2 Table. Unless specified, flies were kept on a corn meal-based medium with 80 g/L fresh yeast. Protein starvation was performed on grape juice agar plate. Coch-EGFP was obtained by inserting the EGFP-FlAsH-StrepII-TEV-3xFlag cassette in the Minos element insertion MI01960, as previously described [40]. For egg laying quantification, 10 females of each genotype were place on fruit juice agar plate with liquid yeast for few hours, then the number per hour and per female was calculated and the experiments were repeated four times. TRAP experiments Ovaries of 100 females of each genotype (tj>RPL10-GFP or Nos>RPL10-GFP) were dissected on ice. Then, TRAP was performed as described in [54]. Briefly, after homogenization, ovary extracts were preabsorbed with magnetic beads. Immunoprecipitation was performed with anti-GFP antibodies already coupled with magnetic beads before mRNA extraction. Tissue specificity and enrichment of the extracted mRNAs were checked by Reverse Transcription-qPCR with GFP (enrichment), traffic-jam (soma specific), Ago3 and Aubergine (germline specific) primers. mRNA libraries were made with the Nugen Ovation 1–16 droso Universal RNA-seq kit according to the manufacturer’s instructions. Sequencing was performed by Fasteris. Data were deposited on GEO (GSE230452) and described in S1 Table. Molecular cloning and transgenesis The QF sequence was amplified from the pAttB-QF-sv40 vector and cloned in the vector that contains the alpha4-tubulin promoter (MatTub). The pQUASp vector was constructed from pUAST in which the promoter was replaced by QUAS sites and the minimal P-element promoter was amplified from the pUASp vector. Then, this vector was used to clone EGFP or coch coding sequence and P-element insertions were generated. eIF2α and eIF2α-S51D coding sequences were cloned in the pUASz vector, and transgenes were inserted at the AttP40 landing site. All vectors and new Drosophila lines can be provided upon request. Immunostaining, FISH, EDU incorporation, imaging and quantitative analyses Resources and reagents are listed in S2 Table. Immunostaining was performed as described in Vachias and colleagues (2014). Stellaris SM-FISH oligo-probes against Inx2 and CG43693/coch mRNAs were produced by Biosearch Technologies. FISH was performed according to the manufacturer’s instruction. Images were acquired on a Zeiss LSM800 confocal microscope or a Zeiss Cell observer spinning disc microscope. P-bodies were quantified using a homemade macro initially designed to quantify basement membrane fibrils [55]. Quantifications were done on 5 μm z-stack projections of spinning-disc images acquired with a ×20 lens. After manual selection of a ROI corresponding to nurse cells, P-bodies were detected by keeping objects with a minimal size of 5 adjacent pixels and a minimal fluorescence intensity of ×1.75 of the mean intensity. Then, the total fluorescence contained in these P-bodies was quantified and reported as a fraction of the total fluorescence of the ROI. For EDU incorporation, ovaries were incubated with 10 μM EDU in complemented Schneider medium for 15 min. After fixation, staining was performed according to the manufacturer’s instructions (Kit EDU C10638, Thermo Fisher). Mutant-control follicle volume ratios were calculated after estimating the follicle volume as a spheroid ( = 4/3π(length × width2)). Mutant and control follicles in position 2–4 of the ovariole starting from the anterior were analyzed. Cell size was automatically determined after cell segmentation using Tissue Analyzer [56]. Statistical analyses were performed with Prism. For all experiments, the minimum sample size is indicated in the figure legends. For each experiment, multiple females were dissected. Randomization or blinding was not performed. The sample normality was calculated using the D’Agostino and Pearson normality test. Statistical tests and size samples are indicated in figure legends. Figures were prepared using ScientiFig [57]. Supporting information S1 Fig. (A) Maximum intensity projection images of an ovariole after immunostaining for Inx2 (green in A, white in A′) and Inx4 (magenta in A, white in A″). Note the progressive increase of Inx2 expression. (B) Control stage 9 follicle (no clone) showing that GAP junction plaques are absent at oocyte-follicle cells interface. (C) Follicle containing a germline mutant clone for Inx2 (absence of GFP white in C and C′) and showing no growth defect and no impact on gap junction plaques while Inx2 staining (green in C, white in C″) is lost in the somatic clone observed on the same follicle. (D) Somatic Inx4 RNAi clone showing no impact on plaque formation (Inx4 staining green in C, white in C″). E) Control FRT19A RFP minus clone covering the whole epithelium of a follicle and showing no growth defect when compared to surrounding follicles. Scale bars 50 μm in A, 10 μm in all the other panels. https://doi.org/10.1371/journal.pbio.3003045.s001 (PDF) S2 Fig. (A) Color-coded graph representing Spearman correlation coefficients between the different TRAP samples. (B) Z-score hierarchical clustering heat map visualization. Only significantly differentially expressed genes are reported. Highly expression correlation can be noticed across condition replicates. The raw data underlying this figure can be found in S1 Table. https://doi.org/10.1371/journal.pbio.3003045.s002 (PDF) S3 Fig. (A) RNAi clone against CG43693/coch stained by FISH against CG43693/coch showing its disappearance from these cells, confirming probe specificity and RNAi efficiency. (B) Quantification of ovary size (mm2) for the indicated genotypes (n = 13, 14, 12, 9, One-way ANOVA plus Dunnett’s multiple comparisons test). Data are the mean ± SD. ****p < 0.0001. (C) quantification of stage 1–8 follicles positive or negative for Edu in the germ cells (GL) (n = 85 for control and n = 143 for coch mutant flies, Fisher’s exact test). (D) Coch-GFP in a Inx2 mutant clone (RFP negative cells). Scale bars 10 μm. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.s003 (PDF) S4 Fig. (A) Ovariole expressing a QUASP:GFP transgene under the control of the MatTub:QF driver. (B) Stage distribution per ovariole in indicated genotypes. We regrouped follicle stages in four categories: early: 1–6, intermediate: 7–9, late: 10–12, mature: 13–14 (n = 43 for controls and n = 95 for coch RNAi, n = 58 for qUAS:coch and n = 58 for cochRNAi, qUAS:coch, two-way ANOVA and Šídák’s multiple comparisons test). The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.s004 (PDF) S5 Fig. (A, B) Representative images of Me31B-BFP expression in stage 9 follicles from a: (A) control female, (B) starved control female, (C) gcn2 transheterozygous mutant female, and (D) protein starved gcn2 transheterozygous mutant female. (E) P-bodies quantification (fluorescence intensity) in follicles of the indicated genotypes and conditions. Data are the mean ± SD. (F, G) Absence of Atf4-GFP protein expression used as a read-out of GCN2 activity in (F) normal and (G) protein-starved conditions (green in F and G. white in F′ and G′). (H, I) Ovarioles with germline gcn2 mutant clones marked by the absence of RFP expression (magenta) and stained for F-actin (green) in (H) wild-type background and (I) in flies harboring coch RNAi in follicle cells. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.s005 (PDF) S6 Fig. (A) Coch-GFP expressing follicle with akt mutant clones (RFP-negative cells). (B) Coch-GFP expressing follicle with flip-out clones that express a constitutively active form of PI3K (RFP- positive cells). (C, D) Inx2 staining in follicles containing a mutant clones for (C) Tor or (D) Tsc1. (E, F) quantification of (E) ovary size and (F) P-bodies in the indicated genotypes (E: n = 28, 19, 18, 32 and One-way ANOVA plus Tukey’s multiple comparisons test, F: n = 10, 10, 10, 9 Kruskal Wallis test plus Dunn’s multiple comparisons test). For all graphs, data are the mean ± SD, ***p < 0.001, ****p < 0.0001. The raw data underlying this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003045.s006 (PDF) S1 Table. Excel file of soma versus germline TRAP analysis. https://doi.org/10.1371/journal.pbio.3003045.s007 (XLSX) S2 Table. Reagents and resources. https://doi.org/10.1371/journal.pbio.3003045.s008 (DOCX) S3 Table. Genotypes and specific conditions. (h: hours; HS: heat-shock, GP: grape juice agar plate). https://doi.org/10.1371/journal.pbio.3003045.s009 (DOCX) S1 Data. Numerical data supporting figures (excepted Figs 2 and S2). https://doi.org/10.1371/journal.pbio.3003045.s010 (XLSX) Acknowledgments We are grateful to G. Junion, P Phelan, G Tanentzapf, H.D. Ryoo for sharing antibodies or fly stocks. We also thank the CLIC facility (Clermont Imagerie Confocale).

journal article

Open Access Collection

Detailed characterisation of the trypanosome nuclear pore architecture reveals conserved asymmetrical functional hubs that drive mRNA export

Gabiatti, Bernardo Papini;Krenzer, Johanna;Braune, Silke;Krüger, Timothy;Zoltner, Martin;Kramer, Susanne

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003024pmid: 39899609

Introduction Nuclear pores penetrate the double-membrane of the nucleus and serve as an essential gateway for the exchange of proteins, RNAs and ribosomes between the nucleoplasm and the cytoplasm. They are among the largest macromolecular complexes in nature with more than 500 copies of approximately 30 different nucleoporins (NUPs) that form 8 identical protomers (spokes) [1–4]. Each spoke is connected to the nuclear envelope (NE) as well as to the neighbouring spokes, resulting in multiple concentric rings: the inner ring (IR) at the centre of the pore is flanked by 2 outer rings (ORs) at the cytoplasmic site (cytoplasmic ring) and nuclear site (nuclear ring). The outer rings are composed of large Y-shaped protein complexes, called the Nup84 complex in yeast and the NUP107 complex in humans [4]. This pore core structure is extended by the nuclear basket at the nuclear site and the cytoplasmic filaments at the cytoplasmic site. NUPs can be divided into 3 classes: (i) structured NUPs that form the scaffold of the pore, with structural features being limited to beta propellers, coiled coil and alpha-helical solenoids; (ii) pore membrane proteins (POMs) that anchor the pore in the nuclear envelope via transmembrane regions; and (iii) non-structured, intrinsically disordered NUPs, that contain FG (phenylalanine and glycine) repeat motifs and provide a diffusion barrier at the central channel of the pore by phase separation [5]. While smaller molecules can pass by diffusion, the transport of larger molecules, such as most RNAs, ribonucleoprotein particles (RNPs), pre-ribosomes and most proteins, requires energy and depends on transporters. Protein transport, as well as the transport of micro-RNAs and tRNAs is mediated by importins and exportins of the karyopherin family [6]. These transporters recognise and bind nuclear localisation signals (NLSs) or nuclear export signals (NESs) of their cargo and shuttle it within the phase-separated central channel of the pore by interacting with the FG-repeat NUPs [6]. This transport is energised by the RanGTP/RanGDP gradient across the nuclear envelope maintained by the chromatin-bound guanine nucleotide exchange factor RCC1 and the cytoplasmic-localised proteins Ran-GTPase-activating protein RanGAP and RanBP1 [6]. Importins bind cargo and Ran-GTP mutually exclusively, while exportins bind Ran-GTP and cargo cooperatively, thus allowing selective release of cargo either in the nucleus or cytoplasm, respectively, driven by GTP hydrolysis cycles of Ran [6]. The vast majority of messenger ribonucleoprotein particles (mRNPs) are not exported by karyopherins but use the heterodimeric Mex67/Mtr2 complex (NXF1 or TAP/NXT1 in humans) instead [7–9]. The energy is provided by at least 2 RNA helicases, Sub2 and Dbp5 in yeast and UAP56 and DDX19 in human, that assemble and disassemble the Mex67/Mtr2/mRNA export complex in the nucleus and in the cytoplasm, respectively, in events known as nuclear and cytoplasmic mRNP remodelling [10]. In ophistokonts, 2 asymmetric pore components, the basket and the cytoplasmic filaments, ensure directionality of RNP transport. The basket of Saccharomyces cerevisiae consists of Nup1, Nup2, Nup60, Mlp1/2, and Pml39 and in metazoan of NUP153 (orthologue to yeast Nup1/60), NUP50 (orthologue to yeast Nup2), TPR (orthologue to yeast Mlp1/2), and ZC3H1 (orthologue to yeast Pml39). Nup60 anchors the yeast nuclear basket to the Y-complex of the nuclear outer ring [11]. Nup1 and NUP153 anchor the TREX-2 (3 prime repair exoribonuclease 2) complex to the pore in yeast [12] and humans [13], respectively. TREX-2 in yeast/human consists of the large scaffolding protein Sac3/GANP bound to Thp1/PCID2, Sem1/DSS1, and Sus1/ENY2, and, in yeast only, to Cdc31. This complex functions in recruiting the mRNP to the pore by direct interaction of the Sac3 N-terminal region with the mRNA-loaded Mex67 [14]. The cytoplasmic filaments are heterotrimeric complexes of Nup82/NUP88, Nup159/NUP214, and Nsp1/NUP62 (yeast/human); Nup159/NUP214 recruit the mRNA remodelling helicase Dbp5/Ddx19 with its cofactor Gle1 [15,16]. Structure and composition of nuclear pore complexes has been characterised in a range of organisms, including S. cerevisiae [17,18], humans [19], the thermophilic fungus Chaetomium thermophilum [20], and the algae Chlamydomonas reinhardtii [21]. While the general structure of the pore, in particular the inner ring structure, is highly conserved, the more peripheral structures can differ between organisms and even within the same organism [22]. Yeast for example has up to 3 pore variants that differ in the number of nuclear outer rings and in the presence or absence of a basket [23–26]. Trypanosomes have separated from the eukaryotic main branches very early and their nuclear pore architecture is thus an important stepping-stone towards a better understanding of pore evolution. In particular, structural differences would help to unravel which pore features constitute organism-specific adaptations and which have been present in the LECA (last common eukaryotic ancestor) [27,28]. In Trypanosoma brucei, 22 NUPs were initially identified based mostly on predicted structural similarities to human and yeast NUPs, as the sequences are poorly conserved, and pore localisation was confirmed by GFP-tagging [29]. This served as foundation for a hallmark follow-up study that has defined the sub-complexes, quaternary structure, and pore-associated proteins by a large set of immunoprecipitations with multiple baits from cryomilled samples, combined with immunogold electron localisation and in silico prediction tools [30]. Similar to all other eukaryotes studied so far, the inner ring is mostly conserved [22,30,31], with the one exception of the membrane anchoring mechanism: T. brucei lacks orthologues to all POMs of opisthokonts. Instead, TbNUP65 has evolved a C-terminal transmembrane helix to connect to the nuclear envelope [30], replacing the amphipathic lipid-packing sensor (ALPS) motif used by its opisthokont orthologues ScNup53/59 and HsNUP35. Two additional POM candidates, with transmembrane helixes, were recently identified within the interactome of lamin-like proteins [32]. The structured outer ring complex (Y-complex) was clearly defined in multiple affinity purifications to consist of TbNUP158, TbSEC13, TbNUP41, TbNUP82, TbNUP89, TbNUP132, TbNUP152, and, likely, TbNUP109 [30]. This complex, named NUP89 complex, is the equivalent to the yeast outer ring complex Nup84 (NUP107 in human) and is mostly conserved, with some lineage-specific variations in the β-propeller proteins [30]. Three FG-NUPs, NUP64, NUP75, and NUP98, are unique to trypanosomes and part of one complex with unknown localisation [30]. There were 2 major unexpected outcomes from this study: (i) no asymmetrically localised NUPs were identified, with the exception of the basket proteins NUP110 and NUP92, suggested as putative homologues to yeast Mlp1 and Mlp2. Even TbNUP76, which was co-isolated with TbMEX67 and has structural homology to the cytoplasmic site-specific yeast Nup82 that has a function in mRNA export [33,34] was predicted at both outer rings by immunogold labelling; (ii) the authors could not identify any homologue to the cytoplasmic mRNA remodelling enzyme, the DEAD-box RNA helicase Dbp5. Instead, they found co-purification in high-stringency conditions between the conserved mRNA transporter TbMEX67 with TbRan, TbRanBP1, and a putative T. brucei RanGAP, indicating that mRNA export may be fuelled by the Ran system. Meanwhile, many additional nuclear pore-localised proteins were identified, primarily by the genome-wide localisation database TrypTag [35], of which most remain functionally uncharacterised. We were puzzled by the absence of asymmetric NUPs at the outer rings, which are viewed as key determinants underpinning directed transport of macromolecules. We therefore revisited the ultrastructure of the trypanosome nuclear pore using a novel, powerful combination of expansion microscopy and proximity labelling techniques. Our approach indeed identified a set of asymmetric components and we employed these as markers to map all 75 nuclear pore-localised proteins reported by TrypTag [35]. Altogether, we provide an updated, comprehensive map of the pore and its associated proteins, including proteins of the Ran GTPase transport system. We describe many novel proteins at the nuclear site of the pore, most of these trypanosome-unique, including 3 potential TREX-2 complex proteins. We find the NUP76 complex proteins, NUP76, NUP140, and NUP149, exclusively at the cytoplasmic site and demonstrate a conserved function of NUP76 in mRNA export, while NUP140 and NUP149 are unique to trypanosomes, and lack any conserved binding site for Dbp5, consistent with the absence of this RNA helicase. Our data, combined with the data of [30], support a model of the trypanosome pore with a conserved core structure, but with a fundamentally different mRNA remodelling platform at the cytoplasmic site and many trypanosome-unique proteins at the basket site that await functional characterisation. Material and methods Bioinformatics All sequences were retrieved from TriTrypDB between 2021 and 2024 [36]. InterPro was used for domain search based on sequence [37]. Homology search based on primary and predicted secondary alignments was done with Phyre2 [38]. Tertiary alignments of Trypanosomatid-optimised predicted AlphaFold2 models [39] were carried out with Foldseek [40]. Foldseek searches were performed on the web server (https://search.foldseek.com) covering all available databases (AlphaFold/Proteome v4, AlphaFold/Swiss-Prot v4, AlphaFold/Swiss-Prot v4, BMFD 20240623, CATH50 4.3.0, Mgnify-ESM30 v1, PDB100 20240101, and GMGCL 2204) with Mode 3Di/A. The outputs from Foldseek including the superimposed structures, the values of sequence identity, RMSD (root mean square deviation), TM (template modelling score), qTM and tTM (TM scores normalised by query and template length, respectively) values were retrieved. All structures were predicted using AlphaFold2-Multimer-v2.3.1 [41,42] through the ColabFold version 1.5.3 with Mmseq2 (UniRef+Environmental) with 5 recycles and 5 models (doi:10.1038/s41592-022-01488-1). Predicted structures were visualised with ChimeraX [43]. Heatmaps were generated with the ComplexHeatmap package [44] in R. The t test difference values from the affinity purifications (detailed below) were fed in and clustered with a Pearson distance method (option cluster_rows = TRUE, cluster_columns = FALSE). The t test difference values are represented as a colour scale and the colouring was made by the package. pLDDT plots of local prediction confidence over the protein length shown near the heatmaps are available and were retrieved from the Trypanosomatid-optimised AlphaFold2 database [39]. Trypanosoma cells Trypanosoma brucei Lister 427 procyclic cells in logarithmic growth were used for all experiments. Cells were grown in SDM-79 supplemented with 5% (v/v) FCS and 75 μg/ml hemin at 27°C, 5% CO2, and appropriate drugs [45]. Drugs used for transgenic cells were G418 disulfate (15 μg/ml), blasticidine S (10 μg/ml), puromycin dihydrochloride (1 μg/ml), hygromycin B (25 μg/ml), and phleomycin (2.5 μg/ml); these concentrations were used for maintenance and doubled during the actual selection process after transfection. Growth was measured by sub-culturing cells daily to 106 cells/ml and measuring densities 24 h later using a Coulter Counter Z2 particle counter (Beckman Coulter) over 5 days. Transgenic trypanosomes were generated by standard procedures. Endogenous tagging with TurboID-Ty1, TurboID-HA, 3xHA, 4xTy1, and OsAID-3xHA was done using a PCR-based method and the pPOTv7 system [46]. TurboID-Ty1, 3xHA, and 4xTy1 customisations of the pPOTv7 were made in this work; 25 μl of PCR reaction (PrimeSTAR MAX (Takara)) was used for transfections. The PCR product was precipitated with isopropanol, washed once with 70% ethanol in a sterile hood, resuspended in 10 μl of sterile ddH2O, and mixed with 107 cells in 400 μl of transfection buffer [47]. Transfections were performed with Amaxa Nucleofactor IIb (Lonza Cologne AG, Germany, program X-001) using BTX electroporation cuvettes (45–0125). Cells were recovered in 25 ml SDM-79 supplemented with 20% FCS for 18 h and diluted 1:4 in 75 ml SDM-79 supplemented with 20% FCS. Relevant drugs were added and cells plated in four 24-well plates (1 ml/well). Drug-resistant populations were analysed after 10 days and confirmed with western blotting or diagnostic PCR from genomic DNA [48]. For ectopic, inducible expression of MEX67 fused to TurboID-Ty1 at the C-terminus, its open reading frame was cloned in frame with TurboID-Ty1 in a genetic cassette containing an EP procyclin promoter controlled by 2× Tet operator flanked by sequences for integration to the rRNA locus [49]. For auxin-inducible degron experiments, 50 μm 5-Ph-IAA (MedChem Express, HY-134653) was added to the cultured cells from a 50 mM stock in DMSO. The auxin system was kindly provided by the laboratory of Mark Carrington (University of Cambridge, United Kingdom) and is described in [50]. Plasmids and PCR products All plasmids and PCR products used in this study are listed in S1 Table. pPOTv7 variants for TurboID-Ty1, 3xHA, 4xTy1 tagging were generated by sub-cloning the respective tag sequence in the BamHI/HindIII sites. pPOTv7 OsAID-3xHA was generated in [50]. Note that the TurboID-Ty1 and TurboID-HA tags will be referred only as TurboID in text and figures to avoid confusion; the Ty1 and HA tags were solely used to control cell lines by western blot, not for imaging. Western blot and antibodies Western blots were done using standard methods. Primary antibodies used for detection of proteins were rat anti-HA (3F10, Roche) (1:1,000), anti-Ty1/BB2 ([51] hybridoma supernatant 1:1,000) and anti-T. brucei PFRA/B (L13D6) (1:10,000) [52]. Secondary antibodies were IRDye 680 RD and 800 CW rat and mouse anti-goat (LI-COR) (1:30,000). Biotinylated proteins were detected with Streptavidin-IRDye 680 LT (LI-COR) (1:10,000). Blots were scanned with the Odyssey Infrared Imaging System (LI-COR Biosciences, Lincoln, Nebraska, United States of America). Protein retention expansion microscopy (proExM) and ultrastructural expansion microscopy (UExM) The proExM and UExM methods were performed as previously described in [53], with the following minor modifications for UExM: the primary and secondary antibody labelling reactions were done in 6-well plates with 1 ml of antibody diluted in PBS-T (PBS with 0.1% Tween20 (v/v)) containing 3% BSA (w/v). Primary antibodies were incubated overnight at 37°C with agitation and secondary antibodies for 3 h at 37°C with agitation. The plates were slightly tilted to ensure that the whole gel piece was covered. Microscopy and quantification of microscopy data For all fluorescence microscopy experiments, images were acquired using a fully automated iMIC microscope (TILL Photonics) equipped with a 100×, 1.4 numerical aperture objective (Olympus, Japan) and a sensicam qe CCD camera (PCO, Germany). Z-stacks (75, 100, or 150 slices, 140 nm step size) were recorded. Exposure times ranged between 50 and 100 ms for DAPI and 400 to 800 ms for all other fluorophores. Image stacks were deconvolved with the Huygens Essential software v24.04 (SVI, Hilversum, the Netherlands). Deconvolution parameters were kept constant for all images, except for the number of iterations which were optimised depending on the signal intensity and background. To correct aberrations due to the refractive index mismatches occurring at different depths of the gel specimens, the varPSF function of Huygens Essential was used, which calculates the PSFs according to depth position. After deconvolution, images were corrected for the chromatic shift aberration between the green and red channel in 3 dimensions. The Huygens Chromatic Aberration Corrector was used with image stacks of TetraSpeck fluorescent microspheres (T7279, Thermo Fisher Scientific) as template. Fiji [54] was used for figure generation. Streptavidin affinity purification and LC MS/MS analysis Affinity purification of biotinylated proteins followed by tryptic digest and peptide preparation were done as described [55], except that 1 mM biotin was added to the on-beads tryptic digests, to improve the elution. Eluted peptides were analysed by liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) on an Ultimate3000 nano rapid separation LC system (Dionex) coupled to an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific). Spectra were processed using the intensity-based label-free quantification (LFQ) in MaxQuant version 2.1.3.0 [56,57] searching the T. brucei brucei 927 annotated protein database (release 64) from TriTrypDB [58]. Analysis was done using Perseus [59] essentially as described in [60]. Briefly, known contaminants, reverse hits (decoy sequences for calculating the false discovery rate (FDR)) and hits only identified by a modification site were filtered out. LFQ intensities were log2-transformed and missing values imputed from a normal distribution of intensities around the detection limit of the mass spectrometer. A Student’s t test was used to compare the LFQ intensity values between the duplicate samples of the bait with untagged control (WT parental cells) triplicate samples. The -log10 p-values were plotted versus the t test difference to generate multiple volcano plots (Hawaii plots). Potential interactors were classified according to their position in the Hawaii plot, applying cut-off curves for significant class A (SigA; FDR = 0.01, s0 = 0.1) and significant class B (SigB; FDR = 0.05, s0 = 0.1). The cut-off is based on the FDR and the artificial factor s0, which controls the relative importance of the t test p-value and difference between means (at s0 = 0 only the p-value matters, while at non-zero s0 the difference of means contributes). Perseus was also used for principal component analysis (PCA), the profile plots and to determine proteins with similar distribution in the plot profile using Pearson’s correlation. All proteomics data have been deposited at the ProteomeXchange Consortium via the PRIDE partner repository [61] with the data set identifiers PXD055934 (Nup75, Ran, MEX67), PXD047268 (NUP76, NUP96, NUP110), PXD031245 (NUP158, wt control), and PXD059554 (NUP98). Bioinformatics All sequences were retrieved from TriTrypDB between 2021 and 2024 [36]. InterPro was used for domain search based on sequence [37]. Homology search based on primary and predicted secondary alignments was done with Phyre2 [38]. Tertiary alignments of Trypanosomatid-optimised predicted AlphaFold2 models [39] were carried out with Foldseek [40]. Foldseek searches were performed on the web server (https://search.foldseek.com) covering all available databases (AlphaFold/Proteome v4, AlphaFold/Swiss-Prot v4, AlphaFold/Swiss-Prot v4, BMFD 20240623, CATH50 4.3.0, Mgnify-ESM30 v1, PDB100 20240101, and GMGCL 2204) with Mode 3Di/A. The outputs from Foldseek including the superimposed structures, the values of sequence identity, RMSD (root mean square deviation), TM (template modelling score), qTM and tTM (TM scores normalised by query and template length, respectively) values were retrieved. All structures were predicted using AlphaFold2-Multimer-v2.3.1 [41,42] through the ColabFold version 1.5.3 with Mmseq2 (UniRef+Environmental) with 5 recycles and 5 models (doi:10.1038/s41592-022-01488-1). Predicted structures were visualised with ChimeraX [43]. Heatmaps were generated with the ComplexHeatmap package [44] in R. The t test difference values from the affinity purifications (detailed below) were fed in and clustered with a Pearson distance method (option cluster_rows = TRUE, cluster_columns = FALSE). The t test difference values are represented as a colour scale and the colouring was made by the package. pLDDT plots of local prediction confidence over the protein length shown near the heatmaps are available and were retrieved from the Trypanosomatid-optimised AlphaFold2 database [39]. Trypanosoma cells Trypanosoma brucei Lister 427 procyclic cells in logarithmic growth were used for all experiments. Cells were grown in SDM-79 supplemented with 5% (v/v) FCS and 75 μg/ml hemin at 27°C, 5% CO2, and appropriate drugs [45]. Drugs used for transgenic cells were G418 disulfate (15 μg/ml), blasticidine S (10 μg/ml), puromycin dihydrochloride (1 μg/ml), hygromycin B (25 μg/ml), and phleomycin (2.5 μg/ml); these concentrations were used for maintenance and doubled during the actual selection process after transfection. Growth was measured by sub-culturing cells daily to 106 cells/ml and measuring densities 24 h later using a Coulter Counter Z2 particle counter (Beckman Coulter) over 5 days. Transgenic trypanosomes were generated by standard procedures. Endogenous tagging with TurboID-Ty1, TurboID-HA, 3xHA, 4xTy1, and OsAID-3xHA was done using a PCR-based method and the pPOTv7 system [46]. TurboID-Ty1, 3xHA, and 4xTy1 customisations of the pPOTv7 were made in this work; 25 μl of PCR reaction (PrimeSTAR MAX (Takara)) was used for transfections. The PCR product was precipitated with isopropanol, washed once with 70% ethanol in a sterile hood, resuspended in 10 μl of sterile ddH2O, and mixed with 107 cells in 400 μl of transfection buffer [47]. Transfections were performed with Amaxa Nucleofactor IIb (Lonza Cologne AG, Germany, program X-001) using BTX electroporation cuvettes (45–0125). Cells were recovered in 25 ml SDM-79 supplemented with 20% FCS for 18 h and diluted 1:4 in 75 ml SDM-79 supplemented with 20% FCS. Relevant drugs were added and cells plated in four 24-well plates (1 ml/well). Drug-resistant populations were analysed after 10 days and confirmed with western blotting or diagnostic PCR from genomic DNA [48]. For ectopic, inducible expression of MEX67 fused to TurboID-Ty1 at the C-terminus, its open reading frame was cloned in frame with TurboID-Ty1 in a genetic cassette containing an EP procyclin promoter controlled by 2× Tet operator flanked by sequences for integration to the rRNA locus [49]. For auxin-inducible degron experiments, 50 μm 5-Ph-IAA (MedChem Express, HY-134653) was added to the cultured cells from a 50 mM stock in DMSO. The auxin system was kindly provided by the laboratory of Mark Carrington (University of Cambridge, United Kingdom) and is described in [50]. Plasmids and PCR products All plasmids and PCR products used in this study are listed in S1 Table. pPOTv7 variants for TurboID-Ty1, 3xHA, 4xTy1 tagging were generated by sub-cloning the respective tag sequence in the BamHI/HindIII sites. pPOTv7 OsAID-3xHA was generated in [50]. Note that the TurboID-Ty1 and TurboID-HA tags will be referred only as TurboID in text and figures to avoid confusion; the Ty1 and HA tags were solely used to control cell lines by western blot, not for imaging. Western blot and antibodies Western blots were done using standard methods. Primary antibodies used for detection of proteins were rat anti-HA (3F10, Roche) (1:1,000), anti-Ty1/BB2 ([51] hybridoma supernatant 1:1,000) and anti-T. brucei PFRA/B (L13D6) (1:10,000) [52]. Secondary antibodies were IRDye 680 RD and 800 CW rat and mouse anti-goat (LI-COR) (1:30,000). Biotinylated proteins were detected with Streptavidin-IRDye 680 LT (LI-COR) (1:10,000). Blots were scanned with the Odyssey Infrared Imaging System (LI-COR Biosciences, Lincoln, Nebraska, United States of America). Protein retention expansion microscopy (proExM) and ultrastructural expansion microscopy (UExM) The proExM and UExM methods were performed as previously described in [53], with the following minor modifications for UExM: the primary and secondary antibody labelling reactions were done in 6-well plates with 1 ml of antibody diluted in PBS-T (PBS with 0.1% Tween20 (v/v)) containing 3% BSA (w/v). Primary antibodies were incubated overnight at 37°C with agitation and secondary antibodies for 3 h at 37°C with agitation. The plates were slightly tilted to ensure that the whole gel piece was covered. Microscopy and quantification of microscopy data For all fluorescence microscopy experiments, images were acquired using a fully automated iMIC microscope (TILL Photonics) equipped with a 100×, 1.4 numerical aperture objective (Olympus, Japan) and a sensicam qe CCD camera (PCO, Germany). Z-stacks (75, 100, or 150 slices, 140 nm step size) were recorded. Exposure times ranged between 50 and 100 ms for DAPI and 400 to 800 ms for all other fluorophores. Image stacks were deconvolved with the Huygens Essential software v24.04 (SVI, Hilversum, the Netherlands). Deconvolution parameters were kept constant for all images, except for the number of iterations which were optimised depending on the signal intensity and background. To correct aberrations due to the refractive index mismatches occurring at different depths of the gel specimens, the varPSF function of Huygens Essential was used, which calculates the PSFs according to depth position. After deconvolution, images were corrected for the chromatic shift aberration between the green and red channel in 3 dimensions. The Huygens Chromatic Aberration Corrector was used with image stacks of TetraSpeck fluorescent microspheres (T7279, Thermo Fisher Scientific) as template. Fiji [54] was used for figure generation. Streptavidin affinity purification and LC MS/MS analysis Affinity purification of biotinylated proteins followed by tryptic digest and peptide preparation were done as described [55], except that 1 mM biotin was added to the on-beads tryptic digests, to improve the elution. Eluted peptides were analysed by liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) on an Ultimate3000 nano rapid separation LC system (Dionex) coupled to an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific). Spectra were processed using the intensity-based label-free quantification (LFQ) in MaxQuant version 2.1.3.0 [56,57] searching the T. brucei brucei 927 annotated protein database (release 64) from TriTrypDB [58]. Analysis was done using Perseus [59] essentially as described in [60]. Briefly, known contaminants, reverse hits (decoy sequences for calculating the false discovery rate (FDR)) and hits only identified by a modification site were filtered out. LFQ intensities were log2-transformed and missing values imputed from a normal distribution of intensities around the detection limit of the mass spectrometer. A Student’s t test was used to compare the LFQ intensity values between the duplicate samples of the bait with untagged control (WT parental cells) triplicate samples. The -log10 p-values were plotted versus the t test difference to generate multiple volcano plots (Hawaii plots). Potential interactors were classified according to their position in the Hawaii plot, applying cut-off curves for significant class A (SigA; FDR = 0.01, s0 = 0.1) and significant class B (SigB; FDR = 0.05, s0 = 0.1). The cut-off is based on the FDR and the artificial factor s0, which controls the relative importance of the t test p-value and difference between means (at s0 = 0 only the p-value matters, while at non-zero s0 the difference of means contributes). Perseus was also used for principal component analysis (PCA), the profile plots and to determine proteins with similar distribution in the plot profile using Pearson’s correlation. All proteomics data have been deposited at the ProteomeXchange Consortium via the PRIDE partner repository [61] with the data set identifiers PXD055934 (Nup75, Ran, MEX67), PXD047268 (NUP76, NUP96, NUP110), PXD031245 (NUP158, wt control), and PXD059554 (NUP98). Results Expansion microscopy identifies novel asymmetric components of the trypanosome pore We revisited the trypanosomatid nuclear pore architecture with expansion microscopy. Therefore, we expressed target proteins fused to a small peptide epitope-tag (3xHA or 4xTy1) to allow immunofluorescence detection via anti-HA or anti-Ty1. In some experiments, we expressed the target protein fused to the biotin ligase TurboID [62], followed by the detection of the biotinylation of the bait and proximal proteins with fluorescent streptavidin (= streptavidin imaging). We had previously shown that labelling with streptavidin increases the signal with no obvious loss in resolution, which is essential since expansion microscopy causes a massive reduction in antigen density [53]. Even more importantly, streptavidin readily labels proteins within phase-separated areas, such as the nuclear pore central channel, that we found largely inaccessible to antibodies [53]. Since TurboID will not only auto-biotinylate the bait but also adjacent proteins, there is the possibility that the streptavidin signal may not reflect the true localisation of the bait. Hence, throughout this work, we have confirmed all major findings derived from streptavidin labelling with orthogonal methods, such as immunofluorescence, mass spectrometry, and/or vice-versa labelling. All fusion proteins were expressed from the endogenous loci to avoid major changes in gene expression. First, we used protein retention expansion microscopy (proExM), a method that expands cells after protein labelling [63]. We experimentally determined the expansion factor as 3.6 and confirmed the isotropic expansion of the nucleus (S1 Fig). We initially concentrated on the proteins of the NUP76 complex (NUP76, NUP140, and NUP149) as these were previously shown to co-precipitate with the trypanosome mRNA export factor MEX67 [30]. These NUPs were fused to either N- or C-terminal peptide tags (NUP76::3xHA, NUP140::4xTy1, 3xHA::NUP149). In the same cell lines, we co-expressed the nuclear basket protein NUP110 with an N-terminal fusion to a different epitope tag (4xTy1 or 3xHA). Upon dual labelling with anti-Ty1 and anti-HA, we carried out expansion and imaging. All 4 proteins resolved as single dots located at the nuclear periphery. The signals from the NUP76-complex proteins were in all cases clearly separated from the NUP110 signal towards the cytoplasmic site of the pore (Figs 1A and S2). Notably, we observed for every dot signal originating from the NUP76 complex a corresponding NUP110 dot, indicating that trypanosomes, unlike yeast [26], do not have basket-less pores. The median, expansion-factor corrected distance to NUP110 exceeded for all 3 proteins 120 nm (129±17 nm for NUP76, 120±22 nm for NUP140, and 137±18 nm for NUP149 with n > 100). The large distance of the NUP76 complex proteins to the basket protein NUP110 and the absence of “double-dots” for the NUP76-complex strongly indicate asymmetric localization of the NUP76 complex exclusively at the cytoplasmic site of the pore. The data disagree with previous observations derived from immuno-electron microscopy that place NUP76, symmetrically, to both outer rings [30]. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Expansion microscopy identifies novel asymmetric pore proteins. (A) proExM of cell lines co-expressing epitope-tagged versions of NUP76, NUP140, or NUP149 in combination with NUP110. Images were deconvolved (20 iterations for NUP149 and NUP76; 60 iterations for NUP140) and single planes of the nuclei are shown. Further images are shown in S2 Fig. (B, C) UExM of cells expressing proteins fused to TurboID or 3xHA, as indicated. Labelling was done with fluorescent streptavidin and with anti-HA. Images were deconvolved (20, 20, 40 iterations for NUP89/NUP96, NUP89/NUP64, and NUP98/NUP64, respectively). A single plane image of an entire nucleus is shown on the left and 3 enlarged regions from the same or another nucleus are shown on the right (Cy = cytoplasm, Nu = nucleus). For NUP89/NUP96, yellow arrows point to pores that are in a suitable focal plane to see the 2 NUP89 dots sandwiching the NUP96 dot. (D) UExM of lines co-expressing TurboID-tagged versions of NUP76 or NUP149 with NUP64-3xHA. Labelling was done with fluorescent streptavidin and anti-HA. Images were deconvolved with 60 and 20 iterations for NUP76/NUP64 and NUP149/NUP64, respectively. A single plane image of one nucleus is shown. proExM, protein retention expansion microscopy; UExM, ultrastructural expansion microscopy. https://doi.org/10.1371/journal.pbio.3003024.g001 We were concerned that the observed sole cytoplasmic localisation of the NUP76 complex proteins is a technical artifact caused by (i) an insufficient resolution of the proExM method; (ii) a non-isotropic expansion across the nuclear membrane; or (iii) reduced accessibilities of antibodies to the nuclear ring in comparison to the cytoplasmic ring. To investigate, we applied ultrastructural expansion microscopy (UExM) [64], which offers a higher resolution because the antibody labelling is applied after the expansion and the linkage error (distance between the fluorophore of the secondary antibody and the target protein) is therefore not expanded. UExM has been successfully used in trypanosomes [53,65–67], and we achieved an expansion factor of 4.2-fold with isotropic expansion of the nucleus (S1 Fig). To prove that UExM provides the resolution to resolve the nuclear outer ring, inner ring, and cytoplasmic outer ring, we first imaged NUPs that are conserved across eukaryotes. As an outer ring marker, we chose NUP89, the trypanosome orthologue to the outer ring Y-complex component of yeast Nup84/85 (NUP107/75 in human) [30]. As an inner ring marker, we selected NUP96, the conserved trypanosome orthologue to S. cerevisiae Nic96. We co-expressed NUP89::TurboID with NUP96::3xHA. UExM with fluorescent streptavidin and anti-HA readily resolved NUP89 as double dots at the nuclear periphery, sandwiching the single dot signal of NUP96 (Fig 1B), proving that the resolution of the method is sufficient to distinguish these different subregions of the pore. However, as the NUP96 signal was weak, we searched for a better inner ring marker. We tested NUP64, expressed as C-terminal 3xHA fusion, a trypanosome-unique FG-repeat NUP previously identified as a multi-complex NUP localised mostly to the centre of the pore [30]. To our surprise, the resulting single NUP64 dot signal was not sandwiched by the 2 outer ring dots of NUP89 but instead co-localised solely with the NUP89 dot at the nuclear site of the pore (Fig 1C). Likewise, the streptavidin signal of a C-terminal TurboID fusion of TbNUP98, known to form a complex with NUP64 [30], resolved as single dots that colocalised exclusively with the NUP64::3xHA dots at the nuclear site (Fig 1C). The data indicate an asymmetric, exclusive nuclear site localisation of NUP64 and NUP98 (Fig 1C). Having confirmed that UExM has the resolution to distinguish proteins located at the cytoplasmic site outer ring from proteins located at the nuclear site outer ring, we reassessed the localisation of the NUP76 complex. We confirmed the sole cytoplasmic localisation of the NUP76 complex by co-staining fusions of this complex to either 3xHA (S3 Fig) or TurboID (Fig 1D) with C-terminal 4xTy1 or 3xHA fusions of our newly identified nuclear site marker NUP64. In summary, we discovered 5 asymmetric proteins of the trypanosome nuclear pore complex, previously assumed to be symmetrically distributed: NUP76/NUP140/NUP149 at the cytoplasmic site and NUP64/NUP98 at the nuclear site. With the exception of NUP76, which is the structural orthologue to yeast NUP82 and human NUP88 [30], all novel asymmetric NUPs are trypanosome specific. A proximity map of the trypanosome nuclear pore The novel availability of asymmetric NUPs prompted us to use mass spectrometry data from TurboID proximity labelling experiments, to generate a proximity map of the entire trypanosome nuclear pore. We used 7 available LC-MS/MS data sets from previous streptavidin-affinity purifications, namely N- and C- terminal TurboID fusions of NUP110 (basket), NUP96 (inner ring), and NUP76 (cytoplasmic outer ring) [53] and NUP158 (outer ring) with C- terminal TurboID fusion [55]. In addition, we generated a new LC-MS/MS data sets for NUP98 that we have identified to be at the nuclear-site of the pore by expansion microscopy, and also for NUP75, which was previously identified to be in a complex with NUP98 and NUP64 [30]. Of all the proteins that were labelled by these baits, we initially concentrated on proteins that were previously identified as NUPs, based on predicted structural similarities [29] and/or affinity purification [30]. For each data set, we colour-coded the enrichment of the NUP proteins based on t test difference to a wild-type control. Then, the NUPs were sorted by hierarchical clustering applying a Pearson distance method (Fig 2A). For each NUP, we included the pLDDT plots [39] to indicate confidence of the predicted structure, which in most cases correlates to structured (high confidence) and unstructured (low confidence, mostly FG-repeats) regions. The majority of the NUPs separated into 3 clearly distinct main clusters. The first cluster contained the lamina protein NUP2 [68] and NUP64/NUP98 that localised exclusively to the nuclear site of the pore by expansion microscopy (Fig 1C). The second cluster largely consisted of NUPs previously identified as inner ring NUPs [30] while the third was dominated by NUPs previously assigned to the outer ring [30]. Five NUPs were manually added to the clusters using positional information from [30], because of poor labelling by only 2 baits or less (NUP62, NUP119, NUP110, NUP92) or labelling by all baits (NUP132). For 6 proteins, the labelling was too weak (NUP152, SEC13A, SEC13B, NUP41, NUP48) or too diverse (GLE2) to confidently assign the proteins to a certain region of the pore. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. A proximity map of the trypanosomatid nuclear pore. (A) Mass spectrometry data (t test difference values) from proximity labelling experiments were used to create a heat map of the trypanosome nuclear pore. A range of N- or C- terminally tagged TurboID fusions served as baits and the labelling (proximity) of most nuclear pore proteins is shown as a tree. Some NUPs were manually added to the tree using data of [30]. pLDDT plots from Trypanosomatid-optimised AlphaFold2 models [39] are shown on the right. Details on the mass spectrometry data can be found in S1 Table. (B) The model of the trypanosome nuclear pore changes, with the discovery of 5 novel asymmetrically localised proteins. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g002 For the vast majority of NUPs, the proximity map confirmed the previous assignments of the NUPs from affinity capture experiments [30]. Only 2 NUPs exhibited ambiguous placement in the proximity map (bold in Fig 2A). The outer ring NUP132 is labelled strongly by all 8 baits, including strong labelling by the basket proteins NUP110 and NUP92, suggesting extensions of NUP132 towards the basket region. Further, the presumed inner ring NUP53a is also labelled by all baits: the labelling by the inner ring bait NUP96 is the strongest, but there is also strong labelling by the cytoplasmic-specific NUP76 and also by nuclear site-specific NUPs, indicating that NUP53a may be at the inner ring but reaching out to the outer rings. NUP98 and NUP64 unequivocally grouped with the basket/nuclear site (basket and inner ring) and were labelled by NUP110 and NUP96 baits, but not by the cytoplasmic NUP76, in line with our proExM and UExM data (compare Fig 1B and 1C). To our surprise, NUP75, which shares 46% sequence identity with NUP64 and associates with both NUP98 and NUP64 [30], was placed to the inner ring and was not labelled by NUP110. Moreover, the outer ring protein NUP158 strongly labelled NUP64 and slightly less NUP98, but not NUP75, further supporting the absence of NUP75 from the outer rings [53]. When NUP75 and NUP98 were used as baits, both showed the strongest labelling with each other and with NUP64 (Fig 2A), consistent with these 3 proteins forming a complex, as previously suggested [30]. Moreover, as expected, neither protein labelled the NUP76 complex, which, incidentally, is orthogonal evidence for the NUP76 complex being cytoplasmic. Interestingly, neither protein labelled NUP110, which was expected for NUP75, but not for NUP98, which is itself labelled by NUP110. Perhaps, the C-terminus of NUP98 is distant to NUP110, while the N-terminus is close. Our data suggest a model of an asymmetric NUP98/64/75 complex reaching from the nuclear outer ring to the inner ring, with NUP98 and NUP64 being located at the outer nuclear ring and NUP75 at the inner ring. Data from previous affinity isolation experiments with NUP98, NUP64, and NUP75 show marked differences between the interactomes of NUP98/64 and NUP75, including the exclusive absence of NUP110 from the NUP75 interactome, in full agreement with our data [30]. The outer-ring cluster is divided into 2 subclusters. Proteins of both clusters are labelled by the inner ring bait NUP96 and by the cytoplasmic-site-specific NUP76. Proteins of the first cluster (NUP158, NUP53a, and NUP109) are additionally labelled by the nuclear site-specific proteins NUP98 and NUP75 and by NUP158, while proteins of the second cluster are not (with the one exception of NUP149, which is labelled by NUP158). We believe that this clustering reflects differences in proximity between these 2 protein groups within the outer rings. We can exclude the interpretation that the clustering of NUP89 and NUP82 with the cytoplasmic site-specific NUP75/NUP140/NUP149 proteins means, that NUP89 and NUP82 are cytoplasmic-site specific too: NUP89 is present in both outer rings (Fig 1B and 1C) and both NUP89 and NUP82 are weakly labelled by the nucleoplasmic-specific NUP75 and NUP98. In summary, the proximity map accurately predicts the localisations for the vast majority of NUPs. Importantly, it offers orthogonal (tag-independent) validation of the asymmetric localisation of NUP98 and NUP64 to the nuclear site of the pore and of the NUP76 complex proteins to the cytoplasmic site of the pore, confirming the data of the expansion microscopy. A new model of the pore, highlighting the asymmetric components, is shown in Fig 2B. Thus, our proximity map has the potential to predict localisations of proteins with sub-pore size resolution, which prompted us to look at all 44 proteins that have nuclear pore localisation according to TrypTag [35] but are not annotated as NUPs. Mapping the Ran-based mRNA export system to the pore First, we concentrated on all nuclear pore-localised proteins that are involved in mRNA export: MEX67 [69], Mtr2, MEX67b [70] and, as postulated [30,71], Ran, RanGAP, the putative RanGDP importer NTF2 and 2 Ran-binding proteins, RanBP1 and RanBPL [72]. The proximity map places RanGAP and RanBP1 to the cytoplasmic site of the pore, while RanBPL localisation is predicted at the nucleoplasmic site (Fig 3A). For the transporters MEX67, MEX67b and Ran, the labelling was less confined to a specific site. For Mtr2 and NTF2, we obtained no labelling, likely due to their small size which is problematic in BioID [55]. As a control, we included data of vice versa TurboID experiments with MEX67 and Ran as baits (Figs 3A and S4 and S2 Table): both proteins label proteins at both sides of the pore, consistent with shuttling. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Mapping the trypanosome mRNA export system to the nuclear pore. (A) Mass spectrometry data (t test difference values) from proximity labelling experiments were used to map the Ran-based mRNA export system to the trypanosome pore. Details on the mass spectrometry data can be found in S1 Table. (B, C) Ultra-expansion microscopy. MEX67, Ran, RanBP1, RanGAP, and RanBPL were expressed as C-terminal (MEX67, Ran RanBP1, RanGAP) or N-terminal (RanBPL) fusions to TurboID in a cell line co-expressing the nuclear-site marker NUP64::3xHA, all from the endogenous locus. Cells were expanded and the proteins detected with streptavidin and anti-HA. Images of single nuclei are shown (single plane of deconvolved Z-stacks with 10 iterations for MEX67, Ran, RanBP1, and RanBPL and 20 iterations for RanGAP). For MEX67, the streptavidin signal was weak and 3 representative nuclei of an overexpression cell line are shown (B, right). For the other proteins, enlarged section of the nuclear envelope of the same or other nuclei are shown (C, right). (D) Proximity labelling of the nuclear pore by MEX67 and Ran. MEX67 and Ran were expressed as C-terminal TurboID fusions from the endogenous loci and biotinylated peptides were analysed by mass spectrometry. The labelling of NUPs by MEX67 (left) and Ran (right) is shown coloured based on their t test difference values in comparison to wild-type cells. Asymmetric NUPs are marked with a red arrow. (E) Schematic summary of our localisation data of the Ran-based mRNA export system. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g003 Next, we determined the localisation of MEX67, Ran, RanGAP, RanBP1, and RanBPL by UExM, expressing TurboID fusions in a cell line that expressed NUP64::3xHA as a nucleoplasmic-site marker (Fig 3B and 3C). RanGAP and RanBP1 resolved as single dots, well distanced from the NUP64 dots towards the cytoplasmic site, while the RanBPL signals overlapped with the NUP64 signals at the nuclear site, in full agreement with the proximity map. The biotinylation signal of the 2 proteins with suspected shuttling activity, Ran and MEX67, resolved as large cytoplasmic dots and smaller nuclear dots, connected by a string-like signal reaching through the pore. For MEX67, we observed that these bone-shaped signals were more pronounced when the gene was 8-fold overexpressed from an ectopic locus (Fig 3B, images on the right) which only slightly impaired growth (S5 Fig). For Ran, MEX67, and RanBPL, we observed an additional signal at the nucleolus, which is defined by the reduction in DAPI stain (S6A Fig), and a minor signal in the nucleoplasm. For RanBPL, the nucleolar signal was stronger than the signal at the pores, while for Ran and MEX67 (at endogenous expression levels) the nuclear pore signal was dominant. We attempted to confirm the nucleolar localisation by direct immunofluorescence instead of streptavidin imaging. The nucleolus is challenging to label with antibodies [53], but for MEX67::4xTy1 we could get a weak, but distinct nucleolar antibody signal (S6B Fig). The functional implications of the nucleolar localisation of Ran, MEX67, and RanBPL are not fully understood in trypanosomes but not unexpected, as in ophistokonts Ran and Mex67 participate in pre-ribosome transport [73]. For the shuttling proteins Ran and MEX67, we confirmed the streptavidin-based imaging data by the LC-MS/MS data upon streptavidin enrichment (Figs 3D and S4 and S2 Table). Both MEX67 and Ran strongly labelled FG-NUPs lining the inner pore channel, consistently reflecting their movement across the pore. The NUPs with the strongest labelling were the asymmetric NUPs on both sides of the pore: NUP149 at the cytoplasmic site and NUP98, NUP64 and the lamin-like protein NUP2 at the nuclear site (red arrows in Fig 3D). This strongly suggests that both proteins, Mex67 and Ran, would have binding sites at both sides of the trypanosome pore, analogous to human Ran [74–77]. There was weak labelling of structured NUPs, in agreement with the rather poor labelling of MEX67 and Ran by NUP76, NUP96 and NUP110 and NUP158 in the heat map (Fig 3A). Preferential labelling of asymmetric FG NUPs over structured NUPs has also been shown for human karyopherins tagged with the biotin ligase BirA* [78]. In conclusion, our proximity map predicted the localisation of all non-shuttling Ran system components confidently and in agreement with streptavidin imaging in UExM. RanGAP is at the (expected) cytoplasmic site, together with RanBP1. RanBPL had not been previously mapped, but is unequivocally placed to the nuclear site, consistent with its binding preference for RanGTP [72]. The proximity map was unable to categorise the shuttling proteins MEX67 and Ran, likely because non-structured FG-NUPs are poorly represented in our bait repertoire. Direct BioID combined with orthogonal assessment through expansion microscopy was required to confidently place these putative export factors. The derived localisations are summarised in Fig 3E and are consistent with a mechanistically divergent Ran-dependent mRNA export in trypanosomes. Predicting the position of unknown proteins within the pore To predict the localisation of the remaining 38 nuclear pore-localised proteins more accurately, we included the proximity labelling data of MEX67 and Ran to our proximity map. Fifteen of the 38 nuclear pore-localised proteins are karyopherins (S7 Fig), five of which have not been previously classified as karyopherins but have unequivocal structural homology to importin and exportin-like folds predicted by FoldSeek (S7B Fig); these include a putative orthologue to the importin Hikeshi (Tb927.1.1400) that is specialised on the import of Hsp70-family proteins [79] (S7C Fig). Karyopherins were mostly not or poorly labelled by our proximity map (S7A Fig). The likely reason is their preferred interaction with FG-NUPs rather than structured NUPs, similar to what we observed for MEX67 and Ran (compare Fig 3A). Exceptions are XPO1 (exportin 1), known to be involved in the transport of both mRNAs and tRNAs [80], which is labelled by all bait proteins and 2 XPO-like proteins labelled by a subfraction of the baits. An additional 5 proteins with nuclear pore localisation were not labelled by either of the bait proteins (S8 Fig). For three of these proteins, Tb927.11.1000, Tb927.10.12200, and Tb927.10.8160, the reason could be failed detection due to small size [55]. None of these small proteins has homologues outside of trypanosomatids and their function is unknown. Tb927.10.8160 has the strongest nuclear pore localisation ([35], S8B Fig) and high-throughput phenotyping indicates an essential function [81]. The 2 larger proteins (Tb927.1.3230 and Tb927.9.12700) do not have very prominent nuclear pore localisation ([35], S8B Fig). For Tb927.9.12700, biochemical data indicate glycosomal localisation [82] and Tb927.1.3230 could be the trypanosomatid ortholog of the ribosome biogenesis factor Rix7 [83]. Their lack of labelling might thus be due to poor or absent nuclear pore localisation. The remaining 18 proteins were labelled by at least one of the bait proteins (Fig 4A). Strikingly, none were labelled by the cytoplasmic-site marker protein NUP76, suggesting absence of further proteins with exclusive cytoplasmic localisation, other than the NUP76 complex, RanGAP, and RanBP1. Moreover, not a single protein was exclusively labelled by the outer ring protein NUP158 [55], with the one exception of Tb927.11.13080. The (near) absence of combined labelling by NUP76 and NUP158 suggests that the outer ring proteome may be complete. Instead, these 18 proteins were either labelled by baits of the nuclear site or inner ring or both. We present the data as Pearson-distance clusters, with manual placements of proteins with insufficient labelling. Four proteins are not included to the clustering analysis: for one (SENP) the labelling pattern was too unique and 3 proteins were only labelled by MEX67 and/or Ran. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Characterisation of unknown nuclear pore proteins and their localisation over the nuclear pore complex. (A) Mass spectrometry data (t test difference values) from a range of proximity labelling experiments were screened for labelling of nuclear pore localised proteins that are not NUPs or karyopherins. All proteins that were labelled by at least one of the bait proteins are shown partially clustered using Pearson-correlation. Annotations are explained in the text. pLDDT plots from Trypanosomatid-optimised AlphaFold2 models [39] are shown on the right. Details on the mass spectrometry data can be found in S1 Table. (B–D) Structural alignments of AlphaFold2 models of the T. brucei TREX-2 complex candidates Sac3 (B), Thp1 (C), and Sus1 (D) with PDB structures of the respective TREX-2 complex proteins from other organisms, using Foldseek [40]. The regions of the proteins that were used for the structural alignments are shown in the schematics in orange (T. brucei) and blue (other organisms). The root mean square deviation of atomic positions (RMSD) and the sequence identity are shown for the superimposed regions. NUP, nucleoporin; RMSD, root mean square deviation. https://doi.org/10.1371/journal.pbio.3003024.g004 The majority of these 18 proteins is unique to Kinetoplastida or even to Trypanosomatida and lack functional annotations. Only 2 proteins have readily identifiable homologues outside Kinetoplastida: Tb927.10.9020 has homology to the non-catalytic, substrate binding subunit of the tRNA methyltransferase Trm6/Gcd10, responsible for adenosine(58)-N(1) methylation, a modification present in many eukaryotic tRNAs [84]. The second protein, Tb927.9.2220, is a SUMO protease of the Ulp/SENP (ubiquitin-like protease/sentrin-specific protease) family with potential function in resolving stalled DNA replication forks [85]. The remaining 16 proteins contain 5 proteins with predicted basket or inner ring localisation that were previously identified as lamina-associated proteins (LAPs), based on their interactions with the lamina-like proteins NUP1 and NUP2 [32]. Two of these LAPs, LAP71 and LAP102, are basket specific in our map, as expected for lamina associated proteins. Two further LAPs, LAP73 and LAP59, have exclusive inner ring prediction. Of significant interest is basket/inner ring-predicted LAP173, which has a Sac3/GANP domain and was suggested to be the orthologue to Sac3 and sole representative of a potential trypanosome TREX-2 complex [32]. Association with MEX67 was observed by affinity purifications [30] and BioID ([55], Fig 4A). In fact, the Sac3/GANP domain of a LAP173 model predicted by a trypanosome-optimised AlphaFold2 [39,41] displays remarkable structural homology to the equivalent region of an experimentally resolved S. cerevisiae Sac3 structure (RMSD 1.12Å, [86]), despite poor sequence conservation (Fig 4B). Motivated by the presence of a putative Sac3, we used Foldseek [40] to search for structural homologues of the remaining TREX-2-complex components, using AlphaFold2 models as inputs [39,41]. We identified the Csn12-like domain containing protein Tb927.11.5560 as a putative Thp1 orthologue (Fig 4C), with structural homology to the human Tph1 homologue PCID2 (PDB entry 3T5X; TM score 0.82), while the primary sequence is, again, poorly conserved. Just like Sac3, our proximity map places this Thp1 candidate to both, nuclear site of the pore and inner ring. Moreover, we identified Tb927.7.5830 as a putative Sus1 orthologue with highest structural similarity to Sus1 of the yeast K. phaffii [87] (Fig 4D). The Sus1 candidate protein is not labelled by NUPs, presumably due to its small size. However, all 3 trypanosome TREX-2 complex candidates, including Sus1, are strongly labelled by MEX67, a prototypic Sac3 interactor in ophistokonts [88], supportive of a potential role in a trypanosome TREX-2 complex (Fig 4A). In conclusion, our extended proximity map granted mapping the majority of nuclear pore localised proteins to a subregion of the pore. We found no evidence for the existence of any further proteins asymmetrically distributed to the cytoplasmic-site indicating the entirety of the cytoplasmic site-specific proteome of the pore is the NUP76 complex, RanGAP, and RanBP1, plus shuttling proteins. Instead, we predict a diverse cohort of proteins with preferential localisation to the basket or nuclear site of the pore, including 3 putative TREX-2 complex proteins with proximity to MEX67, indicative of a conserved function. The trypanosome NUP76 complex as a cytoplasmic mRNA remodelling hub We detected the NUP76 complex (NUP76, NUP140, NUP149) exclusively at the cytoplasmic site (Fig 1A and 1E) and a previous study has shown the interaction of this complex with MEX67 under high stringency conditions [30]. In combination, these data suggest that the NUP76 complex is the trypanosomes cytoplasmic mRNP binding hub that serves as remodelling platform. In opisthokonts, the cytoplasmic mRNP remodelling platform is based on the heterotrimeric complex composed of Nup82/Nup159/Nsp1 in yeast and NUP88/NUP214/NUP62 in human (Fig 5A) [15,16]. The 3 proteins are connected via a C-terminal parallel coiled-coil structure. In yeast, both Nup82 and Nup159 possess N-terminal β-propellers that provide direct binding platforms for Nup145 (which recruits Gle2) and the RNA helicase Dpb5 (recruiting Gle1); in human, the complex is built in the same way from the respective human homologues (Fig 5A). Yeast Nsp1 and Nup159 (NUP62 and NUP214 in human) possess FG repeat regions. T. brucei NUP76 has been previously suggested as Nup82/NUP88 (yeast/human) homologue [30]; indeed, the AlphaFold2 model of NUP76 [39,41] shows an analogous structural organisation with a β-propeller at the N-terminus, interrupted by a long, disordered coil, and a three-helical coiled-coil at the C-terminus (Figs 5B, 5C, and S9). Moreover, T. brucei NUP76 may share its β-propeller interactions with Nup82/NUP88: orthologues to both Nup145N/NUP98 (yeast/human) and Gle2/RAE1 (yeast/human) can be readily identified in trypanosomes [29,30]. However, the 2 remaining TbNUP76 complex components, TbNUP140 and TbNUP149, do not exhibit detectable structural homology to the NUP82/NUP88 partner proteins Nup159/NUP214 (yeast/human) and Nsp1/NUP62 (yeast/human) (Fig 5A–5C) as based on AlphaFold2 predictions. TbNUP140 consists almost entirely of FG repeats of the PxFG type, apart from a small N-terminal stretch with a coiled-coil structure that is predicted with low confidence. NUP149 is not FG rich, with only few FG motifs of the SxFG and of the FxFG type but contains 6 zinc fingers sparsed by coils and potentially a small coiled-coil region at the C-terminus [30]. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. The T. brucei NUP76 complex is only partially conserved. (A) Schematics of the cytoplasmic filament complex from yeast and human (modified from [16], not to scale. The proteins of the trypanosome NUP76 complex are shown for comparison (left). Note that trypanosomes do have orthologues to NUP145N and Gle2, but it is not known whether these directly interact with NUP76. (B) pAE and pLDDT plots of trypanosomatid-optimised AlphaFold2 models of NUP76, NUP140, and NUP149. Each protein is also shown schematically with all predicted domains and, for NUP140 and NUP149, with positions and types of FG repeats. (C) Models of trypanosomatid-optimised AlphaFold2 predictions of NUP76, NUP140, and NUP149. Structured parts are coloured, disordered regions are shown in grey. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g005 To investigate whether the trypanosome NUP76 complex is involved in mRNA export, we depleted the protein with an auxin-inducible degron system. Both alleles of the NUP76 gene were C-terminally fused to the OsAID-3xHA sequence, in a cell line that expressed the necessary components for the auxin degron system; the cell line was confirmed by diagnostic PCR (S11 Fig). Upon induction with the auxin derivative 5-Ph-IAA, the NUP76::OsAID-3xHA protein was depleted within 2 h (Fig 6A), followed by growth arrest (Fig 6B) and accumulation of poly(A) signal in the nucleus that was saturated 4 h post induction (Figs 6C and S12–S14). This phenotype is similar to the one observed upon Nup82 depletion in yeast [33,34], suggesting that NUP76 is the functional orthologue to yeast Nup82 with a crucial role in mRNA export. In order to limit the possibility that the observed blockade of mRNA export is an indirect effect, i.e., the result of a disrupted pore architecture, we expressed a range of NUPs as N- or C-terminal eYFP fusions in the NUP76 auxin degron cell line to test whether their localisation to the pore is dependent on NUP76 (Figs 6D, 6E and S15). The localisation of the inner ring NUP96 was not affected by NUP76 depletion, suggesting that the overall pore structure remains intact (Fig 6D). Of the (putative) NUP76-associated proteins, only the pore localisation of NUP140 was clearly abrogated upon NUP76 depletion, while NUP149 and Gle2 still localised to the pore (serving as additional controls for pore integrity not being affected). Note that a slightly diminished pore localisation was observed for all 4 proteins, possibly caused by the disrupted mRNA export and general loss in fitness rather than a specific impact on nuclear pore architecture. Thus, NUP140 localisation to the pore is fully dependent on NUP76, while NUP149 and Gle2 appear to be anchored independent of NUP76. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Depletion of TbNUP76 causes nuclear poly(A) accumulation and loss of NUP140 pore localisation. NUP76 protein was depleted using a degron system based on induction with the auxin derivative 5-Ph-IAA. Both alleles of NUP76 were replaced by NUP76 fused to OsAID-3xHA at the C-terminus. (A) The depletion of the NUP76 protein at 2, 4, and 24 h of induction was monitored on a western blot using anti-HA to detect NUP76 and anti-PFR as loading control. Wild-type (WT) cells served as negative controls. Data of one representative clonal cell line are shown. (B) Growth was monitored over 5 days following NUP76 depletion. Data of 3 independent clonal cell lines are shown. Raw data can be found in S3 Table. (C) In situ hybridisation: cells were probed with oligo dT to monitor mRNA localisation. The DNA is labelled with DAPI. One representative cell is shown for untreated cells and cells after 2 and 4 h of NUP76 depletion (method: sum slices of 75 images recorded at 140 nm distance). Fluorescent profiles and images with more cells are shown in S12–S14 Figs. Representative data of one out of 3 clones is shown. (D, E) An N-terminal eYFP fusion of NUP96 and C-terminal eYFP fusions of NUP140, Nup149, and Gle2 were expressed from endogenous loci in the NUP76 degron cell line. The eYFP fluorescence of 7 randomly selected nuclei is shown before and after induction of NUP76 depletion. Additional nuclei are shown in S15 Fig. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g006 In conclusion, the cytoplasmic site-localised NUP76 complex of trypanosomes, consisting of NUP76, NUP140, and NUP149, is distinct from the cytoplasmic mRNA remodelling complex of yeast and human. While NUP76 is the likely functional homologue to Nup82/NUP88 from yeast/human, NUP140 and NUP149 are trypanosome-unique with no sequence or structural homology to cytoplasmic site (filament) proteins from opisthokonts. The absence of a Nup159/NUP214 (yeast/human) orthologue in trypanosomes correlates with the absence of orthologues to its interaction partners Dbp5/DDX19 and Gle1/GLE1 (yeast/human), suggestive of significant mechanistic differences on the mRNA remodelling mechanisms in trypanosomes. Expansion microscopy identifies novel asymmetric components of the trypanosome pore We revisited the trypanosomatid nuclear pore architecture with expansion microscopy. Therefore, we expressed target proteins fused to a small peptide epitope-tag (3xHA or 4xTy1) to allow immunofluorescence detection via anti-HA or anti-Ty1. In some experiments, we expressed the target protein fused to the biotin ligase TurboID [62], followed by the detection of the biotinylation of the bait and proximal proteins with fluorescent streptavidin (= streptavidin imaging). We had previously shown that labelling with streptavidin increases the signal with no obvious loss in resolution, which is essential since expansion microscopy causes a massive reduction in antigen density [53]. Even more importantly, streptavidin readily labels proteins within phase-separated areas, such as the nuclear pore central channel, that we found largely inaccessible to antibodies [53]. Since TurboID will not only auto-biotinylate the bait but also adjacent proteins, there is the possibility that the streptavidin signal may not reflect the true localisation of the bait. Hence, throughout this work, we have confirmed all major findings derived from streptavidin labelling with orthogonal methods, such as immunofluorescence, mass spectrometry, and/or vice-versa labelling. All fusion proteins were expressed from the endogenous loci to avoid major changes in gene expression. First, we used protein retention expansion microscopy (proExM), a method that expands cells after protein labelling [63]. We experimentally determined the expansion factor as 3.6 and confirmed the isotropic expansion of the nucleus (S1 Fig). We initially concentrated on the proteins of the NUP76 complex (NUP76, NUP140, and NUP149) as these were previously shown to co-precipitate with the trypanosome mRNA export factor MEX67 [30]. These NUPs were fused to either N- or C-terminal peptide tags (NUP76::3xHA, NUP140::4xTy1, 3xHA::NUP149). In the same cell lines, we co-expressed the nuclear basket protein NUP110 with an N-terminal fusion to a different epitope tag (4xTy1 or 3xHA). Upon dual labelling with anti-Ty1 and anti-HA, we carried out expansion and imaging. All 4 proteins resolved as single dots located at the nuclear periphery. The signals from the NUP76-complex proteins were in all cases clearly separated from the NUP110 signal towards the cytoplasmic site of the pore (Figs 1A and S2). Notably, we observed for every dot signal originating from the NUP76 complex a corresponding NUP110 dot, indicating that trypanosomes, unlike yeast [26], do not have basket-less pores. The median, expansion-factor corrected distance to NUP110 exceeded for all 3 proteins 120 nm (129±17 nm for NUP76, 120±22 nm for NUP140, and 137±18 nm for NUP149 with n > 100). The large distance of the NUP76 complex proteins to the basket protein NUP110 and the absence of “double-dots” for the NUP76-complex strongly indicate asymmetric localization of the NUP76 complex exclusively at the cytoplasmic site of the pore. The data disagree with previous observations derived from immuno-electron microscopy that place NUP76, symmetrically, to both outer rings [30]. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Expansion microscopy identifies novel asymmetric pore proteins. (A) proExM of cell lines co-expressing epitope-tagged versions of NUP76, NUP140, or NUP149 in combination with NUP110. Images were deconvolved (20 iterations for NUP149 and NUP76; 60 iterations for NUP140) and single planes of the nuclei are shown. Further images are shown in S2 Fig. (B, C) UExM of cells expressing proteins fused to TurboID or 3xHA, as indicated. Labelling was done with fluorescent streptavidin and with anti-HA. Images were deconvolved (20, 20, 40 iterations for NUP89/NUP96, NUP89/NUP64, and NUP98/NUP64, respectively). A single plane image of an entire nucleus is shown on the left and 3 enlarged regions from the same or another nucleus are shown on the right (Cy = cytoplasm, Nu = nucleus). For NUP89/NUP96, yellow arrows point to pores that are in a suitable focal plane to see the 2 NUP89 dots sandwiching the NUP96 dot. (D) UExM of lines co-expressing TurboID-tagged versions of NUP76 or NUP149 with NUP64-3xHA. Labelling was done with fluorescent streptavidin and anti-HA. Images were deconvolved with 60 and 20 iterations for NUP76/NUP64 and NUP149/NUP64, respectively. A single plane image of one nucleus is shown. proExM, protein retention expansion microscopy; UExM, ultrastructural expansion microscopy. https://doi.org/10.1371/journal.pbio.3003024.g001 We were concerned that the observed sole cytoplasmic localisation of the NUP76 complex proteins is a technical artifact caused by (i) an insufficient resolution of the proExM method; (ii) a non-isotropic expansion across the nuclear membrane; or (iii) reduced accessibilities of antibodies to the nuclear ring in comparison to the cytoplasmic ring. To investigate, we applied ultrastructural expansion microscopy (UExM) [64], which offers a higher resolution because the antibody labelling is applied after the expansion and the linkage error (distance between the fluorophore of the secondary antibody and the target protein) is therefore not expanded. UExM has been successfully used in trypanosomes [53,65–67], and we achieved an expansion factor of 4.2-fold with isotropic expansion of the nucleus (S1 Fig). To prove that UExM provides the resolution to resolve the nuclear outer ring, inner ring, and cytoplasmic outer ring, we first imaged NUPs that are conserved across eukaryotes. As an outer ring marker, we chose NUP89, the trypanosome orthologue to the outer ring Y-complex component of yeast Nup84/85 (NUP107/75 in human) [30]. As an inner ring marker, we selected NUP96, the conserved trypanosome orthologue to S. cerevisiae Nic96. We co-expressed NUP89::TurboID with NUP96::3xHA. UExM with fluorescent streptavidin and anti-HA readily resolved NUP89 as double dots at the nuclear periphery, sandwiching the single dot signal of NUP96 (Fig 1B), proving that the resolution of the method is sufficient to distinguish these different subregions of the pore. However, as the NUP96 signal was weak, we searched for a better inner ring marker. We tested NUP64, expressed as C-terminal 3xHA fusion, a trypanosome-unique FG-repeat NUP previously identified as a multi-complex NUP localised mostly to the centre of the pore [30]. To our surprise, the resulting single NUP64 dot signal was not sandwiched by the 2 outer ring dots of NUP89 but instead co-localised solely with the NUP89 dot at the nuclear site of the pore (Fig 1C). Likewise, the streptavidin signal of a C-terminal TurboID fusion of TbNUP98, known to form a complex with NUP64 [30], resolved as single dots that colocalised exclusively with the NUP64::3xHA dots at the nuclear site (Fig 1C). The data indicate an asymmetric, exclusive nuclear site localisation of NUP64 and NUP98 (Fig 1C). Having confirmed that UExM has the resolution to distinguish proteins located at the cytoplasmic site outer ring from proteins located at the nuclear site outer ring, we reassessed the localisation of the NUP76 complex. We confirmed the sole cytoplasmic localisation of the NUP76 complex by co-staining fusions of this complex to either 3xHA (S3 Fig) or TurboID (Fig 1D) with C-terminal 4xTy1 or 3xHA fusions of our newly identified nuclear site marker NUP64. In summary, we discovered 5 asymmetric proteins of the trypanosome nuclear pore complex, previously assumed to be symmetrically distributed: NUP76/NUP140/NUP149 at the cytoplasmic site and NUP64/NUP98 at the nuclear site. With the exception of NUP76, which is the structural orthologue to yeast NUP82 and human NUP88 [30], all novel asymmetric NUPs are trypanosome specific. A proximity map of the trypanosome nuclear pore The novel availability of asymmetric NUPs prompted us to use mass spectrometry data from TurboID proximity labelling experiments, to generate a proximity map of the entire trypanosome nuclear pore. We used 7 available LC-MS/MS data sets from previous streptavidin-affinity purifications, namely N- and C- terminal TurboID fusions of NUP110 (basket), NUP96 (inner ring), and NUP76 (cytoplasmic outer ring) [53] and NUP158 (outer ring) with C- terminal TurboID fusion [55]. In addition, we generated a new LC-MS/MS data sets for NUP98 that we have identified to be at the nuclear-site of the pore by expansion microscopy, and also for NUP75, which was previously identified to be in a complex with NUP98 and NUP64 [30]. Of all the proteins that were labelled by these baits, we initially concentrated on proteins that were previously identified as NUPs, based on predicted structural similarities [29] and/or affinity purification [30]. For each data set, we colour-coded the enrichment of the NUP proteins based on t test difference to a wild-type control. Then, the NUPs were sorted by hierarchical clustering applying a Pearson distance method (Fig 2A). For each NUP, we included the pLDDT plots [39] to indicate confidence of the predicted structure, which in most cases correlates to structured (high confidence) and unstructured (low confidence, mostly FG-repeats) regions. The majority of the NUPs separated into 3 clearly distinct main clusters. The first cluster contained the lamina protein NUP2 [68] and NUP64/NUP98 that localised exclusively to the nuclear site of the pore by expansion microscopy (Fig 1C). The second cluster largely consisted of NUPs previously identified as inner ring NUPs [30] while the third was dominated by NUPs previously assigned to the outer ring [30]. Five NUPs were manually added to the clusters using positional information from [30], because of poor labelling by only 2 baits or less (NUP62, NUP119, NUP110, NUP92) or labelling by all baits (NUP132). For 6 proteins, the labelling was too weak (NUP152, SEC13A, SEC13B, NUP41, NUP48) or too diverse (GLE2) to confidently assign the proteins to a certain region of the pore. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. A proximity map of the trypanosomatid nuclear pore. (A) Mass spectrometry data (t test difference values) from proximity labelling experiments were used to create a heat map of the trypanosome nuclear pore. A range of N- or C- terminally tagged TurboID fusions served as baits and the labelling (proximity) of most nuclear pore proteins is shown as a tree. Some NUPs were manually added to the tree using data of [30]. pLDDT plots from Trypanosomatid-optimised AlphaFold2 models [39] are shown on the right. Details on the mass spectrometry data can be found in S1 Table. (B) The model of the trypanosome nuclear pore changes, with the discovery of 5 novel asymmetrically localised proteins. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g002 For the vast majority of NUPs, the proximity map confirmed the previous assignments of the NUPs from affinity capture experiments [30]. Only 2 NUPs exhibited ambiguous placement in the proximity map (bold in Fig 2A). The outer ring NUP132 is labelled strongly by all 8 baits, including strong labelling by the basket proteins NUP110 and NUP92, suggesting extensions of NUP132 towards the basket region. Further, the presumed inner ring NUP53a is also labelled by all baits: the labelling by the inner ring bait NUP96 is the strongest, but there is also strong labelling by the cytoplasmic-specific NUP76 and also by nuclear site-specific NUPs, indicating that NUP53a may be at the inner ring but reaching out to the outer rings. NUP98 and NUP64 unequivocally grouped with the basket/nuclear site (basket and inner ring) and were labelled by NUP110 and NUP96 baits, but not by the cytoplasmic NUP76, in line with our proExM and UExM data (compare Fig 1B and 1C). To our surprise, NUP75, which shares 46% sequence identity with NUP64 and associates with both NUP98 and NUP64 [30], was placed to the inner ring and was not labelled by NUP110. Moreover, the outer ring protein NUP158 strongly labelled NUP64 and slightly less NUP98, but not NUP75, further supporting the absence of NUP75 from the outer rings [53]. When NUP75 and NUP98 were used as baits, both showed the strongest labelling with each other and with NUP64 (Fig 2A), consistent with these 3 proteins forming a complex, as previously suggested [30]. Moreover, as expected, neither protein labelled the NUP76 complex, which, incidentally, is orthogonal evidence for the NUP76 complex being cytoplasmic. Interestingly, neither protein labelled NUP110, which was expected for NUP75, but not for NUP98, which is itself labelled by NUP110. Perhaps, the C-terminus of NUP98 is distant to NUP110, while the N-terminus is close. Our data suggest a model of an asymmetric NUP98/64/75 complex reaching from the nuclear outer ring to the inner ring, with NUP98 and NUP64 being located at the outer nuclear ring and NUP75 at the inner ring. Data from previous affinity isolation experiments with NUP98, NUP64, and NUP75 show marked differences between the interactomes of NUP98/64 and NUP75, including the exclusive absence of NUP110 from the NUP75 interactome, in full agreement with our data [30]. The outer-ring cluster is divided into 2 subclusters. Proteins of both clusters are labelled by the inner ring bait NUP96 and by the cytoplasmic-site-specific NUP76. Proteins of the first cluster (NUP158, NUP53a, and NUP109) are additionally labelled by the nuclear site-specific proteins NUP98 and NUP75 and by NUP158, while proteins of the second cluster are not (with the one exception of NUP149, which is labelled by NUP158). We believe that this clustering reflects differences in proximity between these 2 protein groups within the outer rings. We can exclude the interpretation that the clustering of NUP89 and NUP82 with the cytoplasmic site-specific NUP75/NUP140/NUP149 proteins means, that NUP89 and NUP82 are cytoplasmic-site specific too: NUP89 is present in both outer rings (Fig 1B and 1C) and both NUP89 and NUP82 are weakly labelled by the nucleoplasmic-specific NUP75 and NUP98. In summary, the proximity map accurately predicts the localisations for the vast majority of NUPs. Importantly, it offers orthogonal (tag-independent) validation of the asymmetric localisation of NUP98 and NUP64 to the nuclear site of the pore and of the NUP76 complex proteins to the cytoplasmic site of the pore, confirming the data of the expansion microscopy. A new model of the pore, highlighting the asymmetric components, is shown in Fig 2B. Thus, our proximity map has the potential to predict localisations of proteins with sub-pore size resolution, which prompted us to look at all 44 proteins that have nuclear pore localisation according to TrypTag [35] but are not annotated as NUPs. Mapping the Ran-based mRNA export system to the pore First, we concentrated on all nuclear pore-localised proteins that are involved in mRNA export: MEX67 [69], Mtr2, MEX67b [70] and, as postulated [30,71], Ran, RanGAP, the putative RanGDP importer NTF2 and 2 Ran-binding proteins, RanBP1 and RanBPL [72]. The proximity map places RanGAP and RanBP1 to the cytoplasmic site of the pore, while RanBPL localisation is predicted at the nucleoplasmic site (Fig 3A). For the transporters MEX67, MEX67b and Ran, the labelling was less confined to a specific site. For Mtr2 and NTF2, we obtained no labelling, likely due to their small size which is problematic in BioID [55]. As a control, we included data of vice versa TurboID experiments with MEX67 and Ran as baits (Figs 3A and S4 and S2 Table): both proteins label proteins at both sides of the pore, consistent with shuttling. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Mapping the trypanosome mRNA export system to the nuclear pore. (A) Mass spectrometry data (t test difference values) from proximity labelling experiments were used to map the Ran-based mRNA export system to the trypanosome pore. Details on the mass spectrometry data can be found in S1 Table. (B, C) Ultra-expansion microscopy. MEX67, Ran, RanBP1, RanGAP, and RanBPL were expressed as C-terminal (MEX67, Ran RanBP1, RanGAP) or N-terminal (RanBPL) fusions to TurboID in a cell line co-expressing the nuclear-site marker NUP64::3xHA, all from the endogenous locus. Cells were expanded and the proteins detected with streptavidin and anti-HA. Images of single nuclei are shown (single plane of deconvolved Z-stacks with 10 iterations for MEX67, Ran, RanBP1, and RanBPL and 20 iterations for RanGAP). For MEX67, the streptavidin signal was weak and 3 representative nuclei of an overexpression cell line are shown (B, right). For the other proteins, enlarged section of the nuclear envelope of the same or other nuclei are shown (C, right). (D) Proximity labelling of the nuclear pore by MEX67 and Ran. MEX67 and Ran were expressed as C-terminal TurboID fusions from the endogenous loci and biotinylated peptides were analysed by mass spectrometry. The labelling of NUPs by MEX67 (left) and Ran (right) is shown coloured based on their t test difference values in comparison to wild-type cells. Asymmetric NUPs are marked with a red arrow. (E) Schematic summary of our localisation data of the Ran-based mRNA export system. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g003 Next, we determined the localisation of MEX67, Ran, RanGAP, RanBP1, and RanBPL by UExM, expressing TurboID fusions in a cell line that expressed NUP64::3xHA as a nucleoplasmic-site marker (Fig 3B and 3C). RanGAP and RanBP1 resolved as single dots, well distanced from the NUP64 dots towards the cytoplasmic site, while the RanBPL signals overlapped with the NUP64 signals at the nuclear site, in full agreement with the proximity map. The biotinylation signal of the 2 proteins with suspected shuttling activity, Ran and MEX67, resolved as large cytoplasmic dots and smaller nuclear dots, connected by a string-like signal reaching through the pore. For MEX67, we observed that these bone-shaped signals were more pronounced when the gene was 8-fold overexpressed from an ectopic locus (Fig 3B, images on the right) which only slightly impaired growth (S5 Fig). For Ran, MEX67, and RanBPL, we observed an additional signal at the nucleolus, which is defined by the reduction in DAPI stain (S6A Fig), and a minor signal in the nucleoplasm. For RanBPL, the nucleolar signal was stronger than the signal at the pores, while for Ran and MEX67 (at endogenous expression levels) the nuclear pore signal was dominant. We attempted to confirm the nucleolar localisation by direct immunofluorescence instead of streptavidin imaging. The nucleolus is challenging to label with antibodies [53], but for MEX67::4xTy1 we could get a weak, but distinct nucleolar antibody signal (S6B Fig). The functional implications of the nucleolar localisation of Ran, MEX67, and RanBPL are not fully understood in trypanosomes but not unexpected, as in ophistokonts Ran and Mex67 participate in pre-ribosome transport [73]. For the shuttling proteins Ran and MEX67, we confirmed the streptavidin-based imaging data by the LC-MS/MS data upon streptavidin enrichment (Figs 3D and S4 and S2 Table). Both MEX67 and Ran strongly labelled FG-NUPs lining the inner pore channel, consistently reflecting their movement across the pore. The NUPs with the strongest labelling were the asymmetric NUPs on both sides of the pore: NUP149 at the cytoplasmic site and NUP98, NUP64 and the lamin-like protein NUP2 at the nuclear site (red arrows in Fig 3D). This strongly suggests that both proteins, Mex67 and Ran, would have binding sites at both sides of the trypanosome pore, analogous to human Ran [74–77]. There was weak labelling of structured NUPs, in agreement with the rather poor labelling of MEX67 and Ran by NUP76, NUP96 and NUP110 and NUP158 in the heat map (Fig 3A). Preferential labelling of asymmetric FG NUPs over structured NUPs has also been shown for human karyopherins tagged with the biotin ligase BirA* [78]. In conclusion, our proximity map predicted the localisation of all non-shuttling Ran system components confidently and in agreement with streptavidin imaging in UExM. RanGAP is at the (expected) cytoplasmic site, together with RanBP1. RanBPL had not been previously mapped, but is unequivocally placed to the nuclear site, consistent with its binding preference for RanGTP [72]. The proximity map was unable to categorise the shuttling proteins MEX67 and Ran, likely because non-structured FG-NUPs are poorly represented in our bait repertoire. Direct BioID combined with orthogonal assessment through expansion microscopy was required to confidently place these putative export factors. The derived localisations are summarised in Fig 3E and are consistent with a mechanistically divergent Ran-dependent mRNA export in trypanosomes. Predicting the position of unknown proteins within the pore To predict the localisation of the remaining 38 nuclear pore-localised proteins more accurately, we included the proximity labelling data of MEX67 and Ran to our proximity map. Fifteen of the 38 nuclear pore-localised proteins are karyopherins (S7 Fig), five of which have not been previously classified as karyopherins but have unequivocal structural homology to importin and exportin-like folds predicted by FoldSeek (S7B Fig); these include a putative orthologue to the importin Hikeshi (Tb927.1.1400) that is specialised on the import of Hsp70-family proteins [79] (S7C Fig). Karyopherins were mostly not or poorly labelled by our proximity map (S7A Fig). The likely reason is their preferred interaction with FG-NUPs rather than structured NUPs, similar to what we observed for MEX67 and Ran (compare Fig 3A). Exceptions are XPO1 (exportin 1), known to be involved in the transport of both mRNAs and tRNAs [80], which is labelled by all bait proteins and 2 XPO-like proteins labelled by a subfraction of the baits. An additional 5 proteins with nuclear pore localisation were not labelled by either of the bait proteins (S8 Fig). For three of these proteins, Tb927.11.1000, Tb927.10.12200, and Tb927.10.8160, the reason could be failed detection due to small size [55]. None of these small proteins has homologues outside of trypanosomatids and their function is unknown. Tb927.10.8160 has the strongest nuclear pore localisation ([35], S8B Fig) and high-throughput phenotyping indicates an essential function [81]. The 2 larger proteins (Tb927.1.3230 and Tb927.9.12700) do not have very prominent nuclear pore localisation ([35], S8B Fig). For Tb927.9.12700, biochemical data indicate glycosomal localisation [82] and Tb927.1.3230 could be the trypanosomatid ortholog of the ribosome biogenesis factor Rix7 [83]. Their lack of labelling might thus be due to poor or absent nuclear pore localisation. The remaining 18 proteins were labelled by at least one of the bait proteins (Fig 4A). Strikingly, none were labelled by the cytoplasmic-site marker protein NUP76, suggesting absence of further proteins with exclusive cytoplasmic localisation, other than the NUP76 complex, RanGAP, and RanBP1. Moreover, not a single protein was exclusively labelled by the outer ring protein NUP158 [55], with the one exception of Tb927.11.13080. The (near) absence of combined labelling by NUP76 and NUP158 suggests that the outer ring proteome may be complete. Instead, these 18 proteins were either labelled by baits of the nuclear site or inner ring or both. We present the data as Pearson-distance clusters, with manual placements of proteins with insufficient labelling. Four proteins are not included to the clustering analysis: for one (SENP) the labelling pattern was too unique and 3 proteins were only labelled by MEX67 and/or Ran. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Characterisation of unknown nuclear pore proteins and their localisation over the nuclear pore complex. (A) Mass spectrometry data (t test difference values) from a range of proximity labelling experiments were screened for labelling of nuclear pore localised proteins that are not NUPs or karyopherins. All proteins that were labelled by at least one of the bait proteins are shown partially clustered using Pearson-correlation. Annotations are explained in the text. pLDDT plots from Trypanosomatid-optimised AlphaFold2 models [39] are shown on the right. Details on the mass spectrometry data can be found in S1 Table. (B–D) Structural alignments of AlphaFold2 models of the T. brucei TREX-2 complex candidates Sac3 (B), Thp1 (C), and Sus1 (D) with PDB structures of the respective TREX-2 complex proteins from other organisms, using Foldseek [40]. The regions of the proteins that were used for the structural alignments are shown in the schematics in orange (T. brucei) and blue (other organisms). The root mean square deviation of atomic positions (RMSD) and the sequence identity are shown for the superimposed regions. NUP, nucleoporin; RMSD, root mean square deviation. https://doi.org/10.1371/journal.pbio.3003024.g004 The majority of these 18 proteins is unique to Kinetoplastida or even to Trypanosomatida and lack functional annotations. Only 2 proteins have readily identifiable homologues outside Kinetoplastida: Tb927.10.9020 has homology to the non-catalytic, substrate binding subunit of the tRNA methyltransferase Trm6/Gcd10, responsible for adenosine(58)-N(1) methylation, a modification present in many eukaryotic tRNAs [84]. The second protein, Tb927.9.2220, is a SUMO protease of the Ulp/SENP (ubiquitin-like protease/sentrin-specific protease) family with potential function in resolving stalled DNA replication forks [85]. The remaining 16 proteins contain 5 proteins with predicted basket or inner ring localisation that were previously identified as lamina-associated proteins (LAPs), based on their interactions with the lamina-like proteins NUP1 and NUP2 [32]. Two of these LAPs, LAP71 and LAP102, are basket specific in our map, as expected for lamina associated proteins. Two further LAPs, LAP73 and LAP59, have exclusive inner ring prediction. Of significant interest is basket/inner ring-predicted LAP173, which has a Sac3/GANP domain and was suggested to be the orthologue to Sac3 and sole representative of a potential trypanosome TREX-2 complex [32]. Association with MEX67 was observed by affinity purifications [30] and BioID ([55], Fig 4A). In fact, the Sac3/GANP domain of a LAP173 model predicted by a trypanosome-optimised AlphaFold2 [39,41] displays remarkable structural homology to the equivalent region of an experimentally resolved S. cerevisiae Sac3 structure (RMSD 1.12Å, [86]), despite poor sequence conservation (Fig 4B). Motivated by the presence of a putative Sac3, we used Foldseek [40] to search for structural homologues of the remaining TREX-2-complex components, using AlphaFold2 models as inputs [39,41]. We identified the Csn12-like domain containing protein Tb927.11.5560 as a putative Thp1 orthologue (Fig 4C), with structural homology to the human Tph1 homologue PCID2 (PDB entry 3T5X; TM score 0.82), while the primary sequence is, again, poorly conserved. Just like Sac3, our proximity map places this Thp1 candidate to both, nuclear site of the pore and inner ring. Moreover, we identified Tb927.7.5830 as a putative Sus1 orthologue with highest structural similarity to Sus1 of the yeast K. phaffii [87] (Fig 4D). The Sus1 candidate protein is not labelled by NUPs, presumably due to its small size. However, all 3 trypanosome TREX-2 complex candidates, including Sus1, are strongly labelled by MEX67, a prototypic Sac3 interactor in ophistokonts [88], supportive of a potential role in a trypanosome TREX-2 complex (Fig 4A). In conclusion, our extended proximity map granted mapping the majority of nuclear pore localised proteins to a subregion of the pore. We found no evidence for the existence of any further proteins asymmetrically distributed to the cytoplasmic-site indicating the entirety of the cytoplasmic site-specific proteome of the pore is the NUP76 complex, RanGAP, and RanBP1, plus shuttling proteins. Instead, we predict a diverse cohort of proteins with preferential localisation to the basket or nuclear site of the pore, including 3 putative TREX-2 complex proteins with proximity to MEX67, indicative of a conserved function. The trypanosome NUP76 complex as a cytoplasmic mRNA remodelling hub We detected the NUP76 complex (NUP76, NUP140, NUP149) exclusively at the cytoplasmic site (Fig 1A and 1E) and a previous study has shown the interaction of this complex with MEX67 under high stringency conditions [30]. In combination, these data suggest that the NUP76 complex is the trypanosomes cytoplasmic mRNP binding hub that serves as remodelling platform. In opisthokonts, the cytoplasmic mRNP remodelling platform is based on the heterotrimeric complex composed of Nup82/Nup159/Nsp1 in yeast and NUP88/NUP214/NUP62 in human (Fig 5A) [15,16]. The 3 proteins are connected via a C-terminal parallel coiled-coil structure. In yeast, both Nup82 and Nup159 possess N-terminal β-propellers that provide direct binding platforms for Nup145 (which recruits Gle2) and the RNA helicase Dpb5 (recruiting Gle1); in human, the complex is built in the same way from the respective human homologues (Fig 5A). Yeast Nsp1 and Nup159 (NUP62 and NUP214 in human) possess FG repeat regions. T. brucei NUP76 has been previously suggested as Nup82/NUP88 (yeast/human) homologue [30]; indeed, the AlphaFold2 model of NUP76 [39,41] shows an analogous structural organisation with a β-propeller at the N-terminus, interrupted by a long, disordered coil, and a three-helical coiled-coil at the C-terminus (Figs 5B, 5C, and S9). Moreover, T. brucei NUP76 may share its β-propeller interactions with Nup82/NUP88: orthologues to both Nup145N/NUP98 (yeast/human) and Gle2/RAE1 (yeast/human) can be readily identified in trypanosomes [29,30]. However, the 2 remaining TbNUP76 complex components, TbNUP140 and TbNUP149, do not exhibit detectable structural homology to the NUP82/NUP88 partner proteins Nup159/NUP214 (yeast/human) and Nsp1/NUP62 (yeast/human) (Fig 5A–5C) as based on AlphaFold2 predictions. TbNUP140 consists almost entirely of FG repeats of the PxFG type, apart from a small N-terminal stretch with a coiled-coil structure that is predicted with low confidence. NUP149 is not FG rich, with only few FG motifs of the SxFG and of the FxFG type but contains 6 zinc fingers sparsed by coils and potentially a small coiled-coil region at the C-terminus [30]. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. The T. brucei NUP76 complex is only partially conserved. (A) Schematics of the cytoplasmic filament complex from yeast and human (modified from [16], not to scale. The proteins of the trypanosome NUP76 complex are shown for comparison (left). Note that trypanosomes do have orthologues to NUP145N and Gle2, but it is not known whether these directly interact with NUP76. (B) pAE and pLDDT plots of trypanosomatid-optimised AlphaFold2 models of NUP76, NUP140, and NUP149. Each protein is also shown schematically with all predicted domains and, for NUP140 and NUP149, with positions and types of FG repeats. (C) Models of trypanosomatid-optimised AlphaFold2 predictions of NUP76, NUP140, and NUP149. Structured parts are coloured, disordered regions are shown in grey. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g005 To investigate whether the trypanosome NUP76 complex is involved in mRNA export, we depleted the protein with an auxin-inducible degron system. Both alleles of the NUP76 gene were C-terminally fused to the OsAID-3xHA sequence, in a cell line that expressed the necessary components for the auxin degron system; the cell line was confirmed by diagnostic PCR (S11 Fig). Upon induction with the auxin derivative 5-Ph-IAA, the NUP76::OsAID-3xHA protein was depleted within 2 h (Fig 6A), followed by growth arrest (Fig 6B) and accumulation of poly(A) signal in the nucleus that was saturated 4 h post induction (Figs 6C and S12–S14). This phenotype is similar to the one observed upon Nup82 depletion in yeast [33,34], suggesting that NUP76 is the functional orthologue to yeast Nup82 with a crucial role in mRNA export. In order to limit the possibility that the observed blockade of mRNA export is an indirect effect, i.e., the result of a disrupted pore architecture, we expressed a range of NUPs as N- or C-terminal eYFP fusions in the NUP76 auxin degron cell line to test whether their localisation to the pore is dependent on NUP76 (Figs 6D, 6E and S15). The localisation of the inner ring NUP96 was not affected by NUP76 depletion, suggesting that the overall pore structure remains intact (Fig 6D). Of the (putative) NUP76-associated proteins, only the pore localisation of NUP140 was clearly abrogated upon NUP76 depletion, while NUP149 and Gle2 still localised to the pore (serving as additional controls for pore integrity not being affected). Note that a slightly diminished pore localisation was observed for all 4 proteins, possibly caused by the disrupted mRNA export and general loss in fitness rather than a specific impact on nuclear pore architecture. Thus, NUP140 localisation to the pore is fully dependent on NUP76, while NUP149 and Gle2 appear to be anchored independent of NUP76. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Depletion of TbNUP76 causes nuclear poly(A) accumulation and loss of NUP140 pore localisation. NUP76 protein was depleted using a degron system based on induction with the auxin derivative 5-Ph-IAA. Both alleles of NUP76 were replaced by NUP76 fused to OsAID-3xHA at the C-terminus. (A) The depletion of the NUP76 protein at 2, 4, and 24 h of induction was monitored on a western blot using anti-HA to detect NUP76 and anti-PFR as loading control. Wild-type (WT) cells served as negative controls. Data of one representative clonal cell line are shown. (B) Growth was monitored over 5 days following NUP76 depletion. Data of 3 independent clonal cell lines are shown. Raw data can be found in S3 Table. (C) In situ hybridisation: cells were probed with oligo dT to monitor mRNA localisation. The DNA is labelled with DAPI. One representative cell is shown for untreated cells and cells after 2 and 4 h of NUP76 depletion (method: sum slices of 75 images recorded at 140 nm distance). Fluorescent profiles and images with more cells are shown in S12–S14 Figs. Representative data of one out of 3 clones is shown. (D, E) An N-terminal eYFP fusion of NUP96 and C-terminal eYFP fusions of NUP140, Nup149, and Gle2 were expressed from endogenous loci in the NUP76 degron cell line. The eYFP fluorescence of 7 randomly selected nuclei is shown before and after induction of NUP76 depletion. Additional nuclei are shown in S15 Fig. NUP, nucleoporin. https://doi.org/10.1371/journal.pbio.3003024.g006 In conclusion, the cytoplasmic site-localised NUP76 complex of trypanosomes, consisting of NUP76, NUP140, and NUP149, is distinct from the cytoplasmic mRNA remodelling complex of yeast and human. While NUP76 is the likely functional homologue to Nup82/NUP88 from yeast/human, NUP140 and NUP149 are trypanosome-unique with no sequence or structural homology to cytoplasmic site (filament) proteins from opisthokonts. The absence of a Nup159/NUP214 (yeast/human) orthologue in trypanosomes correlates with the absence of orthologues to its interaction partners Dbp5/DDX19 and Gle1/GLE1 (yeast/human), suggestive of significant mechanistic differences on the mRNA remodelling mechanisms in trypanosomes. Discussion The compartmentalisation of hereditary information in the nucleus necessitated the invention of a gateway allowing mRNPs and a variety of other essential cargos to cross the nuclear envelope. The study of nuclear pores in evolutionary divergent eukaryotes, such as the ancient trypanosomes, is fundamental to understand the evolutionary origin of the nucleus and decipher the complexity of the nuclear pore as platform with multiple cargo routes. Our study contributes a roadmap of the trypanosome nuclear pore, reporting conserved and non-conserved features and devising a plethora of new leads for further exploration. This study is an update of the trypanosome nuclear pore composition and is based on the foundations set by earlier studies: In 2009 [29] had identified all trypanosomes NUPs using predicted structural similarities in pre-AlphaFold times. In 2016, Obado and colleagues have used a combination of high stringency precipitation of cryomilled cell material and electron microscopy with immunogold labelling to present the first detailed model of the trypanosome pore [30]. Finally, the genome-wide localisation database TrypTag has provided a comprehensive list of proteins with pore localisation [35]. We took advantage of all these available data to create a revised map of the pore, by combining expansion microscopy with a novel way to globally analyse proximity labelling data. Mostly, our map is in agreement with the earlier map. We did one major correction, which is the placement of the NUP76 complex and the NUP64/NUP98 complex to the cytoplasmic and nuclear site, respectively. The previous symmetric assignment of these complexes was solely based on electron-microscopy detection of immunogold labelled NUP-GFP fusions. This method may be error prone due to large distance of the gold particle to the target protein (GFP+antibody tandem tag) and, even more importantly, it is very difficult to accurately determine the centre of the pore, as membranes are poorly visible. In contrast, in expansion microscopy, the localisation is determined in relation to another pore marker protein and does not depend on membrane detection, and, at least in UExM, the localisation error caused by the antibodies and tag is much smaller because the labelling is done after the expansion. We do therefore believe that expansion microscopy reflects the localisation of pore proteins more accurately. Nevertheless, the conflicting results prompted us to confirm our findings with orthogonal methods and we chose a heat map, generated from proximity labelling mass spectrometry data of marker proteins at different positions within the pore. The heat map agreed with all expansion microscopy data, and, importantly, it does not rely on the proteins being modified with a tag, which could affect localisation. Both the heat map and the expansion microscopy have their pitfalls and can potentially create wrong results. However, the combination of the methods increases the confidence, in particular when further combined by targeting multiple subunits of one complex rather than just one, as we did for example for the NUP76 complex. NUP76: A partially conserved cytoplasmic site-specific complex with connections to mRNA export Of the NUPs, only the NUP76 complex (NUP76, NUP140, and NUP149) appears localised specifically to the cytoplasmic site. The NUP76 complex had been previously suggested to be part of the mRNA remodelling platform, as all 3 proteins co-isolated with Mex67 under high stringency conditions [30]. Consistent with this hypothesis, we now show exclusive cytoplasmic localisation of the NUP76 complex (Fig 1) as well as significant structural similarities of NUP76 with the scaffold mRNA remodellers of yeast and human, NUP82 and NUP88, respectively (Figs 5 and S9) and nuclear mRNA accumulation upon NUP76 depletion (Fig 6C). While NUP76 is a likely homologue to yeast NUP82 or human NUP88, the other proteins of the NUP76 complex, NUP140 and NUP149, appear unique to trypanosomes and share no similarity with the proteins that interact with yeast Nup82 or human NUP88. They possess no predictable structured elements, with the exception of NUP149 which possess four zinc finger motifs. Notably, zinc fingers are also present in the human cytoplasmic-filament NUP358 and the nuclear-site localised NUP153, both absent from trypanosomes, where they engage in Ran binding [89,90]. However, the zinc fingers of TbNUP149, confidently predicted as 3 β-hairpin strands with 4 cysteines side chains coordinating a zinc ion by AlphaFold2 and AlphaFold3 [91] appear to lack obvious sequence or structural homology to the zinc fingers of human NUP358 and NUP153 (S10 Fig) and whether they nevertheless promote Ran binding remains to be investigated. Notably, NUP149 is heavily biotinylated by Ran-TurboID, indicative of a possible interaction (Fig 3B). Even though the NUP76 complex in trypanosomes is different to the complexes from yeast and human, one similarity is worth mentioning: NUP76 depletion disrupts pore localisation of NUP140, but not of NUP149, just like NUP88 depletion in human disrupts pore localisation of NUP214, but not Nup62 [92]. A mechanistically divergent Ran-dependent mRNA export pathway in trypanosomes Apart from the NUP76 complex, we mapped RanGAP and RanBP1 to the cytoplasmic site of the nuclear pore. Their sole cytoplasmic localisation is suggestive of a conserved function of these proteins in triggering GTP hydrolysis of RanGTP and thus disassembly of exportin-cargo-RanGTP and importin-RanGTP complexes. Trypanosome RanGAP is phylogenetically more closely related to a RabGAP [93] but its proposed function as RanGAP [30] is further corroborated by our study. Pore-anchoring of RanGAP and underlying mechanisms significantly vary across species, ranging from a SUMO-dependent interaction with the metazoan-specific RanBP2/NUP358 [75,94–96], over a WPP domain-specific to plant RanGAP that interacts with a plant specific nucleoporin [97,98], to no pore localisation at all in S. cerevisiae and S. pombe [99,100]. The mechanism of pore localisation of trypanosome RanGAP is thus likely unique and may involve interactions with likewise unique trypanosome specific NUPs such as NUP140 and/or NUP149. Trypanosome RanBP1 consists of a disordered 30-amino acid stretch followed by a conserved RanBP domain (S16A Fig) and it remains unclear whether it has binding sites to the pore or simply concentrates in sites of cargo docking. The lack of a Dbp5 homologue and the association between MEX67 and Ran implies that trypanosomes employ the Ran-GTP gradient for mRNA export [30]. In fact, while Ran is predominantly nuclear localised in humans [101], in trypanosomes we find biotinylated Ran targets on both sites of the pore, possibly reflecting Ran engagement in cargo import and export [101]. This unique dual usage of the Ran pathway for both mRNA and protein cargo presents a formidable challenge for export/import ratio moderation. It is tempting to speculate that RanGTP is anchored at the basket site awaiting the MEX67 bound mRNP, which is then liberated on the cytoplasmic site driven by RanGAP-catalysed GTP hydrolysis. The lack of Dbp5 suggests that ATP-dependent mRNP disassembly at the cytoplasmic site of the pore is dispensable in trypanosomes, implying a fundamentally different mode of interaction between MEX67 and mRNA. Indeed, trypanosome MEX67 uniquely carries a CCCH-type zinc finger instead of the canonical RNA recognition motif containing RNA binding domain (RRM/RBD) found in ophistokonts [102,103], and trypanosomes lack mRNA adaptors (SR proteins) that would require stripping during cytoplasmic remodelling. Thus, the exported trypanosome RNP may exhibit lower stability and complexity, making a remodelling RNA helicase redundant. We have identified another potential component of the Ran system at the nuclear site of the pore: RanBPL has a Ran-binding domain which is very similar to the one of cytoplasmic-site localised RanBP1 but has a longer disordered N-terminal stretch (S16 Fig) and was previously characterised as Ran-binding protein with a clear preference to RanGTP over RanGDP [72]. Thus, RanBPL may be the trypanosome functional counterpart to basket proteins Nup2/NUP50 (yeast/human), which also possess Ran-binding domains. While the multiple roles of Nup2/NUP50 remain largely elusive, one known function is the acceleration of protein import complex disassembly through stimulation of RanGEF/RCC1 activity [104], analogous to the function of RanBP1 as enhancer of RanGAP activity at the cytoplasmic site [105]. In trypanosomes, a RanGEF has not yet been identified and the absence of a detectable RanGEF/RCC1 domain among the proteins biotinylated by Ran indicates that a trypanosome RanGEF/RCC1 is either absent or divergent. Theoretically, RanBPL has the potential to compensate for the absence of the canonical RanGEF/RCC1: instead of directly catalysing the GDP to GTP exchange, RanBPL1 could act by stabilising RanGTP and preventing GTP hydrolysis, driving the equilibrium towards RanGTP. However, the observation that RanBPL silencing evokes only a mild growth phenotype (Brasseur and colleagues [72]) argues against this hypothesis. Altogether, our study fortifies the hypothesis of a Ran-dependent mRNA export pathway in trypanosomes and opens new avenues for exploration of the underlying molecular mechanisms. Of potential interest in this context is also the hypothetical protein Tb927.3.5370 that is strongly labelled by Ran, but not by any NUPs. Newly identified proteins with (predicted) localisation to the nuclear site of the pore While only 5 proteins are specific to the cytoplasmic site, the nuclear site of the pore appears more complex. Next to the previously described basket proteins NUP110 and NUP92 [29,30] and RanBPL (discussed above), we found exclusive nuclear site localisation for the FG nucleoporins NUP64 and NUP98, putative TREX-2-complex proteins and up to 8 further proteins (the number is depending on how threshold is defined) that are mostly unique to trypanosomes. The FG-NUPs NUP64 and NUP98 are unique to trypanosomes and in a complex with NUP75 [30] which appears to extend to the inner ring via NUP75. NUP64 and NUP98 were previously suggested to be the (functional) orthologues of S. cerevisiae Nup1 and Nup60, as they carry the same FG-type and engage in an interaction with the putative Sac3 homologue [30,32]. Our data now show the exclusive nuclear-side position of these NUPs, in full support of this model. The TREX-2 complex was believed to be absent from trypanosomes, with the possible exception of Sac3 [32] (Fig 4B). We have now identified proteins within the cohort of nuclear pore-localised proteins [35] that show structural similarity to Thp1 and Sus1 (Fig 4C and 4D). Moreover, the orthologues to Thp1 and Sac3 have a predicted localisation at the nuclear site of the pore. All 3 trypanosome candidate TREX-2 components now await experimental analysis to understand the mechanistic details of the trypanosome mRNA export platform. The 2 further TREX-2 components, Sem1 and Cdc31 [14], were not identified within the trypanosome nuclear pore-localised proteins. These are either absent from the trypanosome TREX-2 complex, or failed identification either due to poor structural conversation or because the proteins were not identified as pore-localised by TrypTag [35]. Six of the 8 further proteins with predicted localisation at the nuclear site of the pore are trypanosome-unique with no obvious homologies and further experiments are essential to uncover their functions. Two of the proteins have predicted functions: Tb927.10.9020 is the likely homologue to the non-catalytic subunit of the tRNA methyltransferase TRM6 and was predicted as a basket-specific nuclear pore protein in our heatmap, with minor labelling by the outer ring NUP158 and by MEX67 and Ran (Fig 4A). The protein exhibits strong nuclear pore localisation [35], which is in contrast to the nuclear localisation observed for S. cerevisiae and A. thaliana Gcd10/TmR6 [106,107]. While localisation to the nuclear pore and/or envelope is not unheard of for tRNA modifying enzymes [108,109], this finding requires further investigation, as, conversely, the putative T. brucei homolog of the corresponding catalytic subunit, TRM61/Gdc14 (Tb927.11.11660), localises to the nuclear lumen/nucleoplasm [35]. The other protein with predicted basket localisation, Tb927.9.2220, has homologies to an ubiquitin-like protease/sentrin-specific protease (Ulp/SENP) that may function in resolving stalled DNA replication forks [85]. Both Ulp1 of yeast and SENP2 of human have nuclear pore localisation [110,111] and the latter was localised to the nuclear site of the pore, consistent with our map [110]. TbSENP has a rather unique biotinylation pattern that did not clustered with the biotinylation pattern of any other nuclear pore protein: it is labelled by the basket-localised NUP110 and by NUPs of the inner and outer rings, but neither by the nuclear site-specific NUP98 and NUP75 nor by the cytoplasmic site-specific NUP76. The reason for this unusual labelling pattern remains unknown and requires further investigation. Proteins with inner-ring prediction Two non-NUP proteins have exclusive inner ring prediction: LAP59 and LAP73. LAP59 was previously co-isolated with the lamina-like proteins NUP1 and NUP2, is conserved across eukaryotes and the presence of 2 N-terminal transmembrane domains suggests it to be a pore membrane protein (POM) [32]. LAP73 has a divergent NUP35/Nup53 type RNA-binding domain [32] and, interestingly, the T. brucei orthologue to Nup53, TbNup65, is anchored to the nuclear envelope via a trans-membrane helix [30]. This raises the possibility of a nuclear envelope and thus inner ring localisation of LAP73 via binding to TbNUP65. However, immunoprecipitation assays failed to establish an inner ring association with LAP73 [30], albeit it is possible that the interaction is weak and thus exclusively detectable in BioID. NUP76: A partially conserved cytoplasmic site-specific complex with connections to mRNA export Of the NUPs, only the NUP76 complex (NUP76, NUP140, and NUP149) appears localised specifically to the cytoplasmic site. The NUP76 complex had been previously suggested to be part of the mRNA remodelling platform, as all 3 proteins co-isolated with Mex67 under high stringency conditions [30]. Consistent with this hypothesis, we now show exclusive cytoplasmic localisation of the NUP76 complex (Fig 1) as well as significant structural similarities of NUP76 with the scaffold mRNA remodellers of yeast and human, NUP82 and NUP88, respectively (Figs 5 and S9) and nuclear mRNA accumulation upon NUP76 depletion (Fig 6C). While NUP76 is a likely homologue to yeast NUP82 or human NUP88, the other proteins of the NUP76 complex, NUP140 and NUP149, appear unique to trypanosomes and share no similarity with the proteins that interact with yeast Nup82 or human NUP88. They possess no predictable structured elements, with the exception of NUP149 which possess four zinc finger motifs. Notably, zinc fingers are also present in the human cytoplasmic-filament NUP358 and the nuclear-site localised NUP153, both absent from trypanosomes, where they engage in Ran binding [89,90]. However, the zinc fingers of TbNUP149, confidently predicted as 3 β-hairpin strands with 4 cysteines side chains coordinating a zinc ion by AlphaFold2 and AlphaFold3 [91] appear to lack obvious sequence or structural homology to the zinc fingers of human NUP358 and NUP153 (S10 Fig) and whether they nevertheless promote Ran binding remains to be investigated. Notably, NUP149 is heavily biotinylated by Ran-TurboID, indicative of a possible interaction (Fig 3B). Even though the NUP76 complex in trypanosomes is different to the complexes from yeast and human, one similarity is worth mentioning: NUP76 depletion disrupts pore localisation of NUP140, but not of NUP149, just like NUP88 depletion in human disrupts pore localisation of NUP214, but not Nup62 [92]. A mechanistically divergent Ran-dependent mRNA export pathway in trypanosomes Apart from the NUP76 complex, we mapped RanGAP and RanBP1 to the cytoplasmic site of the nuclear pore. Their sole cytoplasmic localisation is suggestive of a conserved function of these proteins in triggering GTP hydrolysis of RanGTP and thus disassembly of exportin-cargo-RanGTP and importin-RanGTP complexes. Trypanosome RanGAP is phylogenetically more closely related to a RabGAP [93] but its proposed function as RanGAP [30] is further corroborated by our study. Pore-anchoring of RanGAP and underlying mechanisms significantly vary across species, ranging from a SUMO-dependent interaction with the metazoan-specific RanBP2/NUP358 [75,94–96], over a WPP domain-specific to plant RanGAP that interacts with a plant specific nucleoporin [97,98], to no pore localisation at all in S. cerevisiae and S. pombe [99,100]. The mechanism of pore localisation of trypanosome RanGAP is thus likely unique and may involve interactions with likewise unique trypanosome specific NUPs such as NUP140 and/or NUP149. Trypanosome RanBP1 consists of a disordered 30-amino acid stretch followed by a conserved RanBP domain (S16A Fig) and it remains unclear whether it has binding sites to the pore or simply concentrates in sites of cargo docking. The lack of a Dbp5 homologue and the association between MEX67 and Ran implies that trypanosomes employ the Ran-GTP gradient for mRNA export [30]. In fact, while Ran is predominantly nuclear localised in humans [101], in trypanosomes we find biotinylated Ran targets on both sites of the pore, possibly reflecting Ran engagement in cargo import and export [101]. This unique dual usage of the Ran pathway for both mRNA and protein cargo presents a formidable challenge for export/import ratio moderation. It is tempting to speculate that RanGTP is anchored at the basket site awaiting the MEX67 bound mRNP, which is then liberated on the cytoplasmic site driven by RanGAP-catalysed GTP hydrolysis. The lack of Dbp5 suggests that ATP-dependent mRNP disassembly at the cytoplasmic site of the pore is dispensable in trypanosomes, implying a fundamentally different mode of interaction between MEX67 and mRNA. Indeed, trypanosome MEX67 uniquely carries a CCCH-type zinc finger instead of the canonical RNA recognition motif containing RNA binding domain (RRM/RBD) found in ophistokonts [102,103], and trypanosomes lack mRNA adaptors (SR proteins) that would require stripping during cytoplasmic remodelling. Thus, the exported trypanosome RNP may exhibit lower stability and complexity, making a remodelling RNA helicase redundant. We have identified another potential component of the Ran system at the nuclear site of the pore: RanBPL has a Ran-binding domain which is very similar to the one of cytoplasmic-site localised RanBP1 but has a longer disordered N-terminal stretch (S16 Fig) and was previously characterised as Ran-binding protein with a clear preference to RanGTP over RanGDP [72]. Thus, RanBPL may be the trypanosome functional counterpart to basket proteins Nup2/NUP50 (yeast/human), which also possess Ran-binding domains. While the multiple roles of Nup2/NUP50 remain largely elusive, one known function is the acceleration of protein import complex disassembly through stimulation of RanGEF/RCC1 activity [104], analogous to the function of RanBP1 as enhancer of RanGAP activity at the cytoplasmic site [105]. In trypanosomes, a RanGEF has not yet been identified and the absence of a detectable RanGEF/RCC1 domain among the proteins biotinylated by Ran indicates that a trypanosome RanGEF/RCC1 is either absent or divergent. Theoretically, RanBPL has the potential to compensate for the absence of the canonical RanGEF/RCC1: instead of directly catalysing the GDP to GTP exchange, RanBPL1 could act by stabilising RanGTP and preventing GTP hydrolysis, driving the equilibrium towards RanGTP. However, the observation that RanBPL silencing evokes only a mild growth phenotype (Brasseur and colleagues [72]) argues against this hypothesis. Altogether, our study fortifies the hypothesis of a Ran-dependent mRNA export pathway in trypanosomes and opens new avenues for exploration of the underlying molecular mechanisms. Of potential interest in this context is also the hypothetical protein Tb927.3.5370 that is strongly labelled by Ran, but not by any NUPs. Newly identified proteins with (predicted) localisation to the nuclear site of the pore While only 5 proteins are specific to the cytoplasmic site, the nuclear site of the pore appears more complex. Next to the previously described basket proteins NUP110 and NUP92 [29,30] and RanBPL (discussed above), we found exclusive nuclear site localisation for the FG nucleoporins NUP64 and NUP98, putative TREX-2-complex proteins and up to 8 further proteins (the number is depending on how threshold is defined) that are mostly unique to trypanosomes. The FG-NUPs NUP64 and NUP98 are unique to trypanosomes and in a complex with NUP75 [30] which appears to extend to the inner ring via NUP75. NUP64 and NUP98 were previously suggested to be the (functional) orthologues of S. cerevisiae Nup1 and Nup60, as they carry the same FG-type and engage in an interaction with the putative Sac3 homologue [30,32]. Our data now show the exclusive nuclear-side position of these NUPs, in full support of this model. The TREX-2 complex was believed to be absent from trypanosomes, with the possible exception of Sac3 [32] (Fig 4B). We have now identified proteins within the cohort of nuclear pore-localised proteins [35] that show structural similarity to Thp1 and Sus1 (Fig 4C and 4D). Moreover, the orthologues to Thp1 and Sac3 have a predicted localisation at the nuclear site of the pore. All 3 trypanosome candidate TREX-2 components now await experimental analysis to understand the mechanistic details of the trypanosome mRNA export platform. The 2 further TREX-2 components, Sem1 and Cdc31 [14], were not identified within the trypanosome nuclear pore-localised proteins. These are either absent from the trypanosome TREX-2 complex, or failed identification either due to poor structural conversation or because the proteins were not identified as pore-localised by TrypTag [35]. Six of the 8 further proteins with predicted localisation at the nuclear site of the pore are trypanosome-unique with no obvious homologies and further experiments are essential to uncover their functions. Two of the proteins have predicted functions: Tb927.10.9020 is the likely homologue to the non-catalytic subunit of the tRNA methyltransferase TRM6 and was predicted as a basket-specific nuclear pore protein in our heatmap, with minor labelling by the outer ring NUP158 and by MEX67 and Ran (Fig 4A). The protein exhibits strong nuclear pore localisation [35], which is in contrast to the nuclear localisation observed for S. cerevisiae and A. thaliana Gcd10/TmR6 [106,107]. While localisation to the nuclear pore and/or envelope is not unheard of for tRNA modifying enzymes [108,109], this finding requires further investigation, as, conversely, the putative T. brucei homolog of the corresponding catalytic subunit, TRM61/Gdc14 (Tb927.11.11660), localises to the nuclear lumen/nucleoplasm [35]. The other protein with predicted basket localisation, Tb927.9.2220, has homologies to an ubiquitin-like protease/sentrin-specific protease (Ulp/SENP) that may function in resolving stalled DNA replication forks [85]. Both Ulp1 of yeast and SENP2 of human have nuclear pore localisation [110,111] and the latter was localised to the nuclear site of the pore, consistent with our map [110]. TbSENP has a rather unique biotinylation pattern that did not clustered with the biotinylation pattern of any other nuclear pore protein: it is labelled by the basket-localised NUP110 and by NUPs of the inner and outer rings, but neither by the nuclear site-specific NUP98 and NUP75 nor by the cytoplasmic site-specific NUP76. The reason for this unusual labelling pattern remains unknown and requires further investigation. Proteins with inner-ring prediction Two non-NUP proteins have exclusive inner ring prediction: LAP59 and LAP73. LAP59 was previously co-isolated with the lamina-like proteins NUP1 and NUP2, is conserved across eukaryotes and the presence of 2 N-terminal transmembrane domains suggests it to be a pore membrane protein (POM) [32]. LAP73 has a divergent NUP35/Nup53 type RNA-binding domain [32] and, interestingly, the T. brucei orthologue to Nup53, TbNup65, is anchored to the nuclear envelope via a trans-membrane helix [30]. This raises the possibility of a nuclear envelope and thus inner ring localisation of LAP73 via binding to TbNUP65. However, immunoprecipitation assays failed to establish an inner ring association with LAP73 [30], albeit it is possible that the interaction is weak and thus exclusively detectable in BioID. Conclusions Our revisited map of the trypanosome nuclear pore conforms to the pattern of conservation at the core scaffold regions and diversity at the borders of the pore [31]. We discovered an asymmetric architecture, confidently placing the NUP76 complex exclusively to the cytoplasmic site and defining the sole localisation of the trypanosomatid-exclusive FG NUPs NUP64 and NUP98, at the basket site. Notably, this corrects the current view of a largely symmetric trypanosome nuclear pore and ultimately supports moderation of directional nucleocytoplasmic transport which is crucially dependent on asymmetric components at the nuclear pore borders in other systems. For the NUP76 complex, our data strongly indicates a crucial function as cytoplasmic mRNP remodelling hub, analogous to the Nup82/NUP88 complex in opisthokonts, while the presence of trypanosome-unique NUP140 and NUP149 implies significant mechanistic difference. Mapping of the export factors Mex67 and Ran elucidated further divergence, supporting a trypanosome-specific, Ran-dependent export system. Lastly, we present a comprehensive assignment of pore localised proteins to subregions of the nuclear pore that resulted in the identification of novel nuclear pore components, including 3 putative members of a trypanosome TREX-2 complex. Altogether, our approach delivers asymmetric and novel nuclear pore components, including positional information, which can now be interrogated for functional roles to explore trypanosome-specific adaptions of nuclear transport, export control, and mRNP remodelling. Supporting information S1 Fig. Establishment and validation of the expansion microscopy protocols. https://doi.org/10.1371/journal.pbio.3003024.s001 (PDF) S2 Fig. Additional proExM images of the NUP76 complex proteins. https://doi.org/10.1371/journal.pbio.3003024.s002 (PDF) S3 Fig. UExM images of NUP64-4xTy1 with NUP76 complex proteins tagged with 3xHA (labelled using antibodies). https://doi.org/10.1371/journal.pbio.3003024.s003 (PDF) S4 Fig. Statistical analysis of TurboID experiments. https://doi.org/10.1371/journal.pbio.3003024.s004 (PDF) S5 Fig. Inducible overexpression of MEX67 (growth curves and western blotting to check for expression levels). https://doi.org/10.1371/journal.pbio.3003024.s005 (PDF) S6 Fig. Nucleolus can be identified by reduction in DAPI stain. https://doi.org/10.1371/journal.pbio.3003024.s006 (PDF) S7 Fig. Mapping proteins with the proximity map: Karyopherins. https://doi.org/10.1371/journal.pbio.3003024.s007 (PDF) S8 Fig. Mapping proteins with the proximity map: unlabelled proteins. https://doi.org/10.1371/journal.pbio.3003024.s008 (PDF) S9 Fig. Structures of yeast NUP82, human NUP88 and predicted T. brucei NUP76. https://doi.org/10.1371/journal.pbio.3003024.s009 (PDF) S10 Fig. The NUP149 zinc fingers in comparison to the ones from human NUP153 and NUP358. https://doi.org/10.1371/journal.pbio.3003024.s010 (PDF) S11 Fig. Verification of NUP76 auxin degron cell lines by diagnostic PCR. https://doi.org/10.1371/journal.pbio.3003024.s011 (PDF) S12 Fig. Additional poly(A) FISH images and fluorescence profiles of NUP76 depleted cells. https://doi.org/10.1371/journal.pbio.3003024.s012 (PDF) S13 Fig. Additional poly(A) FISH images and fluorescence profiles of NUP76 depleted cells. https://doi.org/10.1371/journal.pbio.3003024.s013 (PDF) S14 Fig. Additional poly(A) FISH images and fluorescence profiles of NUP76 depleted cells. https://doi.org/10.1371/journal.pbio.3003024.s014 (PDF) S15 Fig. Effect of NUP76 depletion on NUP140, NUP149, Gle2, and NUP96 localisation: additional images. https://doi.org/10.1371/journal.pbio.3003024.s015 (PDF) S16 Fig. AlphaFold2 models of RanBP1 and RanBPL. https://doi.org/10.1371/journal.pbio.3003024.s016 (PDF) S1 Raw Images. Contains raw images of all blots. https://doi.org/10.1371/journal.pbio.3003024.s017 (PDF) S1 Table. Oligo sequences. https://doi.org/10.1371/journal.pbio.3003024.s018 (XLSX) S2 Table. Mass spectrometry data. https://doi.org/10.1371/journal.pbio.3003024.s019 (XLSX) S3 Table. Raw data for all graphs. https://doi.org/10.1371/journal.pbio.3003024.s020 (XLSX) S4 Table. Raw data of the FISH profiles (S12–S14 Figs). https://doi.org/10.1371/journal.pbio.3003024.s021 (XLSX) Acknowledgments We are grateful to the OMICS Proteomics BIOCEV core facility for excellent technical service. We like to thank Eva Kowalinski (EMBL, Grenoble, France) for expert help with Alphafold and Colabfold. We are grateful to Mark Carrington for providing the highly useful auxin degron system.

journal article

Open Access Collection

The placenta as a cradle, but not source, of blood?

Chen, Julie Y.;Loh, Kyle M.

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003021pmid: 39913629

Every day, the body produces billions of new blood and immune cells to replace those lost to daily attrition. Manufacturing of new blood and immune cells occurs at a vast scale and an unprecedented pace, and is fueled by blood-forming hematopoietic stem cells (HSCs) [1]. As such, a timeless question in developmental biology concerns how and whence HSCs arise in the embryo [2]. Namely, how is the foundation of the future blood and immune system laid down during early development? Within the embryo’s blood vessels, arterial endothelial cells give rise to HSCs [3], but an important issue remains unresolved. Do all arteries form HSCs? Or are only some arteries ordained with the responsibility of producing HSCs? The first major artery within the embryo, known as the dorsal aorta, likely produces HSCs [2]. However, can other arteries similarly give rise to blood? The quest to define the developmental origins of blood has recently turned to the placenta. Ensconced deep within the mother’s womb, human and mouse embryos cannot breathe, and they thus construct supporting structures—the umbilical cord and placenta—to interface with the mother to acquire oxygen and nutrients, and to discharge carbon dioxide [4]. The embryo is connected via the umbilical cord to the placenta, which in turn is physically adhered to the mother’s uterus. This intimately juxtaposes the respective blood vessels of the embryo and the mother, thereby enabling lifesaving gas and nutrient exchange to occur [4]. Interestingly, the placenta physically harbors HSCs [5,6]. This observation piqued the curiosity of many. Does the placenta merely act as a landing pad for itinerant HSCs that arose from other embryonic locations and traveled through the circulation to take up residence in the placenta? Or alternatively, do HSCs emerge directly from the placenta? The placenta is densely infiltrated by arteries that effectuate gas exchange between the embryo and the mother [4], and arteries are known to form HSCs [3]. Indeed, in 2008, Rhodes and colleagues proposed that the placenta might directly generate HSCs [7]. In a recent PLOS Biology study, Chen and Tober and colleagues employ genetic lineage tracing to rigorously address the longstanding question of whether the placenta forms HSCs [8] (Fig 1). The precursor cells of placental vasculature express the Hoxa13 gene [9]. The authors thus employ a Hoxa13-Cre genetic lineage tracing system to permanently label placental vasculature and all their progeny cells. This Hoxa13-Cre system allowed the authors to test if placental vasculature forms HSCs: when cells within the mouse embryo express Hoxa13, they also express Cre recombinase, which permanently labels cells with a fluorescent protein marker [9]. This genetic lineage tracing approach is elegant because placental endothelial cells and all their future progeny cells are permanently labeled: even if cells divide, migrate, or even turn off Hoxa13 expression, they will retain the fluorescent protein marker. Another strength of genetic lineage tracing is that it is non-invasive: living cells within the mouse embryo are genetically labeled, without the need to dissociate or culture the tissue, transplant cells, or physically inject cells with a dye. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. The placenta is unlikely to generate blood-forming hematopoietic stem cells (HSCs). In mouse embryos, Hoxa13-Cre lineage tracing labels virtually all endothelial cells within the mouse placenta. However, virtually no HSCs are labeled by Hoxa13-Cre. This suggests that the placenta is unlikely to form HSCs. Rather, these results suggest that HSCs are produced elsewhere in the developing embryo and subsequently migrate to the placenta, which serves as a landing pad for HSCs. https://doi.org/10.1371/journal.pbio.3003021.g001 The Hoxa13-Cre system labels essentially all placental endothelial cells with the fluorescent protein [9]; in striking contrast, virtually no HSCs within the placenta or other parts of the embryo (namely, the fetal liver or bone marrow) express the fluorescent protein [8] (Fig 1). Placental vasculature is thus unlikely to be a major source of HSCs. Instead, HSCs may predominately arise from alternative cellular sources, perhaps other arteries, including the dorsal aorta. Interestingly, a minute number of HSCs might be labeled by Hoxa13-Cre, which may reflect HSCs arising from umbilical cord endothelial cells [8]. Because Hoxa13-Cre labels both placenta and umbilical cord vasculature [9], it is still unclear whether the putative Hoxa13-Cre-labeled HSCs emanate from the placenta or umbilical cord. Future work may require placenta- versus umbilical cord-specific markers to distinguish the two possibilities. Additionally, the authors show that Runx1, a marker gene of endothelial cells transitioning into blood cells, is not expressed in endothelial cells from mouse or human placentas. This suggests that placental vasculature does not form blood cells in either mouse or human embryos. How can the present study be reconciled with earlier work that suggested a placental origin of HSCs [7]? The present study uses genetic lineage tracing to label placental endothelial cells within their native tissue and to test if they form HSCs in vivo. Meanwhile, a previous study [7] focused on the different, but related, question of what placental cells can do when placed in the admittedly artificial environment of cell culture. In the previous study [7], placentas from Ncx1-knockout mouse embryos—which apparently lack a circulation—were studied to ask if blood and immune cells are directly produced by the placenta, as opposed to arising from other locations and traversing the circulation to enter the placenta. The placenta was separated from either wild-type or Ncx1-knockout mouse embryos, dissociated into single cells, and then cultured, whereupon blood and immune cells arose in vitro [7]. However, this approach did not formally demonstrate that the placenta formed HSCs, which would require stringent proof that a single cell—namely, an HSC—produces multiple types of blood and immune cells in vivo [1]. Additionally, it is possible that prior to cell culture, the process of physically separating the placenta from the mouse embryo inadvertently introduced cells from other embryonic tissues, which generated the blood and immune cells observed in the culture. Overall, the present authors’ data support a model wherein HSCs are generated elsewhere within the embryo, and then migrate to the placenta, which may serve as a niche for HSCs. Why should this be the case? Following Theodosius Dobzhansky’s maxim that “nothing in biology makes sense except in the light of evolution,” the authors supply an interesting evolutionary observation [8]. HSCs arose in animal evolution prior to the placenta [10]. Perhaps, HSCs were first produced from an ancestral source (e.g., the dorsal aorta) in early animals, and subsequently, once the placenta was created in later animals, HSCs “learned” to migrate to the placenta and to take up temporary residence there [8]. If so, maybe the placenta serves as the cradle for, but not the fount of, blood.

journal article

Open Access Collection

Bacteriocin-like peptides encoded by a horizontally acquired island mediate Neisseria gonorrhoeae autolysis

Poncin, Katy;McKeand, Samantha A.;Lavender, Hayley;Kurzyp, Kacper;Harrison, Odile B.;Roberti, Annabell;Melia, Charlotte;Johnson, Errin;Maiden, Martin C. J.;Greaves, David R.;Exley, Rachel;Tang, Christoph M.

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003001pmid: 39908303

Introduction Neisseria gonorrhoeae, the gonococcus, causes the sexually transmitted disease, gonorrhoea, a major worldwide public health concern. This human-specific pathogen colonises the mucosal surfaces of the urogenital tract and other sites where it elicits a pronounced inflammatory response [1]. Both N. gonorrhoeae and the closely related pathogen Neisseria meningitidis undergo autolysis, a form of programmed cell death (PCD), as part of their life cycle during the stationary phase of growth [2]. Although autolysis in Neisseria was described over a century ago [3], little is known about the pathways leading to PCD in these species or indeed in any gram-negative bacterium. Gonococcal autolysis begins with remodelling of peptidoglycan through the activity of enzymes including the lytic transglycosylase, LtgA [4], and subsequent loss of integrity of the outer membrane, through unknown mechanisms, leads to cell death [5]. In contrast, the mechanisms underlying autolysis of Streptococcus pneumoniae are well understood. In this gram-positive invasive pathogen that inhabits the mucosal surface of the upper respiratory tract, activation of pneumococcal LytA, a cell wall-bound amidase, leads to degradation of the peptidoglycan cell wall at teichoic acid-rich areas [6]. Autolysis prevents phagocytosis of S. pneumoniae and liberates the cytotoxin pneumolysin and PAMPs which trigger host cell death and/or responses [7]. Higher rates of PCD in the pneumococcus correlate with the propensity of pneumococcal strains to cause invasive disease [8]. Therefore, PCD could benefit the pneumococcus in vivo by promoting local tissue damage and release of nutrients; however, how suicide evolves in single-celled organisms presents an evolutionary conundrum [9]. Computational studies suggest that PCD in unicellular organisms can only be an adaptive trait if it is a stochastic event [10,11], as in S. pneumoniae where lytA expression is subject to ON:OFF switching through phase variation [12]. To identify genes which associated with invasive potential, we performed comparative genomic analyses of >24,700 Neisseria genomes on PubMLST (https://pubmlst.org/) to genetic elements which are present in the invasive species, N. gonorrhoeae and N. meningitidis, but absent from the other members of the genus. This search revealed a ~3.4 kb horizontally acquired island which we designated the nap island. The nap island is found in N. gonorrhoeae, N. meningitidis, and only a few isolates of the noninvasive species Neisseria bergeri and Neisseria lactamica. In the gonococcus, this region encodes 4 peptides; 3 peptides, NapA, NapB, and NapC are cationic, with features of class II microcins, i.e., size <10 kDa with a double glycine (GG) motif [13]; the GG motif is a signal for cleavage by C39 peptidases prior to secretion [14,15]. The nap island also harbours genes encoding for a putative regulator (NapR), a C39 peptidase (NapP), and an export channel (NapF) related to FapF in Pseudomonas aeruginosa [16]. We show that, distinct from class II microcins secreted by gram-positive bacteria, the gonococcal peptides do not mediate killing of competitor bacterial species. Instead, the N. gonorrhoeae autolysis peptides (Naps) specifically promote PCD of bacteria specifically in the stationary phase of growth but no other stages in their life cycle. Furthermore, mutants lacking Naps display reduced death during stationary phase and reduced autolysis following nutrient deprivation. NapC is also cytotoxic to red blood cells (RBCs), so the acquisition of the nap island may help bacteria shape the local environment at mucosal surfaces in vivo. Genomic analyses suggest that NapR and NapC are phase variable, due to the presence of homo-polymeric nucleotide tracts of different lengths in their open reading frames. This would lead to the stochastic appearance within a clonal population of 2 populations of bacteria, “altruistic” autolytic bacteria and their sibling beneficiaries, indicating that PCD might have evolved in the gonococcus through kin selection. Autolysis would liberate cell contents including nutrients and DNA from bacteria that could be utilised by siblings for growth and genetic diversification, respectively. The pneumococcus, which is a naturally transformable diplococcus, similar to N. gonorrhoeae and N. meningitidis, also displays phase variable PCD [12]. Thus, phase variable PCD may operate at an exquisitely local level with adjacent, non-autolytic, first-degree relatives, i.e., nearest and dearest, the main beneficiaries from dead/dying cells. The secretion of cytotoxic NapC and the release of PAMPs following cell lysis indicate that the acquisition of the nap island and PCD may have been an important step in the emergence of the invasive phenotype in Neisseria spp. Results Identification and conservation of the nap island We searched for pathogen-specific genes in Neisseria spp. to further understand the basis of the invasive phenotype of some species in this genus that usually colonises mucosal surfaces asymptomatically. Using MaGe [17], we identified a ~3.4 kb region in the gonococcal genome, which we designated the nap island (Figs 1A and S1). This region is found in N. gonorrhoeae, with a related locus present in N. meningitidis. However, the nap island is largely absent from noninvasive Neisseria spp. (S2 Fig). The GC content of the nap island (38%) is significantly lower than rest of the genome (53%, Fig 1A), indicating that it was likely acquired by horizontal gene transfer (HGT) from an extant source. Consistent with this, napC, a gene in the centre of the nap island, encodes a protein with 68.9% of identity with a protein from Inoviridae phages (S1 Fig); these phages are associated with Neisseria spp. as 57 Inoviridae sequences are found in Neisseria CRISPR spacers (GenBank accession number DAS81996.1). One end of the island is flanked by napP, encoding a potential C39 peptidase predicted to process GG-containing peptides prior to secretion [14]. At the other end of the island, napF encodes a homologue of FapF, an outer membrane protein in Pseudomonas aeruginosa involved in the secretion of amyloidogenic peptides processed by a C39 peptidase [16]. Of note, both napP and napF have a different GC content compared with the rest of the nap island (respectively, 52% and 50%), suggesting that these 2 genes might have a different origin from the central portion of the island. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. The nap island organisation and the absence of mNaps antimicrobial activity. (A) Organisation of the nap island in Neisseria gonorrhoeae FA1090. Grey arrows, flanking genes (NEIS0808, pseudouridine synthase; NEIS0794, histidyl-tRNA synthase); blue arrows, genes coding for GG-containing peptides; orange arrows, genes predicted to be involved in peptide processing and secretion; green arrow, gene of unknown function; pink boxes, Correia elements. Percentage of GC content was plotted with Snap Gene Viewer. (B) Table detailing predicted function of genes in the nap locus. Details are available in S1 Fig. (C) Conservation of the nap island and flanking genes among Neisseria spp. Colours are scaled in percent of strain representatives possessing a homologue within each species. The data underlying this figure can be found in S2 Table. (D) mNaps (50 μm) were added to disks which were then placed on lawns of N. gonorrhoeae FA1090 (Ng), N. cinerea (Nc) CCUG 346T or 27178A, L. crispatus (Lc) ATCC 33820, and E. coli (Ec) MG1655 on solid media. Water and polymyxin B (50 μm) were used as negative and positive controls, respectively. Plates were incubated for 24 h and zones of growth inhibition (dotted lines) recorded. No growth inhibition was observed for any mNap (n = 3). Source data is in the file S1 Data. (E) MICs were determined by incubating bacteria in rich medium for 24 h with antibiotics (controls) or mNaps, starting from 256 μg/ml to 0.5 μg/ml in 2-fold dilutions. Bacteria were then spotted onto agar plates, which were incubated for 24 h. MBCs were the lowest concentration at which no growth was visible on plates. The mNaps did not inhibit the growth of any strain (shown by /, n = 3). The data underlying this figure can be found in S1 and S2 Data files. https://doi.org/10.1371/journal.pbio.3003001.g001 The gonococcal nap island contains 3 genes (napABC) coding for predicted α-helical cationic peptides with GG-motifs (Fig 1B). After cleavage at the GG sequence, mature mNapA, mNapB, and mNapC are predicted to contain 36, 16, and 31 amino acids, with pIs of 9, 10, and 10.1, respectively (S1 Fig). Downstream of napABC (Figs 1A and S1) is napI which shares 30% of amino acid identity with a domain in LagC, the immunity protein for the bacteriocin, lactococcin G [18] (S1 Fig). napR is upstream of napABC and encodes a peptide with a putative DNA-binding domain (S1 Fig) so is a potential regulator of the island. Finally, napH is predicted to encode an inner membrane protein with a high methionine content with 3 transmembrane α-helices seen in some bacteriophage pore-forming holins [19] (S1 Fig). Of note, napR contains a poly-G tract of 6 to 12 residues in its 5′ region, indicating that it is likely to be phase variable (S1 Fig and S1 Table). In N. gonorrhoeae FA1090, napR has 7 guanines, the most common allele (in 6,548 out of 10,854 isolates), leading to loss of the GG-leader sequence, so the peptide is predicted to be cytoplasmic. In addition, napC is also probably phase-variable, as it contains a poly-T tract towards its 5′ end resulting typically in a cationic (9xT) or non-cationic (8xT) peptide (S1 Fig and S1 Table). The distribution of the nap island within Neisseria spp. was examined in more detail using PubMLST [20]. Whole genome sequences of >24,700 Neisseria spp. isolates were inspected, with results confirming that most commensal species lack the nap island with 2 exceptions (Fig 1C). The nap island is found in a proportion of isolates of Neisseria bergeri (38/40 have napA) and Neisseria lactamica (138/604 have napA), but no other noninvasive species (Figs 1C and S2 and S2 Table for details). Cationic Naps do not inhibit the growth of competitor bacteria Most class II microcins are antibacterial [21]. To determine whether the cationic Naps possess antibacterial activity, mature versions of the peptides, mNapA, mNapB, and mNapC, were synthesised and their ability to kill competitor bacteria assessed. First, we performed disc diffusion assays against N. gonorrhoeae, Neisseria cinerea, Escherichia coli, and the gram-positive bacterium Lactobacillus crispatus, an inhabitant of the genital tract [22]. As microcins are bactericidal in the nanomolar range [23], assays were performed using discs with 50 μm of each mNap, and with polymyxin B as a positive control. As expected [24], growth of N. cinerea and E. coli was inhibited around discs containing polymyxin B, while there was no clearance of N. gonorrhoeae or L. crispatus around these discs (Fig 1D). In contrast, none of the mNaps inhibited the growth of any strain. We also measured the minimal bactericidal concentration (MBC) of each mNap by broth dilution against the bacteria. Bacteria were incubated with micromolar concentrations of mNaps or polymyxin B for 24 h. Of note, the mNaps failed to inhibit bacterial growth even at the highest concentration (256 μm, Fig 1E). As some bactericidal mechanisms are contact-dependent [25], we performed co-culture experiments to see whether the survival of prey bacteria (i.e., N. cinerea, E. coli, and L. crispatus) was affected in the presence of wild-type N. gonorrhoeae or an isogenic mutant unable to express Naps. We constructed a markerless N. gonorrhoeae FA1090 ΔnapRABC mutant (eliminating all Nap peptides) with a pheS* counter-selection marker [26] to avoid fitness costs associated with antibiotic resistance cassettes (S3 Fig). After 3 and 24 h of co-culturing prey and N. gonorrhoeae at a 1:1 ratio, we recovered prey strains by plating to selective media. No significant difference could be observed in the recovery of prey strains in the presence of wild-type or ΔnapRABC N. gonorrhoeae (multiple paired t tests, n = 3, S4 Fig). Taken together, these results demonstrate that Naps do not have detectable antimicrobial activity. Naps mediate N. gonorrhoeae autolysis To better understand the function of the Naps, we examined the regulation of genes encoding nap island peptides as the expression of many microcins and bacteriocins is regulated during growth [13] (Figs 2A and S5A). The mRNA levels of genes encoding the nap peptides was measured by RT-qPCR of bacteria grown in liquid media over 24 h. The mRNA levels for napA, napB, napC, and napR were lowest in the early stationary phase of growth, then rose to their highest levels in the late autolytic phase. The similar gene expression profile of napA, napB, and napC suggests that they are organised in operon. This is consistent with the absence of a predicted promoter directly upstream of napB (S5B Fig) and the detection of a transcript overlapping all 3 genes in N. gonorrhoeae MS11 [27]. The nap locus of N. gonorrhoeae FA1090 also harbours Correia elements (CEs) in the napR/napA and napI/napH intergenic regions containing potential promoters [28] (Figs 1A and S5B); these CEs harbour integration host factor (IHF) binding sites (S5C Fig), which can influence transcriptional regulation [29]. As nap mRNA levels were elevated during the late stationary phase, we considered whether the Naps are involved in PCD. Gonococcal autolysis involves initial peptidoglycan remodelling by LtgA and other enzymes [4,30], followed by cell lysis. As polyamines can increase the resistance of bacteria to cationic peptides [31,32], we assessed the activity of mNaps on gonococcal survival in protein/spermidine-free media. mNaps were added to N. gonorrhoeae at the mid-log phase, and survival examined during the stationary phase until autolysis occurred [33]. N. gonorrhoeae was grown for 3 h in 96-well plates from a starting OD600 of 0.25, then exposed to mNaps. While mNaps had no effect on survival of gonococci during the exponential phase of growth, micromolar concentrations of mNapA and mNapC significantly reduced survival when bacteria reached the late stationary phase of growth (Fig 2B, p < 0.001 versus no peptide). Specifically, mNaps had no effect on N. cinerea at any stage of growth (S6A Fig), while scrambled versions of mNapA and mNapC (mNapASCR and mNapCSCR, respectively) did not affect survival of N. gonorrhoeae (S6B Fig). We also checked whether mNaps interacted with each other in this assay. Mixtures of Naps were tested on N. gonorrhoeae and results compared with effect of individual mNaps. There was no evidence of antagonism or synergy between the Naps in these conditions (S6C Fig). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Regulation of the nap island and the effect of mNaps on bacterial survival. (A) RT-qPCR of genes in the nap island from N. gonorrhoeae FA1090 grown in GW liquid cultures. Samples were taken from flasks at different times (CFU, left panel) and mRNA levels of target genes determined by RT-qPCR (right panel). Gene expression results were normalised to samples at T0. (B) Bacteria were grown in 96-well plates in protein-free spermidine-free GW medium for 3 h before mNaps were added at various concentrations (arrow). At time points indicated, bacterial viability was determined by plating to solid media and incubation overnight. CFU were counted (dotted lines, limit of detection). Bottom panels represent the same data but grouped according to the growth phase (exponential and autolytic). Error bars, SD, two-way ANOVA (n = 4; p ≥ 0.033, not significant, not represented; p < 0.001, ***). The data underlying this figure can be found in S3 Data. https://doi.org/10.1371/journal.pbio.3003001.g002 As a control for autolysis, we next constructed a markerless N. gonorrhoeae FA1090 ΔltgA mutant; ltgA is involved in the first step of gonococcal autolysis [4]. Importantly, both the ΔnapRABC and ΔltgA mutants displayed increased survival during the late stationary/autolytic phase of growth compared with wild-type bacteria, whether in presence or absence of exogenously added mNaps (Fig 3A, at t, 36 h, p = 0.002, ** for ΔltgA, and p = 0.03, * for ΔnapRABC in the absence of Naps, and p < 0.001, *** for both strains in the presence of mNapA or mNapC, two-way ANOVA), consistent with the Naps contributing to autolysis. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. The nap island is involved in autolysis. (A) Long-term survival of wild-type N. gonorrhoeae FA1090 and the ΔnapRABC mutant in the presence (arrow) or absence of mNaps (5 μm). The markerless ltgA deletion mutant was a control for a strain with reduced autolysis. Dotted lines, limit of detection. (B) Autolysis in rich medium. Bacteria were streaked on plates, grown overnight then resuspended in GCB at OD540 of ~0.3 and left in cuvettes. After 24 h at room temperature, bacteria were gently resuspended and the OD540 measured. One-way ANOVA with Dunnett’s multiple comparison against the wild-type strain (n = 3; p < 0.033, *; p < 0.002, **; p < 0.001, ***). The data underlying this figure can be found in S4 Data. (C) Autolysis in HEPES buffer without (left panel) or with (right panel) 10 mM MgCl2. A representative experiment is shown (last replicate is in the corresponding Suppl. source data files). (D) Transmission electron microscopy images from bacteria in HEPES (T = 75 min). Some wild-type bacteria appear normal (indicated with arrow numbered 1), with many others having a dense (arrow 2) or very dense (arrow 3) cytoplasm, periplasmic expansions (arrow 4), or appearing as ghost cells (arrow 5). Extracellular vesicles (arrow 6) and released cellular content (arrow 7) are also visible. Periplasmic expansions are not observable in the ΔltgA or ΔnapRABC strains. Cytoplasm condensation appears more homogenous in the ΔltgA mutant. The images underlying this figure can be found in https://doi.org/10.6084/m9.figshare.28077749.v1. https://doi.org/10.1371/journal.pbio.3003001.g003 To establish whether the nap island directly contributes to PCD, we examined wild-type, ΔnapRABC, and ΔltgA N. gonorrhoeae in 2 independent assays of autolysis; N. cinerea was included as a control. First, we assessed changes in the OD540 of strains incubated overnight in liquid GCB (Fig 3B); a reduction in OD provides a measure of autolysis [34,35]. While the OD540 of wild-type bacteria fell overnight by 34.3% (±4.9, indicating autolysis), the OD540 of the ΔltgA and ΔnapRABC mutants only dropped by 10.8% (±8.9) and 6.5% (±10.6), respectively (Fig 3B, p < 0.033 and 0.002 versus the wild-type strain), indicating that both mutants exhibit reduced autolysis. In contrast, the OD540 of N. cinerea increased over this time (Fig 3B, +12.4% ± 4), consistent with the lack of PCD in this species. The rate of autolysis of these strains was also estimated by following the OD540 of bacteria incubated in HEPES, as starvation in this buffer can induce autolysis [5]. Again, the ΔltgA and ΔnapRABC mutants exhibited decreased rates of autolysis compared with wild-type bacteria (Fig 3C), as shown by the turbidity of cultures at the mid-time point (75 min, p < 0.033 and p < 0.002, respectively, S7A Fig). Autolysis of 2 strains of N. cinerea was also significantly less than N. gonorrhoeae in this assay (Figs 3C and S7C). Consistent with the ability of divalent cations to impair autolysis [34], addition of MgCl2 (final concentration, 10 mM) markedly decreased autolysis (Fig 3C) and abolished any significant difference in autolysis between the wild-type, ΔltgA and ΔnapRABC strains (S7B Fig). To understand the role of NapI during autolysis, we attempted to generate a napI mutant by replacing its open reading frame with a kanamycin resistance cassette. Despite multiple attempts, this was unsuccessful in wild-type bacteria presumably because of the toxicity of Naps in the absence of their immunity protein. Consistent with this, we were able to generate a napI mutant in the ΔnapRABC strain. The napI mutant was incubated in HEPES and the OD540 measured over time. Surprisingly, the ΔnapRABCI mutant was even more resistant to autolysis than the ΔnapRABC strain (S7A Fig), suggesting that NapI might contribute to autolysis or bacterial fitness in absence of other Naps. Finally, we examined the ultrastructure of the wild-type, ΔltgA, and ΔnapRABC strains by transmission electron microscopy (TEM) after incubation in HEPES buffer to induce autolysis. Before resuspension in HEPES, the strains were indistinguishable with clear nucleoids visible (S8 Fig). After 75 min, wild-type bacteria showed a mix of phenotypes (Fig 3D), some with dense cytoplasm, while other cells displayed an expanded periplasm, or were empty “ghost” cells with evidence of nearby debris. The cytoplasm of the ΔltgA mutant appeared less dense than wild-type bacteria, with no visible periplasmic expansions (Fig 3D). Similarly, the ΔnapRABC mutant lacked periplasmic expansions (Fig 3D), with many cells having aberrant shapes (Fig 3D). Based on the OD540 of bacteria in buffer, both mutants do undergo autolysis. However, TEM images indicate that cell death proceeds differently for the ΔltgA and ΔnapRABC mutants based on differences seen in the expansion of their periplasm. Taken together, our data provide evidence that the Naps contribute directly to autolysis of N. gonorrhoeae. NapC induces host erythrocyte lysis As the nap island is largely limited to the gonococcus and meningococcus which can elicit marked inflammatory responses [36], we examined whether the mNaps are toxic to host cells. mNaps were tested for their ability to lyse human RBCs over 1 h at 37°C. Interestingly, mNapC had marked haemolytic activity even at low concentrations (<2 μM, Fig 4A), while neither mNapA or mNapB were toxic. To check for interactions between the peptides, we added combinations of mNaps in a 1:1 ratio to cells. Surprisingly, mNapA protected RBC from lysis by mNapC (Fig 4B). To examine whether mNaps are toxic to other host cells, we also measured the release of lactate dehydrogenase (LDH) by THP-1-derived macrophages following exposure to 1 or 5 μm of each mNap (S9 Fig). No cytotoxicity was detected against THP-1 cells with any mNap, suggesting that their effect is cell-type specific. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Effect of Naps on human red blood cells and proposed model. (A) Different concentrations of mNaps were added to human RBCs for 1 h and the OD414 measured. Data were normalised to cytotoxicity with melittin (100%), or PBS (0%). Right panel, cytotoxicity induced by 1.56 μm of each mNap (n = 4; p < 0.002, **). (B) mNapA or mNapB were mixed with mNapC at a 1:1 ratio and cytotoxicity measured. Right panel, results with 3.15 μm of each peptide, n = 3, p < 0.033, *, one-way ANOVA with Dunnet’s multiple comparisons. The data underlying this figure can be found in S5 Data. (C) Proposed model of gonococcal autolysis. Peptidoglycan (PG) remodelling enzymes, including LtgA (pink) which localises at the septum, will release PG fragments that are recycled (purple arrows). When bacteria reach stationary phase, PG synthesis stops, but LtgA remains active, leading to the condensation of the cytoplasm (Fig 3D). Meanwhile, Naps are processed and secreted by NapP and NapF (orange), respectively, leading to their accumulation in the extracellular environment. Vacuolation is triggered, potentially through the action of inner membrane destabilising actors such as holins/toxins. Eventually, the outer membrane (OM) breaks when a critical concentration of mNaps (blue) is reached, leading to release of cellular contents and the appearance of ghost cells. For diplococci, cells undergoing PCD will benefit first-degree relatives by direct and indirect mechanisms. https://doi.org/10.1371/journal.pbio.3003001.g004 Identification and conservation of the nap island We searched for pathogen-specific genes in Neisseria spp. to further understand the basis of the invasive phenotype of some species in this genus that usually colonises mucosal surfaces asymptomatically. Using MaGe [17], we identified a ~3.4 kb region in the gonococcal genome, which we designated the nap island (Figs 1A and S1). This region is found in N. gonorrhoeae, with a related locus present in N. meningitidis. However, the nap island is largely absent from noninvasive Neisseria spp. (S2 Fig). The GC content of the nap island (38%) is significantly lower than rest of the genome (53%, Fig 1A), indicating that it was likely acquired by horizontal gene transfer (HGT) from an extant source. Consistent with this, napC, a gene in the centre of the nap island, encodes a protein with 68.9% of identity with a protein from Inoviridae phages (S1 Fig); these phages are associated with Neisseria spp. as 57 Inoviridae sequences are found in Neisseria CRISPR spacers (GenBank accession number DAS81996.1). One end of the island is flanked by napP, encoding a potential C39 peptidase predicted to process GG-containing peptides prior to secretion [14]. At the other end of the island, napF encodes a homologue of FapF, an outer membrane protein in Pseudomonas aeruginosa involved in the secretion of amyloidogenic peptides processed by a C39 peptidase [16]. Of note, both napP and napF have a different GC content compared with the rest of the nap island (respectively, 52% and 50%), suggesting that these 2 genes might have a different origin from the central portion of the island. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. The nap island organisation and the absence of mNaps antimicrobial activity. (A) Organisation of the nap island in Neisseria gonorrhoeae FA1090. Grey arrows, flanking genes (NEIS0808, pseudouridine synthase; NEIS0794, histidyl-tRNA synthase); blue arrows, genes coding for GG-containing peptides; orange arrows, genes predicted to be involved in peptide processing and secretion; green arrow, gene of unknown function; pink boxes, Correia elements. Percentage of GC content was plotted with Snap Gene Viewer. (B) Table detailing predicted function of genes in the nap locus. Details are available in S1 Fig. (C) Conservation of the nap island and flanking genes among Neisseria spp. Colours are scaled in percent of strain representatives possessing a homologue within each species. The data underlying this figure can be found in S2 Table. (D) mNaps (50 μm) were added to disks which were then placed on lawns of N. gonorrhoeae FA1090 (Ng), N. cinerea (Nc) CCUG 346T or 27178A, L. crispatus (Lc) ATCC 33820, and E. coli (Ec) MG1655 on solid media. Water and polymyxin B (50 μm) were used as negative and positive controls, respectively. Plates were incubated for 24 h and zones of growth inhibition (dotted lines) recorded. No growth inhibition was observed for any mNap (n = 3). Source data is in the file S1 Data. (E) MICs were determined by incubating bacteria in rich medium for 24 h with antibiotics (controls) or mNaps, starting from 256 μg/ml to 0.5 μg/ml in 2-fold dilutions. Bacteria were then spotted onto agar plates, which were incubated for 24 h. MBCs were the lowest concentration at which no growth was visible on plates. The mNaps did not inhibit the growth of any strain (shown by /, n = 3). The data underlying this figure can be found in S1 and S2 Data files. https://doi.org/10.1371/journal.pbio.3003001.g001 The gonococcal nap island contains 3 genes (napABC) coding for predicted α-helical cationic peptides with GG-motifs (Fig 1B). After cleavage at the GG sequence, mature mNapA, mNapB, and mNapC are predicted to contain 36, 16, and 31 amino acids, with pIs of 9, 10, and 10.1, respectively (S1 Fig). Downstream of napABC (Figs 1A and S1) is napI which shares 30% of amino acid identity with a domain in LagC, the immunity protein for the bacteriocin, lactococcin G [18] (S1 Fig). napR is upstream of napABC and encodes a peptide with a putative DNA-binding domain (S1 Fig) so is a potential regulator of the island. Finally, napH is predicted to encode an inner membrane protein with a high methionine content with 3 transmembrane α-helices seen in some bacteriophage pore-forming holins [19] (S1 Fig). Of note, napR contains a poly-G tract of 6 to 12 residues in its 5′ region, indicating that it is likely to be phase variable (S1 Fig and S1 Table). In N. gonorrhoeae FA1090, napR has 7 guanines, the most common allele (in 6,548 out of 10,854 isolates), leading to loss of the GG-leader sequence, so the peptide is predicted to be cytoplasmic. In addition, napC is also probably phase-variable, as it contains a poly-T tract towards its 5′ end resulting typically in a cationic (9xT) or non-cationic (8xT) peptide (S1 Fig and S1 Table). The distribution of the nap island within Neisseria spp. was examined in more detail using PubMLST [20]. Whole genome sequences of >24,700 Neisseria spp. isolates were inspected, with results confirming that most commensal species lack the nap island with 2 exceptions (Fig 1C). The nap island is found in a proportion of isolates of Neisseria bergeri (38/40 have napA) and Neisseria lactamica (138/604 have napA), but no other noninvasive species (Figs 1C and S2 and S2 Table for details). Cationic Naps do not inhibit the growth of competitor bacteria Most class II microcins are antibacterial [21]. To determine whether the cationic Naps possess antibacterial activity, mature versions of the peptides, mNapA, mNapB, and mNapC, were synthesised and their ability to kill competitor bacteria assessed. First, we performed disc diffusion assays against N. gonorrhoeae, Neisseria cinerea, Escherichia coli, and the gram-positive bacterium Lactobacillus crispatus, an inhabitant of the genital tract [22]. As microcins are bactericidal in the nanomolar range [23], assays were performed using discs with 50 μm of each mNap, and with polymyxin B as a positive control. As expected [24], growth of N. cinerea and E. coli was inhibited around discs containing polymyxin B, while there was no clearance of N. gonorrhoeae or L. crispatus around these discs (Fig 1D). In contrast, none of the mNaps inhibited the growth of any strain. We also measured the minimal bactericidal concentration (MBC) of each mNap by broth dilution against the bacteria. Bacteria were incubated with micromolar concentrations of mNaps or polymyxin B for 24 h. Of note, the mNaps failed to inhibit bacterial growth even at the highest concentration (256 μm, Fig 1E). As some bactericidal mechanisms are contact-dependent [25], we performed co-culture experiments to see whether the survival of prey bacteria (i.e., N. cinerea, E. coli, and L. crispatus) was affected in the presence of wild-type N. gonorrhoeae or an isogenic mutant unable to express Naps. We constructed a markerless N. gonorrhoeae FA1090 ΔnapRABC mutant (eliminating all Nap peptides) with a pheS* counter-selection marker [26] to avoid fitness costs associated with antibiotic resistance cassettes (S3 Fig). After 3 and 24 h of co-culturing prey and N. gonorrhoeae at a 1:1 ratio, we recovered prey strains by plating to selective media. No significant difference could be observed in the recovery of prey strains in the presence of wild-type or ΔnapRABC N. gonorrhoeae (multiple paired t tests, n = 3, S4 Fig). Taken together, these results demonstrate that Naps do not have detectable antimicrobial activity. Naps mediate N. gonorrhoeae autolysis To better understand the function of the Naps, we examined the regulation of genes encoding nap island peptides as the expression of many microcins and bacteriocins is regulated during growth [13] (Figs 2A and S5A). The mRNA levels of genes encoding the nap peptides was measured by RT-qPCR of bacteria grown in liquid media over 24 h. The mRNA levels for napA, napB, napC, and napR were lowest in the early stationary phase of growth, then rose to their highest levels in the late autolytic phase. The similar gene expression profile of napA, napB, and napC suggests that they are organised in operon. This is consistent with the absence of a predicted promoter directly upstream of napB (S5B Fig) and the detection of a transcript overlapping all 3 genes in N. gonorrhoeae MS11 [27]. The nap locus of N. gonorrhoeae FA1090 also harbours Correia elements (CEs) in the napR/napA and napI/napH intergenic regions containing potential promoters [28] (Figs 1A and S5B); these CEs harbour integration host factor (IHF) binding sites (S5C Fig), which can influence transcriptional regulation [29]. As nap mRNA levels were elevated during the late stationary phase, we considered whether the Naps are involved in PCD. Gonococcal autolysis involves initial peptidoglycan remodelling by LtgA and other enzymes [4,30], followed by cell lysis. As polyamines can increase the resistance of bacteria to cationic peptides [31,32], we assessed the activity of mNaps on gonococcal survival in protein/spermidine-free media. mNaps were added to N. gonorrhoeae at the mid-log phase, and survival examined during the stationary phase until autolysis occurred [33]. N. gonorrhoeae was grown for 3 h in 96-well plates from a starting OD600 of 0.25, then exposed to mNaps. While mNaps had no effect on survival of gonococci during the exponential phase of growth, micromolar concentrations of mNapA and mNapC significantly reduced survival when bacteria reached the late stationary phase of growth (Fig 2B, p < 0.001 versus no peptide). Specifically, mNaps had no effect on N. cinerea at any stage of growth (S6A Fig), while scrambled versions of mNapA and mNapC (mNapASCR and mNapCSCR, respectively) did not affect survival of N. gonorrhoeae (S6B Fig). We also checked whether mNaps interacted with each other in this assay. Mixtures of Naps were tested on N. gonorrhoeae and results compared with effect of individual mNaps. There was no evidence of antagonism or synergy between the Naps in these conditions (S6C Fig). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Regulation of the nap island and the effect of mNaps on bacterial survival. (A) RT-qPCR of genes in the nap island from N. gonorrhoeae FA1090 grown in GW liquid cultures. Samples were taken from flasks at different times (CFU, left panel) and mRNA levels of target genes determined by RT-qPCR (right panel). Gene expression results were normalised to samples at T0. (B) Bacteria were grown in 96-well plates in protein-free spermidine-free GW medium for 3 h before mNaps were added at various concentrations (arrow). At time points indicated, bacterial viability was determined by plating to solid media and incubation overnight. CFU were counted (dotted lines, limit of detection). Bottom panels represent the same data but grouped according to the growth phase (exponential and autolytic). Error bars, SD, two-way ANOVA (n = 4; p ≥ 0.033, not significant, not represented; p < 0.001, ***). The data underlying this figure can be found in S3 Data. https://doi.org/10.1371/journal.pbio.3003001.g002 As a control for autolysis, we next constructed a markerless N. gonorrhoeae FA1090 ΔltgA mutant; ltgA is involved in the first step of gonococcal autolysis [4]. Importantly, both the ΔnapRABC and ΔltgA mutants displayed increased survival during the late stationary/autolytic phase of growth compared with wild-type bacteria, whether in presence or absence of exogenously added mNaps (Fig 3A, at t, 36 h, p = 0.002, ** for ΔltgA, and p = 0.03, * for ΔnapRABC in the absence of Naps, and p < 0.001, *** for both strains in the presence of mNapA or mNapC, two-way ANOVA), consistent with the Naps contributing to autolysis. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. The nap island is involved in autolysis. (A) Long-term survival of wild-type N. gonorrhoeae FA1090 and the ΔnapRABC mutant in the presence (arrow) or absence of mNaps (5 μm). The markerless ltgA deletion mutant was a control for a strain with reduced autolysis. Dotted lines, limit of detection. (B) Autolysis in rich medium. Bacteria were streaked on plates, grown overnight then resuspended in GCB at OD540 of ~0.3 and left in cuvettes. After 24 h at room temperature, bacteria were gently resuspended and the OD540 measured. One-way ANOVA with Dunnett’s multiple comparison against the wild-type strain (n = 3; p < 0.033, *; p < 0.002, **; p < 0.001, ***). The data underlying this figure can be found in S4 Data. (C) Autolysis in HEPES buffer without (left panel) or with (right panel) 10 mM MgCl2. A representative experiment is shown (last replicate is in the corresponding Suppl. source data files). (D) Transmission electron microscopy images from bacteria in HEPES (T = 75 min). Some wild-type bacteria appear normal (indicated with arrow numbered 1), with many others having a dense (arrow 2) or very dense (arrow 3) cytoplasm, periplasmic expansions (arrow 4), or appearing as ghost cells (arrow 5). Extracellular vesicles (arrow 6) and released cellular content (arrow 7) are also visible. Periplasmic expansions are not observable in the ΔltgA or ΔnapRABC strains. Cytoplasm condensation appears more homogenous in the ΔltgA mutant. The images underlying this figure can be found in https://doi.org/10.6084/m9.figshare.28077749.v1. https://doi.org/10.1371/journal.pbio.3003001.g003 To establish whether the nap island directly contributes to PCD, we examined wild-type, ΔnapRABC, and ΔltgA N. gonorrhoeae in 2 independent assays of autolysis; N. cinerea was included as a control. First, we assessed changes in the OD540 of strains incubated overnight in liquid GCB (Fig 3B); a reduction in OD provides a measure of autolysis [34,35]. While the OD540 of wild-type bacteria fell overnight by 34.3% (±4.9, indicating autolysis), the OD540 of the ΔltgA and ΔnapRABC mutants only dropped by 10.8% (±8.9) and 6.5% (±10.6), respectively (Fig 3B, p < 0.033 and 0.002 versus the wild-type strain), indicating that both mutants exhibit reduced autolysis. In contrast, the OD540 of N. cinerea increased over this time (Fig 3B, +12.4% ± 4), consistent with the lack of PCD in this species. The rate of autolysis of these strains was also estimated by following the OD540 of bacteria incubated in HEPES, as starvation in this buffer can induce autolysis [5]. Again, the ΔltgA and ΔnapRABC mutants exhibited decreased rates of autolysis compared with wild-type bacteria (Fig 3C), as shown by the turbidity of cultures at the mid-time point (75 min, p < 0.033 and p < 0.002, respectively, S7A Fig). Autolysis of 2 strains of N. cinerea was also significantly less than N. gonorrhoeae in this assay (Figs 3C and S7C). Consistent with the ability of divalent cations to impair autolysis [34], addition of MgCl2 (final concentration, 10 mM) markedly decreased autolysis (Fig 3C) and abolished any significant difference in autolysis between the wild-type, ΔltgA and ΔnapRABC strains (S7B Fig). To understand the role of NapI during autolysis, we attempted to generate a napI mutant by replacing its open reading frame with a kanamycin resistance cassette. Despite multiple attempts, this was unsuccessful in wild-type bacteria presumably because of the toxicity of Naps in the absence of their immunity protein. Consistent with this, we were able to generate a napI mutant in the ΔnapRABC strain. The napI mutant was incubated in HEPES and the OD540 measured over time. Surprisingly, the ΔnapRABCI mutant was even more resistant to autolysis than the ΔnapRABC strain (S7A Fig), suggesting that NapI might contribute to autolysis or bacterial fitness in absence of other Naps. Finally, we examined the ultrastructure of the wild-type, ΔltgA, and ΔnapRABC strains by transmission electron microscopy (TEM) after incubation in HEPES buffer to induce autolysis. Before resuspension in HEPES, the strains were indistinguishable with clear nucleoids visible (S8 Fig). After 75 min, wild-type bacteria showed a mix of phenotypes (Fig 3D), some with dense cytoplasm, while other cells displayed an expanded periplasm, or were empty “ghost” cells with evidence of nearby debris. The cytoplasm of the ΔltgA mutant appeared less dense than wild-type bacteria, with no visible periplasmic expansions (Fig 3D). Similarly, the ΔnapRABC mutant lacked periplasmic expansions (Fig 3D), with many cells having aberrant shapes (Fig 3D). Based on the OD540 of bacteria in buffer, both mutants do undergo autolysis. However, TEM images indicate that cell death proceeds differently for the ΔltgA and ΔnapRABC mutants based on differences seen in the expansion of their periplasm. Taken together, our data provide evidence that the Naps contribute directly to autolysis of N. gonorrhoeae. NapC induces host erythrocyte lysis As the nap island is largely limited to the gonococcus and meningococcus which can elicit marked inflammatory responses [36], we examined whether the mNaps are toxic to host cells. mNaps were tested for their ability to lyse human RBCs over 1 h at 37°C. Interestingly, mNapC had marked haemolytic activity even at low concentrations (<2 μM, Fig 4A), while neither mNapA or mNapB were toxic. To check for interactions between the peptides, we added combinations of mNaps in a 1:1 ratio to cells. Surprisingly, mNapA protected RBC from lysis by mNapC (Fig 4B). To examine whether mNaps are toxic to other host cells, we also measured the release of lactate dehydrogenase (LDH) by THP-1-derived macrophages following exposure to 1 or 5 μm of each mNap (S9 Fig). No cytotoxicity was detected against THP-1 cells with any mNap, suggesting that their effect is cell-type specific. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Effect of Naps on human red blood cells and proposed model. (A) Different concentrations of mNaps were added to human RBCs for 1 h and the OD414 measured. Data were normalised to cytotoxicity with melittin (100%), or PBS (0%). Right panel, cytotoxicity induced by 1.56 μm of each mNap (n = 4; p < 0.002, **). (B) mNapA or mNapB were mixed with mNapC at a 1:1 ratio and cytotoxicity measured. Right panel, results with 3.15 μm of each peptide, n = 3, p < 0.033, *, one-way ANOVA with Dunnet’s multiple comparisons. The data underlying this figure can be found in S5 Data. (C) Proposed model of gonococcal autolysis. Peptidoglycan (PG) remodelling enzymes, including LtgA (pink) which localises at the septum, will release PG fragments that are recycled (purple arrows). When bacteria reach stationary phase, PG synthesis stops, but LtgA remains active, leading to the condensation of the cytoplasm (Fig 3D). Meanwhile, Naps are processed and secreted by NapP and NapF (orange), respectively, leading to their accumulation in the extracellular environment. Vacuolation is triggered, potentially through the action of inner membrane destabilising actors such as holins/toxins. Eventually, the outer membrane (OM) breaks when a critical concentration of mNaps (blue) is reached, leading to release of cellular contents and the appearance of ghost cells. For diplococci, cells undergoing PCD will benefit first-degree relatives by direct and indirect mechanisms. https://doi.org/10.1371/journal.pbio.3003001.g004 Discussion Many bacteria undergo PCD when subjected to specific environmental stresses such as nutrient deprivation and oxidative damage [37]. This can be mediated by toxin:antitoxin systems, which can also limit the spread of bacteriophage in a population through a process known as abortive infection [37]. In contrast, notably few bacterial species undergo autolysis as a part of their natural life cycle when they enter the stationary phase of growth. Among bacteria with the potential to cause invasive disease, PCD is a feature of the gonococcus, meningococcus, and pneumococcus. These species share the characteristics of being diplococci that are naturally transformable inhabitants of mucosal surfaces in humans. These features are likely to have contributed to autolysis being an adaptive trait in this specific subset of bacteria. We show that the invasive species of Neisseria have acquired a 3.4 kb island which is necessary for their ability to undergo PCD. The success of the nap island as a selfish genetic element is evidenced by its almost ubiquitous distribution in N. gonorrhoeae and N. meningitidis, indicating that the nap island and thence autolysis promotes the success of these bacteria at mucosal surfaces and their invasive potential. For the gram-positive bacterium S. pneumoniae, the major autolysin LytA is responsible for degradation of cell wall peptidoglycan which is sufficient to cause cell lysis [38]. The situation is more complex for gram-negative bacteria, which possess 2 membranes. Autolysis of pathogenic Neisseria was described over a century ago [3,39]. The first step of gonococcal PCD is peptidoglycan remodelling which depends on the activity of lytic transglycosylases such as LtgA [4,5,30,34]. However, the mechanisms underlying subsequent steps in this process have remained obscure. However, it was known that blocking protein synthesis suppresses autolysis [40] without impeding peptidoglycan hydrolysis [5], suggesting that other proteins are involved in autolysis. Here, we demonstrate that Naps mediate the second step of gonococcal autolysis. We show that addition of mNapA and mNapC reduce bacterial survival specifically during the stationary phase of growth when autolysis occurs. In addition, the napRABC mutant lacking all Naps displayed reduced autolysis compared to the wild-type bacteria. As well as promoting PCD, NapC can be toxic to host RBCs. The peptides are encoded by the nap island, a small horizontally acquired genetic element predominantly found in species of Neisseria which have the capacity to cause invasive disease. Aside from mediating the release of bacterial PAMPs on death, mNaps might contribute directly to host:pathogen interactions by being toxic to human cells. mNapC can mediate lysis of RBCs at micromolar concentrations; this is counteracted by mNapA, indicating that the interplay of mNaps influence their effect on human cells. In our work, we could not attribute a function to mNapB. One possibility is that NapB does not to regulate autolysis but instead influences other aspects of bacterial life. For example, in S. pneumoniae, competence and fratricide lysis have been functionally linked, since an early competence gene encodes an immunity protein against their own lysins [41]. Since a functional link between the 2 phenomena might also exist in N. gonorrhoeae [42], it might be worth investigating the effect of NapB on competence. Additionally, gram-positive bacteria can also produce multiple bacteriocins; in lactic acid bacteria, the diversity of peptides offers an advantage in distinct environments [43]. Similarly, N. gonorrhoeae might possess multiple Naps to respond to different environmental cues to leading to PCD +/− toxicity of host cells. Considering that some killing mechanisms require direct bacterial contact with target cells [44], it is also possible that gonococcal Naps have further toxic effects on host cells in the presence of bacteria. Further work is needed to understand the function of the proteins encoded by the nap island. Nevertheless, we propose a model for the role of Naps during autolysis (Fig 4C). N. gonorrhoeae PCD starts with re-modelling of peptidoglycan. Cationic Naps produced in the cytoplasm as immature pre-peptides are cleaved in the periplasm and secreted by NapP and NapF, respectively. The role of the other genes on the nap island is more speculative as they are not always found in nap islands in other species of Neisseria. NapH has features of an inner membrane holin, which allow escape of phages from host cells [45], so might be involved in transporting Naps across the inner membrane. NapR (with a predicted DNA binding domain) could regulate genes on the nap island, while the function of NapI (which has some homology to the immunity protein LagC [18]) could prevent self-intoxication by the cationic Naps. PCD has several potential benefits to bystander bacteria, such as provision of nutrients [46], enhancing biofilm formation, and the release of DNA to promote genetic diversification of naturally competent bacteria, such as the pathogenic Neisseria. In addition, bacterial lysis could perturb host cells to release nutrients, or alter local inflammatory responses [4]. Lysed RBCs would also release iron, which could help bacteria circumvent nutritional immunity [47]. Importantly, autolysis is also concomitant with the release of bacterial phospholipids [35,48] and peptidoglycan fragments [40,49], which can limit bacterial growth [50] and reduce host innate immune signalling [51], respectively. In S. pneumoniae, the extent of autolysis is correlated with hyper-virulence [8], while the acquisition of horizontally acquired genomic islands in N. meningitidis are associated with invasive capacity of strains [52]. There is a debate over how autolysis evolves in single-celled organisms [37], as PCD is literally a dead-end for a bacterium. Interestingly, NapR and NapC are likely to be phase variable, based on different lengths of homopolymeric tracts in their open reading frames. Based on sequences of over 10,000 gonococcal isolates in PubMLST, napC seems to be mainly OFF (86% of sequenced N. gonorrhoeae strains, n = 7,773/10,854) with an early stop codon preventing the production of the C-terminal cationic amino acids. Phase variation has implications about how the nap island and PCD could be beneficial to a clonal bacterial population [53], and would generate different subpopulations of gonococci, with some bacteria refractory to PCD with other bacteria undergoing altruistic cell death. A bet-hedging strategy, with bacteria switching between 2 distinct phenotypes, could explain how autolysis and altruism evolved in the gonococcus [37]. It is noteworthy that PCD, which is rare among prokaryotes, has evolved through distinct mechanisms in pathogenic diplococci, which are naturally transformable. The release of DNA on autolysis could allow the transfer of beneficial traits between siblings. PCD might be particularly beneficial for diplococci with a dying cell benefitting their first-degree relatives in immediate proximity, i.e., their nearest and dearest (Fig 4C). In summary, our study sheds light on PCD in invasive Neisseria, a fundamental process relevant for gonococcal cell biology; the nap island encodes NapC which has dual activity, triggering autolysis in addition to death of host cells. As well as the direct effect of the Naps, bacterial suicide might also trigger local and systemic inflammation through the release of PAMPs. Thus, the acquisition of the nap island and its success in N. gonorrhoeae and N. meningitidis might have been an important step in the emergence of these species as invasive pathogens, by enabling them to undergo autolysis to manipulate immune responses and the local environment. Materials and methods Bacterial strains and growth Bacterial strains used in this study are listed in S3 Table. N. gonorrhoeae and N. cinerea were grown on GCB agar plates (1.5% wt./vol. proteose peptone number 3, Becton Dickinson, 0.1% starch, 0.4% K2HPO4, 0.1% KH2PO4, 0.5% NaCl, 1% Vitox, Oxoid, 1.5% agar Oxoid) [54] at 37°C with 5% CO2. L. crispatus was grown on MRS agar plates (ATCC medium 416) and incubated at 37°C with 5% CO2 for about 36 h. E. coli was grown on lysogeny broth (LB) agar plates at 37°C. Generation of deletion mutants All primers are shown in S3 Table. For allelic replacement, overlap PCR was performed to obtain resistance cassettes (aph(3)-I or ermC, for kanamycin or erythromycin resistance, respectively) flanked by regions (approximately 1,000 bp) surrounding the target gene. Briefly, overlap PCR consisted in mixing 3 PCR products (“upstream PCR 1” including the START codon of the targeted gene; “downstream PCR 2” including the STOP codon of the targeted gene; “resistance cassette PCR 3”) in equal ratios. For markerless deletions, 2 distinct overlap PCR products were generated: (a) a PCR product consisting of homologous regions (“upstream PCR 1” and “downstream PCR 2”) flanking selection and counterselection cassettes controlled by a constitutive promoter (“popaB-kanR-pheS* PCR 3”, S3 Fig); (b) a markerless PCR product consisting of homologous regions directly bound to each other (“upstream PCR 1” and “downstream PCR 2”). N. gonorrhoeae was transformed as previously [54]. For selection erythromycin (0.5 μg ml−1), kanamycin (80 μg ml−1), or 4CP (8 mM) were added to media. Individual transformants were screened by PCR and confirmed by sequencing. Bioinformatic analysis The nap island was identified by MaGe [17] by comparing synteny maps between the reference N. gonorrhoeae FA1090 and N. gonorrhoeae NCCP11945, N. gonorrhoeae FA6140, N. gonorrhoeae 35/02, N. gonorrhoeae PID24-1, N. meningitidis 053442, N. meningitidis FAM18, N. meningitidis MC58, Neisseria lactamica ATCC 23970, N. cinerea ATCC 14685, Neisseria flavescens NRL30031/H210, N. flavescens SK114, Neisseria subflava NJ9703, Neisseria sicca ATCC 29256 and Neisseria mucosa ATCC 25996. Using PubMLST [20], the ORF loci and alleles associated to the nap island were manually curated, as described in https://bigsdb.readthedocs.io/en/latest/curator_guide.html and https://www.youtube.com/watch?v=09g5YdrCtDc. Briefly, using the Neisseria isolates database curator’s interface, the “sequence tags scan” tool allowed us to retrieve alleles in given batches of isolates (see below for isolates selection criteria). New alleles were then validated through sequence alignment and added to the database using Neisseria typing database curator’s interface, “sequences (batch) add” tool. This process was repeated until at least 93.5% of the selected isolates had an allele number attributed to each nap gene. To determine gene conservation (Fig 1C), manual strain selection from the isolate collection of each species on PubMLST was done using the following criteria: a complete rMLST to confirm the species, and the number of contigs <500. Gene presence was assessed using “Analysis/Gene Presence” for the different NEIS loci, with pre-set parameters (Min % identity: 70, Min % alignment: 50, BLASTN word size: 20). Gene conservation data are detailed in S2 Table. Reverse transcription followed by quantitative PCR Bacteria were grown in 50 ml of protein- and spermidine-free GW medium [55] in vented flasks (Corning, 431144), by inoculation at OD600nm 0.025 from bacteria grown on GCB plates (T0). RNA was purified from 2 ml of culture pelleted for 4 min at 4,500 rpm and resuspended in TRIzol for 5 min then frozen at −80°C. Tubes were then thawed on ice and mixed with 200 μl of chloroform by vigorous shaking of tubes. After 2 to 3 min incubation at room temperature, tubes were centrifuged at 4°C for 15 min at 12,000 x g and the aqueous phase transferred to 500 μl of ice-cold isopropanol. Tubes were incubated overnight at −20°C, then centrifuged at 4°C for 30 min at 20,000 x g, and pellets were washed with 75% ice cold ethanol, before being resuspended in 80 μl diethylpyrocarbonate-treated H2O. Samples were treated with ezDNase (Invitrogen) and first strand cDNA synthesis was performed with SuperScript IV reverse transcriptase (RT) (Invitrogen) according to manufacturer’s instructions. Note that to obtain specific cDNA, reverse primers (available in S3 Table) were used (alongside the one for the housekeeping gene recA), and a no-RT control was done in parallel. Samples were treated with RNase H for 20 min at 37°C before qPCR was performed using SYBR Green PCR master mix (Applied Biosystems). Primer efficiency was evaluated using serial dilutions to generate a standard curve from PCR products (pre-screen) and mixes of cDNA (to reflect real qPCR conditions). The slopes of standard curves and efficiency values for each primer pair was calculated. StepONEPlus real-time PCR software was used to collect RT-qPCR data and the ΔΔCt method [56] was used to analyse the data, using recA Ct values as reference gene and T0 Ct values as reference condition for Fig 2A. Experiments were done with at least 3 biological replicates, each time with duplicate or triplicate technical repeats. mNap synthesis mNaps were synthesised and subjected to HPLC/mass spectrometry by Isca Biochemicals (UK). Note that mNapA was acetylated on its N-terminal side to increase stability. Peptides were diluted in dezionized water supplemented with 0.001% trifluoroacetic acid at 10 mg/ml, aliquoted in 10 μl fractions in Protein Lo-bind tubes (Eppendorf), snap frozen with liquid nitrogen, and stocked at −80°C. Each aliquot was only used once and immediately upon thawing. Batches of peptides were not stocked for longer than 6 months to avoid loss of activity. Disk diffusion and MBC assays Bacteria were freshly harvested from plates and resuspended in GW medium [55] at 108 bacteria/ml then spread on agar plates. L. crispatus was plated on MRS agar plates, while all other bacteria were plated on GW agar plates, prepared by mixing (50:50) 2×-concentrated liquid GW medium (filtered) with warm autoclaved 2% agar in water. Plates were allowed to dry for 10 to 15 min, then 6 mm Whatman paper disks soaked with 10 μl of peptide (50 μm) were added. Polymyxin B (50 μm) and water were used as controls. Plates were incubated for 24 to 48 h at 37°C and 5% CO2 as required. The efficacy of compounds was assessed by the presence of a zone of growth inhibition around disks. All bacteria were harvested from plates and resuspended in FB medium [57]. Peptides were prepared in FB medium as 2×-concentrated solutions at 512 μg/ml and 2-fold dilutions (to 10 μg/ml) were prepared in untreated U-bottom polypropylene 96-well plates (Corning, 3879) to 50 μl per well. Penicillin G and polymyxin B were used as controls. Bacteria (50 μl of 105 CFU/ml) were added to each well, then incubated at 37°C with 5% CO2 with shaking at 180 rpm for 24 h, before spotting 10 μl of each well to plates. After 24 h of incubation, MBC values were attributed to the lowest concentration which gave no growth. Co-culture assays Bacteria were harvested from plates, resuspended in FB medium [57], and diluted to 105 CFU (colony-forming unit)/ml. Aliquots of 250 μl of prey bacteria were mixed with 250 μl of either wild-type N. gonorrhoeae FA1090, or the ΔnapRABC mutant, or FB medium alone. Cultures were incubated at 37°C with 5% CO2 with shaking at 180 rpm for 3 and 24 h, before plating to selective agar plates; for N. cinerea, E. coli, and L. crispatus, media were GCB plates supplemented with 0.01% Congo Red [58], LB agar plates, and MRS agar plates, respectively. The number of prey bacteria were normalised to control wells without added gonococci. Growth curves Bacteria were resuspended in liquid medium at OD600 0.025, and 100 μl added to wells of a flat-bottom 96-well plate (Greiner, 655161), and the OD600 monitored with a plate reader (BMG LabTech) at 37°C in 5 % CO2 with shaking at 200 rpm. Alternatively, bacteria were resuspended in protein-free spermidine-free GW medium at an OD600 of 0.025, then 90 μl added to wells of untreated U-bottom polypropylene plates (Corning, 3879). After 3 h, 10 μl of peptide was added. Bacterial survival was measured by plating on chocolate agar plates with 5% defibrinated horse blood (E&O labs, PP0100). Autolysis assays Bacteria were resuspended in 1.2 ml liquid GCB at OD540 of ~0.3, then left in cuvettes for 24 h at room temperature (21°C), gently resuspended by pipetting before the OD540 was measured again. Additionally, bacteria were grown in liquid GCB (OD600 of ~0.025) for ~4 h to reach mid-exponential growth, pelleted at 4,500 x g for 5 min and resuspended in 700 μl HEPES buffer (50 mM, pH 8.5) to an OD450 of ~0.3 per ml. Aliquots were added to cuvettes prefilled with 300 μl of HEPES buffer (50 mM, pH 8.5). Just after mixing in pre-filled cuvettes, the OD450 was taken (T0, 100%), then again at regular intervals after gentle resuspending. Values are given in percent of initial turbidity. Transmission electron microscopy Samples were prepared as when measuring autolysis in buffer. After 75 min, bacteria were recovered from cuvettes and pelleted at 4,500 G for 5 min before being resuspended in fixation buffer (2.5% glutaraldehyde, 2% formaldehyde, 0.1 M PIPES buffer (pH 7.2)) and left at room temperature for 1 h then stored at 4°C. After fixation, samples were extensively washed in buffer then pelleted, embedded in agarose and dissected into ≤1 mm3 cubes. Samples were treated with 50 mM glycine in buffer for 15 min then washed ahead of secondary fixation for 1 h at 4°C in 1% osmium tetroxide and 1.5% potassium ferrocyanide in buffer. Samples were washed extensively in water and stained overnight in 0.5% uranyl acetate (aq.) at 4°C. The following day, samples were washed in water then dehydrated step-wise in a series of 30%, 50%, 70%, 80%, 90%, 95%, and 100% ethanol. The dehydrated samples were incubated in a 25% solution of low viscosity resin (Agar) diluted in ethanol, then in a 50% solution overnight. Samples were further infiltrated in 75% and then extensively in 100% resin ahead of embedding and polymerisation. Sections of 90 nm were cut from polymerised sample blocks using a Leica UC7 ultramicrotome, post-stained with lead citrate and imaged on a Thermo Fisher Tecnai T12 TEM at 120 keV (with Gatan OneView CMOS camera). Haemolysis assay Human blood (1 ml; K2EDTA; Cambridge Bioscience, United Kingdom) was washed 3 times with 4 ml PBS prior to centrifugation at 700 × g for 8 min as previously [59]; erythrocytes were pelleted at 1,000 × g for 10 min, and diluted to 0.5% v/v. Peptides were added to 96-well V-bottomed polypropylene plates at a starting concentration of 100 μm. Mellitin (2.5 μm) and PBS were added as positive and negative controls. Erythrocytes were incubated at 37°C for 1 h, pelleted at 1,000 × g for 10 min; the OD415 nm of supernatants (60 μl) was read, and results normalised to the controls. LDH release assay THP-1 Dual cells (InvivoGen) were maintained in RPMI1640 medium supplemented with 10% heat inactivated FBS, 25 mM HEPES buffer, 100 U/ml penicillin, and 100 μg/ml streptomycin at 37°C, 5% CO2. Cells were differentiated to monocyte-derived macrophages (MDMs) in 96 plates (105 cells per well in 200 μl media) using 50 nM of phorbol 12-myristate 13-acetate (PMA, MP Biomedicals) for 3 h, followed by 3 days in complete media. When required, cells were activated with LPS from E. coli O111:B4 (100 ng/ml, Sigma) for 1 h before being exposed to mNaps. CytoTox 96 Non-Radioactive Cytotoxicity Assay (Promega) was used to quantify cell death according to the manufacturer’s protocol. After 6 h, 50 μl of culture supernatant was added to an equal volume of Cytotox 96 reagent. Samples were incubated at room temperature protected from light for 30 min. Then, the OD490 using a microplate reader (BMG Labtech PHERAstar FS) and results normalised to controls. Statistical analysis Statistical analyses were performed on GraphPad Prism version 10.0.0, using tests described. Bacterial strains and growth Bacterial strains used in this study are listed in S3 Table. N. gonorrhoeae and N. cinerea were grown on GCB agar plates (1.5% wt./vol. proteose peptone number 3, Becton Dickinson, 0.1% starch, 0.4% K2HPO4, 0.1% KH2PO4, 0.5% NaCl, 1% Vitox, Oxoid, 1.5% agar Oxoid) [54] at 37°C with 5% CO2. L. crispatus was grown on MRS agar plates (ATCC medium 416) and incubated at 37°C with 5% CO2 for about 36 h. E. coli was grown on lysogeny broth (LB) agar plates at 37°C. Generation of deletion mutants All primers are shown in S3 Table. For allelic replacement, overlap PCR was performed to obtain resistance cassettes (aph(3)-I or ermC, for kanamycin or erythromycin resistance, respectively) flanked by regions (approximately 1,000 bp) surrounding the target gene. Briefly, overlap PCR consisted in mixing 3 PCR products (“upstream PCR 1” including the START codon of the targeted gene; “downstream PCR 2” including the STOP codon of the targeted gene; “resistance cassette PCR 3”) in equal ratios. For markerless deletions, 2 distinct overlap PCR products were generated: (a) a PCR product consisting of homologous regions (“upstream PCR 1” and “downstream PCR 2”) flanking selection and counterselection cassettes controlled by a constitutive promoter (“popaB-kanR-pheS* PCR 3”, S3 Fig); (b) a markerless PCR product consisting of homologous regions directly bound to each other (“upstream PCR 1” and “downstream PCR 2”). N. gonorrhoeae was transformed as previously [54]. For selection erythromycin (0.5 μg ml−1), kanamycin (80 μg ml−1), or 4CP (8 mM) were added to media. Individual transformants were screened by PCR and confirmed by sequencing. Bioinformatic analysis The nap island was identified by MaGe [17] by comparing synteny maps between the reference N. gonorrhoeae FA1090 and N. gonorrhoeae NCCP11945, N. gonorrhoeae FA6140, N. gonorrhoeae 35/02, N. gonorrhoeae PID24-1, N. meningitidis 053442, N. meningitidis FAM18, N. meningitidis MC58, Neisseria lactamica ATCC 23970, N. cinerea ATCC 14685, Neisseria flavescens NRL30031/H210, N. flavescens SK114, Neisseria subflava NJ9703, Neisseria sicca ATCC 29256 and Neisseria mucosa ATCC 25996. Using PubMLST [20], the ORF loci and alleles associated to the nap island were manually curated, as described in https://bigsdb.readthedocs.io/en/latest/curator_guide.html and https://www.youtube.com/watch?v=09g5YdrCtDc. Briefly, using the Neisseria isolates database curator’s interface, the “sequence tags scan” tool allowed us to retrieve alleles in given batches of isolates (see below for isolates selection criteria). New alleles were then validated through sequence alignment and added to the database using Neisseria typing database curator’s interface, “sequences (batch) add” tool. This process was repeated until at least 93.5% of the selected isolates had an allele number attributed to each nap gene. To determine gene conservation (Fig 1C), manual strain selection from the isolate collection of each species on PubMLST was done using the following criteria: a complete rMLST to confirm the species, and the number of contigs <500. Gene presence was assessed using “Analysis/Gene Presence” for the different NEIS loci, with pre-set parameters (Min % identity: 70, Min % alignment: 50, BLASTN word size: 20). Gene conservation data are detailed in S2 Table. Reverse transcription followed by quantitative PCR Bacteria were grown in 50 ml of protein- and spermidine-free GW medium [55] in vented flasks (Corning, 431144), by inoculation at OD600nm 0.025 from bacteria grown on GCB plates (T0). RNA was purified from 2 ml of culture pelleted for 4 min at 4,500 rpm and resuspended in TRIzol for 5 min then frozen at −80°C. Tubes were then thawed on ice and mixed with 200 μl of chloroform by vigorous shaking of tubes. After 2 to 3 min incubation at room temperature, tubes were centrifuged at 4°C for 15 min at 12,000 x g and the aqueous phase transferred to 500 μl of ice-cold isopropanol. Tubes were incubated overnight at −20°C, then centrifuged at 4°C for 30 min at 20,000 x g, and pellets were washed with 75% ice cold ethanol, before being resuspended in 80 μl diethylpyrocarbonate-treated H2O. Samples were treated with ezDNase (Invitrogen) and first strand cDNA synthesis was performed with SuperScript IV reverse transcriptase (RT) (Invitrogen) according to manufacturer’s instructions. Note that to obtain specific cDNA, reverse primers (available in S3 Table) were used (alongside the one for the housekeeping gene recA), and a no-RT control was done in parallel. Samples were treated with RNase H for 20 min at 37°C before qPCR was performed using SYBR Green PCR master mix (Applied Biosystems). Primer efficiency was evaluated using serial dilutions to generate a standard curve from PCR products (pre-screen) and mixes of cDNA (to reflect real qPCR conditions). The slopes of standard curves and efficiency values for each primer pair was calculated. StepONEPlus real-time PCR software was used to collect RT-qPCR data and the ΔΔCt method [56] was used to analyse the data, using recA Ct values as reference gene and T0 Ct values as reference condition for Fig 2A. Experiments were done with at least 3 biological replicates, each time with duplicate or triplicate technical repeats. mNap synthesis mNaps were synthesised and subjected to HPLC/mass spectrometry by Isca Biochemicals (UK). Note that mNapA was acetylated on its N-terminal side to increase stability. Peptides were diluted in dezionized water supplemented with 0.001% trifluoroacetic acid at 10 mg/ml, aliquoted in 10 μl fractions in Protein Lo-bind tubes (Eppendorf), snap frozen with liquid nitrogen, and stocked at −80°C. Each aliquot was only used once and immediately upon thawing. Batches of peptides were not stocked for longer than 6 months to avoid loss of activity. Disk diffusion and MBC assays Bacteria were freshly harvested from plates and resuspended in GW medium [55] at 108 bacteria/ml then spread on agar plates. L. crispatus was plated on MRS agar plates, while all other bacteria were plated on GW agar plates, prepared by mixing (50:50) 2×-concentrated liquid GW medium (filtered) with warm autoclaved 2% agar in water. Plates were allowed to dry for 10 to 15 min, then 6 mm Whatman paper disks soaked with 10 μl of peptide (50 μm) were added. Polymyxin B (50 μm) and water were used as controls. Plates were incubated for 24 to 48 h at 37°C and 5% CO2 as required. The efficacy of compounds was assessed by the presence of a zone of growth inhibition around disks. All bacteria were harvested from plates and resuspended in FB medium [57]. Peptides were prepared in FB medium as 2×-concentrated solutions at 512 μg/ml and 2-fold dilutions (to 10 μg/ml) were prepared in untreated U-bottom polypropylene 96-well plates (Corning, 3879) to 50 μl per well. Penicillin G and polymyxin B were used as controls. Bacteria (50 μl of 105 CFU/ml) were added to each well, then incubated at 37°C with 5% CO2 with shaking at 180 rpm for 24 h, before spotting 10 μl of each well to plates. After 24 h of incubation, MBC values were attributed to the lowest concentration which gave no growth. Co-culture assays Bacteria were harvested from plates, resuspended in FB medium [57], and diluted to 105 CFU (colony-forming unit)/ml. Aliquots of 250 μl of prey bacteria were mixed with 250 μl of either wild-type N. gonorrhoeae FA1090, or the ΔnapRABC mutant, or FB medium alone. Cultures were incubated at 37°C with 5% CO2 with shaking at 180 rpm for 3 and 24 h, before plating to selective agar plates; for N. cinerea, E. coli, and L. crispatus, media were GCB plates supplemented with 0.01% Congo Red [58], LB agar plates, and MRS agar plates, respectively. The number of prey bacteria were normalised to control wells without added gonococci. Growth curves Bacteria were resuspended in liquid medium at OD600 0.025, and 100 μl added to wells of a flat-bottom 96-well plate (Greiner, 655161), and the OD600 monitored with a plate reader (BMG LabTech) at 37°C in 5 % CO2 with shaking at 200 rpm. Alternatively, bacteria were resuspended in protein-free spermidine-free GW medium at an OD600 of 0.025, then 90 μl added to wells of untreated U-bottom polypropylene plates (Corning, 3879). After 3 h, 10 μl of peptide was added. Bacterial survival was measured by plating on chocolate agar plates with 5% defibrinated horse blood (E&O labs, PP0100). Autolysis assays Bacteria were resuspended in 1.2 ml liquid GCB at OD540 of ~0.3, then left in cuvettes for 24 h at room temperature (21°C), gently resuspended by pipetting before the OD540 was measured again. Additionally, bacteria were grown in liquid GCB (OD600 of ~0.025) for ~4 h to reach mid-exponential growth, pelleted at 4,500 x g for 5 min and resuspended in 700 μl HEPES buffer (50 mM, pH 8.5) to an OD450 of ~0.3 per ml. Aliquots were added to cuvettes prefilled with 300 μl of HEPES buffer (50 mM, pH 8.5). Just after mixing in pre-filled cuvettes, the OD450 was taken (T0, 100%), then again at regular intervals after gentle resuspending. Values are given in percent of initial turbidity. Transmission electron microscopy Samples were prepared as when measuring autolysis in buffer. After 75 min, bacteria were recovered from cuvettes and pelleted at 4,500 G for 5 min before being resuspended in fixation buffer (2.5% glutaraldehyde, 2% formaldehyde, 0.1 M PIPES buffer (pH 7.2)) and left at room temperature for 1 h then stored at 4°C. After fixation, samples were extensively washed in buffer then pelleted, embedded in agarose and dissected into ≤1 mm3 cubes. Samples were treated with 50 mM glycine in buffer for 15 min then washed ahead of secondary fixation for 1 h at 4°C in 1% osmium tetroxide and 1.5% potassium ferrocyanide in buffer. Samples were washed extensively in water and stained overnight in 0.5% uranyl acetate (aq.) at 4°C. The following day, samples were washed in water then dehydrated step-wise in a series of 30%, 50%, 70%, 80%, 90%, 95%, and 100% ethanol. The dehydrated samples were incubated in a 25% solution of low viscosity resin (Agar) diluted in ethanol, then in a 50% solution overnight. Samples were further infiltrated in 75% and then extensively in 100% resin ahead of embedding and polymerisation. Sections of 90 nm were cut from polymerised sample blocks using a Leica UC7 ultramicrotome, post-stained with lead citrate and imaged on a Thermo Fisher Tecnai T12 TEM at 120 keV (with Gatan OneView CMOS camera). Haemolysis assay Human blood (1 ml; K2EDTA; Cambridge Bioscience, United Kingdom) was washed 3 times with 4 ml PBS prior to centrifugation at 700 × g for 8 min as previously [59]; erythrocytes were pelleted at 1,000 × g for 10 min, and diluted to 0.5% v/v. Peptides were added to 96-well V-bottomed polypropylene plates at a starting concentration of 100 μm. Mellitin (2.5 μm) and PBS were added as positive and negative controls. Erythrocytes were incubated at 37°C for 1 h, pelleted at 1,000 × g for 10 min; the OD415 nm of supernatants (60 μl) was read, and results normalised to the controls. LDH release assay THP-1 Dual cells (InvivoGen) were maintained in RPMI1640 medium supplemented with 10% heat inactivated FBS, 25 mM HEPES buffer, 100 U/ml penicillin, and 100 μg/ml streptomycin at 37°C, 5% CO2. Cells were differentiated to monocyte-derived macrophages (MDMs) in 96 plates (105 cells per well in 200 μl media) using 50 nM of phorbol 12-myristate 13-acetate (PMA, MP Biomedicals) for 3 h, followed by 3 days in complete media. When required, cells were activated with LPS from E. coli O111:B4 (100 ng/ml, Sigma) for 1 h before being exposed to mNaps. CytoTox 96 Non-Radioactive Cytotoxicity Assay (Promega) was used to quantify cell death according to the manufacturer’s protocol. After 6 h, 50 μl of culture supernatant was added to an equal volume of Cytotox 96 reagent. Samples were incubated at room temperature protected from light for 30 min. Then, the OD490 using a microplate reader (BMG Labtech PHERAstar FS) and results normalised to controls. Statistical analysis Statistical analyses were performed on GraphPad Prism version 10.0.0, using tests described. Supporting information S1 Fig. Supplementary information regarding genes of the nap island in Neisseria gonorrhoeae FA1090. DNA and amino acid sequences are given for each nap genes, as well as predicted alpha-fold structures and extra information, such as C39 peptidase cleaving sites, phase variable sequences based on available genomic data (details available in S1 Table) or gene alignments. Note that the alignment between napI and the Lactococcin-G immunity protein lagC was performed after the serendipitous observation that NapP was annotated as homolog to the “Lactococcin-G-processing and transport ATP-binding protein” LagD from Lactococcus lactis both in the genome of N. gonorrhoeae FA19 (GenBank accession no. CP012026.1, locus tag: VT05_00181) and N. gonorrhoeae 35/02 (GenBank accession no. CP012028.1, locus tag: WX61_01768). https://doi.org/10.1371/journal.pbio.3003001.s001 (PDF) S2 Fig. Nap island organisation in Neisseria meningitidis (Nm) MC58, Nm FAM18, Neisseria lactamica (Nl) 020–06, and Neisseria cinerea (Nc) NCTC10294. In both Nm and Nl strains, NapF sequences correspond to the long NEIS0907 allele, which is found in another loci in N. gonorrhoeae (Ng) (under the gene name NGO_0166 in Ng FA1090). In Nm, chromosomal rearrangements led to the fusion between the 2 NEIS0907-containing loci. Genes in purple correspond to those homologous to the NGO_0166-containing locus in Ng FA1090. In Nl, the presence of the long NEIS0907 allele correlates with the loss of napA. Regarding napB, due to the presence of an earlier start codon, it is present as a longer allelic form than in Nm or Ng. As for napI, it is slightly shorter in Nl than in Ng. Note that an extra gene is represented in black in the locus of Nl (NLA_13780, function unknown); a homolog sequence exists in Ng FA1090 but the start codon is not present. As shown in Fig 1C, the locus is absent in Nc. Instead, 3 genes of unknown function (light grey) are present between the flanking NEIS0794 and NEIS0808. ⁑, C39 peptidase predicted cleavage site. Note that for NapB in Nl, ⁑ was not positioned as alternative GG-sites might be at play. Sequences were manually annotated on Snap Gene viewer. https://doi.org/10.1371/journal.pbio.3003001.s002 (PDF) S3 Fig. Construction and characterisation of deletion mutants in N. gonorrhoeae. (A) Growth curves in GCB medium. Markerless deletion strains (plain lines, ML) are compared to resistant marker strains (dotted lines, K for kanamycin resistance marker and E for erythromycin resistance marker). Standard deviation are shown in lighter colours (n = 9). The data underlying this figure can be found in S6 Data. (B) Schematic representation of the classical method to generate deletion mutants in N. gonorrhoeae. Briefly, an overlap PCR product is generated from upstream region (PCR 1), downstream region (PCR 2), and resistance cassette marker (PCR 3) amplifications. The Overlap PCR product is then used for natural transformation into N. gonorrhoeae and the mutant strain is selected based on the acquired antibiotic resistance. (C) Schematic representation of the markerless method. Briefly, 2 overlap PCR products are generated from upstream region (PCR 1) and downstream region (PCR 2) as well as, for the first product only, an endogenous promoter (popaB) followed by section/counter-selection markers (PCR 3) amplifications. Note that the pheS* sequence used here was amplified from the genome of N. gonorrhoeae FA1090 itself and point mutations (* = T275S and A318G) were then introduced by overlap PCR. The markerless method consists in first selecting a recombinant colony through kanamycin resistance selection, and secondly, by transforming this colony with markerless overlap PCR product to allow the counter-selection of the pheS* marker. https://doi.org/10.1371/journal.pbio.3003001.s003 (PDF) S4 Fig. Co-culture of prey bacteria with N. gonorrhoeae. Prey bacteria were incubated alongside either wild-type N. gonorrhoeae (WT FA1090, black bars) or the ΔnapRABC strain (blue bars) for 3 or 24 h in FB medium at a 1:1 ratio. Prey bacteria were then recovered selectively on agar plates and CFU/ml were counted. Data were normalised against the recovery of prey bacteria grown without the gonococcus (100%). The data underlying this figure can be found in S7 Data. Multiple paired t test were performed between each pair (WT vs. ΔnapRABC) with no significant difference in their survival (n = 3, error bars, SD). https://doi.org/10.1371/journal.pbio.3003001.s004 (PDF) S5 Fig. Gene expression of the nap island in N. gonorrhoeae. (A) RT-qPCR on all genes of the nap locus of strain FA1090. Samples were recovered from GW liquid cultures in flasks at different time points, and gene expression was normalised based on samples recovered from plates and resuspended in GW (T0). Error bars represent standard deviation from the mean (n = 3–4). One-way ANOVA was performed on data with minimum 2-fold change (dotted lines) compared to T0 (p < 0.033, *; p < 0.002, **; p < 0.001, ***). Growth curves were performed in parallel (right bottom panel). The data underlying this figure can be found in S8 Data. (B) Predicted ribosome binding sites (RBS) and promoter regions (grey arrows) in the nap locus of strain FA1090. RBS were manually annotated based on the presence of a GGA enriched sequence −3 to −10 of an ATG start codon; promoter regions predicted by BPROM (http://softberry.com). (C, D) Pink arrows represent repeats of Correia elements (CE), while blue boxes show integration host factor (IHF) binding sites, as defined previously (10.1016/s0378-1119(01)00725-9 and 10.1016/s0014-5793(02)02882-x), respectively. Annotations were performed with Snap Gene. https://doi.org/10.1371/journal.pbio.3003001.s005 (PDF) S6 Fig. Survival assays in the presence of Naps. (A) Synthetic mature peptides (mNap) were tested against N. cinerea CCUG 346T. (B) Synthetic scrambled versions of the mature peptides (mNapSCR) were tested on N. gonorrhoeae FA1090. Bacteria were cultured in 96-well plates in protein-free spermidine-free GW medium for 3 h before peptides were added at various concentrations (arrow). At specific time points (0, 3, 6, 9, 24, 30, 36 h), one well per condition was emptied and serial diluted before plating on chocolate agar plates and incubation overnight. Colony-forming units (CFUs) were then counted. Dotted lines represent the limit of detection. (C) In order to check whether mNaps had a synergistic or competitive effect, they were tested against N. gonorrhoeae FA1090 with a fixed concentration of 2.5 μm each, alone or mixed. Data shown here represent the differences (in log scale) between t 0 and 9 h (exponential phase of growth) and t 9 and 36 h (autolytic phase) (n = 3). No synergistic or competitive effect was observed for any of the peptides. Note that the standard deviation at t 36 h was larger than usual due to recovery on GCB plates instead of chocolate agar (see Materials and methods). The data underlying this figure can be found in S9 Data. https://doi.org/10.1371/journal.pbio.3003001.s006 (PDF) S7 Fig. Autolysis in 50 mM HEPES buffer (pH 8.5). (A) Against N. gonorrhoeae in plain buffer. The data shown here are the mean values of 8 biological replicates, except for ΔnapRABC napI::kanR (n = 4). Standard deviation are shown in transparent corresponding colours. One-way ANOVA was performed on data from T = 75 min, with Dunnett’s multiple comparison against the WT values (p < 0.033, *; p < 0.002, **; p < 0.001, ***). (B) Against N. gonorrhoeae in buffer supplemented with 10 mM MgCl2 (n = 3). (C) Against N. cinerea in plain buffer (n = 3). (D) Against N. cinerea in buffer supplemented with 10 mM MgCl2 (n = 3). The data underlying this figure can be found in S10 Data. https://doi.org/10.1371/journal.pbio.3003001.s007 (PDF) S8 Fig. Autolysis in buffer. Transmission electron microscopy images of samples prior to performing autolysis in buffer experiment (T = 0 min). https://doi.org/10.1371/journal.pbio.3003001.s008 (PDF) S9 Fig. Cytotoxicity of Naps on THP-1 cells. LDH release assays were performed in the presence of 1 and 5 μm of each mNap with non-activated or activated THP-1 derived macrophages. No significant cytotoxic activity was detectable for any mNap (one-sample t and Wilcoxon test); error bars, SD of assays performed in triplicate. The data underlying this figure can be found in S11 Data. https://doi.org/10.1371/journal.pbio.3003001.s009 (PDF) S1 Table. Predicted sequences of NapC and NapR. https://doi.org/10.1371/journal.pbio.3003001.s010 (XLSX) S2 Table. Nap genes in Neisseria spp. https://doi.org/10.1371/journal.pbio.3003001.s011 (XLSX) S3 Table. Strains and primers used in this study. https://doi.org/10.1371/journal.pbio.3003001.s012 (XLSX) S1 Data. Disc diffusion assays of antimicrobials and Naps against bacteria as indicated. https://doi.org/10.1371/journal.pbio.3003001.s013 (XLSX) S2 Data. Results of MICs in μg/ml. Strains are indicated. https://doi.org/10.1371/journal.pbio.3003001.s014 (XLSX) S3 Data. Effect of mNaps on survival of wild-type N. gonorrhoeae. https://doi.org/10.1371/journal.pbio.3003001.s015 (XLSX) S4 Data. Effect of mNaps on survival of N. gonorrhoeae. Strains are indicated. https://doi.org/10.1371/journal.pbio.3003001.s016 (XLSX) S5 Data. Effect of mNaps on lysis of RBCs. https://doi.org/10.1371/journal.pbio.3003001.s017 (XLSX) S6 Data. Growth curves in GCB medium. Strains and values of OD600 are indicated. https://doi.org/10.1371/journal.pbio.3003001.s018 (XLSX) S7 Data. Results of co-culture of prey bacteria with N. gonorrhoeae. Strains are indicated, and survival is shown as CFU/ml. https://doi.org/10.1371/journal.pbio.3003001.s019 (XLSX) S8 Data. Gene expression of the nap island in N. gonorrhoeae. RT-qPCR of expression of genes (indicated) in wild-type N. gonorrhoeae FA1090. https://doi.org/10.1371/journal.pbio.3003001.s020 (XLSX) S9 Data. Survival assays in the presence of individual and combinations of Naps. Survival is shown as CFU/ml, with the folders corresponding to the panels in S6 Fig. https://doi.org/10.1371/journal.pbio.3003001.s021 (XLSX) S10 Data. Autolysis in 50 mM HEPES buffer. Results show the OD540 of bacteria in buffers as indicated. The folders correspond to the panels in S7 Fig. https://doi.org/10.1371/journal.pbio.3003001.s022 (XLSX) S11 Data. Cytotoxicity of Naps on THP-1 cells. Results of LDH release (measured by the OD490 of assays on cell supernatants) from THP-1 cells in the presence of individual mNaps. Lysis buffer shows the LDH release from 100% of cells. https://doi.org/10.1371/journal.pbio.3003001.s023 (XLSX)

journal article

Open Access Collection

Neural development goes retro: Gags as essential modulators of synapse formation

Chang, Yung-Heng;Dubnau, Josh

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003032pmid: 39970178

The fly larval neuromuscular junction (NMJ) is a powerful model to investigate conserved underpinnings of synapse formation and synaptic transmission, as well as mechanisms of functional and structural plasticity that maintain homeostasis as the size of the NMJ increases during development. In this issue of PLOS Biology, M’Angale and colleagues describe a surprising and novel role for retrotransposable elements (RTEs) in synaptic development and plasticity at the larval NMJ [1]. Their findings identify a mechanistic push and pull in which the Copia RTE and the RTE-derived Arc protein act antagonistically to balance the rate of synaptic bouton formation. RTEs are a subtype of transposable element that replicate through an RNA intermediate. RTEs encode a reverse transcriptase, which synthesizes cDNA from the RTE encoded RNA, as well as protein machinery to re-insert the cDNA copies at de novo chromosome locations. The long-terminal repeat (LTR) subtype of RTEs are evolutionary ancestors of retroviruses [2,3] and share many of the features of retroviral replication, including the fact that they encode a gag protein that assembles a viral-like capsid, bound to its own RNA. Sequences derived from RTEs make up a vast fraction of the genomes of most plants and animals. Because expression of RTEs has the potential to cause mutations, genomic instability, and activation of inflammatory signaling, organisms have invested heavily in mechanisms to suppress their expression. There is accumulating evidence that when these silencing systems fail, RTEs contribute to diseases such as cancer and neurodegeneration, as well as to the normal aging process [4,5]. Most genomic copies of RTEs have degenerated over time, and are no longer capable of retrotransposition, but many of the RTE proteins and regulatory regions have been exapted to play critical cellular roles [6]. An example of this is the Activity Regulated Cytoskeleton associated protein (Arc), a well-known immediate early gene that plays fundamental roles in synaptic plasticity [7]. It was previously demonstrated that Arc genes of Drosophila and tetrapods are derived from exapted gag genes from an ancient and no longer functional Ty3/gypsy family of LTR-RTEs. Importantly, this appears to be an example of convergent evolution: the fly and tetrapod Arc genes were independently exapted from ancient RTE encoded gags. This suggests that in both cases, evolutionarily ancient properties of gag were independently coopted in distant animal lineages [8,9]. Formation of RNA-bound capsids is a fundamental property of RTE encoded gag, which in the case of Arc appears to provide a means for intercellular (across the synapse) transfer of RNA. Indeed, both the fly and mammalian Arc genes retain structural and functional characteristics of RTE and retroviral encoded gag, including the ability to form virus-like capsids, bound to the Arc encoded mRNA. In both mouse and fly, Arc-bound mRNA is released in extracellular vesicles (EVs) which cross the synapse. After transfer across the synapse, the Arc mRNA undergoes activity-regulated translation [8,9]. In flies, Arc has also been shown to promote NMJ structural plasticity, which converges with prior findings on Arc in mammalian synaptic plasticity [9]. The discovery that PNMA2, another Ty3 gag derived gene, has also been coopted for intercellular communication [10], and that membrane targeting of the zebrafish Bik-2 gag is required for neural crest cell migration [11], adds credence to the idea RTE-encoded gags have repeatedly been adapted for intercellular communication. In this issue of PLOS Biology, M’Angale and colleagues report the surprising finding that Copiagag, one of the proteins encoded by the fully active Copia LTR-RTE, plays an essential role in NMJ structural plasticity. Copiagag is essential for Copia replication, and the authors found that purified Copiagag can self-assemble to form virus-like capsids. Using cryo-electron microscopy (cryo-EM), they demonstrate that the nucleocapsid (NC) region, which interacts with RNA, is located deep within the interior of the Copia capsid. The N- and C-terminal domains of the Gag-CA region form the outer shield of the capsid, showing structural similarity to capsids from retroviruses. The authors demonstrate that presence of RNA is crucial for Copia capsid assembly, consistent with its known functional role in Copia retrotransposition. Like Arc, Copiagag is also expressed abundantly in the larval central nervous system and body wall muscles. At the NMJ, Copiagag is detected in both presynaptic neurons and postsynaptic muscle cells. To investigate the source of Copiagag, the authors knock down Copia in either pre- or postsynaptic cells, combined with immunostaining to assess the levels of Copiagag present at NMJ. These experiments reveal that at least some of the postsynaptic Copiagag derives from Copia mRNA that is expressed in presynaptic neurons, indicating transfer across the synapse. The Thomson lab previously reported that Copia RNA is loaded into EVs derived from Drosophila S2 cells [8]. They now are able to detect Copia transcripts associated with Copiagag capsids in EVs extracted from larval CNS and body wall muscle as well. Together, the findings are consistent with the interpretation that as with Arc, Copiagag bound Copia RNA may be transferred across the NMJ synapse via EVs, leading to translation of the RNA cargo postsynaptically. Investigation of the functional roles of Arc and Copiagag indicates that they antagonistically modulate the number of synaptic boutons (Fig 1). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Copia and Arc interactions at the NMJ. In the presynaptic neurons, Copia and Arc RNAs antagonize each other’s expression at the transcriptional level. Presynaptic Copia and Arc RNAs are encapsulated by their own RNA-encoded Gag proteins to form virus-like capsids, which are subsequently released by the presynaptic neurons into the EVs to cross the NMJ. The capsids of Copia and Arc in EVs are then engulfed by the postsynaptic cells, where the virus-like capsids are disassembled to release the embedded RNAs and Gag proteins. The balance of released Copia and Arc RNA and Gag proteins impacts the rate of synapse formation. https://doi.org/10.1371/journal.pbio.3003032.g001 The Thomson group previously demonstrated that knockout of the fly Arc1 gene leads to a decrease in bouton number [8]. Here, they found that Copia knockdown results in the opposite effect, an increase in the number of synaptic boutons formed. Copia knockdown is also sufficient to sensitize the NMJ to neural activity so that greater structural plasticity is seen from less stimulation. Copia knockdown is also sufficient to ameliorate the loss of boutons seen in Arc1 mutant animals, indicating that Copia’s effects on plasticity are manifested in the absence of Arc protein. These findings indicate that Arc1 and Copia provide a brake and an accelerator on synapse formation, respectively. And yet unlike the brake and accelerator of a car, the opposing effects of these 2 gag proteins are not entirely independent. Rather, they are mutually antagonistic because Copia knockdown leads to increased levels of Arc1 and knocking out Arc1 causes increased Copia expression. Capsid immunoprecipitation experiments demonstrate that although Copia RNA is only detected within Copia capsids, the Arc RNA can be loaded into either Copia or Arc capsids, suggesting a more complex interplay whose impacts are yet to be elucidated. The discovery of an essential role in neural development for the gag of Copia raises obvious questions about how the functional requirements for Copiagag at the NMJ are balanced against the genotoxic potential from expressing an active and functional LTR-retrotransposon. As more cases of mutualism between active RTEs and the host emerge, this question will need to be investigated. But the repeated examples where gag proteins from LTR retrotransposons have been adapted for use in intercellular trafficking of RNAs is unexpected for another reason [1,8–10]. Phylogenetic analyses of reverse transcriptase sequences indicate that all retroviruses have descended from mammalian LTR-RTEs [2,3]. The emergence of retroviruses from LTR-RTEs involves a change from a strictly intracellular replication cycle of the RTE, in which targeting of capsid assembly is geared towards nuclear entry, to an obligate intercellular replication cycle of a virus. The emergence of retroviruses from their ancestral LTR-RTE cousins required a change in targeting of gag to assemble a virus particle at the plasma membrane for release. Perhaps the repeated adaptation of RTE gag for intercellular trafficking points to a shared evolutionary path with the emergence of retroviruses. Has the host coopted a viral property of gag? Or have retroviruses coopted a property of host gag genes? Acknowledgments We thank Jonathan Nelson for helpful comments on the manuscript.

journal article

Open Access Collection

GitHub enables collaborative and reproducible laboratory research

Chen, Katharine Y.;Toro-Moreno, Maria;Subramaniam, Arvind Rasi

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003029pmid: 39951468

Introduction Laboratory research is a complex, collaborative process that involves several stages, including hypothesis formulation, experimental design, data generation and analysis, and manuscript writing. Although reproducibility and data sharing are increasingly prioritized at the publication stage, integrating these principles at earlier stages of laboratory research has been hampered by the lack of broadly applicable solutions. Lab notebooks are the most common media used to document research, but they are typically only used for recording methods and data. Electronic lab notebooks, despite their popularity, are stored in proprietary formats, incur a recurrent cost, tend to become defunct over time, and have poor interoperability with each other [1]. Cloud-based tools like Google Docs and Dropbox allow sharing of data and documents, but do not provide a structured way to track changes over time or record project-related communication. Email and messaging tools such as Slack and Microsoft Teams facilitate informal discussion of ideas and data, but these are poorly suited for organizing data and discussion in a reproducible manner. Consequently, research information often becomes fragmented across multiple platforms. Here, we introduce GitHub as a software platform that can overcome these limitations, and be used across all stages of laboratory research. GitHub for laboratory research The process of software development bears several similarities to activities in laboratory research; it involves iterative problem-solving, where hypotheses are broken into smaller, testable components, implemented through code, analyzed, and refined as needed. The need to document and share all stages of software development has led to tools and workflows that ensure reproducibility and enable seamless collaboration. Many of these tools and common workflows associated with software development are implemented in GitHub, an online platform where people can store, organize, and share their projects. In the scientific community, GitHub is used to share data analysis workflows after publication [2,3], develop and share computational tools [4], perform individual record keeping [5,6], and conduct open science and collaborative projects [7–11]. However, how the standard workflows and features of GitHub can be adapted to improve reproducibility and collaboration within a traditional laboratory research group has not been explored. We outline below a three-part approach for incorporating the GitHub ecosystem into laboratory research workflows (Fig 1). For a more detailed guide on implementing this approach in a molecular biology laboratory, see the full preprint [12]. In addition, an example GitHub repository based on this approach is available at https://github.com/rasilab/github_demo and a template repository that can be copied is available at https://github.com/rasilab/github_template. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Features of GitHub that are useful for laboratory research and how they compare to software development. https://doi.org/10.1371/journal.pbio.3003029.g001 Designing and organizing experiments using “issues” Software developers use “issues” on GitHub to track tasks, problems, or ideas related to their project. Each issue serves as a to-do item where team members can describe the problem, propose solutions, and discuss progress in one place. In our research group, we use this feature as an interface to organize and collaborate on all aspects of a laboratory experiment (see preprint [12] for example screenshots). Each experiment begins with the creation of a new issue in the corresponding project repository by any of the project members. The issue is initially used to outline the rationale and background of the experiment and the strategy for performing the experiment. Project members can discuss aspects of experimental design, provide clarification in the comments section, and update the issue description as needed. During the experiment, we use the comments section to discuss troubleshooting steps, intermediate data and figures, and interpretation of results. Once the experiment concludes, we update the issue with key results, figures, and conclusions, turning it into a concise summary of the experiment. Thus, each issue functions as a “gist” of the experiment, easily accessible to all collaborators. The issue number provides a convenient way to reference the experiment across physical samples, work logs, computer file names, and discussions in other issues. GitHub provides a number of useful features such as labels, assignees, milestones, and project boards to organize and prioritize issues within a project and across projects. Documenting experiments and data analyses with version control Git is a version control system that records the history of file additions and modifications in a folder, and is used by programmers to track changes to their code. In our research group, we store all files relevant to a project within a single folder on our local computers. We use Git to track changes in that folder, and synchronize it with a cloud-based GitHub repository. We write documents in plain text to enable interoperability across different software and platforms, and to facilitate version control with Git. Within each repository, we use standardized subfolder names for lab notebook entries, code, data, manuscripts, grants, and presentations (Fig 2). We record all work pertinent to an issue in lab notebook files, similar to traditional lab notebook entries. Each lab notebook file includes the corresponding issue number in its name and a link to the issue in its contents to enable easy cross-referencing. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Examples of folder structures and container-related commands and files for reproducible research. A GitHub repository is a cloud-based folder where you can store your code, files, and each file’s version history. A Dockerfile is a simple set of instructions that describe how to set up a software environment on a specific operating system. Docker commands are used to manage Docker containers and images on the local computer. https://doi.org/10.1371/journal.pbio.3003029.g002 Ensuring reproducible software environments with containerized packages Replicating data analysis workflows and software environments is a common challenge in both software development and laboratory research. Software containers are portable environments that package all the necessary software, libraries, and dependencies for an analysis, ensuring it runs consistently across different computers [13]. In our research group, we use software containers to perform all data analyses and writing tasks in reproducible software environments. Public container registries, such as Docker Hub and BioContainers [14], offer ready-made containers that can be used without installing software, simplifying data analysis. For custom containers, we take advantage of the Packages feature of GitHub to host our containers in a centralized location that is free to use and publicly accessible. Each container in our group’s GitHub Packages collection is linked to a dedicated GitHub repository to store the recipe for creating that container. Our group uses containers in several ways for interactive data analyses, writing tasks, and complex bioinformatic workflows. Containers in our group’s GitHub Packages can also be used by external collaborators and readers of our published manuscripts to reproduce data analyses. Designing and organizing experiments using “issues” Software developers use “issues” on GitHub to track tasks, problems, or ideas related to their project. Each issue serves as a to-do item where team members can describe the problem, propose solutions, and discuss progress in one place. In our research group, we use this feature as an interface to organize and collaborate on all aspects of a laboratory experiment (see preprint [12] for example screenshots). Each experiment begins with the creation of a new issue in the corresponding project repository by any of the project members. The issue is initially used to outline the rationale and background of the experiment and the strategy for performing the experiment. Project members can discuss aspects of experimental design, provide clarification in the comments section, and update the issue description as needed. During the experiment, we use the comments section to discuss troubleshooting steps, intermediate data and figures, and interpretation of results. Once the experiment concludes, we update the issue with key results, figures, and conclusions, turning it into a concise summary of the experiment. Thus, each issue functions as a “gist” of the experiment, easily accessible to all collaborators. The issue number provides a convenient way to reference the experiment across physical samples, work logs, computer file names, and discussions in other issues. GitHub provides a number of useful features such as labels, assignees, milestones, and project boards to organize and prioritize issues within a project and across projects. Documenting experiments and data analyses with version control Git is a version control system that records the history of file additions and modifications in a folder, and is used by programmers to track changes to their code. In our research group, we store all files relevant to a project within a single folder on our local computers. We use Git to track changes in that folder, and synchronize it with a cloud-based GitHub repository. We write documents in plain text to enable interoperability across different software and platforms, and to facilitate version control with Git. Within each repository, we use standardized subfolder names for lab notebook entries, code, data, manuscripts, grants, and presentations (Fig 2). We record all work pertinent to an issue in lab notebook files, similar to traditional lab notebook entries. Each lab notebook file includes the corresponding issue number in its name and a link to the issue in its contents to enable easy cross-referencing. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Examples of folder structures and container-related commands and files for reproducible research. A GitHub repository is a cloud-based folder where you can store your code, files, and each file’s version history. A Dockerfile is a simple set of instructions that describe how to set up a software environment on a specific operating system. Docker commands are used to manage Docker containers and images on the local computer. https://doi.org/10.1371/journal.pbio.3003029.g002 Ensuring reproducible software environments with containerized packages Replicating data analysis workflows and software environments is a common challenge in both software development and laboratory research. Software containers are portable environments that package all the necessary software, libraries, and dependencies for an analysis, ensuring it runs consistently across different computers [13]. In our research group, we use software containers to perform all data analyses and writing tasks in reproducible software environments. Public container registries, such as Docker Hub and BioContainers [14], offer ready-made containers that can be used without installing software, simplifying data analysis. For custom containers, we take advantage of the Packages feature of GitHub to host our containers in a centralized location that is free to use and publicly accessible. Each container in our group’s GitHub Packages collection is linked to a dedicated GitHub repository to store the recipe for creating that container. Our group uses containers in several ways for interactive data analyses, writing tasks, and complex bioinformatic workflows. Containers in our group’s GitHub Packages can also be used by external collaborators and readers of our published manuscripts to reproduce data analyses. Benefits of GitHub for “wet” lab research We recognize that adopting the approach outlined here may involve a steep learning curve, particularly for laboratory research groups with limited computational experience. However, we have provided example repositories, tutorials, and templates to assist with this, and we believe the following benefits make the transition more manageable and outweigh the initial effort—particularly for young labs that are still establishing their workflows: (1) Git and GitHub have comprehensive and user-friendly documentation (see resources above). (2) The workflow and features described here are highly modular and can therefore be incrementally adopted. For example, wet lab teams can start with using GitHub Issues to discuss ideas and experimental design in a structured manner. (3) Researchers can then learn to use Git and GitHub to record their work and results. Postdocs transitioning into faculty jobs can start by building their lab website using GitHub. (4) The features described here are part of the free GitHub tier, and can be used by any research group regardless of their size, funding level, or institutional affiliation. (5) Git and GitHub are widely used in both academia and industry, and thus the organization and documentation practices we describe are highly transferrable skills for trainees. Limitations of GitHub Our approach does not directly address data storage since GitHub is not suitable for storing large data sets. While we provide some solutions in our preprint [12] (see “Use Git to store and track your work”), data storage solutions are ultimately lab- and data type-specific and beyond the scope of this article. Further, GitHub is not suitable for storing sensitive data, as it might breach institutional guidelines. Platforms similar to GitHub such as GitLab and Bitbucket might be more suitable for certain labs to meet their privacy or hosting requirements. GitHub private repositories allow fine-grained access control, but researchers should be aware that information stored on GitHub might be used for training large machine-learning models. Despite these limitations, we find that GitHub can serve as an effective platform for improving reproducibility and collaboration in many wet lab research scenarios. Conclusions Here, we have introduced GitHub and highlighted how this platform can be effectively used to support laboratory research. We have adopted widely used features from software development workflows, such as issues, version control, and containers, and adapted them to the specific needs of a molecular biology laboratory. The versatility, scalability, and affordability of this approach make it suitable for various scenarios, ranging from small research groups to large, cross-institutional collaborations. Adopting this framework from a project’s outset can increase the efficiency and fidelity of knowledge transfer within and across research laboratories.

journal article

Open Access Collection

HCNetlas: A reference database of human cell type-specific gene networks to aid disease genetic analyses

Yu, Jiwon;Cha, Junha;Koh, Geon;Lee, Insuk

2025 PLoS Biology

doi: 10.1371/journal.pbio.3002702pmid: 39908239

Introduction Human tissues comprise a mosaic of cell types, each with distinctive functional roles that affect how genes associated with diseases contribute to their onset and progression. Grasping how specific cell types influence the action of disease-related genes is a complex and as-yet unresolved aspect of human genetics [1,2]. The Human Cell Atlas (HCA) project [3], which compiles extensive single-cell RNA-sequencing (scRNA-seq) data from healthy tissues, seeks to illuminate the relationship between cell types and disease through gene expression profiles [4]. However, the role of a disease gene within a specific cell type often extends beyond expression levels to its position and influence within a gene network, known as network centrality. To address this, we need a network-focused approach for mapping disease genes to their functional landscapes within specific cell types. In response, we have crafted scHumanNet [5], leveraging HumanNet [6] as the foundational interactome, refining connections based on cell-to-cell variation in gene expression [7]. This framework allows us to discern the network topologies of disease genes by contrasting cell type-specific gene networks (CGNs) [8] from healthy versus diseased tissues, leading to the identification of the cell types wherein disease genes are influential. Typically, this involves creating CGNs for both healthy and diseased tissue samples. Identifying altered cellular states in diseases typically requires comparisons with matched control samples, which can entail additional costs, efforts, and sometimes may even be unavailable. This challenge can be mitigated if there is access to a comprehensive collection of reference cells from healthy individuals, like a cell atlas. Such a resource could potentially eliminate the need to generate control samples. In a similar vein, a network atlas that offers reference CGNs for a diverse array of cell and tissue types from healthy individuals could significantly streamline the investigation of disease-state network alterations. Moreover, while individual gene expression values are prone to aggregate according technical origins across different batches confounding biological variations, the inference of gene associations with adequate cell number is generally less affected by such variances especially when large number of samples are used (e.g., cells) [9]. This is because co-expression signals are intrinsically normalized within each batch, making them more reliable for network comparison. Therefore, we propose that an integrated collection of CGNs, derived from a cell atlas, would constitute a robust framework for cell type-specific analysis of disease genes. This approach would circumvent the need for matched control samples, offering a more efficient route to understanding disease mechanisms. Here, we introduce HCNetlas (human cell network atlas), a collection of reference CGNs from a wide array of healthy tissue cells, which parallels the HCA in its potential impact on disease research. By utilizing these reference CGNs, it becomes feasible to uncover associations between disease genes and specific cell types, relying solely on the availability of disease sample data. This approach also bypasses the necessity to infer CGNs from matched control samples, streamlining the process of identifying disease-specific gene interactions within a specific cell type. HCNetlas currently includes 198 CGNs covering 61 cell types across 25 tissues (Table 1). We clustered the CGNs based on their disease profiles and observed the formation of groups comprising similar cell types. This clustering pattern indicates the potential of these CGNs to effectively resolve the cell type specificity of disease gene functions. Additionally, we implemented 3 network-based methods for assessing the cell type-specific functions of disease genes. Utilizing this analytical framework on both reference CGNs and those derived from diseased tissues enabled us to pinpoint cell types implicated in various diseases, thereby validating the effectiveness of our approach in cell type-resolved disease genetics. Consequently, HCNetlas holds great promise in expediting the discovery of biomarkers and therapeutic targets that are specifically tailored to the cellular context of disease genes, offering a refined perspective on the intricate web of cell type-specific gene actions in human diseases. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. Tissue abbreviations used in this study. https://doi.org/10.1371/journal.pbio.3002702.t001 Methods Single-cell and single-nucleus transcriptomic data used for HCNetlas construction We employed both single-cell and single-nucleus RNA sequencing (scRNA-seq and snRNA-seq) data from multiple sources to reconstruct HCNetlas CGNs. We acquired scRNA-seq data for 329,762 immune cells spanning 16 tissues from 12 deceased donors in a cross-tissue immune cell atlas [10]. We used the author-provided cell type labels, which were initially annotated by CellTypist and subsequently underwent manual curation. Additionally, we integrated data from Tabula Sapiens [11], which included the human transcriptome reference for 249,961 immune cells across 24 tissues from 15 donors. For constructing brain CGNs, we utilized snRNA-seq data of Allen Brain Atlas (http://www.brain-map.org) [12], derived from 76,533 nuclei in the primary motor cortex (M1) and 166,868 nuclei in the middle temporal gyrus (MTG) using 10× Genomics Chromium platform. All data were processed through alignment, quantification by Cell Ranger, and cell type annotation. We harmonized pre-annotated cell types across all data sets and applied scHumanNet (v. 1.0.0) [5] to the single-cell transcriptomic data to generate CGNs for various tissues and cell types. Briefly, scHumanNet takes a single-cell transcriptome count matrix for a specific cell type, followed by data imputation, transformation, and normalization. It then produces a transformed gene activity score matrix and filters the reference interactome, HumanNet, for gene pairs in the matrix that pass a minimum activity score, creating a CGN. The edge weight scores are inherited from the reference interactome. scHumanNet has been shown to be a superior network inference model compared to other single-cell based methods. To construct CGNs for major cell types, we aggregated sub-cell types into broader categories, such as T cells, B cells, and myeloid cells. In total, 198 CGNs were generated, encompassing 25 tissues and 61 cell types. Evaluating the cell type-specificity of CGNs We performed dimension reduction analysis with uniform manifold approximation and projection (UMAP) to visualize interrelationships among HCNetlas CGNs. Utilizing scHumanNet for CGN construction, which references the HumanNet [6] comprising 18,593 human genes, we generated binary profile vectors indicating the presence (1) or absence (0) of each gene in the network for every CGN. For visualizations, we employed the UMAP package in R (v. 0.2.10), setting min_dist to 0.5 to balance the trade-off between local and global structure in the data. To determine the cell type-specific functionality of the HCNetlas CGNs, we explored the enrichment of cell type-associated genes, particularly for B cells and T cells. We collated cell type-associated genes from 3 authoritative databases: Gene Ontology biological process (GOBP), CellMarker, and the Azimuth cell type database [13–15]. We postulated that genes functionally connected within a CGN reflect the properties of their respective cell type, thus we considered the interconnectivity within genes for each cell type as a measure of the cell type specificity. Additionally, we identified hub genes within each CGN, which are likely to play pivotal roles in the function of their corresponding cell type. We profiled the top 15 hub genes, ranked by degree centrality, for each major cell type across various tissues using the FindAllHub() function of the scHumanNet. FindAllHub() function randomly shuffles the constructed network and creates a null distribution of centrality values. This enables the calculation of statistical significance for the hubness of a gene of interest. This profiling helped to ascertain the relative importance of these hub genes within the network. Overall assessment of the association between CGNs and diseases We assessed whether HCNetlas CGNs can discern connections between diseases and cell types by profiling CGNs with disease-association scores. Within each CGN, we ranked the 18,593 genes from the HumanNet reference interactome by degree centrality, utilizing the GetCentrality() function of the scHumanNet. Disease gene sets, totaling 5,763, were compiled from DisGeNET [16] and GWAS Catalog [17] for the analysis. Assessment of disease gene set association with each CGN was performed with ssGSEA (v. 2.0) [18] and GSVA [19], the latter via the gsva() function from the GSVA package (v. 1.38.2). Hierarchical clustering of disease gene sets and CGNs was performed using the R package ComplexHeatmap, with the default parameters, where cluster.rows and cluster.cols were both set to TRUE. Furthermore, we conducted differential compactness analysis on the HCNetlas CGNs using the same disease gene sets. For each gene set, we calculated the within-group connectivity across all 198 networks to gauge network compactness. To accommodate variations in network size, we normalized the within-group connectivity by the number of nodes in each network and then scaled these normalized values by multiplying them by 10,000 to ensure a consistent basis for comparison across all networks. Cell type-resolved genetic analysis of systemic lupus erythematosus (SLE) with HCNetlas We acquired scRNA-seq data for peripheral blood mononuclear cells (PBMCs) from 41 patients with SLE and 15 healthy controls as reported by Nehar-Belaid and colleagues [20]. To ensure a consistent analysis, we excluded 2 SLE patient samples lacking SLEDAI scores. Following quality control measures, including the removal of doublets using the DoubletFinder (v. 2.0.0) [21] package and the exclusion of cells with fewer than 400 transcripts or over 5% mitochondrial gene content, the data set was narrowed to approximately 276,000 cells. After normalization and scaling with Seurat package (v. 4.1.1), we identified 3,000 variable genes using the vst() and FindVariableFeatures() functions. Batch effects were mitigated by applying principal component analysis (PCA) and Harmony (v. 1.0) [22] (dims = 40), and cellular clustering was performed using the Louvain method (resolution = 1.5), followed by UMAP visualization with 40 dimensions. Cell types were manually annotated using canonical markers after optimizing the number of principal components and clustering resolution. For constructing SLE-specific CGNs, we focused on the curated data from SLE patients. We built networks for B cells, CD4+ T cells, CD8+ T cells, myeloid cells, and NK cells. Using the Compactness() function, we performed differential compactness analysis. We referenced 184 SLE-associated genes from KEGG pathway (I05322) and KEGG disease (H00080) databases, comparing connectivity within disease CGNs and HCNetlas CGNs, and visualized the networks in Cytoscape (v 3.9.1) [23]. Node centrality within these networks was computed using the GetCentrality() function from the scHumanNet package. We compared the percentile ranks of centrality for disease CGNs against the reference CGNs of HCNetlas using the DiffPR.HCNetlas() function. Genes showing differential hubness were pinpointed with FindDiffHub.HCNetlas(), with significance defined by a q-value < 0.05 after Benjamini–Hochberg correction to control false discovery rate (FDR). We compiled interferon-stimulated genes (ISGs) from hallmark gene sets of the molecular signature database (MSigDB) and the Immunological Genome Project (ImmGen) [24], resulting in a total of 423 ISGs. The efficacy of prediction for ISGs by hubness within SLE-associated CGNs was assessed using receiver operating characteristic (ROC) curve analysis. The ROC curve was generated using the roc() function from the pROC package (v. 1.18.0). To explore the diagnostic potential of gain-of-hubness genes, we computed an expression score for the genes in myeloid cells via AddModuleScore() of the Seurat package and evaluated the differences in distribution of expression values between patients and healthy controls using the Wilcoxon signed-rank test. Differentially expressed gene (DEG) analysis was performed by merging Seurat objects containing HCNetlas healthy tissue data with disease scRNA-seq data. After normalization and scaling by a factor of 10,000, we identified 2,000 variable genes. DEGs were pinpointed for key cell types using Seurat’s FindMarkers() function, considering genes with an adjusted p-value < 0.05 and an absolute log2-fold change > 0.5, focusing solely on coding genes. Cell type-resolved genetic analysis of Alzheimer’s disease (AD) with HCNetlas In our study of AD, we used snRNA-seq data from 12 patients with annotated cell types from Morabito and colleagues [25]. Since the data were derived from the prefrontal cortex of brain tissues, the generated CGNs for AD were compared with reference CGNs for the primary motor cortex (M1) from the HCNetlas. We grouped the cell type annotations into 4 main categories: astrocytes, inhibitory neurons, excitatory neurons, and oligodendrocytes. The identification of differential hubness genes and DEGs within these cell types was carried out using the same methodology applied in the analyses of SLE. To ascertain the relevance of AD-associated genes predicted by our differential hubness analysis, we referenced genes linked to AD in the KEGG pathway (M16024), MSigDB (M35868), and Wightman and colleagues [26]. Evaluation of differentially associated pathways between reference and disease CGNs Considering the association of gain-of-hubness and loss-of-hubness genes with AD in inhibitory and excitatory neurons, we constructed network-ranked signatures for both reference and AD-specific CGNs for the cell types. The signature genes were based on the top 10 hub genes by degree centrality within each CGN. The networks of these top-tier hub genes were visualized using the Cytoscape software [23]. Furthermore, we conducted gene set enrichment analysis (GSEA) on these network-ranked signatures using the enrichR package [27]. To evaluate the pathways differentially associated between disease CGNs and reference CGNs, we introduced a metric called diffQ, calculated as follows: In this formula, a positive diffQ value signifies that a pathway is more strongly associated with the cell type in its diseased state than in its healthy state (gain-of-pathway). Conversely, a negative diffQ value indicates greater association with the cell type in its healthy state as compared to its diseased state (loss-of-pathway). To emphasize the most significantly altered pathways, we focused on the top 10 KEGG pathways with the highest absolute diffQ values. This approach effectively pinpoints the key molecular pathways involved in the pathogenesis of AD. Cell type-resolved genetic analysis of lung cancer using HCNetlas To create lung cancer-specific CGNs, we used scRNA-seq data from 29 tumor tissues provided by Qian and colleagues [28]. We retained the pre-annotated cell type identifications from the data sets. For comparison with reference CGNs derived from paired normal tissues, we constructed networks from both the lung cancer and healthy control data. Following the scHumanNet protocol, we generated networks and defined differential hubness genes using FindDiffHub(). The process of identifying differential hubness genes within each cell type was conducted using the same methodology employed in the SLE analyses. Similarly, the identification of DEGs followed the methodology used in the SLE studies, with the exception that genes exhibiting an absolute log2-fold change < −1.5 were categorized as down-regulated DEGs. To validate the lung cancer relevance of the identified genes, we referenced the Cancer Gene Census, CancerMine, and IntOGen databases [29–31]. We assessed the proportion of lung cancer-associated genes detected uniquely through differential hubness, uniquely through DEGs, and by the intersection of both methods. Furthermore, we analyzed 42 immune checkpoint molecules listed by Auslander and collagues [32] to determine if cell type-specific genes vital for cancer immunity are discernible through both expression-based and network-based analyses. We investigated the prognostic potential of genes identified by cell type-specific differential hubness and differential expression using survival analysis on TCGA lung cancer data sets (TCGA-LUSC, TCGA-LUAD). Initially, we identified a total of 379 gain-of-hubness genes and 211 up-regulated DEGs from 3 major cell types: B cells, T cells, and myeloid cells. Subsequently, genomic and clinical data for 1,017 lung cancer samples were acquired from the GDC portal [33]. The STAR-Counts data underwent preprocessing, log-normalization, and variance stabilization using the vst() function in the DESeq2 R package (v. 1.30.1). With the application of GSVA [19], we evaluated the association of both gain-of-hubness genes and up-regulated DEGs with each tumor expression profile of the patients. Patients were then classified into upper and lower quartile groups based on their GSVA scores. These groups were further examined through Kaplan–Meier survival curves. To ensure the reliability of our findings, we adjusted all p-values obtained from the survival analysis using the Benjamini–Hochberg method to control the FDR. Single-cell and single-nucleus transcriptomic data used for HCNetlas construction We employed both single-cell and single-nucleus RNA sequencing (scRNA-seq and snRNA-seq) data from multiple sources to reconstruct HCNetlas CGNs. We acquired scRNA-seq data for 329,762 immune cells spanning 16 tissues from 12 deceased donors in a cross-tissue immune cell atlas [10]. We used the author-provided cell type labels, which were initially annotated by CellTypist and subsequently underwent manual curation. Additionally, we integrated data from Tabula Sapiens [11], which included the human transcriptome reference for 249,961 immune cells across 24 tissues from 15 donors. For constructing brain CGNs, we utilized snRNA-seq data of Allen Brain Atlas (http://www.brain-map.org) [12], derived from 76,533 nuclei in the primary motor cortex (M1) and 166,868 nuclei in the middle temporal gyrus (MTG) using 10× Genomics Chromium platform. All data were processed through alignment, quantification by Cell Ranger, and cell type annotation. We harmonized pre-annotated cell types across all data sets and applied scHumanNet (v. 1.0.0) [5] to the single-cell transcriptomic data to generate CGNs for various tissues and cell types. Briefly, scHumanNet takes a single-cell transcriptome count matrix for a specific cell type, followed by data imputation, transformation, and normalization. It then produces a transformed gene activity score matrix and filters the reference interactome, HumanNet, for gene pairs in the matrix that pass a minimum activity score, creating a CGN. The edge weight scores are inherited from the reference interactome. scHumanNet has been shown to be a superior network inference model compared to other single-cell based methods. To construct CGNs for major cell types, we aggregated sub-cell types into broader categories, such as T cells, B cells, and myeloid cells. In total, 198 CGNs were generated, encompassing 25 tissues and 61 cell types. Evaluating the cell type-specificity of CGNs We performed dimension reduction analysis with uniform manifold approximation and projection (UMAP) to visualize interrelationships among HCNetlas CGNs. Utilizing scHumanNet for CGN construction, which references the HumanNet [6] comprising 18,593 human genes, we generated binary profile vectors indicating the presence (1) or absence (0) of each gene in the network for every CGN. For visualizations, we employed the UMAP package in R (v. 0.2.10), setting min_dist to 0.5 to balance the trade-off between local and global structure in the data. To determine the cell type-specific functionality of the HCNetlas CGNs, we explored the enrichment of cell type-associated genes, particularly for B cells and T cells. We collated cell type-associated genes from 3 authoritative databases: Gene Ontology biological process (GOBP), CellMarker, and the Azimuth cell type database [13–15]. We postulated that genes functionally connected within a CGN reflect the properties of their respective cell type, thus we considered the interconnectivity within genes for each cell type as a measure of the cell type specificity. Additionally, we identified hub genes within each CGN, which are likely to play pivotal roles in the function of their corresponding cell type. We profiled the top 15 hub genes, ranked by degree centrality, for each major cell type across various tissues using the FindAllHub() function of the scHumanNet. FindAllHub() function randomly shuffles the constructed network and creates a null distribution of centrality values. This enables the calculation of statistical significance for the hubness of a gene of interest. This profiling helped to ascertain the relative importance of these hub genes within the network. Overall assessment of the association between CGNs and diseases We assessed whether HCNetlas CGNs can discern connections between diseases and cell types by profiling CGNs with disease-association scores. Within each CGN, we ranked the 18,593 genes from the HumanNet reference interactome by degree centrality, utilizing the GetCentrality() function of the scHumanNet. Disease gene sets, totaling 5,763, were compiled from DisGeNET [16] and GWAS Catalog [17] for the analysis. Assessment of disease gene set association with each CGN was performed with ssGSEA (v. 2.0) [18] and GSVA [19], the latter via the gsva() function from the GSVA package (v. 1.38.2). Hierarchical clustering of disease gene sets and CGNs was performed using the R package ComplexHeatmap, with the default parameters, where cluster.rows and cluster.cols were both set to TRUE. Furthermore, we conducted differential compactness analysis on the HCNetlas CGNs using the same disease gene sets. For each gene set, we calculated the within-group connectivity across all 198 networks to gauge network compactness. To accommodate variations in network size, we normalized the within-group connectivity by the number of nodes in each network and then scaled these normalized values by multiplying them by 10,000 to ensure a consistent basis for comparison across all networks. Cell type-resolved genetic analysis of systemic lupus erythematosus (SLE) with HCNetlas We acquired scRNA-seq data for peripheral blood mononuclear cells (PBMCs) from 41 patients with SLE and 15 healthy controls as reported by Nehar-Belaid and colleagues [20]. To ensure a consistent analysis, we excluded 2 SLE patient samples lacking SLEDAI scores. Following quality control measures, including the removal of doublets using the DoubletFinder (v. 2.0.0) [21] package and the exclusion of cells with fewer than 400 transcripts or over 5% mitochondrial gene content, the data set was narrowed to approximately 276,000 cells. After normalization and scaling with Seurat package (v. 4.1.1), we identified 3,000 variable genes using the vst() and FindVariableFeatures() functions. Batch effects were mitigated by applying principal component analysis (PCA) and Harmony (v. 1.0) [22] (dims = 40), and cellular clustering was performed using the Louvain method (resolution = 1.5), followed by UMAP visualization with 40 dimensions. Cell types were manually annotated using canonical markers after optimizing the number of principal components and clustering resolution. For constructing SLE-specific CGNs, we focused on the curated data from SLE patients. We built networks for B cells, CD4+ T cells, CD8+ T cells, myeloid cells, and NK cells. Using the Compactness() function, we performed differential compactness analysis. We referenced 184 SLE-associated genes from KEGG pathway (I05322) and KEGG disease (H00080) databases, comparing connectivity within disease CGNs and HCNetlas CGNs, and visualized the networks in Cytoscape (v 3.9.1) [23]. Node centrality within these networks was computed using the GetCentrality() function from the scHumanNet package. We compared the percentile ranks of centrality for disease CGNs against the reference CGNs of HCNetlas using the DiffPR.HCNetlas() function. Genes showing differential hubness were pinpointed with FindDiffHub.HCNetlas(), with significance defined by a q-value < 0.05 after Benjamini–Hochberg correction to control false discovery rate (FDR). We compiled interferon-stimulated genes (ISGs) from hallmark gene sets of the molecular signature database (MSigDB) and the Immunological Genome Project (ImmGen) [24], resulting in a total of 423 ISGs. The efficacy of prediction for ISGs by hubness within SLE-associated CGNs was assessed using receiver operating characteristic (ROC) curve analysis. The ROC curve was generated using the roc() function from the pROC package (v. 1.18.0). To explore the diagnostic potential of gain-of-hubness genes, we computed an expression score for the genes in myeloid cells via AddModuleScore() of the Seurat package and evaluated the differences in distribution of expression values between patients and healthy controls using the Wilcoxon signed-rank test. Differentially expressed gene (DEG) analysis was performed by merging Seurat objects containing HCNetlas healthy tissue data with disease scRNA-seq data. After normalization and scaling by a factor of 10,000, we identified 2,000 variable genes. DEGs were pinpointed for key cell types using Seurat’s FindMarkers() function, considering genes with an adjusted p-value < 0.05 and an absolute log2-fold change > 0.5, focusing solely on coding genes. Cell type-resolved genetic analysis of Alzheimer’s disease (AD) with HCNetlas In our study of AD, we used snRNA-seq data from 12 patients with annotated cell types from Morabito and colleagues [25]. Since the data were derived from the prefrontal cortex of brain tissues, the generated CGNs for AD were compared with reference CGNs for the primary motor cortex (M1) from the HCNetlas. We grouped the cell type annotations into 4 main categories: astrocytes, inhibitory neurons, excitatory neurons, and oligodendrocytes. The identification of differential hubness genes and DEGs within these cell types was carried out using the same methodology applied in the analyses of SLE. To ascertain the relevance of AD-associated genes predicted by our differential hubness analysis, we referenced genes linked to AD in the KEGG pathway (M16024), MSigDB (M35868), and Wightman and colleagues [26]. Evaluation of differentially associated pathways between reference and disease CGNs Considering the association of gain-of-hubness and loss-of-hubness genes with AD in inhibitory and excitatory neurons, we constructed network-ranked signatures for both reference and AD-specific CGNs for the cell types. The signature genes were based on the top 10 hub genes by degree centrality within each CGN. The networks of these top-tier hub genes were visualized using the Cytoscape software [23]. Furthermore, we conducted gene set enrichment analysis (GSEA) on these network-ranked signatures using the enrichR package [27]. To evaluate the pathways differentially associated between disease CGNs and reference CGNs, we introduced a metric called diffQ, calculated as follows: In this formula, a positive diffQ value signifies that a pathway is more strongly associated with the cell type in its diseased state than in its healthy state (gain-of-pathway). Conversely, a negative diffQ value indicates greater association with the cell type in its healthy state as compared to its diseased state (loss-of-pathway). To emphasize the most significantly altered pathways, we focused on the top 10 KEGG pathways with the highest absolute diffQ values. This approach effectively pinpoints the key molecular pathways involved in the pathogenesis of AD. Cell type-resolved genetic analysis of lung cancer using HCNetlas To create lung cancer-specific CGNs, we used scRNA-seq data from 29 tumor tissues provided by Qian and colleagues [28]. We retained the pre-annotated cell type identifications from the data sets. For comparison with reference CGNs derived from paired normal tissues, we constructed networks from both the lung cancer and healthy control data. Following the scHumanNet protocol, we generated networks and defined differential hubness genes using FindDiffHub(). The process of identifying differential hubness genes within each cell type was conducted using the same methodology employed in the SLE analyses. Similarly, the identification of DEGs followed the methodology used in the SLE studies, with the exception that genes exhibiting an absolute log2-fold change < −1.5 were categorized as down-regulated DEGs. To validate the lung cancer relevance of the identified genes, we referenced the Cancer Gene Census, CancerMine, and IntOGen databases [29–31]. We assessed the proportion of lung cancer-associated genes detected uniquely through differential hubness, uniquely through DEGs, and by the intersection of both methods. Furthermore, we analyzed 42 immune checkpoint molecules listed by Auslander and collagues [32] to determine if cell type-specific genes vital for cancer immunity are discernible through both expression-based and network-based analyses. We investigated the prognostic potential of genes identified by cell type-specific differential hubness and differential expression using survival analysis on TCGA lung cancer data sets (TCGA-LUSC, TCGA-LUAD). Initially, we identified a total of 379 gain-of-hubness genes and 211 up-regulated DEGs from 3 major cell types: B cells, T cells, and myeloid cells. Subsequently, genomic and clinical data for 1,017 lung cancer samples were acquired from the GDC portal [33]. The STAR-Counts data underwent preprocessing, log-normalization, and variance stabilization using the vst() function in the DESeq2 R package (v. 1.30.1). With the application of GSVA [19], we evaluated the association of both gain-of-hubness genes and up-regulated DEGs with each tumor expression profile of the patients. Patients were then classified into upper and lower quartile groups based on their GSVA scores. These groups were further examined through Kaplan–Meier survival curves. To ensure the reliability of our findings, we adjusted all p-values obtained from the survival analysis using the Benjamini–Hochberg method to control the FDR. Results HCNetlas: A catalog of reference CGNs for various healthy human tissues To build reference CGNs, we utilized scRNA-seq data from immune cell atlas project [10] and The Tabula Sapiens [11], as well as single-nucleus RNA-sequencing (snRNA-seq) data from the Allen Brain Atlas [12]. Our single-cell transcriptomic data set comprised 763,559 cells from 28 donors. We generated gene networks for each predefined cell type using the scHumanNet framework [5] (Fig 1A), providing a comprehensive baseline for identifying disease-associated genes and cell types. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Overview of HCNetlas. (A) Schematic representation of the workflow from single-cell transcriptomic data collection to the construction of the HCNetlas. Single-cell RNA sequencing data preannotated for cell type were used to build CGNs using the scHumanNet framework. HCNetlas is comprised of a comprehensive collection of these gene networks, representing various human tissues and cell types. (B) UMAP visualization of CGNs based on gene profiles, highlighting the major cell lineages, with node size representing the number of genes in each network. Major cell type “Other” in gray (major cell type abbreviations; B; B cells, Br; Brain cells, My; Myeloid cells, T; T cells). (C) UMAP plot displaying the interrelationship among the CGNs based on network gene profiles for major organs or tissue types. Each point represents a gene network associated with a specific organ or tissue type colored distinctly. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. CGN, cell type-specific gene network; HCNetlas, human cell network atlas; UMAP, uniform manifold approximation and projection. https://doi.org/10.1371/journal.pbio.3002702.g001 When constructing CGNs, the number of cells used can influence the efficacy of network inference. To investigate this aspect, we conducted an analysis of CGNs derived from various cell atlas data sets, specifically examining how the number of cells used for network inference correlates with the overall network size. Our findings indicated a clear trend: as the number of cells increases, there is a corresponding rise in both the node and edge counts within the inferred CGNs. However, this growth in network complexity tends to plateau once the cell count reaches approximately 1,000 (S1A Fig). The observed saturation point suggests that the inference of CGNs becomes substantially robust to the effects of sample size when the number of cells exceeds 1,000. Based on this insight, we focused our study on networks inferred from data sets comprising a minimum of 1,000 cells. This led to the generation of 198 CGNs, covering 25 tissues and including 61 distinct cell types (S1 Table). These networks form our newly established resource, HCNetlas, a catalog of human CGNs for healthy tissues. To examine the interrelationships among the CGNs in our HCNetlas, we analyzed each CGN based on network gene profiles, subsequently visualizing these profiles in a reduced dimensional space. This analysis demonstrated a clear trend where CGNs corresponding to the same cell types exhibited a tendency to cluster together (Fig 1B), reinforcing the concept that these networks accurately capture and reflect the specificity inherent to each cell type. Notably, CGNs within the myeloid and B cell lineages showed remarkable coherence, in contrast to the T cell lineage CGNs, which exhibited greater heterogeneity. An interesting observation was the close proximity of innate lymphoid cell (ILC) and natural killer (NK) CGNs (S1B Fig), underscoring their lineage correlations [34]. However, CGNs related to the same tissue types generally did not demonstrate strong clustering with the exception brain tissue network nodes that displayed high similarity (Fig 1C), suggesting that cell type identity is a stronger determinant of network structure than tissue environment. This was further evidenced in the T cell lineage, including ILCs, NK cells, CD4+ T cells, and CD8+ T cells, where subsets exhibited coherence within cell types but not necessarily within tissue types (S1B and S1C Fig). This aligns with recent studies that emphasize tissue or sub-cell type-dependent variability in T cells [35–37]. These findings highlight the utility of HCNetlas as a potentially powerful tool for investigating cell type-specific gene functions. Assessing the cell type-specific functionality of HCNetlas CGNs To evaluate whether the reference CGNs of HCNetlas accurately reflect cell type-specific functions, we conducted tests using 2 distinct immune cell types from different lineages: B cells and T cells. The premise of this test was that if the HCNetlas CGNs are effective in representing cellular functions unique to each cell type, then genes for maintaining the identity and function of each cell type should demonstrate interconnectedness within their respective networks. As anticipated, our analysis showed that genes specifically annotated for either B cells or T cells by the GOBP [38] exhibited the highest within-group connectivity in their respective CGNs across various tissues (Fig 2A and 2B). This pattern of connectivity was further validated by comparing it with cell type marker genes as identified in the Azimuth database [13] and the CellMarker database [14] (S2A Fig). These findings underscore the ability of HCNetlas CGNs to capture and represent the unique functional characteristics inherent to specific cell types. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Cell type-specific functionality of HCNetlas CGNs. (A, B) Bar graph illustrating the within-group connectivity of B cell-related (a) or T cell-related (b) GOBP genes in the respective CGNs. Connectivity is normalized by each CGN’s total node number. All tissues with over 0 value of normalized connectivity for both B and T cells are included. (C) Heatmap displaying the percentile rank of the top 15 hub genes in spleen CGNs, with values scaled per row, with color intensity indicating the expression level from low (blue) to high (red). Genes uniquely found in each cell type are highlighted in bold. (D, E) UpSet plots for 2 major CGNs, B cell CGN (D), and T cell CGN (E), representing the intersection of network genes across different tissues. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. https://doi.org/10.1371/journal.pbio.3002702.g002 Moreover, we investigated the network hub genes within each CGN, identified based on degree centrality (Fig 2C). For instance, in spleen CGNs, CD86, which is pivotal in B cell activation [39], emerged as top hub genes in the B cell CGN. Similarly, genes essential for T cell identity like CD2, CD4, and CD28 were among the top 15 hub genes in the T cell CGN [40]. Additionally, S100A8, S100A9, CD14, markers for monocytic myeloid-derived suppressor cells were prominent hubs in the monocyte CGNs [41]. These patterns of hub genes, significant due to their high degree centrality, were consistent across various tissues (S2B Fig), underlining the functional interpretability of these hub genes in the context of their respective cell types. Lastly, to assess the tissue dependency of the HCNetlas CGNs, we compared CGN genes for each major cell type across different tissues. We observed limited overlap in CGN genes among different tissues within major cell types (Figs 2D, 2E, and S3), suggesting a convergence of networks across tissues within major cell lineages, aligning with findings from previous studies [10,35]. These observations indicate that while there are core gene networks characteristic of each cell type, tissue adaptation of CGNs is also evident, underscoring the complexity and diversity of cellular functions across different biological contexts. HCNetlas as a tool for unraveling cell type specificity of disease genes The HCNetlas, with its collection of reference CGNs, presents a promising resource for dissecting the cellular specificity of disease genes. The majority of disease-associated genes identified to date have been derived from bulk tissue data, which often fails to specify the exact cell types involved in disease onset and progression. In this scenario, HCNetlas CGNs become instrumental in pinpointing the critical cell types at play. To ascertain the effectiveness of HCNetlas CGNs in disease-oriented research, we embarked on an investigation to determine if these CGNs capture and reflect the cell type specificity of various diseases. This involved conducting enrichment analyses on the CGNs using disease-associated genes sourced from 2 distinct databases: DisGeNET [16] and GWAS Catalog [17]. We began by ranking genes within each CGN based on network degree centrality, and then applied single-sample gene set enrichment analysis (ssGSEA) [18] and gene set variation analysis (GSVA) [19] to profile degree of association with each set of disease genes. Our analysis revealed a distinct pattern of congregation among CGNs corresponding to shared cell types, as determined by disease-association profiles (Figs 3A and S4A). This finding was particularly notable within cell types, whereas the convergence of networks corresponding to the same tissue types was less pronounced, indicating the specificity of cell types in the context of disease genetics. We next evaluated the connectivity among genes associated with the same disease within CGNs across different tissues. Our hypothesis was that genes would exhibit more interconnectedness in the relevant cell types and tissues primarily responsible for diseases. This analysis aimed to elucidate the relationships between specific diseases and their associated cell types or tissues. While not all diseases we considered manifests cell type specificity, we noticed that CGNs predicted similar disease gene enrichment patterns in tissues such as the intestine and the liver (Fig 3A). A case in point is hepatitis-related terms, where genes associated with this condition showed the most significant within-group connectivity in liver CGNs of most major immune cell types (Fig 3B). Noteworthy was the pronounced within-group connectivity observed in both myeloid cell and T cell CGNs, highlighting the integral role of T cells in viral infectious diseases and the contribution of Kupffer cells (resident liver macrophages) to hepatitis [42]. This finding indicates that HCNetlas effectively identifies relevant cell types and tissues implicated in hepatitis. Furthermore, genes related to schizophrenia showed increased within-group connectivity across brain tissues, particularly in the primary motor cortex (M1) and MTG CGNs (S4B Fig). Additionally, AD-associated genes from the GWAS were found to be highly connected within the macrophage and myeloid CGNs (S4C Fig), consistent with the association between AD risk and macrophage transcriptional network [43]. These observations underscore the potential of HCNetlas CGNs as a valuable resource for uncovering intricate relationships between diseases and specific cell types or tissues, thereby enhancing our understanding of disease pathology at a cellular level. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Overview of cell type-resolved disease genetics using HCNetlas. (A) Heatmap displaying the disease profiles of various cell types across different tissues, conducted with ssGSEA. Each column represents a CGN of HCNetlas, while each row corresponds to a disease gene set sourced from either DisGeNET or GWAS Catalog, ordered by hierarchical clustering. Color intensity indicates scaled ssGSEA enrichment values for the disease gene sets (rows), reflecting the degree of association of the CGN signature genes with each disease terms. (B) Bar graphs showing the within-group connectivity of genes associated with toxic hepatitis across different cell lineages, in various tissues. The bars are color-coded to represent different cell lineages; 15 terms and their combined genes were assessed from DisGeNET terms based on the key word search “hepatitis.” (C) Schematic representation and summary of the analytical framework used for comparing disease CGNs with reference CGNs from HCNetlas. The workflow illustrates the process of CGN inference from single-cell transcriptomes of disease samples and contrasts disease CGNs for specific immune cells against reference CGNs. The analysis includes (i) differential compactness, highlighting the difference in interconnectivity within disease-associated genes; (ii) differential hubness, showing the changes in hubness; and (iii) differential pathways, contrasting pathway associations between disease and healthy states based on enrichment for CGN signature genes. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. CGN, cell type-specific gene network; HCNetlas, human cell network atlas; ssGSEA, single-sample gene set enrichment analysis. https://doi.org/10.1371/journal.pbio.3002702.g003 HCNetlas, having proven its functional and biological relevance, is posited to be an effective reference for network analyses in disease studies. To enhance their utility, we have developed a suite of network analysis methodologies (Fig 3C) and applied them to investigate various diseases, showcasing the adaptability of HCNetlas CGNs. Firstly, if we have a set of disease genes, determining the specific cell type where these genes predominantly influence disease progression is crucial. To evaluate the functional role of these disease genes in a targeted cell type, we have developed an approach known as differential compactness analysis. This method compares the degree of interconnectivity among disease genes between the reference CGNs in HCNetlas and their corresponding disease CGNs derived from disease samples. In this framework, “gain-of-compactness” denotes an enhanced interconnectivity of disease genes within disease CGNs, suggesting an increased functional role in the disease context. Conversely, “loss-of-compactness” implies a reduced interconnectivity. Through this analysis, we can gain insights into which cell type the disease genes are actively involved and determine whether their impact on the disease state is characterized by a gain or loss of function. Secondly, to identify disease genes and ascertain the cell type implicated in the disease, focusing on genes exhibiting significant differences in network centrality between diseased and healthy states can be insightful. Therefore, we prioritize genes based on differential hubness between disease CGNs and reference CGNs. This methodology involves categorizing genes into 2 distinct groups: “gain-of-hubness” and “loss-of-hubness.” Genes in the “gain-of-hubness” category show increased centrality in disease CGNs compared to reference CGNs, indicating a heightened role in the disease state. Conversely, genes in the “loss-of-hubness” category demonstrate decreased centrality in disease CGNs, suggesting a reduced or altered function in the disease context. This approach effectively distinguishes genes that are central to disease mechanisms in specific cell types. Lastly, examining pathways that show differential associations between diseased and healthy states in cell types associated with the disease can provide insights into the molecular mechanisms underlying pathogenesis. To conduct the differential pathway analysis, we initially select signature genes representative of both diseased and healthy states for each cell type. This selection is based on identifying the top-ranked hub genes (for example, the top 10 hub genes) within both the disease CGN and the corresponding reference CGN. Subsequently, through gene set enrichment analysis, we aim to identify and prioritize pathways that are differentially associated between the disease and healthy states. In this context, “gain-of-pathways” refers to those pathways that show an increased association with the disease state in comparison to the healthy control. Conversely, “loss-of-pathways” denotes pathways that have a reduced association in the disease state compared to the healthy state. Identifying these differentially associated pathways enables us to formulate hypotheses that delve deeper into the molecular basis of pathogenesis in disease-associated cell types. Cell type-resolved genetic analysis of an autoimmune disease using HCNetlas Given that the majority of the CGNs provided by HCNetlas are derived from immune cells, this resource would be particularly valuable for studying immune disorders such as autoimmune diseases. To evaluate capability of HCNetlas to identify the specific immune cell types where disease genes have an impact on pathogenesis, we focused our research on SLE which is a chronic autoimmune disorder characterized by elusive pathogenesis, genetic susceptibility, and clinical heterogeneity [44]. For constructing disease CGNs for SLE, we manually annotated scRNA-seq data from 38 SLE patients [20] using canonical markers (Fig 4A). These disease CGNs were then compared with the blood cell CGNs from HCNetlas, providing insights into the cell type specificity underlying SLE pathogenesis. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Cell type-resolved disease genetics for SLE. (A) UMAP plot representing the interrelationship among immune cells. (B) Bar chart showing normalized interconnectivity among SLE-susceptible genes in both reference and disease CGNs for various cell types. (C) ROC curves for retrieval of ISGs by network hubness in disease myeloid CGN and reference myeloid CGN. (D) Comparison of AUROC values with CGNs for various cell types, contrasting the reference and disease CGNs to assess prediction capability. (E) Left panel: Distribution of expression levels for 131 gain-of-hubness genes in myeloid cells. Right panels: Correlation analysis of expression levels of the 131 gain-of-hubness genes with the SLEDAI. (F) Same as for (E) except using up-regulated DEGs. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. AUROC, area under the ROC curve; CGN, cell type-specific gene network; DEG, differentially expressed gene; ISG, interferon-stimulated gene; ROC, receiver operating characteristic; SLE, systemic lupus erythematosus; SLEDAI, Systemic Lupus Erythematosus Disease Activity Index; UMAP, uniform manifold approximation and projection. https://doi.org/10.1371/journal.pbio.3002702.g004 Leveraging the principle that increased network compactness among disease-associated genes within a CGN indicates their significant role in the pathogenesis for that cell type, we assessed the involvement of major immune cell types in SLE. We applied a set of SLE-susceptible genes (S2 Table), gathered from the KEGG pathway database, to both disease CGNs and reference CGNs. Our analysis revealed that network compactness in myeloid cells and B cells is significantly greater in the disease CGN compared to the reference CGN (Fig 4B). This suggests that SLE-susceptible genes are critically involved in the disease progression primarily through myeloid cells and B cells. Considering that genes associated with SLE predominantly exert their effects through myeloid cells, we prioritized genes for SLE based on network centrality within both disease and reference CGNs specifically pertaining to myeloid cells. Aligning with previous studies that emphasize the increased expression of type 1 interferon (IFN) and ISGs in SLE patients [20,45,46], we evaluate the prediction of SLE-associated genes based on retrieval rate of ISGs (S3 Table) using the ROC curve. Consistent with the greater network compactness of SLE-susceptible genes in the disease myeloid CGN relative to the reference myeloid CGN, our results showed a significantly improved prediction of ISGs in the disease myeloid CGN when compared to the reference myeloid CGN (Fig 4C). Likewise, for other cell types, disease CGNs demonstrated improved predictions of ISGs compared to the reference CGNs (Fig 4D and S4 Table). Taken together, these findings underscore the critical role of myeloid cells in the initiation and progression of SLE, corroborating previous research that highlights the link between SLE and myeloid cells [47–49]. Next, we hypothesized that gain-of-hubness genes for myeloid cells could effectively differentiate diseased myeloid cells from their healthy counterparts. To test this hypothesis, we initially identified a set of 131 gain-of-hubness genes with statistical significance (S5A Table). We then examined the distribution of their expression level between disease-state myeloid cells and their corresponding healthy controls. Our observations revealed a significant disparity between these 2 distributions, affirming the potential of these 131 gain-of-hubness genes to distinguish diseased myeloid cells (Fig 4E, left panel). Furthermore, we observed a positive correlation between the expression levels of these gain-of-hubness genes and the Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) scores, albeit with limited statistical power due to the small sample size (Fig 4E, right panel). This correlation indicates that the expression patterns of these 131 gain-of-hubness genes are not only distinctive of diseased states but may also reflect the severity of SLE in patients. In contrast, the up-regulated DEGs in disease-state myeloid cells (S5B Table) did not demonstrate the capability to either differentiate diseased myeloid cells (Fig 4F, left panel) or correlate with SLEDAI scores (Fig 4F, right panel). These outcomes imply that collections of CGNs for healthy tissues, such as those provided by HCNetlas, are apt references for identifying disease states. Cell type-resolved genetic analysis of a brain disease using HCNetlas HCNetlas offers an extensive collection of CGNs for brain tissue, making it a valuable resource for investigating neurological disorders. Alzheimer’s disease (AD), a widespread neurodegenerative condition known for its progressive impact on behavior and cognitive functions, is one such area where HCNetlas can be particularly useful. To study AD more closely, we have developed disease CGNs using single-nucleus RNA sequencing (snRNA-seq) data from the prefrontal cortex of AD patients [25]. These disease CGNs were compared with HCNetlas CGNs, which were derived from the primary motor cortex (M1). This comparison enables a detailed analysis of alterations in the gene network that is associated with AD, facilitating a deeper understanding of the disease progression and its impact on brain function. To identify the primary cell types impacted by AD-associated genes compiled from various sources (Methods, S6 Table), we employed differential compactness analysis. This analysis revealed that AD-associated genes exhibit a high degree of interconnectivity within the reference CGNs for both inhibitory and excitatory neurons (Fig 5A), suggesting that these genes predominantly function in these neuron types. Interestingly, we observed that the disease CGNs for inhibitory and excitatory neurons displayed notably lower network compactness scores compared to their reference counterparts (i.e., loss-of-compactness). This significant decrease in network compactness within the diseased neurons points to a loss of connections among AD-associated genes. Such a loss in the diseased state of inhibitory and excitatory neurons could be a critical factor in the pathogenesis of AD, indicating a disruption in the intricate gene networks that underlie normal neuronal function. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Cell type-resolved disease genetics for AD. (A) Bar graph depicting the normalized edge count among AD-associated genes in reference and disease CGNs for various neurological cell types. (B) Bar graph showing the number of validated AD genes predicted by differential expression or differential hubness across the 4 neurological cell types. Statistical significance of overlap is shown for each gene sets (*P < 0.05, ** P < 0.01, *** P < 0.001 by one-sided hypergeometric test). (C–F) Ten most differentially associated KEGG pathways with CGN signature genes: Differentially associated pathways for inhibitory neuron CGN signature (C), inhibitory neuron expanded CGN signature (D), excitatory neuron CGN signature (E), and excitatory neuron expanded CGN signature (F). The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. AD, Alzheimer’s disease; CGN, cell type-specific gene network. https://doi.org/10.1371/journal.pbio.3002702.g005 We then focused on prioritizing genes for AD by either differential expression or differential hubness between healthy and diseased states across each cell type (S7 Table). In alignment with the identified cell type specificity for AD, both inhibitory and excitatory neurons demonstrated a more accurate prediction of AD-related genes when analyzed for differential hubness rather than differential expression (Fig 5B, hypergeometric test P-value < 0.001). Interestingly, the predictive capacity using differential expression in these neuron types was lower compared to that achieved through differential hubness analysis. Additionally, this capacity was akin to what was observed in other cell types. This finding suggests that a network-based approach is more effective for predicting AD genes than methods solely based on expression, which tend to be less specific to AD-associated cell types. Notably, the overlap between gene predictions made using differential hubness and differential expression was minimal (S5A Fig), indicating that these 2 approaches are complementary to each other in identifying key genes associated with AD. To delve into the molecular mechanisms implicated in AD pathogenesis within inhibitory and excitatory neurons, we carried out a differential pathway analysis. This analysis was based on CGN signatures of these neurons, focusing on the top 10 genes ranked by hubness (S5B Fig). As anticipated, our analysis revealed that pathways associated with AD and other related neurodegenerative diseases, such as Parkinson’s disease and Huntington’s disease, were among those most prominently exhibiting a reduced association, or “loss-of-pathway,” in inhibitory neurons (Fig 5C). In addition to these, we identified several other pathways that exhibited loss-of-pathway in inhibitory neurons, and these findings were validated through a literature survey. The pathways that were validated to be associated with AD included oxidative phosphorylation [50], thermogenesis [51], non-alcoholic fatty liver disease [52,53], diabetic cardiomyopathy [54,55], amyotrophic lateral sclerosis [56], and prion disease [57]. Significantly, our analysis identified that the pathway related to the cholinergic synapse was the most notably increased in diseased inhibitory neurons. This finding is also relevant given the known involvement of the cholinergic signaling in AD [58]. We also performed our analysis using an expanded CGN signature that includes their network neighbors, which confirmed the initial list of top loss-of-pathways (Fig 5D). This reaffirms the importance of these pathways in the pathogenesis of AD, highlighting their potential roles in the disease’s progression and impact within inhibitory neurons. In our differential pathway analysis using CGN signatures for excitatory neurons, we predominantly observed gain-of-pathways. These are pathways showing increased activity in AD compared to the healthy state, findings which are substantiated by literature evidence (Fig 5E). For instance, the ErbB signaling pathway is known to mediate amyloid-β (Aβ)-induced neurotoxicity [59], and HIF-1 (hypoxia-inducible factor-1) signaling has been found to increase Aβ generation [60]. Additionally, a similar analysis with expanded CGN signatures for excitatory neurons revealed loss-of-pathways akin to those identified in inhibitory neurons (Fig 5F). Among these findings, the MAPK signaling pathway emerged as the most prominent gain-of-pathway. This is in alignment with previous research demonstrating that MKP-1, a crucial negative regulator of MAPKs, can reduce Aβ generation and alleviate cognitive impairments in AD models [61], thereby validating our observation. Investigating cell type-resolved lung cancer genetics using HCNetlas The tumor immune microenvironment has become increasingly recognized as a key hallmark of cancer. Considering this, we hypothesized that HCNetlas CGNs for immune cells could be instrumental in identifying cancer-associated genes that primarily function within the immune microenvironment. Focusing on lung cancer, we constructed disease CGNs for major immune cell types using single-cell transcriptome data derived from tumor tissues of lung cancer patients [28]. Through differential hubness analysis, compared to reference CGNs of corresponding cell types, we pinpointed gain-of-hubness genes predominantly in T cells and myeloid cells, many of which are known to be associated with lung cancer (Fig 6A and S8A Table). Interestingly, only a few gain-of-hubness genes were common across multiple immune cell types, suggesting a specific functional role of cancer-associated genes in T cells and myeloid cells. This analysis also revealed that differential hubness was more effective in identifying lung cancer-associated genes than the traditional differentially expressed genes (DEGs) analysis (Fig 6B and S8B Table). Notably, many up-regulated DEGs shared among all immune cell types included very few validated lung cancer genes. When assessing loss-of-hubness genes, a similar trend was observed: fewer candidates but with more specificity to cell types compared to down-regulated expression in disease (Fig 6C and 6D). T cell-specific loss-of-hubness particularly retrieved a significant number of known lung cancer genes. Additionally, we found supportive literature evidence for the proposed cell type of action for these validated cancer-associated genes identified through differential hubness analysis (S8 Table). This suggests that HCNetlas is effective in predicting genes associated with cancer specifically within immune cell types. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Identifying genes that contribute to lung cancer through immune cells using HCNetlas. (A–D) UpSet plots for predicted genes for lung cancer by gain-of-hubness (A), up-regulated DEGs (B), loss-of-hubness (C), or down-regulated DEGs (D), representing the intersection of predictions across different cell types. Orange bar indicates the number of lung cancer genes validated by various databases such as CancerMine, IntOGen, and cancer gene consensus. (E) Bar graph showing the number of ICMs retrieved by both gain-of-hubness genes and up-regulated DEGs across different immune cell types. Statistical significance of overlap is shown for each gene sets (*P < 0.05, ** P < 0.01, *** P < 0.001 by one-sided hypergeometric test). (F, G) Kaplan–Meier survival curve analysis for cancer patients from TCGA cohort (TCGA-LUSC and TCGA-LUAD), stratified by the enrichment score of gain-of-hubness genes (F) or up-regulated DEGs (G). The graph shows the overall survival probability over time, with patients categorized into high and low quantiles based on the GSVA score. Significance of survival rate difference between the upper and lower quantile expression groups were evaluated using the log-rank test. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. DEG, differentially expressed gene; HCNetlas, human cell network atlas; GSVA, gene set variation analysis; ICM, immune checkpoint molecule. https://doi.org/10.1371/journal.pbio.3002702.g006 Further evaluation focused on immune checkpoint molecules (ICMs), which are pivotal in antitumor immunity [62,63]. We anticipated an increase in network centrality and expression of ICMs in tumor-derived immune cells. Confirming our hypothesis, genes identified through differential hubness analysis were more effective in detecting ICMs, particularly within T cells and myeloid cells, compared to differential expression analysis (Fig 6E). This finding underscores the advantage of network-based analyses in pinpointing crucial genes in cancer immunology. Additionally, using The Cancer Genome Atlas (TCGA) lung cancer data, we explored the prognostic value of these genes. We found that the gene expression profile association score, calculated using GSVA [19], for the set of gain-of-hubness genes in each tumor sample was predictive of clinical outcomes (Fig 6F), unlike up-regulated DEGs (Fig 6G). HCNetlas: A catalog of reference CGNs for various healthy human tissues To build reference CGNs, we utilized scRNA-seq data from immune cell atlas project [10] and The Tabula Sapiens [11], as well as single-nucleus RNA-sequencing (snRNA-seq) data from the Allen Brain Atlas [12]. Our single-cell transcriptomic data set comprised 763,559 cells from 28 donors. We generated gene networks for each predefined cell type using the scHumanNet framework [5] (Fig 1A), providing a comprehensive baseline for identifying disease-associated genes and cell types. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Overview of HCNetlas. (A) Schematic representation of the workflow from single-cell transcriptomic data collection to the construction of the HCNetlas. Single-cell RNA sequencing data preannotated for cell type were used to build CGNs using the scHumanNet framework. HCNetlas is comprised of a comprehensive collection of these gene networks, representing various human tissues and cell types. (B) UMAP visualization of CGNs based on gene profiles, highlighting the major cell lineages, with node size representing the number of genes in each network. Major cell type “Other” in gray (major cell type abbreviations; B; B cells, Br; Brain cells, My; Myeloid cells, T; T cells). (C) UMAP plot displaying the interrelationship among the CGNs based on network gene profiles for major organs or tissue types. Each point represents a gene network associated with a specific organ or tissue type colored distinctly. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. CGN, cell type-specific gene network; HCNetlas, human cell network atlas; UMAP, uniform manifold approximation and projection. https://doi.org/10.1371/journal.pbio.3002702.g001 When constructing CGNs, the number of cells used can influence the efficacy of network inference. To investigate this aspect, we conducted an analysis of CGNs derived from various cell atlas data sets, specifically examining how the number of cells used for network inference correlates with the overall network size. Our findings indicated a clear trend: as the number of cells increases, there is a corresponding rise in both the node and edge counts within the inferred CGNs. However, this growth in network complexity tends to plateau once the cell count reaches approximately 1,000 (S1A Fig). The observed saturation point suggests that the inference of CGNs becomes substantially robust to the effects of sample size when the number of cells exceeds 1,000. Based on this insight, we focused our study on networks inferred from data sets comprising a minimum of 1,000 cells. This led to the generation of 198 CGNs, covering 25 tissues and including 61 distinct cell types (S1 Table). These networks form our newly established resource, HCNetlas, a catalog of human CGNs for healthy tissues. To examine the interrelationships among the CGNs in our HCNetlas, we analyzed each CGN based on network gene profiles, subsequently visualizing these profiles in a reduced dimensional space. This analysis demonstrated a clear trend where CGNs corresponding to the same cell types exhibited a tendency to cluster together (Fig 1B), reinforcing the concept that these networks accurately capture and reflect the specificity inherent to each cell type. Notably, CGNs within the myeloid and B cell lineages showed remarkable coherence, in contrast to the T cell lineage CGNs, which exhibited greater heterogeneity. An interesting observation was the close proximity of innate lymphoid cell (ILC) and natural killer (NK) CGNs (S1B Fig), underscoring their lineage correlations [34]. However, CGNs related to the same tissue types generally did not demonstrate strong clustering with the exception brain tissue network nodes that displayed high similarity (Fig 1C), suggesting that cell type identity is a stronger determinant of network structure than tissue environment. This was further evidenced in the T cell lineage, including ILCs, NK cells, CD4+ T cells, and CD8+ T cells, where subsets exhibited coherence within cell types but not necessarily within tissue types (S1B and S1C Fig). This aligns with recent studies that emphasize tissue or sub-cell type-dependent variability in T cells [35–37]. These findings highlight the utility of HCNetlas as a potentially powerful tool for investigating cell type-specific gene functions. Assessing the cell type-specific functionality of HCNetlas CGNs To evaluate whether the reference CGNs of HCNetlas accurately reflect cell type-specific functions, we conducted tests using 2 distinct immune cell types from different lineages: B cells and T cells. The premise of this test was that if the HCNetlas CGNs are effective in representing cellular functions unique to each cell type, then genes for maintaining the identity and function of each cell type should demonstrate interconnectedness within their respective networks. As anticipated, our analysis showed that genes specifically annotated for either B cells or T cells by the GOBP [38] exhibited the highest within-group connectivity in their respective CGNs across various tissues (Fig 2A and 2B). This pattern of connectivity was further validated by comparing it with cell type marker genes as identified in the Azimuth database [13] and the CellMarker database [14] (S2A Fig). These findings underscore the ability of HCNetlas CGNs to capture and represent the unique functional characteristics inherent to specific cell types. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Cell type-specific functionality of HCNetlas CGNs. (A, B) Bar graph illustrating the within-group connectivity of B cell-related (a) or T cell-related (b) GOBP genes in the respective CGNs. Connectivity is normalized by each CGN’s total node number. All tissues with over 0 value of normalized connectivity for both B and T cells are included. (C) Heatmap displaying the percentile rank of the top 15 hub genes in spleen CGNs, with values scaled per row, with color intensity indicating the expression level from low (blue) to high (red). Genes uniquely found in each cell type are highlighted in bold. (D, E) UpSet plots for 2 major CGNs, B cell CGN (D), and T cell CGN (E), representing the intersection of network genes across different tissues. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. https://doi.org/10.1371/journal.pbio.3002702.g002 Moreover, we investigated the network hub genes within each CGN, identified based on degree centrality (Fig 2C). For instance, in spleen CGNs, CD86, which is pivotal in B cell activation [39], emerged as top hub genes in the B cell CGN. Similarly, genes essential for T cell identity like CD2, CD4, and CD28 were among the top 15 hub genes in the T cell CGN [40]. Additionally, S100A8, S100A9, CD14, markers for monocytic myeloid-derived suppressor cells were prominent hubs in the monocyte CGNs [41]. These patterns of hub genes, significant due to their high degree centrality, were consistent across various tissues (S2B Fig), underlining the functional interpretability of these hub genes in the context of their respective cell types. Lastly, to assess the tissue dependency of the HCNetlas CGNs, we compared CGN genes for each major cell type across different tissues. We observed limited overlap in CGN genes among different tissues within major cell types (Figs 2D, 2E, and S3), suggesting a convergence of networks across tissues within major cell lineages, aligning with findings from previous studies [10,35]. These observations indicate that while there are core gene networks characteristic of each cell type, tissue adaptation of CGNs is also evident, underscoring the complexity and diversity of cellular functions across different biological contexts. HCNetlas as a tool for unraveling cell type specificity of disease genes The HCNetlas, with its collection of reference CGNs, presents a promising resource for dissecting the cellular specificity of disease genes. The majority of disease-associated genes identified to date have been derived from bulk tissue data, which often fails to specify the exact cell types involved in disease onset and progression. In this scenario, HCNetlas CGNs become instrumental in pinpointing the critical cell types at play. To ascertain the effectiveness of HCNetlas CGNs in disease-oriented research, we embarked on an investigation to determine if these CGNs capture and reflect the cell type specificity of various diseases. This involved conducting enrichment analyses on the CGNs using disease-associated genes sourced from 2 distinct databases: DisGeNET [16] and GWAS Catalog [17]. We began by ranking genes within each CGN based on network degree centrality, and then applied single-sample gene set enrichment analysis (ssGSEA) [18] and gene set variation analysis (GSVA) [19] to profile degree of association with each set of disease genes. Our analysis revealed a distinct pattern of congregation among CGNs corresponding to shared cell types, as determined by disease-association profiles (Figs 3A and S4A). This finding was particularly notable within cell types, whereas the convergence of networks corresponding to the same tissue types was less pronounced, indicating the specificity of cell types in the context of disease genetics. We next evaluated the connectivity among genes associated with the same disease within CGNs across different tissues. Our hypothesis was that genes would exhibit more interconnectedness in the relevant cell types and tissues primarily responsible for diseases. This analysis aimed to elucidate the relationships between specific diseases and their associated cell types or tissues. While not all diseases we considered manifests cell type specificity, we noticed that CGNs predicted similar disease gene enrichment patterns in tissues such as the intestine and the liver (Fig 3A). A case in point is hepatitis-related terms, where genes associated with this condition showed the most significant within-group connectivity in liver CGNs of most major immune cell types (Fig 3B). Noteworthy was the pronounced within-group connectivity observed in both myeloid cell and T cell CGNs, highlighting the integral role of T cells in viral infectious diseases and the contribution of Kupffer cells (resident liver macrophages) to hepatitis [42]. This finding indicates that HCNetlas effectively identifies relevant cell types and tissues implicated in hepatitis. Furthermore, genes related to schizophrenia showed increased within-group connectivity across brain tissues, particularly in the primary motor cortex (M1) and MTG CGNs (S4B Fig). Additionally, AD-associated genes from the GWAS were found to be highly connected within the macrophage and myeloid CGNs (S4C Fig), consistent with the association between AD risk and macrophage transcriptional network [43]. These observations underscore the potential of HCNetlas CGNs as a valuable resource for uncovering intricate relationships between diseases and specific cell types or tissues, thereby enhancing our understanding of disease pathology at a cellular level. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Overview of cell type-resolved disease genetics using HCNetlas. (A) Heatmap displaying the disease profiles of various cell types across different tissues, conducted with ssGSEA. Each column represents a CGN of HCNetlas, while each row corresponds to a disease gene set sourced from either DisGeNET or GWAS Catalog, ordered by hierarchical clustering. Color intensity indicates scaled ssGSEA enrichment values for the disease gene sets (rows), reflecting the degree of association of the CGN signature genes with each disease terms. (B) Bar graphs showing the within-group connectivity of genes associated with toxic hepatitis across different cell lineages, in various tissues. The bars are color-coded to represent different cell lineages; 15 terms and their combined genes were assessed from DisGeNET terms based on the key word search “hepatitis.” (C) Schematic representation and summary of the analytical framework used for comparing disease CGNs with reference CGNs from HCNetlas. The workflow illustrates the process of CGN inference from single-cell transcriptomes of disease samples and contrasts disease CGNs for specific immune cells against reference CGNs. The analysis includes (i) differential compactness, highlighting the difference in interconnectivity within disease-associated genes; (ii) differential hubness, showing the changes in hubness; and (iii) differential pathways, contrasting pathway associations between disease and healthy states based on enrichment for CGN signature genes. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. CGN, cell type-specific gene network; HCNetlas, human cell network atlas; ssGSEA, single-sample gene set enrichment analysis. https://doi.org/10.1371/journal.pbio.3002702.g003 HCNetlas, having proven its functional and biological relevance, is posited to be an effective reference for network analyses in disease studies. To enhance their utility, we have developed a suite of network analysis methodologies (Fig 3C) and applied them to investigate various diseases, showcasing the adaptability of HCNetlas CGNs. Firstly, if we have a set of disease genes, determining the specific cell type where these genes predominantly influence disease progression is crucial. To evaluate the functional role of these disease genes in a targeted cell type, we have developed an approach known as differential compactness analysis. This method compares the degree of interconnectivity among disease genes between the reference CGNs in HCNetlas and their corresponding disease CGNs derived from disease samples. In this framework, “gain-of-compactness” denotes an enhanced interconnectivity of disease genes within disease CGNs, suggesting an increased functional role in the disease context. Conversely, “loss-of-compactness” implies a reduced interconnectivity. Through this analysis, we can gain insights into which cell type the disease genes are actively involved and determine whether their impact on the disease state is characterized by a gain or loss of function. Secondly, to identify disease genes and ascertain the cell type implicated in the disease, focusing on genes exhibiting significant differences in network centrality between diseased and healthy states can be insightful. Therefore, we prioritize genes based on differential hubness between disease CGNs and reference CGNs. This methodology involves categorizing genes into 2 distinct groups: “gain-of-hubness” and “loss-of-hubness.” Genes in the “gain-of-hubness” category show increased centrality in disease CGNs compared to reference CGNs, indicating a heightened role in the disease state. Conversely, genes in the “loss-of-hubness” category demonstrate decreased centrality in disease CGNs, suggesting a reduced or altered function in the disease context. This approach effectively distinguishes genes that are central to disease mechanisms in specific cell types. Lastly, examining pathways that show differential associations between diseased and healthy states in cell types associated with the disease can provide insights into the molecular mechanisms underlying pathogenesis. To conduct the differential pathway analysis, we initially select signature genes representative of both diseased and healthy states for each cell type. This selection is based on identifying the top-ranked hub genes (for example, the top 10 hub genes) within both the disease CGN and the corresponding reference CGN. Subsequently, through gene set enrichment analysis, we aim to identify and prioritize pathways that are differentially associated between the disease and healthy states. In this context, “gain-of-pathways” refers to those pathways that show an increased association with the disease state in comparison to the healthy control. Conversely, “loss-of-pathways” denotes pathways that have a reduced association in the disease state compared to the healthy state. Identifying these differentially associated pathways enables us to formulate hypotheses that delve deeper into the molecular basis of pathogenesis in disease-associated cell types. Cell type-resolved genetic analysis of an autoimmune disease using HCNetlas Given that the majority of the CGNs provided by HCNetlas are derived from immune cells, this resource would be particularly valuable for studying immune disorders such as autoimmune diseases. To evaluate capability of HCNetlas to identify the specific immune cell types where disease genes have an impact on pathogenesis, we focused our research on SLE which is a chronic autoimmune disorder characterized by elusive pathogenesis, genetic susceptibility, and clinical heterogeneity [44]. For constructing disease CGNs for SLE, we manually annotated scRNA-seq data from 38 SLE patients [20] using canonical markers (Fig 4A). These disease CGNs were then compared with the blood cell CGNs from HCNetlas, providing insights into the cell type specificity underlying SLE pathogenesis. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Cell type-resolved disease genetics for SLE. (A) UMAP plot representing the interrelationship among immune cells. (B) Bar chart showing normalized interconnectivity among SLE-susceptible genes in both reference and disease CGNs for various cell types. (C) ROC curves for retrieval of ISGs by network hubness in disease myeloid CGN and reference myeloid CGN. (D) Comparison of AUROC values with CGNs for various cell types, contrasting the reference and disease CGNs to assess prediction capability. (E) Left panel: Distribution of expression levels for 131 gain-of-hubness genes in myeloid cells. Right panels: Correlation analysis of expression levels of the 131 gain-of-hubness genes with the SLEDAI. (F) Same as for (E) except using up-regulated DEGs. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. AUROC, area under the ROC curve; CGN, cell type-specific gene network; DEG, differentially expressed gene; ISG, interferon-stimulated gene; ROC, receiver operating characteristic; SLE, systemic lupus erythematosus; SLEDAI, Systemic Lupus Erythematosus Disease Activity Index; UMAP, uniform manifold approximation and projection. https://doi.org/10.1371/journal.pbio.3002702.g004 Leveraging the principle that increased network compactness among disease-associated genes within a CGN indicates their significant role in the pathogenesis for that cell type, we assessed the involvement of major immune cell types in SLE. We applied a set of SLE-susceptible genes (S2 Table), gathered from the KEGG pathway database, to both disease CGNs and reference CGNs. Our analysis revealed that network compactness in myeloid cells and B cells is significantly greater in the disease CGN compared to the reference CGN (Fig 4B). This suggests that SLE-susceptible genes are critically involved in the disease progression primarily through myeloid cells and B cells. Considering that genes associated with SLE predominantly exert their effects through myeloid cells, we prioritized genes for SLE based on network centrality within both disease and reference CGNs specifically pertaining to myeloid cells. Aligning with previous studies that emphasize the increased expression of type 1 interferon (IFN) and ISGs in SLE patients [20,45,46], we evaluate the prediction of SLE-associated genes based on retrieval rate of ISGs (S3 Table) using the ROC curve. Consistent with the greater network compactness of SLE-susceptible genes in the disease myeloid CGN relative to the reference myeloid CGN, our results showed a significantly improved prediction of ISGs in the disease myeloid CGN when compared to the reference myeloid CGN (Fig 4C). Likewise, for other cell types, disease CGNs demonstrated improved predictions of ISGs compared to the reference CGNs (Fig 4D and S4 Table). Taken together, these findings underscore the critical role of myeloid cells in the initiation and progression of SLE, corroborating previous research that highlights the link between SLE and myeloid cells [47–49]. Next, we hypothesized that gain-of-hubness genes for myeloid cells could effectively differentiate diseased myeloid cells from their healthy counterparts. To test this hypothesis, we initially identified a set of 131 gain-of-hubness genes with statistical significance (S5A Table). We then examined the distribution of their expression level between disease-state myeloid cells and their corresponding healthy controls. Our observations revealed a significant disparity between these 2 distributions, affirming the potential of these 131 gain-of-hubness genes to distinguish diseased myeloid cells (Fig 4E, left panel). Furthermore, we observed a positive correlation between the expression levels of these gain-of-hubness genes and the Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) scores, albeit with limited statistical power due to the small sample size (Fig 4E, right panel). This correlation indicates that the expression patterns of these 131 gain-of-hubness genes are not only distinctive of diseased states but may also reflect the severity of SLE in patients. In contrast, the up-regulated DEGs in disease-state myeloid cells (S5B Table) did not demonstrate the capability to either differentiate diseased myeloid cells (Fig 4F, left panel) or correlate with SLEDAI scores (Fig 4F, right panel). These outcomes imply that collections of CGNs for healthy tissues, such as those provided by HCNetlas, are apt references for identifying disease states. Cell type-resolved genetic analysis of a brain disease using HCNetlas HCNetlas offers an extensive collection of CGNs for brain tissue, making it a valuable resource for investigating neurological disorders. Alzheimer’s disease (AD), a widespread neurodegenerative condition known for its progressive impact on behavior and cognitive functions, is one such area where HCNetlas can be particularly useful. To study AD more closely, we have developed disease CGNs using single-nucleus RNA sequencing (snRNA-seq) data from the prefrontal cortex of AD patients [25]. These disease CGNs were compared with HCNetlas CGNs, which were derived from the primary motor cortex (M1). This comparison enables a detailed analysis of alterations in the gene network that is associated with AD, facilitating a deeper understanding of the disease progression and its impact on brain function. To identify the primary cell types impacted by AD-associated genes compiled from various sources (Methods, S6 Table), we employed differential compactness analysis. This analysis revealed that AD-associated genes exhibit a high degree of interconnectivity within the reference CGNs for both inhibitory and excitatory neurons (Fig 5A), suggesting that these genes predominantly function in these neuron types. Interestingly, we observed that the disease CGNs for inhibitory and excitatory neurons displayed notably lower network compactness scores compared to their reference counterparts (i.e., loss-of-compactness). This significant decrease in network compactness within the diseased neurons points to a loss of connections among AD-associated genes. Such a loss in the diseased state of inhibitory and excitatory neurons could be a critical factor in the pathogenesis of AD, indicating a disruption in the intricate gene networks that underlie normal neuronal function. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Cell type-resolved disease genetics for AD. (A) Bar graph depicting the normalized edge count among AD-associated genes in reference and disease CGNs for various neurological cell types. (B) Bar graph showing the number of validated AD genes predicted by differential expression or differential hubness across the 4 neurological cell types. Statistical significance of overlap is shown for each gene sets (*P < 0.05, ** P < 0.01, *** P < 0.001 by one-sided hypergeometric test). (C–F) Ten most differentially associated KEGG pathways with CGN signature genes: Differentially associated pathways for inhibitory neuron CGN signature (C), inhibitory neuron expanded CGN signature (D), excitatory neuron CGN signature (E), and excitatory neuron expanded CGN signature (F). The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. AD, Alzheimer’s disease; CGN, cell type-specific gene network. https://doi.org/10.1371/journal.pbio.3002702.g005 We then focused on prioritizing genes for AD by either differential expression or differential hubness between healthy and diseased states across each cell type (S7 Table). In alignment with the identified cell type specificity for AD, both inhibitory and excitatory neurons demonstrated a more accurate prediction of AD-related genes when analyzed for differential hubness rather than differential expression (Fig 5B, hypergeometric test P-value < 0.001). Interestingly, the predictive capacity using differential expression in these neuron types was lower compared to that achieved through differential hubness analysis. Additionally, this capacity was akin to what was observed in other cell types. This finding suggests that a network-based approach is more effective for predicting AD genes than methods solely based on expression, which tend to be less specific to AD-associated cell types. Notably, the overlap between gene predictions made using differential hubness and differential expression was minimal (S5A Fig), indicating that these 2 approaches are complementary to each other in identifying key genes associated with AD. To delve into the molecular mechanisms implicated in AD pathogenesis within inhibitory and excitatory neurons, we carried out a differential pathway analysis. This analysis was based on CGN signatures of these neurons, focusing on the top 10 genes ranked by hubness (S5B Fig). As anticipated, our analysis revealed that pathways associated with AD and other related neurodegenerative diseases, such as Parkinson’s disease and Huntington’s disease, were among those most prominently exhibiting a reduced association, or “loss-of-pathway,” in inhibitory neurons (Fig 5C). In addition to these, we identified several other pathways that exhibited loss-of-pathway in inhibitory neurons, and these findings were validated through a literature survey. The pathways that were validated to be associated with AD included oxidative phosphorylation [50], thermogenesis [51], non-alcoholic fatty liver disease [52,53], diabetic cardiomyopathy [54,55], amyotrophic lateral sclerosis [56], and prion disease [57]. Significantly, our analysis identified that the pathway related to the cholinergic synapse was the most notably increased in diseased inhibitory neurons. This finding is also relevant given the known involvement of the cholinergic signaling in AD [58]. We also performed our analysis using an expanded CGN signature that includes their network neighbors, which confirmed the initial list of top loss-of-pathways (Fig 5D). This reaffirms the importance of these pathways in the pathogenesis of AD, highlighting their potential roles in the disease’s progression and impact within inhibitory neurons. In our differential pathway analysis using CGN signatures for excitatory neurons, we predominantly observed gain-of-pathways. These are pathways showing increased activity in AD compared to the healthy state, findings which are substantiated by literature evidence (Fig 5E). For instance, the ErbB signaling pathway is known to mediate amyloid-β (Aβ)-induced neurotoxicity [59], and HIF-1 (hypoxia-inducible factor-1) signaling has been found to increase Aβ generation [60]. Additionally, a similar analysis with expanded CGN signatures for excitatory neurons revealed loss-of-pathways akin to those identified in inhibitory neurons (Fig 5F). Among these findings, the MAPK signaling pathway emerged as the most prominent gain-of-pathway. This is in alignment with previous research demonstrating that MKP-1, a crucial negative regulator of MAPKs, can reduce Aβ generation and alleviate cognitive impairments in AD models [61], thereby validating our observation. Investigating cell type-resolved lung cancer genetics using HCNetlas The tumor immune microenvironment has become increasingly recognized as a key hallmark of cancer. Considering this, we hypothesized that HCNetlas CGNs for immune cells could be instrumental in identifying cancer-associated genes that primarily function within the immune microenvironment. Focusing on lung cancer, we constructed disease CGNs for major immune cell types using single-cell transcriptome data derived from tumor tissues of lung cancer patients [28]. Through differential hubness analysis, compared to reference CGNs of corresponding cell types, we pinpointed gain-of-hubness genes predominantly in T cells and myeloid cells, many of which are known to be associated with lung cancer (Fig 6A and S8A Table). Interestingly, only a few gain-of-hubness genes were common across multiple immune cell types, suggesting a specific functional role of cancer-associated genes in T cells and myeloid cells. This analysis also revealed that differential hubness was more effective in identifying lung cancer-associated genes than the traditional differentially expressed genes (DEGs) analysis (Fig 6B and S8B Table). Notably, many up-regulated DEGs shared among all immune cell types included very few validated lung cancer genes. When assessing loss-of-hubness genes, a similar trend was observed: fewer candidates but with more specificity to cell types compared to down-regulated expression in disease (Fig 6C and 6D). T cell-specific loss-of-hubness particularly retrieved a significant number of known lung cancer genes. Additionally, we found supportive literature evidence for the proposed cell type of action for these validated cancer-associated genes identified through differential hubness analysis (S8 Table). This suggests that HCNetlas is effective in predicting genes associated with cancer specifically within immune cell types. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Identifying genes that contribute to lung cancer through immune cells using HCNetlas. (A–D) UpSet plots for predicted genes for lung cancer by gain-of-hubness (A), up-regulated DEGs (B), loss-of-hubness (C), or down-regulated DEGs (D), representing the intersection of predictions across different cell types. Orange bar indicates the number of lung cancer genes validated by various databases such as CancerMine, IntOGen, and cancer gene consensus. (E) Bar graph showing the number of ICMs retrieved by both gain-of-hubness genes and up-regulated DEGs across different immune cell types. Statistical significance of overlap is shown for each gene sets (*P < 0.05, ** P < 0.01, *** P < 0.001 by one-sided hypergeometric test). (F, G) Kaplan–Meier survival curve analysis for cancer patients from TCGA cohort (TCGA-LUSC and TCGA-LUAD), stratified by the enrichment score of gain-of-hubness genes (F) or up-regulated DEGs (G). The graph shows the overall survival probability over time, with patients categorized into high and low quantiles based on the GSVA score. Significance of survival rate difference between the upper and lower quantile expression groups were evaluated using the log-rank test. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. DEG, differentially expressed gene; HCNetlas, human cell network atlas; GSVA, gene set variation analysis; ICM, immune checkpoint molecule. https://doi.org/10.1371/journal.pbio.3002702.g006 Further evaluation focused on immune checkpoint molecules (ICMs), which are pivotal in antitumor immunity [62,63]. We anticipated an increase in network centrality and expression of ICMs in tumor-derived immune cells. Confirming our hypothesis, genes identified through differential hubness analysis were more effective in detecting ICMs, particularly within T cells and myeloid cells, compared to differential expression analysis (Fig 6E). This finding underscores the advantage of network-based analyses in pinpointing crucial genes in cancer immunology. Additionally, using The Cancer Genome Atlas (TCGA) lung cancer data, we explored the prognostic value of these genes. We found that the gene expression profile association score, calculated using GSVA [19], for the set of gain-of-hubness genes in each tumor sample was predictive of clinical outcomes (Fig 6F), unlike up-regulated DEGs (Fig 6G). Discussion In this study, we have demonstrated the efficacy of a network biology approach for delineating the genetics of disease at the cellular level, making use of HCNetlas—an analytical framework for leveraging compendium of reference CGNs derived from a single-cell expression atlas of healthy individuals for disease research. By comparing these reference CGNs against their diseased counterparts, which are constructed from single-cell transcriptomic data of the same cell types in a disease context, we could measure the alterations in network topology that distinguish healthy from diseased states. We incorporated 3 analytical methods within HCNetlas: differential compactness, differential hubness, and differential pathway analysis. These methods were applied in 3 distinct case studies addressing diseases of the immune system, neurological disorders, and cancer, thereby confirming the extensive applicability of HCNetlas for investigating disease genes in relation to cell type specificity. Our differential compactness analysis pinpointed cell types associated with diseases. We showed that identifying differential hub genes between reference and disease CGNs for a disease-associated cell type is an effective method to predict cell type-specific disease genes. Moreover, by examining differential pathways associated with top hub genes between reference and disease CGNs, we gained insights into the molecular mechanisms potentially driving pathogenesis in the disease-relevant cell types. Consequently, HCNetlas proves to be a robust framework for identifying the specific cell types, genes, and molecular pathways involved in diseases, thus significantly advancing our understanding of how diseases manifest in a cell type-specific manner. Notably, while our work was under review, CellNetdb [64], a database of CGNs across 44 tumor types, was published. Although its CGN inference strategy is similar, CellNetdb is specifically designed to investigate cell type-specific cellular and molecular mechanisms focused on cancer. In contrast, HCNetlas serves as a database of reference CGNs for studying a wide variety of human diseases. Our study underscores the effectiveness of network-based analysis over conventional expression-based methods in discerning the cell type specificity of disease genes. In our lung cancer case study, for example, the finding that only a few gain-of-hubness genes were shared across multiple immune cell types underscores that it is the alterations in network configuration, rather than just changes in gene expression, that more accurately reflect the cell type-specific functions of genes. Further, our findings reveal that differential hubness offers greater predictive capacity for cancer-associated genes compared to differential expression analysis. A noteworthy observation was that while numerous genes were differentially expressed across various immune cell types, only a limited subset of these genes were validated to be involved in lung cancer. This highlights that gene properties unique to a cell type, such as differential hubness, can significantly enhance the accuracy of disease gene prediction. Additionally, even though they were not identified through expression-level prioritization, the association of gain-of-hubness genes with expression profiles of tumor samples was found to have prognostic value, unlike the up-regulated DEGs. This implies that the expression levels of genes that influence disease through interactions with other genes in specific cell types are more relevant and indicative of the disease context. Thus, our approach not only identifies disease-relevant genes but also provides insights into the functional significance of these genes within specific cellular environments. Despite the promising findings, HCNetlas has some limitations. A significant limitation is the current scarcity of “control” single-cell gene expression data for a broad spectrum of cell types and tissues. This lack of data limits the scope and applicability of HCNetlas, as comprehensive mapping of CGNs is contingent on the availability of extensive transcriptomic data. Consequently, our endeavor to create compendium of true reference CGNs was limited by the availability of atlas level resources with varying health conditions (not necessarily diseased). However, this limitation is expected to diminish as the field of single-cell transcriptomics continues to grow. As more data are generated, particularly for healthy tissues, it will become feasible to construct a more comprehensive array of CGNs, covering a wider variety of cell types. Moreover, healthy control samples may vary in properties such as gender, age, body mass index, smoking status, diet, and other lifestyle factors. The effects of these variables will be mitigated by including cells from a large pool of healthy donors, leaving only the core gene network that represent the given cell type within the range of a healthy state. Consequently, the progression of the HCA project is likely to significantly enhance the utility of HCNetlas, extending its applicability to a broader range of diseases and deepening our understanding of cellular behaviors in various pathological states. Another challenge with HCNetlas stems from the inherent limitations of our network inference methodology, which is dependent on a reference interactome. The reference interactome is mapped predominantly using data from a control state, rather than from a disease state. Consequently, this approach may overlook interactions that are unique to the disease state, as these might be underrepresented or entirely absent in the reference interactome. Such omissions can limit the analytical capacity of HCNetlas, particularly in accurately portraying disease-specific network dynamics. Moreover, low number of cells (below approximately 1,000 cells) often models incomplete network structure, and thus may hinder disease analysis depending on the data input. To avoid this issue, we constructed CGNs only from data sets containing at least 1,000 cells. However, this approach may exclude some important cell types from the database. For example, this has prevented us from observing microglia with our input AD scRNA-seq data, an important cell type known to be associated with the disease. To address this issue, future developments of HCNetlas may need to include the de novo inference of gene networks directly from disease sample data. Integrating these disease-specific networks into HCNetlas would provide a more comprehensive view of the gene interactions occurring in various diseases. This enhancement would not only overcome the current limitations but also enrich the platform capability to provide more nuanced and accurate insights into disease mechanisms at the molecular level. Supporting information S1 Fig. Overview of cell-type-specific networks (CGNs) of the HCNetlas. (A) Scatter plot depicting the relationship between the number of cells used for network inference (cell count) and network size by node or edge count of CGNs. (B) Uniform Manifold Approximation and Projection (UMAP) visualization of T cell CGNs based on the network node profiles, where each circle’s size represents the number of network genes and different colors correspond to various T cell subtypes. (C) UMAP plot displaying T cell CGNs across different organs and tissues. The shapes of the points distinguish between different organs, while the color denotes the specific tissue types. The size of each point corresponds to the size of CGN, as determined by the number of genes. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. https://doi.org/10.1371/journal.pbio.3002702.s001 (PDF) S2 Fig. Evaluation of cell type-specific functionality of HCNetlas. (A) Box plot depicting the interconnectivity distribution of B cell and T cell marker genes within CGNs of the HCNetlas. Marker genes for each cell type were derived from gene sets in the Azimuth and CellMarker databases. (B) Heat map displaying the relative percentile ranks of the top 15 hub genes across multiple cell types and tissues, with values scaled per row and color intensity indicating the expression level from low (blue) to high (red). The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. https://doi.org/10.1371/journal.pbio.3002702.s002 (PDF) S3 Fig. Overlap of cell-type-specific network (CGN) nodes across tissues. Upset plots illustrate the intersection of network genes across CGNs for various tissues within each cell type. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. https://doi.org/10.1371/journal.pbio.3002702.s003 (PDF) S4 Fig. Evaluation of HCNetlas for cell type-resolved disease genetics. (A) Heatmap displaying the disease profiles of various cell types across different tissues, conducted with gene set variation analysis (GSVA). Each column represents a CGN of HCNetlas, while each row corresponds to a disease gene set sourced from either DisGeNET or GWAS Catalog. Color intensity indicates the degree of association of the CGN signature genes with each disease gene set. (B, C) Bar graphs showing the within-group connectivity of genes associated with Schizophrenia (B) and Alzheimer’s Disease (C) across different cell lineages or tissues. These disease-associated genes were collected from GWAS catalog. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. https://doi.org/10.1371/journal.pbio.3002702.s004 (PDF) S5 Fig. Evaluation of HCNetlas for cell-type resolved disease genetics for Alzheimer’s disease (AD). (A) Venn diagram displaying overlap between AD-associated genes predicted by differential hubness and differential expression. (B) Networks of signature genes by top 10 hub genes for reference cell type-specific gene network (CGN) from HCNetlas and disease CGNs from disease samples for inhibitory neuron or excitatory neuron. The data underlying this figure can be found in https://doi.org/10.5281/zenodo.14522296. https://doi.org/10.1371/journal.pbio.3002702.s005 (PDF) S1 Table. HCNetlas cell type, abbreviation, and cell count. https://doi.org/10.1371/journal.pbio.3002702.s006 (XLSX) S2 Table. SLE susceptible gene list. The 184 SLE-associated genes are obtained from KEGG pathway (hsa05322) and KEGG disease (H00080) databases. https://doi.org/10.1371/journal.pbio.3002702.s007 (XLSX) S3 Table. The list of interferon-stimulated genes. We compiled interferon-stimulated genes (ISGs) from hallmark gene sets of the molecular signature database (MSigDB) and the Immunological Genome Project (ImmGen) [24], resulting in a total of 423 ISGs. https://doi.org/10.1371/journal.pbio.3002702.s008 (XLSX) S4 Table. Area under ROC for prediction of interferon-stimulated genes (ISGs) by network centrality. AUROC were computed with 423 ISGs from S3 Table were applied to genes sorted based on degree centrality. https://doi.org/10.1371/journal.pbio.3002702.s009 (XLSX) S5 Table. Gain-of-hubness genes and up-regulated genes in SLE myeloid network. Gain-of-hubness genes were defined by differential percentile rank > 0.5 and q-value < 0.05. DEGs were genes with an adjusted p-value < 0.05 and a log2-fold change > 0.5, focusing solely on coding genes. https://doi.org/10.1371/journal.pbio.3002702.s010 (XLSX) S6 Table. Genes associated with Alzheimer’s disease. AD-associated genes were obtained from the KEGG pathway (M16024), MSigDB (M35868), and Wightman and colleagues. https://doi.org/10.1371/journal.pbio.3002702.s011 (XLSX) S7 Table. Gain-of-hubness genes, loss-of-hubness genes, up-regulated DEGs, and down-regulated DEGs in AD CGNs. Gain/loss-of-hubness genes were defined by absolute differential percentile rank > 0.5 and q-value < 0.05. DEGs were genes with an adjusted p-value < 0.05 and an absolute log2-fold change > 0.5, focusing solely on coding genes. https://doi.org/10.1371/journal.pbio.3002702.s012 (XLSX) S8 Table. Gain-of-hubness genes, loss-of-hubness genes, up-regulated DEGs, and down-regulated DEGs in AD CGNs. Gain/loss-of-hubness genes were defined by absolute differential percentile rank > 0.5 and q-value < 0.05. Up-regulated DEGs were genes with an adjusted p-value < 0.05 and log2-fold change > 0.5, and down-regulated DEGs were genes with adjusted p-value < 0.01 and log2-fold change < -1.5. https://doi.org/10.1371/journal.pbio.3002702.s013 (XLSX)

journal article

Open Access Collection

Caspase 3 and caspase 7 promote cytoprotective autophagy and the DNA damage response during non-lethal stress conditions in human breast cancer cells

Samarasekera, Gayathri;Go, Nancy E.;Choutka, Courtney;Xu, Jing;Takemon, Yuka;Chan, Jennifer;Chan, Michelle;Perera, Shivani;Aparicio, Samuel;Morin, Gregg B.;Marra, Marco A.;Chittaranjan, Suganthi;Gorski, Sharon M.

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003034pmid: 39982959

Introduction Caspases are cysteine-dependent aspartic proteases and are traditionally known for their role in proteolysis during the final stages of apoptosis. It is increasingly appreciated that caspases also play important roles in non-apoptotic cellular processes in normal development and in disease conditions [1–4]. Given the evolutionarily conserved non-apoptotic caspase roles across phyla, it has been postulated that the primordial function of caspases was to regulate cellular stress adaptations [1,5,6]. Further supporting this notion are the distinct caspase 3 (CASP3)- and caspase 7 (CASP7)-dependent proteolytic landscapes in cells exposed to non-lethal stress conditions compared to lethal stress conditions [7]. Such adaptive roles under non-apoptotic stress may explain unexpected observations, such as the association of high caspase expression with enhanced tumor progression or the lack of correlation between caspase expression and apoptosis in several cancer types, including breast cancer [8–15]. Caspases are ubiquitously expressed in most cells and several studies have linked caspases to stress response pathways [16–19]. Interestingly, stimuli that activate cell stress response pathways can also activate caspases [20,21]. However, the processing, regulation, and activity of caspases during non-lethal cellular stress are understudied. Macroautophagy, hereafter referred to as autophagy, is an evolutionarily conserved intracellular lysosome-mediated degradation and recycling process that plays significant roles in normal development, aging and diseases, including cancer [22–24]. While autophagy occurs at basal levels, it is also a major pathway upregulated in response to several stressors, including nutrient deprivation, hypoxia, reactive oxygen species, DNA damage, and pathogens [25–27]. Autophagy supports stress adaptation and cell survival through degrading and recycling damaged organelles and macromolecules to facilitate the production of energy and/or essential cellular components [27,28]. The mechanistic interactions between cytoprotective autophagy and apoptosis are not well-understood. It is widely accepted that these two processes are antagonistic, and hence the final cell fate is determined by a tug-of-war between pathways [29,30]. Consistent with this model, caspases were shown to suppress autophagy by direct cleavage of autophagy regulators or core autophagy proteins [29,31]. However, apoptotic pathway components, including caspases, have also been implicated in promoting autophagy in some contexts [31–33]. In Drosophila, the apoptotic effector caspase Dcp-1 was shown to positively regulate stress-induced cytoprotective autophagy [34–36]. In human cells, a role in the positive regulation of stress-induced cytoprotective autophagy was demonstrated for the initiator caspase 9 [32]. However, it is unknown whether human effector caspases have an evolutionarily conserved function in stress-induced cytoprotective autophagy. In addition, the contexts, molecular mechanisms and pathways involved in non-lethal stress adaptation by both initiator and effector caspases have not been thoroughly investigated. All caspases exist as inactive zymogens (pro-caspases) consisting of a pro-domain, a large ~ p20 subunit containing the catalytic site, a small ~ p10/p12 subunit, and may contain a linker sequence between subunits. Upon activation in apoptosis, the pro-domain and the linker region are removed by proteolytic cleavage in a temporal order unique to each caspase, ultimately resulting in ~ p20 and p10/p12 cleaved caspase fragments that assemble to form the active tetrameric complex [37]. In effector caspases, these cleavage events are typically carried out by initiator caspases. Granzyme B and calpain have also been implicated [38–40]. Once activated, effector caspases cleave many substrate proteins [41]. Therefore, caspase activity must be tightly regulated, presumably in both apoptotic and non-apoptotic settings. One major outstanding question is how caspases participate in non-apoptotic functions without killing cells. Identified modes of caspase regulation include post-translational modifications (cleavage, phosphorylation, ubiquitylation), subcellular compartmentalization (mitochondria, cytosol, nucleus, stress granules), and modulatory protein interactions, including inhibitor of apoptosis protein family members [5,6,19,42]. Allosteric sites in CASP3 and CASP7 and an exosite (non-active site interaction) in CASP7 can also contribute to regulation and activity [43]. However, caspase processing, activation, and regulation in non-apoptotic functions and/or in non-lethal cellular stress conditions have not been well characterized. Here, we demonstrate that human effector CASP3 and CASP7 have an evolutionary conserved role in promoting cytoprotective autophagy during non-lethal stress. We found that the underlying mechanism involves non-canonical cleavage of CASP7 and alteration of Poly(ADP ribose) polymerase 1 (PARP1) activity. The combined loss of CASP3 and CASP7 phenocopies PARP1-inhibition, including synthetic lethality with BRCA1 loss. Results Dual loss of CASP3 and CASP7 suppresses starvation-induced autophagy We used the autophagy-activating role in response to nutrient deprivation of the Drosophila effector caspase Dcp-1 [34–36] as the basis for investigating the potential functional conservation between two closely related human caspases, effector CASP3 and CASP7. First, we employed standard MAP1LC3B/LC3B-based autophagy assays as these have been used successfully to monitor autophagy in breast cancer cell lines, including in SKBR3 lines, and under similar stress conditions we employed [44–47]. To determine the optimum time point for detecting a consistent upregulation of autophagy without signs of cell death, SKBR3 cells were subjected to amino acid starvation (Fig 1A and 1B). We observed a significant increase in autophagic flux at 8- and 24-h starvation, as measured by the increased LC3BII accumulation in the presence of BafA1 (Fig 1B). Across all time points, low levels of cleaved-PARP (at ~ 89 kDa) were observed in both fed (non-stressed) control cells and starved (stressed) cells as reported previously as being normal even in non-apoptotic viable cells [7,48]. Furthermore, there were no signs of cell death, as indicated by the absence of a marked increase in cleaved-PARP1 in the starvation conditions (Fig 1A), and thus we selected 8-h starvation as the optimal time point for detecting non-lethal stress-related starvation-induced autophagic flux in SKBR3 cells in this study. Next, to determine whether amino acid starvation-induced autophagy was CASP3- and/or CASP7-dependent, single knockdown (KD) and double knockdown (DKD) experiments were performed with siRNAs. The control scramble-siRNA-treated cells showed an increase in LC3BII levels in the presence of BafA1, indicating an increase in autophagic flux when cells were starved (Fig 1C). In single CASP3 or CASP7 siRNA-treated cells, the starvation-induced autophagic flux was comparable to that of the control. However, a significant suppression of autophagic flux was observed in CASP3 and CASP7 (CASP3 + 7) DKD cells (Fig 1C and 1D). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Dual loss of CASP3 and CASP7 suppresses starvation-induced autophagy. (A) Representative western blots of indicated proteins from SKBR3 cells in fed (F) or in amino acid starvation (S) for various time periods, in the absence or presence of 50 nM Bafilomycin A1 (BafA1) in the final 2 h. (B) Quantification of LC3B-based autophagy flux in starved cells relative to the fed control, shown in (A). All with BafA1. (C) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and incubated in fed conditions or starved for 8 h with BafA1 (50 nM) for the final 2 h. (D) Quantification of LC3BII-based autophagy flux in starved cells relative to the starved scramble-siRNA control, shown in (C). (E) Representative western blots of indicated proteins from CASP3, CASP7 single (KO), or double knockout (DKO) SKBR3 cells in fed or starved (8 h) conditions, with BafA1 (50 nM) in the final 2 h. (F) Quantification of LC3BII-based autophagy flux in starved cells relative to the parental control, shown in (E). (G) Representative western blots of indicated proteins from SKBR3 parental, DKO or CASP3 + 7-WT re-expression in DKO cells in fed or starved (8 h) conditions, with BafA1 (50 nM) in the final 2 h. (H) Quantification of LC3BII-based autophagy flux in starved cells relative to starved parental cells, shown in (G). (I) Representative immunofluorescence images of SKBR3 parental, DKO or CASP3 + 7-WT re-expression in DKO cells treated with 0.25 µM DALGreen in fed or starved (8 h) conditions. BafA1 (50 nM for 8 h) in both fed and starved conditions serve as controls. Scale bars, 20 µm. (J) Quantification of DALGreen immunofluorescence in starved cells shown in (I). Graph shows number of punctae per cell. n = 2, each with 10 random images covering total of 500–700 cells. In graphs, data are shown as mean ± SEM. n = at least 3 independent experiments except in (J). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B, D, F, and H, one-way ANOVA with Dunnett’s post-test. In J, one-way ANOVA with Tukey’s post-test. See also S1 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g001 We hypothesized that the lack of a significant effect on starvation-induced autophagy upregulation in single CASP3 or CASP7 KD experiments might be due to the presence of residual caspase levels (S1A and S1B Fig). To address this, we generated CASP3 and CASP7 single knockout (KO) and double knockout (DKO) SKBR3 cell lines (S1C Fig). In single caspase KO cells, we found negligible or undetectable effects on starvation-induced autophagy compared to the control parental cell lines (Fig 1E and 1F). This suggests that CASP3 and CASP7 have partially overlapping functions and/or single KO cell lines may have activated compensatory mechanisms to overcome the other caspase’s loss. Consistent with this notion and with what we observed in DKD cells, there was a significant impairment of autophagic flux in CASP3 + 7 DKO cell lines (Fig 1E and 1F). Similarly, following 24 hours starvation, there was no induction of autophagic flux in CASP3 + 7 DKO cells (S1D and S1E Fig). To confirm the specificity of our findings, we generated multiple isogenic caspase DKO SKBR3 cell lines. In all five CASP3 + 7 DKO lines we tested, we observed a reduction in LC3B-II levels (in the presence of BafA1) indicating that amino acid starvation-induced autophagy was compromised (S1F Fig). To exclude the possibility of cell line-specific effects, we repeated our experiments using another breast cancer cell line, MDA-MB-231, and similarly found that autophagic flux induction was significantly compromised only upon CASP3 + 7 DKD and DKO (S1G–S1L Fig). Lastly, when wild-type constructs of both CASP3 and CASP7 were stably reintroduced into DKO cells, the starvation-induced upregulation of autophagy was partially rescued (Fig 1G and 1H). The processing of LC3B to form LC3BII is a hallmark of autophagy [49]. However, since LC3B is also involved in autophagy-independent processes [50], we orthogonally measured autophagy by employing the DALGreen autolysosome fluorescent marker [51]. In accordance with the LC3B-based autophagy assay results, the levels of DALGreen positive puncta indicate that amino-acid deprivation-induced autophagy is significantly compromised in CASP3 + 7 DKO cells, and the dual re-expression of CASP3 and CASP7 fully rescued autophagy (Fig 1I and 1J). The reduction in number of autolysosomes in CASP3 + 7 DKO cells was not due to a difference in cell size (S1M and S1N Fig). These observations together indicate that CASP3 and CASP7 play a partially redundant positive regulatory role in non-lethal amino acid deprivation-induced autophagy. Dual loss of CASP3 and CASP7 suppresses proteasome inhibition-induced compensatory autophagy and sensitizes cells to proteasome inhibitors It is well-documented that autophagy is upregulated as a compensatory mechanism to maintain protein homeostasis, when proteasome function is compromised [52,53]. Next, we investigated whether this compensatory autophagy was also dependent on CASP3 and CASP7. A proteasome inhibitor (PI) MG132, at 0.5 μM, was chosen as the non-lethal stress condition for these experiments. (Figs 2A–2C and S2A–S2C). The reduction of proteasome activity was confirmed directly by proteasome activity assays (Figs 2C and S2C) and indirectly through the accumulation of ubiquitinated proteins (Figs 2A and S2A). In line with what we observed in amino acid starvation conditions, the PI-induced autophagic flux was significantly inhibited in CASP3 + 7 DKD or DKO cells (Fig 2D–2G). Although the effect was modest, single CASP7 KO also resulted in a significant reduction in PI-induced autophagy (Fig 2F and 2G). Similar results were observed in the MDA-MB-231 cell line (S2D and S2E Fig), and with the clinically approved proteasome inhibitor Bortezomib [54,55] (S2F and S2G Fig). PI-induced LC3BII upregulation was partially rescued by dual re-expression of CASP3 + 7, supporting the requirement for these effector caspases in PI-induced autophagy (Fig 2H and 2I). Collectively, these results indicate that CASP3 and CASP7 have positive regulatory roles in autophagy induction in response to starvation or proteasome inhibition-induced stress. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Dual loss of CASP3 and CASP7 suppresses proteasome inhibition-induced compensatory autophagy and sensitizes cells to proteasome inhibitors. (A) Representative western blots of indicated proteins from SKBR3 cells treated with MG132 at increasing dosage for 24 h, with BafA1 (50 nM) in the final 2 h. (B) Quantification of LC3B-based autophagy flux in MG132-treated cells relative to untreated (0 μM MG132; DMSO vehicle only; NT) SKBR3 shown in (A). The levels of LC3BII were normalized to loading control and shown relative to the untreated control (NT). (C) Graph showing proteasome activity in response to increasing concentrations of MG132 as depicted by chymotrypsin, caspase and trypsin-like activities. (D) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and treated with vehicle DMSO only or MG132 (0.5 μM) for 24 h, with BafA1 (50 nM) in the final 2 h. (E) Quantification of LC3B-based autophagy flux in MG132-treated cells relative to the MG132-treated scramble-siRNA control, shown in (D). (F) Representative western blots of indicated proteins from CASP3, CASP7 single knockout, or DKO SKBR3 cells treated with vehicle DMSO or MG132 (0.5 μM) for 24 h, with BafA1 (50 nM) in the final 2 h. (G) Quantification of LC3B-based autophagy flux in MG132-treated cells relative to the MG132-treated parental control show in (F). (H) Representative western blots of indicated proteins from SKBR3 parental, DKO or CASP3 + 7-WT reintroduced into DKO cells treated with vehicle DMSO or MG132 (0.5 or 1.0 μM) for 24 h with BafA1 (50 nM) in the final 2 h. (I) Quantification of LC3BII-based autophagy flux in MG132-treated cells relative to DMSO-treated parental control, shown in (H). (J) Representative images of crystal violet assay plates of SKBR3 parental and CASP3 + 7 DKO cells treated with indicated concentrations of MG132 for 24 h and continued to grow in drug free media for another 3 days. (K) Quantification of cell viability shown in (J). Percentage of stained (viable) cells at each concentration was normalized to respective untreated cells. In graphs, data are shown as mean ± SEM. n = 3 independent experiments. * P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B, C, E, and G, one-way ANOVA with Dunnett’s post-test and I with Tukey’s post- test. K with two-way ANOVA with Sidak’s post-test. See also S2 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g002 Since autophagy has been reported to promote cell death in some contexts [56–58], we investigated whether the CASP3- and CASP7-dependent stress-induced autophagy contributes to cell death or survival. Crystal violet cell viability assays were carried out following PI treatment (MG132 or Bortezomib) or starvation of parental and CASP3 + 7 DKO or DKD of SKBR3 or MDA-MB-231 cells (Figs 2J, 2K and S2H–S2O). We observed a modest, but significant reduction in cell viability in response to PI treatment or starvation in CASP3 + 7 DKD or DKO cells compared to control cells. Contrary to traditional caspase roles, these results suggest that CASP3 and 7 promote cell survival in non-lethal PI or starvation stress conditions. CASP7 is cleaved non-canonically generating stable p29/p30 fragments under non-lethal stress conditions To elucidate the molecular mechanisms by which CASP3 and CASP7 positively regulate cytoprotective autophagy, we analyzed their expression and processing pattern following amino acid starvation or PI treatments. CASP3 and CASP7 processing during apoptosis each result in large ~ p20 and small ~ p10/p12 cleaved-caspase fragments [37,59,60]. Intriguingly, our western blot analyses revealed a stable non-canonical CASP7 cleavage fragment(s) at ~ 30 kDa (p30) and/or 29 kDa (p29), and a CASP3 cleavage fragment at ~ 27 kDa (p27) in non-lethal starvation (Fig 3A and 3B) or PI conditions (Fig 3C and 3D). In fed or untreated (non-stressed) cells, these non-canonical fragments were either absent or less apparent. The CASP3 fragment at ~ 27 kDa was inconsistent and/or not readily detectable (e.g., Fig 3C) across all experiments, so we focused on the CASP7-p29/p30 fragments. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. CASP7 is cleaved non-canonically generating stable p29/p30 fragments under non-lethal stress conditions. (A–D) Representative western blots showing changes in CASP7 and CASP3 in SKBR3 or MDA-MB-231 cells in amino acid starvation (A, B) or treated with MG132 (C, D). Non-canonical CASP7 bands (p29/p30) at 30 kDa are indicated by blue arrows. A non-canonical CASP3 band (p27) at 27 kDa is also detected in some conditions. (E) Representative western blots comparing CASP7, CASP3, and PARP1 processing patterns in SKBR3 cells following 4-h incubation in fed conditions (F), non-lethal stress (starvation) or lethal stress (apoptosis induction by staurosporine; STS, 1 or 2 µM). (F) Representative western blots comparing CASP7 and PARP1 processing in SKBR3 parental cells stably transfected with vector control or BCL2 expression construct, following 4-h incubation in fed conditions (F), non-lethal stress (starvation) or lethal stress (apoptosis induction by staurosporine; STS, 1 or 2 µM). (G) Representative western blots showing the effect of rapamycin on the formation of CASP7-p29/p30 bands in SKBR3 cells. Cells in fed, starved (4 h), and/or treated with 10 nM rapamycin (Rap; 24 h). Blot was immunolabeled with mTOR activity reporters P70S60K and p70S6K-PO4. (H) Representative western blot showing CASP7 immunolabeling in tissues from a female CD-1 mouse. Ponceau S labeling serves as the loading control. n = 3 mice. (I) Representative western blot showing CASP7 immunolabeling in human breast cancer PDX specimens from indicated subtypes. Starved SKBR3 serves as the positive control for p29/p30 fragments. (J) Representative western blot showing CASP7 immunolabeling in parental MDA-MB-231 cells and derivative epirubicin-resistant (R8a, R8b, R3-1, R3-2) and 5Fu-resistant (R6-1, R6-2) cells in fed conditions. In each, at least n = 2 independent experiments. See also S3 Fig. https://doi.org/10.1371/journal.pbio.3003034.g003 To confirm and further explore the observed non-canonical processing, we analyzed CASP7 processing in different contexts. We compared the CASP7 processing profiles resulting from starvation stress to apoptosis-induced profiles (staurosporine-treated cells) (Fig 3E). A marked increase in cleaved-PARP and canonical cleaved-CASP3 (p20) were present only in staurosporine-treated cells, confirming that the amino acid starvation conditions employed were non-lethal. The CASP7-p29/30 fragment(s) were produced predominantly only in the non-lethal starvation conditions, whereas the canonical apoptosis-associated CASP7-p20 fragments were primarily observed in staurosporine-treated cells. Altogether, these results confirm that the CASP7 processing in response to non-lethal stress conditions is distinct from CASP7 processing during apoptosis. Mitochondrial outer membrane permeabilization (MOMP) activates caspases in apoptotic and non-apoptotic conditions [61,62]. BCL2 overexpression inhibits MOMP and subsequent CASP3 and CASP 7 processing in apoptosis [63,64]. To investigate whether MOMP plays a role in CASP7 non-canonical processing in non-lethal stress, we overexpressed BCL2 in SKBR3 parental cells. As expected, BCL2 overexpression inhibited canonical CASP7 processing in the control staurosporine-treated cells. In contrast, BCL2 overexpression did not inhibit the formation of CASP7-p29/30 in the non-lethal starvation stress condition (Fig 3F). These results indicate that CASP7 non-canonical processing is independent of MOMP. We next asked whether other known autophagy inducers would also generate non-canonical CASP7 cleavage. Rapamycin inhibits mTOR and induces autophagy [65,66]. In SKBR3 and MDA-MB-231 cell lines (S3A and S3B Fig), autophagy was induced upon treatment with rapamycin. However, rapamycin did not result in non-canonical CASP7 processing (Figs 3G and S3C). Further, CASP3 + 7 DKO had no effect on rapamycin-induced autophagy (S3A and S3B Fig). Collectively, these observations suggest that caspases act upstream or in parallel to mTOR in autophagy regulation. Next, we investigated the prevalence and the context dependency of non-canonical CASP7 processing. Using fed (unstressed) mice, we confirmed the existence of in vivo non-canonical processing of CASP7 in several tissues, including in the brain, heart, spleen, and kidney (Fig 3H). Multiple breast cancer patient-derived xenograft (PDX) specimens also clearly showed the existence of CASP7-p29/30 fragments (Fig 3I). Further, a panel of anthracycline-resistant MDA-MB-231 cell lines were analyzed, with the majority showing a clear accumulation of CASP7-p29/30 fragments (Fig 3J). Altogether, these data confirm the existence of stable non-canonical CASP7 fragments in vitro and in vivo models and in both normal and cancer tissues. CASP7-p29 and p30 are generated via processing at calpain cleavage sites Next, we investigated the identity of the CASP7-p29/30 fragments observed in non-lethal stress conditions. In apoptosis, CASP7 activation is initiated by removing the pro-domain [37], and no stable accumulation of CASP7 lacking the pro-domain has been reported. To determine whether the non-canonical CASP7-p29/30 fragment(s) could be the result of pro-domain removal, we stably transfected CASP3 + 7 DKO cells with a full length CASP7 wild-type construct (CASP7-WT), a pro-domain deletion construct (CASP7-ΔPro) (Fig 4A and 4B), or a pro-domain cleavage site-mutant construct (S4A Fig). In western blot analyses, the CASP7-p29/30 fragments formed in starvation did not align with the CASP7-ΔPro construct, which appeared at ~ 33.5 kDa (Figs 4A and S4A). Additionally, when starved, the CASP7-ΔPro-expressing cells also resulted in CASP7-p29/30 fragment(s) (Figs 4A and S4A). Collectively, these data show that the non-lethal cellular stress-associated CASP7 cleavage occurs at a non-canonical site(s) downstream of the pro-domain cleavage site. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. CASP7-p29 and p30 are generated via processing at calpain cleavage sites. (A) Representative western blot showing CASP7 immunolabeling in CASP3 + 7 DKO SKBR3 cells stably transfected with CASP7-WT or CASP7-delta-Pro (CASP7-Δpro) constructs in fed or starvation conditions. Arrowhead indicates CASP7-Δpro band and arrow indicates CASP7-p29/p30 bands. n = 2 independent experiments. (B) Schematic of subunits, canonical cleavage sites and non-canonical cleavage sites in CASP7 wild-type or in various CASP7 fragments used or observed in this study. Black scissors denote the canonical cleavage sites and the order of processing in apoptosis. Purple scissors, bars, and black arrows denote the putative calpain cleavage sites (non-canonical). (C) The sequences on the left show the putative calpain cleavage sites in CASP7-WT and the amino acid replacements made in the CASP7 construct with putative calpain-cleavage sites mutated (CASP7-CCM). Representative western blot of CASP7 from CASP3 + 7 DKO SKBR3 cells transfected with CASP7-WT or CASP7-CCM constructs grown in fed (F) or starved (S) conditions. n = 2 independent experiments. (D, E) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, calpain 1 or calpain 2 siRNAs (48 h) and then continued to be cultured in fed (F) conditions or starved (S) for 8 h (D) or treated with MG132 for 24 h (E), with BafA1(50 nM) in the final 2 h. (F–I) Quantification of LC3B-based autophagy flux and CASP7 bands shown in (D) and (E) relative to scramble-siRNA control. In graphs, data are shown as mean ± SEM. n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. See also S4 Fig. In F–I, one-way ANOVA with Dunnett’s post-test. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g004 To elucidate the precise identity of the non-canonical cleavage sites and fragments, we enriched for CASP7-p29/30 fragments by performing CASP7-immunoprecipitation (IP) under starvation conditions (S4B). Protein in the 29/30 kDa region was extracted and subjected to Edman sequencing, which revealed the location of a non-lethal stress-associated cleavage site 10 amino acids downstream of the CASP7 pro-domain cleavage site (Fig 4B and 4C). This non-canonical cleavage site and a second site nearby (Fig 4B and 4C), corresponding to p30 and p29, respectively, were shown previously to be cleaved in vitro by calpains, but were associated with further processing of CASP7 to yield p17 and p18 [39,40]. Using the DeepCalpain algorithm [67], we identified predicted calpain cleavage sites conserved in murine CASP7 (S4C Fig) and confirmed that they corresponded to the size of the CASP7-p30/p29 bands detected in murine samples (Fig 3H). To verify the cleavage sites, we created a CASP7-calpain cleavage mutant (CASP7-CCM) construct by substituting both calpain cleavage sites with alanine residues (see Materials and methods; Fig 4C). Upon starvation, CASP3 + 7 DKO SKBR3 cells stably transfected with CASP7-CCM failed to form abundant CASP7-p29/p30 compared to CASP7-WT cells (Fig 4C). Together, these results validate the non-canonical cleavage of CASP7 and suggest a potential involvement of calpains in regulating CASP7 to promote starvation or PI-induced autophagy and/or cell survival in non-lethal stress conditions. To determine whether calpains have any direct involvement in CASP7 non-canonical processing, we performed single KD of the most abundant calpain family members calpain 1 and calpain 2 in SKBR3 cell lines and subjected these cells to non-lethal amino acid starvation or PI stress conditions (Fig 4D–4I). Single KD of calpain 1 or calpain 2 led to a significant increase in overall CASP7 levels compared to control (Fig 4D–4I), suggesting a role in CASP7 turnover. Next, the ratio of CASP7 p29/30 fragment(s) relative to its full length was quantified (Fig 4G and 4I). In both stress conditions, calpain 1 KD cells showed the highest level of autophagy induction (Fig 4F and 4H) and CASP7-p29/30 fragments ratios (Fig 4F and 4H). In contrast, in both stress conditions, calpain 2 KD showed a reduction of stress-induced autophagy (p < 0.05 for PI-induced stress) and the lowest CASP7-p29/30 ratios. Collectively, these data suggest that calpain 2, but not calpain 1, may mediate the CASP7 non-canonical cleavage at the two calpain cleavage site(s) to positively regulate autophagy. If CASP3 is processed by a mechanism other than calpain 2, then CASP3 KO in combination with calpain 2 KD would be expected to result in an enhanced suppression of autophagy. However, we found that knockdown of calpain 2 in the CASP3 KO background did not further enhance autophagy inhibition relative to the scramble-siRNA control (S4D and S4E Fig), consistent with the possibility that CASP3 could also be processed by calpain 2 in this context. CASP3 was shown previously to be processed by calpains [68,69], further supporting the investigation of CASP3 processing in non-lethal stress conditions as an interesting avenue for future studies. PARylation and ATG gene expression are reduced in CASP3 and CASP7 DKO cells The well-known PARP1 binding site [43] in CASP7 is located between the two identified calpain cleavage sites (S4F Fig). PARP1 and associated PARylation activities were shown to play key roles in stress sensing, stress adaptation, and autophagy induction in multiple cell lines and in vivo mouse models [70–75]. Further, PARP1 is a CASP3 and CASP7 substrate, with the latter showing a higher affinity for PARP1 binding [76]. These factors prompted us to investigate PARP1 levels and PARP1 activity in the CASP3 + 7 DKO background. We predicted that cleaved-PARP1 levels would be reduced, and thus PARP1 activity would be increased in the CASP3 + 7 DKO cells. Unexpectedly, we detected a higher level of cleaved-PARP1 in CASP3 + 7 DKO cells compared to the SKBR3 parental cells (Fig 5A–5D). There was no evidence of cell death in the DKO cells, consistent with previous observations that cleaved-PARP1 (89 kDa) is not always associated with cell death [77,78]. Cleaved-PARP1 levels were further increased when cells were stressed with a non-lethal dosage of MG132 (Fig 5B and 5D). PARylation is a PARP catalytic activity-dependent process [79]. Next, PAR immunolabeling was performed as a readout for PARylation activity of PARP1 [80]. We observed a significant reduction in PAR labeling in CASP3 + 7 DKO cells compared to parental cells in both SKBR3 and MDA-MB-231 cell lines (Fig 5E–5H). These results indicate that in the absence of CASP3 and CASP7, PARP1 cleavage is enhanced and PARP1 activity (as measured by PAR level) is impaired. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. PARylation and ATG gene expression are reduced in CASP3 + 7 DKO cells. (A, B) Representative western blots of indicated proteins from parental and CASP3 + 7 DKO SKBR3 cells cultured in fed conditions (F) or starved (S) for 8 h (A) or treated with MG132 (0.5 μM) for 24 h (B). (C) Quantification of cleaved-PARP1 shown in (A). Values were normalized to parental fed conditions. (D) Quantification of cleaved-PARP1 shown in (B). Values were normalized to parental NT. NT = No Treatment (vehicle DMSO). (E, F) Representative western blots of PAR immunolabeling from parental and CASP3 + 7 DKO SKBR3 (E) or MDA-MB-231(F) cells treated with vehicle (DMSO) or MG132 (0.5 μM) for 24 h. (G, -H) Quantification of PAR levels shown in (E) and (F). Values were normalized to the parental NT condition. (I) RT-qPCR analyses of LC3B in SKBR3 parental, CASP3 + 7 DKO or CASP3 + 7-WT re-expression in CASP3 + 7 DKO SKBR3 cells. Cells were treated with vehicle (DMSO) or 0.5 μM of MG132 for 24 h. (J) Representative western blots of indicated proteins from SKBR3 parental and CASP3 + 7 DKO cells transfected with scramble or CASP2 siRNAs (72 h). (K) Quantification of cleaved-PARP1 levels (shown in J) relative to scramble-siRNA control. (L) Quantification of CASP2 levels relative to scramble-siRNA-treated parental control. (M) Quantification of PAR levels shown in (J) relative to scramble-siRNA-treated parental control. (N) Representative western blots of indicated proteins from SKBR3 parental and CASP3 + 7 DKO cells transfected with scramble, CASP2 or CASP8 siRNAs (42 h) and treated with vehicle DMSO or MG132 (0.5 μM) for 24 h, with BafA1(50 nM) in the final 2 h. (O) Quantification of LC3B-based autophagy flux in MG132-treated cells, shown in (N). In graphs, data are shown as mean ± SEM. n = 3–6 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In C, D, G, and H, one-way ANOVA with Tukey’s post-test. In I, K–M, and O, one-way ANOVA with Dunnett’s post-test. See also S5 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g005 In cellular stress conditions, PARP1 and PARylation were shown previously to positively regulate cytoprotective autophagy through multiple mechanisms [73–75,81], including the transcriptional upregulation of autophagy pathway components) [82]. Since CASP3 + 7 DKO lead to reduced PARylation, we predicted that CASP3 + 7 DKO would result in transcriptional downregulation of autophagy pathway components. Among the components evaluated by reverse transcription polymerase chain reaction reverse transcription polymerase chain reaction (RT-qPCR), we found that LC3B and ATG7 transcript levels were reduced in CASP3 + 7 DKO SKBR3 cells (S5A–S5E Fig). There was an increase in both transcripts in response to MG132-induced stress (Figs 5I and S5F), with DKO cells showing weaker upregulation compared to parental (Figs 5I and S5F). Dual re-expression of CASP3 + 7-WT constructs partially rescued transcript levels of these genes in DKO cells (Figs 5I and S5F). These results are consistent with the possibility that CASP3 and CASP7 regulate transcription of key autophagy genes, potentially through changes to PARP1 and PARylation. To test whether CASP3 and CASP7 regulate autophagy through PARP1, we investigated whether overexpression of PARP1 could rescue autophagy in the CASP3 + 7 DKO background. Moreover, we compared rescue ability of wild-type PARP1 (PARP1-WT), catalytically inactive PARP1 (PARP1-CI), and PARP1 with the DEVD cleavage site mutated (PARP1-DEVA, PARP1-AEVA). Consistent with endogenous PARP1, the PARP1-WT transfected CASP3 + 7 DKO cells resulted in a significantly increased accumulation of cleaved-PARP1 compared to the PARP1-WT transfected parental cells (S5G and S5H Fig). As expected, levels of cleaved PARP1 were significantly reduced in the PARP1-DEVA and PARP1-AEVA expressing cells (S5G and S5H Fig). While the cleavage-resistant PARP1-DEVA and PARP1-AEVA cells also showed a trend of increased LC3B transcription relative to PARP1-WT (S5J Fig), the PARylation and autophagy flux levels showed no differences (S5G, S5I, S5K, and S5L Fig). The PARP1-CI transfected cells did show significantly reduced PARylation compared to the vector control, consistent with the reported dominant negative effect of catalytically inactive PARP1 (S5G and S5I Fig) [83], but induction of cell death and very low expression levels confounded interpretation of results. Overall, while the association between PARP1 cleavage and LC3B transcription remains consistent, it is not clear whether the reduced autophagy flux in the CASP3 + 7 DKO background is due to the enhanced PARP1 cleavage. Next, to identify the protease(s) responsible for PARP1 cleavage in the absence of CASP3 and 7, we evaluated multiple candidate proteases using a siRNA approach. The knockdown of CASP6, calpain 1, calpain 2, cathepsin B, and cathepsin D resulted in a further increase in PARP1 cleavage (S5M–S5O Fig). We next investigated caspase 2 (CASP2) and caspase 8 (CASP8), both shown to localize to the nucleus in some contexts [84,85]. We discovered that knockdown of CASP2, but not CASP8, consistently reduced PARP1 cleavage in CASP3 + 7 DKO cells (Figs 5J, 5K, and S5P). Interestingly, the relative expression level of CASP2 was significantly higher in CASP3 + 7 DKO cells compared to the parental cells (Fig 5J and 5L), further supporting a role for CASP2 in this context. Consistent with the lack of rescue by the DEVD cleavage site mutants, PARylation and autophagic flux were not rescued in the CASP2 knockdown CASP3 + 7 DKO cells (Fig 5M–5O). Collectively, our data suggest that CASP2 is at least partially responsible for the increased PARP1 cleavage in the CASP3 + 7 DKO cells. Further investigation of the contexts and role of CASP2 in PARP1 regulation is warranted. CASP7-p29/30 promote phosphorylation of DDR pathway protein H2AX In addition to transcriptional remodeling, PARP1 plays a central role in DNA repair and DNA damage response (DDR) pathways in response to cell stress [71,86]. Both starvation and proteasome inhibition were reported to induce DNA damage and activate autophagy as a part of stress-induced DDR [87,88]. Western blot analyses revealed that MG132 stress-induced phospho-H2AX (γ-H2AX) accumulation was significantly reduced in CASP3 + 7 DKO cells compared to parental cells without reduction in total H2AX levels (Fig 6A and 6B). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. CASP7-p29/30 promote phosphorylation of DDR pathway protein H2AX. (A) Representative western blots of indicated proteins from SKBR3 parental and CASP3 + 7 DKO cells treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Ponceau S staining was used as the loading control. (B) Quantification of γ-H2AX/total H2AX shown in (A). (C) Representative images from alkaline comet assay from SKBR3 parental and CASP3 + 7 DKO cells treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Cells treated with H2O2 (100 μM for 4 h) serve as the positive control. (D) Quantification of tail moments of comets shown in (C). Tukey’s box-and whisker plots are based on tail moments determined by CometScore from 200 cells in each of two independent assays. (E) Representative western blots of indicated proteins from SKBR3 parental, CASP3 + 7 DKO or single KO cells treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Ponceau S staining was used as the loading control. (F) Quantification of γ-H2AX/total H2AX shown in (E). (G) Representative western blots of indicated proteins from CASP3 + 7 DKO SKBR3 cells stably transfected with indicated CASP7 constructs treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Ponceau S staining was used as the loading control. (H) Quantification of γ-H2AX/total H2AX shown in (G). (I) Representative images from alkaline comet assay from CASP3 + 7 DKO SKBR3 cells stably transfected with indicated CASP7 constructs and treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Cells treated with H2O2 (100 μM for 4 h) serve as the positive control. (J) Quantification of tail moments of comets shown in (I). Tukey’s box-and-whisker plots are based on tail moments obtained from 200 cells in each of two independent assays. (K) Representative images of crystal violet assay plate. CASP7-CCM, CASP7-p30, and CASP7-p29 SKBR3 cells treated with indicated concentrations of proteasome inhibitor MG132 for 24 h and continued to grow in drug free media for another 3 days. (L) Quantification of cell viability data presented in (K). The percentage of stained (viable) cells at each concentration was normalized to respective untreated cells. (M) Representative western blots of indicated proteins from CASP3 + 7 DKO SKBR3 cells re-expressing vector, CASP7-WT (catalytically active) or CASP7-inactive (catalytically inactive) constructs, treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. H3 serves as the loading control. (N) Quantification of γ-H2AX/total H2AX shown in (M). (O) Representative western blots of LC3B-based autophagy flux in CASP3 + 7 DKO SKBR3 cells re-expressing the indicated CASP7 constructs, treated with vehicle DMSO or MG132 (0.5 µM) for 24 h, with BafA1(50 nM) in the final 2 h. (P) Quantification of autophagy flux in MG132 treated cells relative to MG132 treated CCM expressing cells, shown in (O). In graphs, data are shown as mean ± SEM. n = 3 independent experiments except for comet assays n = 2. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B, F, L, N, and P, one-way ANOVA with Dunnett’s post-test. In D and H, one-way ANOVA with Tukey’s post- test. In J, two-way ANOVA with Sidak’s post-test. See also S6 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g006 Traditionally, the level of γ-H2AX was measured to determine the extent of DNA damage [89]; however, γ-H2AX levels were also shown to represent the strength of DDR signaling activity [12,90,91]. To distinguish between these two possibilities, we used comet assays to directly measure DNA damage [92]. The level of DNA damage/fragmentation in parental versus CASP3 + 7 DKO SKBR3 cells was not significantly different when treated with the vehicle (DMSO) control (Fig 6C and 6D). In response to non-lethal MG132 stress conditions, there was an increase in DNA damage, but it was comparable in both parental and CASP3 + 7 DKO cells. In contrast, the positive control, H2O2, lethal stress conditions induced an increase in DNA damage only in the parental cell line (Fig 6C and 6D). This data indicates that the γ-H2AX reduction observed in CASP3 + 7 DKO cells in the MG132 non-lethal stress conditions is not due to a reduction in DNA damage, but possibly instead due to impaired DDR signaling. Next, we investigated the γ-H2AX levels in CASP3 + 7 double and single KO cells (Fig 6E and 6F). The γ-H2AX levels were reduced in CASP3 + 7 DKO cells, but relatively increased in both CASP3 single KO and CASP7 single KO cells. Notably, the γ-H2AX level was highest in CASP3 KO cells, suggesting that CASP7 plays a prominent role, relative to CASP3, in the H2AX phosphorylation. To further investigate this possibility and given the proximity of the PARP1 binding site to the CASP7 calpain cleavage sites, we generated CASP3 + 7 DKO cell lines stably expressing CASP7-CCM, CASP7-p30, or CASP7-p29 (S6A Fig). We used CASP7-CCM as the full-length control since it does not undergo non-canonical cleavage. We compared γ-H2AX levels in the absence and presence of non-lethal MG132, and observed that the stress-induced γ-H2AX accumulation was increased in CASP7-p30 and CASP7-p29 expressing cells compared to CASP7-CCM expressing cells (Fig 6G and 6H). Among the three constructs, the greatest level of γ-H2AX was observed with CASP7-p29. Furthermore, comet assays revealed that the degree of DNA damage was not significantly different among the three cell lines in either vehicle (DMSO) control, non-lethal MG132-treated or lethal H2O2-treated conditions (Fig 6I and 6J). In fact, the degree of DNA damage appeared slightly reduced in CASP7-p29 expressing cells, where we detected the greatest level of γ-H2AX (Fig 6H and 6J). Crystal violet cell viability assays, following treatment with increasing concentrations of MG132 or Bortezomib (Figs 6K, 6L, S6B and S6C), showed no significant differences in cell viability despite the different γ-H2AX levels observed in the three cell lines. Overall, these results support that CASP7 non-canonical processing does not affect the degree of DNA damage, but rather plays a role in inducing different levels of γ-H2AX signaling (S6D Fig). Consistent with several previous studies showing that γ-H2AX is not an unambiguous marker for DNA double-strand breaks [12,90,91], our findings emphasize the need for careful interpretation of γ-H2AX levels. To further explore CASP7 in this role, we employed catalytically active wild type (CASP7-WT) and catalytically inactive (CASP7-inactive, C > A) constructs stably expressed in CASP3 + 7 DKO cells. CASP7-WT construct expression, but not catalytically inactive construct, led to a significant increase in the γ-H2AX signaling in non-lethal MG132-treated cells (Fig 6M and 6N), revealing that CASP7 and its catalytic activity are required for the phosphorylation of H2AX. Dual inhibition of CASP3 and CASP7 mimics effects of PARP1 inhibitors Our data shows that inhibition of CASP3 and CASP7 leads to increased PARP1 cleavage and reduced PARP activity (PARylation), DDR signaling, and autophagy response. Since these effects phenocopy the genetic or pharmacological inactivation of PARP1 [70,71], we hypothesized that dual CASP3 + 7 inhibition would also phenocopy the synthetic lethal effects of PARP1 in cells that are defective in homologous recombination (HR) [93,94]. To test this, we generated CASP3 + 7 DKD in the BRCA1-deficient SUM149PT breast cancer cells and assessed their viability by the crystal violet assay. In striking contrast to the BRCA1/2-proficient SKBR3, MDA-MB231, and JIMT1 breast cancer cells, the dual inhibition of CASP3 + 7 was lethal in SUM149PT cells (Fig 7A–7C). In the western blot analyses, a distinct band of cleaved-PARP1 was visible in CASP3 + 7 DKD cells compared to the scramble-siRNA control (Fig 7C). Western blot analyses also confirmed that autophagic flux was reduced in the CASP3 + 7 DKD SUM149PT cells compared to the scramble-siRNA control or compared to the single KD of CASP3 or CASP7 (Fig 7D and 7E). To determine whether the lethal effects of CASP3 + 7 DKD was redundant or potentially additive with PARP1 inhibition, we treated the SUM149PT cells with the PARP1 inhibitor olaparib. The combination of CASP3 + 7 DKD and olaparib treatment resulted in a further reduction of SUM149PT cell viability compared to olaparib treatment alone (Fig 7F and 7G). These results suggest that combined inhibition of CASP3 and CASP7 induces synthetic lethality with loss of BRCA1. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 7. Dual inhibition of CASP3 and 7 mimics effects of PARP1 inhibitors. (A) Representative images of crystal violet assays. SUM149PT, MDA-MB-231, SKBR3, or JIMT-1 cells transfected with scramble-siRNA or CASP3 and CASP7 siRNAs (10 nM) for 2 days and then cultured in siRNA-free media for another 3 days. (B) Quantification of cell viability from data represented in (A). Percentage of stained (viable) cells at each concentration was normalized to respective untreated cells. (C) Representative western blots showing CASP3 and CASP7 levels following siRNA treatment conditions used in crystal violet assays shown in (A). Immunolabelling of cleaved-PARP1 is also shown. n = 2 independent experiments. (D) Representative western blots of indicated proteins from SUM149PT cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and treated with vehicle DMSO or MG132 (0.5 μM) for 24 h, with BafA1 (50 nM) in the final 2 h. (E) Quantification of LC3B-based autophagy flux in proteasome inhibitor (MG132) treated cells shown in (D). The levels of LC3BII in MG132 treated cells were normalized to loading control and shown relative to the MG132 treated scramble-siRNA control. (F, G) Representative images of crystal violet assay plates (F) and quantification of percentage of cell viability (G). Indicated siRNA transfected cells treated with indicated concentrations of Olaparib for 24 h and continued to be cultured in drug free media for another 3 days. Graph (G) show the percentage of stained (viable) cells at each concentration normalized to scramble-siRNA-treated and Olaparib untreated cells. (H) Dot plot showing DepMap cancer cell lines with bottom 10% of CASP3 + 7 protein expression (n = 16) and top 10% CASP3 + 7 protein expression (n = 10). Normalized CASP3/7 protein expression (scaled and mean centered) were extracted from Nusinow and colleagues using GRETTA. (I) Ranked CASP3 + 7 genetic interaction scores generated using GRETTA. The cell lines at the bottom 10% of CASP3 and CASP7 protein expression are compared against cells lines at the top 10%. Rank is based on lethal GI scores. The red points indicate true candidate CASP3 + 7 lethal genetic interactions (GIs). The top most lethal genetic interactors are labeled. In graphs, unless otherwise noted, data are shown as mean ± SEM. n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B and E, one-way ANOVA with Sidak’s and Dunnett’s post-test, respectively. In G, two-way ANOVA with Sidak’s post-test. In I, Mann–Whitney U-test p-value < 0.05. See also S7 Fig. The numerical data presented in this figure can be found in S1 Data. The code related to Fig 7H and 7I is publicly available in a GitHub repository (https://github.com/MarraLab/Caspase_GRETTA_analysis) and archived on Zenodo (https://doi.org/10.5281/zenodo.14722298). https://doi.org/10.1371/journal.pbio.3003034.g007 To further assess genetic interactions between BRCA1 and CASP3 and CASP7, we utilized Genetic inteRaction and EssenTiality neTwork mApper (GRETTA) [95], a tool that leverages the publicly available data from the Cancer Dependency Map (DepMap [96–98], to perform a pan-cancer in silico genetic interaction screen. Using GRETTA, we identified 16 DepMap cancer cell lines harboring low protein expression of both CASP3 and CASP7 and 10 cancer cell lines with high expression of both CASP3 and CASP7 across 13 cancer types (Figs 7H and S7A). Comparisons between these two groups resulted in the identification of 142 candidate CASP3 + 7 synthetic lethal interactors, where perturbation of these genes resulted in significantly higher lethal probabilities in the CASP3 + 7 low expressor lines compared to the CASP3 + 7 high expressor lines (S7B Fig). The candidate synthetic lethal interactors included HR repair pathway genes (S7B Fig), with BRCA1 as the top synthetic lethal hit (Fig 7I). Collectively, our results show that the dual inhibition of CASP3 and CASP7 mimics the cellular effects of PARP1 inhibitors, revealing a surprising new avenue for potential therapeutic exploitation. Counter-intuitive to traditional caspase roles, our findings demonstrate that combined inhibition of CASP3 and CASP7 induces synthetic lethality in BRCA1-defective cells. Dual loss of CASP3 and CASP7 suppresses starvation-induced autophagy We used the autophagy-activating role in response to nutrient deprivation of the Drosophila effector caspase Dcp-1 [34–36] as the basis for investigating the potential functional conservation between two closely related human caspases, effector CASP3 and CASP7. First, we employed standard MAP1LC3B/LC3B-based autophagy assays as these have been used successfully to monitor autophagy in breast cancer cell lines, including in SKBR3 lines, and under similar stress conditions we employed [44–47]. To determine the optimum time point for detecting a consistent upregulation of autophagy without signs of cell death, SKBR3 cells were subjected to amino acid starvation (Fig 1A and 1B). We observed a significant increase in autophagic flux at 8- and 24-h starvation, as measured by the increased LC3BII accumulation in the presence of BafA1 (Fig 1B). Across all time points, low levels of cleaved-PARP (at ~ 89 kDa) were observed in both fed (non-stressed) control cells and starved (stressed) cells as reported previously as being normal even in non-apoptotic viable cells [7,48]. Furthermore, there were no signs of cell death, as indicated by the absence of a marked increase in cleaved-PARP1 in the starvation conditions (Fig 1A), and thus we selected 8-h starvation as the optimal time point for detecting non-lethal stress-related starvation-induced autophagic flux in SKBR3 cells in this study. Next, to determine whether amino acid starvation-induced autophagy was CASP3- and/or CASP7-dependent, single knockdown (KD) and double knockdown (DKD) experiments were performed with siRNAs. The control scramble-siRNA-treated cells showed an increase in LC3BII levels in the presence of BafA1, indicating an increase in autophagic flux when cells were starved (Fig 1C). In single CASP3 or CASP7 siRNA-treated cells, the starvation-induced autophagic flux was comparable to that of the control. However, a significant suppression of autophagic flux was observed in CASP3 and CASP7 (CASP3 + 7) DKD cells (Fig 1C and 1D). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Dual loss of CASP3 and CASP7 suppresses starvation-induced autophagy. (A) Representative western blots of indicated proteins from SKBR3 cells in fed (F) or in amino acid starvation (S) for various time periods, in the absence or presence of 50 nM Bafilomycin A1 (BafA1) in the final 2 h. (B) Quantification of LC3B-based autophagy flux in starved cells relative to the fed control, shown in (A). All with BafA1. (C) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and incubated in fed conditions or starved for 8 h with BafA1 (50 nM) for the final 2 h. (D) Quantification of LC3BII-based autophagy flux in starved cells relative to the starved scramble-siRNA control, shown in (C). (E) Representative western blots of indicated proteins from CASP3, CASP7 single (KO), or double knockout (DKO) SKBR3 cells in fed or starved (8 h) conditions, with BafA1 (50 nM) in the final 2 h. (F) Quantification of LC3BII-based autophagy flux in starved cells relative to the parental control, shown in (E). (G) Representative western blots of indicated proteins from SKBR3 parental, DKO or CASP3 + 7-WT re-expression in DKO cells in fed or starved (8 h) conditions, with BafA1 (50 nM) in the final 2 h. (H) Quantification of LC3BII-based autophagy flux in starved cells relative to starved parental cells, shown in (G). (I) Representative immunofluorescence images of SKBR3 parental, DKO or CASP3 + 7-WT re-expression in DKO cells treated with 0.25 µM DALGreen in fed or starved (8 h) conditions. BafA1 (50 nM for 8 h) in both fed and starved conditions serve as controls. Scale bars, 20 µm. (J) Quantification of DALGreen immunofluorescence in starved cells shown in (I). Graph shows number of punctae per cell. n = 2, each with 10 random images covering total of 500–700 cells. In graphs, data are shown as mean ± SEM. n = at least 3 independent experiments except in (J). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B, D, F, and H, one-way ANOVA with Dunnett’s post-test. In J, one-way ANOVA with Tukey’s post-test. See also S1 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g001 We hypothesized that the lack of a significant effect on starvation-induced autophagy upregulation in single CASP3 or CASP7 KD experiments might be due to the presence of residual caspase levels (S1A and S1B Fig). To address this, we generated CASP3 and CASP7 single knockout (KO) and double knockout (DKO) SKBR3 cell lines (S1C Fig). In single caspase KO cells, we found negligible or undetectable effects on starvation-induced autophagy compared to the control parental cell lines (Fig 1E and 1F). This suggests that CASP3 and CASP7 have partially overlapping functions and/or single KO cell lines may have activated compensatory mechanisms to overcome the other caspase’s loss. Consistent with this notion and with what we observed in DKD cells, there was a significant impairment of autophagic flux in CASP3 + 7 DKO cell lines (Fig 1E and 1F). Similarly, following 24 hours starvation, there was no induction of autophagic flux in CASP3 + 7 DKO cells (S1D and S1E Fig). To confirm the specificity of our findings, we generated multiple isogenic caspase DKO SKBR3 cell lines. In all five CASP3 + 7 DKO lines we tested, we observed a reduction in LC3B-II levels (in the presence of BafA1) indicating that amino acid starvation-induced autophagy was compromised (S1F Fig). To exclude the possibility of cell line-specific effects, we repeated our experiments using another breast cancer cell line, MDA-MB-231, and similarly found that autophagic flux induction was significantly compromised only upon CASP3 + 7 DKD and DKO (S1G–S1L Fig). Lastly, when wild-type constructs of both CASP3 and CASP7 were stably reintroduced into DKO cells, the starvation-induced upregulation of autophagy was partially rescued (Fig 1G and 1H). The processing of LC3B to form LC3BII is a hallmark of autophagy [49]. However, since LC3B is also involved in autophagy-independent processes [50], we orthogonally measured autophagy by employing the DALGreen autolysosome fluorescent marker [51]. In accordance with the LC3B-based autophagy assay results, the levels of DALGreen positive puncta indicate that amino-acid deprivation-induced autophagy is significantly compromised in CASP3 + 7 DKO cells, and the dual re-expression of CASP3 and CASP7 fully rescued autophagy (Fig 1I and 1J). The reduction in number of autolysosomes in CASP3 + 7 DKO cells was not due to a difference in cell size (S1M and S1N Fig). These observations together indicate that CASP3 and CASP7 play a partially redundant positive regulatory role in non-lethal amino acid deprivation-induced autophagy. Dual loss of CASP3 and CASP7 suppresses proteasome inhibition-induced compensatory autophagy and sensitizes cells to proteasome inhibitors It is well-documented that autophagy is upregulated as a compensatory mechanism to maintain protein homeostasis, when proteasome function is compromised [52,53]. Next, we investigated whether this compensatory autophagy was also dependent on CASP3 and CASP7. A proteasome inhibitor (PI) MG132, at 0.5 μM, was chosen as the non-lethal stress condition for these experiments. (Figs 2A–2C and S2A–S2C). The reduction of proteasome activity was confirmed directly by proteasome activity assays (Figs 2C and S2C) and indirectly through the accumulation of ubiquitinated proteins (Figs 2A and S2A). In line with what we observed in amino acid starvation conditions, the PI-induced autophagic flux was significantly inhibited in CASP3 + 7 DKD or DKO cells (Fig 2D–2G). Although the effect was modest, single CASP7 KO also resulted in a significant reduction in PI-induced autophagy (Fig 2F and 2G). Similar results were observed in the MDA-MB-231 cell line (S2D and S2E Fig), and with the clinically approved proteasome inhibitor Bortezomib [54,55] (S2F and S2G Fig). PI-induced LC3BII upregulation was partially rescued by dual re-expression of CASP3 + 7, supporting the requirement for these effector caspases in PI-induced autophagy (Fig 2H and 2I). Collectively, these results indicate that CASP3 and CASP7 have positive regulatory roles in autophagy induction in response to starvation or proteasome inhibition-induced stress. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Dual loss of CASP3 and CASP7 suppresses proteasome inhibition-induced compensatory autophagy and sensitizes cells to proteasome inhibitors. (A) Representative western blots of indicated proteins from SKBR3 cells treated with MG132 at increasing dosage for 24 h, with BafA1 (50 nM) in the final 2 h. (B) Quantification of LC3B-based autophagy flux in MG132-treated cells relative to untreated (0 μM MG132; DMSO vehicle only; NT) SKBR3 shown in (A). The levels of LC3BII were normalized to loading control and shown relative to the untreated control (NT). (C) Graph showing proteasome activity in response to increasing concentrations of MG132 as depicted by chymotrypsin, caspase and trypsin-like activities. (D) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and treated with vehicle DMSO only or MG132 (0.5 μM) for 24 h, with BafA1 (50 nM) in the final 2 h. (E) Quantification of LC3B-based autophagy flux in MG132-treated cells relative to the MG132-treated scramble-siRNA control, shown in (D). (F) Representative western blots of indicated proteins from CASP3, CASP7 single knockout, or DKO SKBR3 cells treated with vehicle DMSO or MG132 (0.5 μM) for 24 h, with BafA1 (50 nM) in the final 2 h. (G) Quantification of LC3B-based autophagy flux in MG132-treated cells relative to the MG132-treated parental control show in (F). (H) Representative western blots of indicated proteins from SKBR3 parental, DKO or CASP3 + 7-WT reintroduced into DKO cells treated with vehicle DMSO or MG132 (0.5 or 1.0 μM) for 24 h with BafA1 (50 nM) in the final 2 h. (I) Quantification of LC3BII-based autophagy flux in MG132-treated cells relative to DMSO-treated parental control, shown in (H). (J) Representative images of crystal violet assay plates of SKBR3 parental and CASP3 + 7 DKO cells treated with indicated concentrations of MG132 for 24 h and continued to grow in drug free media for another 3 days. (K) Quantification of cell viability shown in (J). Percentage of stained (viable) cells at each concentration was normalized to respective untreated cells. In graphs, data are shown as mean ± SEM. n = 3 independent experiments. * P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B, C, E, and G, one-way ANOVA with Dunnett’s post-test and I with Tukey’s post- test. K with two-way ANOVA with Sidak’s post-test. See also S2 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g002 Since autophagy has been reported to promote cell death in some contexts [56–58], we investigated whether the CASP3- and CASP7-dependent stress-induced autophagy contributes to cell death or survival. Crystal violet cell viability assays were carried out following PI treatment (MG132 or Bortezomib) or starvation of parental and CASP3 + 7 DKO or DKD of SKBR3 or MDA-MB-231 cells (Figs 2J, 2K and S2H–S2O). We observed a modest, but significant reduction in cell viability in response to PI treatment or starvation in CASP3 + 7 DKD or DKO cells compared to control cells. Contrary to traditional caspase roles, these results suggest that CASP3 and 7 promote cell survival in non-lethal PI or starvation stress conditions. CASP7 is cleaved non-canonically generating stable p29/p30 fragments under non-lethal stress conditions To elucidate the molecular mechanisms by which CASP3 and CASP7 positively regulate cytoprotective autophagy, we analyzed their expression and processing pattern following amino acid starvation or PI treatments. CASP3 and CASP7 processing during apoptosis each result in large ~ p20 and small ~ p10/p12 cleaved-caspase fragments [37,59,60]. Intriguingly, our western blot analyses revealed a stable non-canonical CASP7 cleavage fragment(s) at ~ 30 kDa (p30) and/or 29 kDa (p29), and a CASP3 cleavage fragment at ~ 27 kDa (p27) in non-lethal starvation (Fig 3A and 3B) or PI conditions (Fig 3C and 3D). In fed or untreated (non-stressed) cells, these non-canonical fragments were either absent or less apparent. The CASP3 fragment at ~ 27 kDa was inconsistent and/or not readily detectable (e.g., Fig 3C) across all experiments, so we focused on the CASP7-p29/p30 fragments. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. CASP7 is cleaved non-canonically generating stable p29/p30 fragments under non-lethal stress conditions. (A–D) Representative western blots showing changes in CASP7 and CASP3 in SKBR3 or MDA-MB-231 cells in amino acid starvation (A, B) or treated with MG132 (C, D). Non-canonical CASP7 bands (p29/p30) at 30 kDa are indicated by blue arrows. A non-canonical CASP3 band (p27) at 27 kDa is also detected in some conditions. (E) Representative western blots comparing CASP7, CASP3, and PARP1 processing patterns in SKBR3 cells following 4-h incubation in fed conditions (F), non-lethal stress (starvation) or lethal stress (apoptosis induction by staurosporine; STS, 1 or 2 µM). (F) Representative western blots comparing CASP7 and PARP1 processing in SKBR3 parental cells stably transfected with vector control or BCL2 expression construct, following 4-h incubation in fed conditions (F), non-lethal stress (starvation) or lethal stress (apoptosis induction by staurosporine; STS, 1 or 2 µM). (G) Representative western blots showing the effect of rapamycin on the formation of CASP7-p29/p30 bands in SKBR3 cells. Cells in fed, starved (4 h), and/or treated with 10 nM rapamycin (Rap; 24 h). Blot was immunolabeled with mTOR activity reporters P70S60K and p70S6K-PO4. (H) Representative western blot showing CASP7 immunolabeling in tissues from a female CD-1 mouse. Ponceau S labeling serves as the loading control. n = 3 mice. (I) Representative western blot showing CASP7 immunolabeling in human breast cancer PDX specimens from indicated subtypes. Starved SKBR3 serves as the positive control for p29/p30 fragments. (J) Representative western blot showing CASP7 immunolabeling in parental MDA-MB-231 cells and derivative epirubicin-resistant (R8a, R8b, R3-1, R3-2) and 5Fu-resistant (R6-1, R6-2) cells in fed conditions. In each, at least n = 2 independent experiments. See also S3 Fig. https://doi.org/10.1371/journal.pbio.3003034.g003 To confirm and further explore the observed non-canonical processing, we analyzed CASP7 processing in different contexts. We compared the CASP7 processing profiles resulting from starvation stress to apoptosis-induced profiles (staurosporine-treated cells) (Fig 3E). A marked increase in cleaved-PARP and canonical cleaved-CASP3 (p20) were present only in staurosporine-treated cells, confirming that the amino acid starvation conditions employed were non-lethal. The CASP7-p29/30 fragment(s) were produced predominantly only in the non-lethal starvation conditions, whereas the canonical apoptosis-associated CASP7-p20 fragments were primarily observed in staurosporine-treated cells. Altogether, these results confirm that the CASP7 processing in response to non-lethal stress conditions is distinct from CASP7 processing during apoptosis. Mitochondrial outer membrane permeabilization (MOMP) activates caspases in apoptotic and non-apoptotic conditions [61,62]. BCL2 overexpression inhibits MOMP and subsequent CASP3 and CASP 7 processing in apoptosis [63,64]. To investigate whether MOMP plays a role in CASP7 non-canonical processing in non-lethal stress, we overexpressed BCL2 in SKBR3 parental cells. As expected, BCL2 overexpression inhibited canonical CASP7 processing in the control staurosporine-treated cells. In contrast, BCL2 overexpression did not inhibit the formation of CASP7-p29/30 in the non-lethal starvation stress condition (Fig 3F). These results indicate that CASP7 non-canonical processing is independent of MOMP. We next asked whether other known autophagy inducers would also generate non-canonical CASP7 cleavage. Rapamycin inhibits mTOR and induces autophagy [65,66]. In SKBR3 and MDA-MB-231 cell lines (S3A and S3B Fig), autophagy was induced upon treatment with rapamycin. However, rapamycin did not result in non-canonical CASP7 processing (Figs 3G and S3C). Further, CASP3 + 7 DKO had no effect on rapamycin-induced autophagy (S3A and S3B Fig). Collectively, these observations suggest that caspases act upstream or in parallel to mTOR in autophagy regulation. Next, we investigated the prevalence and the context dependency of non-canonical CASP7 processing. Using fed (unstressed) mice, we confirmed the existence of in vivo non-canonical processing of CASP7 in several tissues, including in the brain, heart, spleen, and kidney (Fig 3H). Multiple breast cancer patient-derived xenograft (PDX) specimens also clearly showed the existence of CASP7-p29/30 fragments (Fig 3I). Further, a panel of anthracycline-resistant MDA-MB-231 cell lines were analyzed, with the majority showing a clear accumulation of CASP7-p29/30 fragments (Fig 3J). Altogether, these data confirm the existence of stable non-canonical CASP7 fragments in vitro and in vivo models and in both normal and cancer tissues. CASP7-p29 and p30 are generated via processing at calpain cleavage sites Next, we investigated the identity of the CASP7-p29/30 fragments observed in non-lethal stress conditions. In apoptosis, CASP7 activation is initiated by removing the pro-domain [37], and no stable accumulation of CASP7 lacking the pro-domain has been reported. To determine whether the non-canonical CASP7-p29/30 fragment(s) could be the result of pro-domain removal, we stably transfected CASP3 + 7 DKO cells with a full length CASP7 wild-type construct (CASP7-WT), a pro-domain deletion construct (CASP7-ΔPro) (Fig 4A and 4B), or a pro-domain cleavage site-mutant construct (S4A Fig). In western blot analyses, the CASP7-p29/30 fragments formed in starvation did not align with the CASP7-ΔPro construct, which appeared at ~ 33.5 kDa (Figs 4A and S4A). Additionally, when starved, the CASP7-ΔPro-expressing cells also resulted in CASP7-p29/30 fragment(s) (Figs 4A and S4A). Collectively, these data show that the non-lethal cellular stress-associated CASP7 cleavage occurs at a non-canonical site(s) downstream of the pro-domain cleavage site. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. CASP7-p29 and p30 are generated via processing at calpain cleavage sites. (A) Representative western blot showing CASP7 immunolabeling in CASP3 + 7 DKO SKBR3 cells stably transfected with CASP7-WT or CASP7-delta-Pro (CASP7-Δpro) constructs in fed or starvation conditions. Arrowhead indicates CASP7-Δpro band and arrow indicates CASP7-p29/p30 bands. n = 2 independent experiments. (B) Schematic of subunits, canonical cleavage sites and non-canonical cleavage sites in CASP7 wild-type or in various CASP7 fragments used or observed in this study. Black scissors denote the canonical cleavage sites and the order of processing in apoptosis. Purple scissors, bars, and black arrows denote the putative calpain cleavage sites (non-canonical). (C) The sequences on the left show the putative calpain cleavage sites in CASP7-WT and the amino acid replacements made in the CASP7 construct with putative calpain-cleavage sites mutated (CASP7-CCM). Representative western blot of CASP7 from CASP3 + 7 DKO SKBR3 cells transfected with CASP7-WT or CASP7-CCM constructs grown in fed (F) or starved (S) conditions. n = 2 independent experiments. (D, E) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, calpain 1 or calpain 2 siRNAs (48 h) and then continued to be cultured in fed (F) conditions or starved (S) for 8 h (D) or treated with MG132 for 24 h (E), with BafA1(50 nM) in the final 2 h. (F–I) Quantification of LC3B-based autophagy flux and CASP7 bands shown in (D) and (E) relative to scramble-siRNA control. In graphs, data are shown as mean ± SEM. n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. See also S4 Fig. In F–I, one-way ANOVA with Dunnett’s post-test. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g004 To elucidate the precise identity of the non-canonical cleavage sites and fragments, we enriched for CASP7-p29/30 fragments by performing CASP7-immunoprecipitation (IP) under starvation conditions (S4B). Protein in the 29/30 kDa region was extracted and subjected to Edman sequencing, which revealed the location of a non-lethal stress-associated cleavage site 10 amino acids downstream of the CASP7 pro-domain cleavage site (Fig 4B and 4C). This non-canonical cleavage site and a second site nearby (Fig 4B and 4C), corresponding to p30 and p29, respectively, were shown previously to be cleaved in vitro by calpains, but were associated with further processing of CASP7 to yield p17 and p18 [39,40]. Using the DeepCalpain algorithm [67], we identified predicted calpain cleavage sites conserved in murine CASP7 (S4C Fig) and confirmed that they corresponded to the size of the CASP7-p30/p29 bands detected in murine samples (Fig 3H). To verify the cleavage sites, we created a CASP7-calpain cleavage mutant (CASP7-CCM) construct by substituting both calpain cleavage sites with alanine residues (see Materials and methods; Fig 4C). Upon starvation, CASP3 + 7 DKO SKBR3 cells stably transfected with CASP7-CCM failed to form abundant CASP7-p29/p30 compared to CASP7-WT cells (Fig 4C). Together, these results validate the non-canonical cleavage of CASP7 and suggest a potential involvement of calpains in regulating CASP7 to promote starvation or PI-induced autophagy and/or cell survival in non-lethal stress conditions. To determine whether calpains have any direct involvement in CASP7 non-canonical processing, we performed single KD of the most abundant calpain family members calpain 1 and calpain 2 in SKBR3 cell lines and subjected these cells to non-lethal amino acid starvation or PI stress conditions (Fig 4D–4I). Single KD of calpain 1 or calpain 2 led to a significant increase in overall CASP7 levels compared to control (Fig 4D–4I), suggesting a role in CASP7 turnover. Next, the ratio of CASP7 p29/30 fragment(s) relative to its full length was quantified (Fig 4G and 4I). In both stress conditions, calpain 1 KD cells showed the highest level of autophagy induction (Fig 4F and 4H) and CASP7-p29/30 fragments ratios (Fig 4F and 4H). In contrast, in both stress conditions, calpain 2 KD showed a reduction of stress-induced autophagy (p < 0.05 for PI-induced stress) and the lowest CASP7-p29/30 ratios. Collectively, these data suggest that calpain 2, but not calpain 1, may mediate the CASP7 non-canonical cleavage at the two calpain cleavage site(s) to positively regulate autophagy. If CASP3 is processed by a mechanism other than calpain 2, then CASP3 KO in combination with calpain 2 KD would be expected to result in an enhanced suppression of autophagy. However, we found that knockdown of calpain 2 in the CASP3 KO background did not further enhance autophagy inhibition relative to the scramble-siRNA control (S4D and S4E Fig), consistent with the possibility that CASP3 could also be processed by calpain 2 in this context. CASP3 was shown previously to be processed by calpains [68,69], further supporting the investigation of CASP3 processing in non-lethal stress conditions as an interesting avenue for future studies. PARylation and ATG gene expression are reduced in CASP3 and CASP7 DKO cells The well-known PARP1 binding site [43] in CASP7 is located between the two identified calpain cleavage sites (S4F Fig). PARP1 and associated PARylation activities were shown to play key roles in stress sensing, stress adaptation, and autophagy induction in multiple cell lines and in vivo mouse models [70–75]. Further, PARP1 is a CASP3 and CASP7 substrate, with the latter showing a higher affinity for PARP1 binding [76]. These factors prompted us to investigate PARP1 levels and PARP1 activity in the CASP3 + 7 DKO background. We predicted that cleaved-PARP1 levels would be reduced, and thus PARP1 activity would be increased in the CASP3 + 7 DKO cells. Unexpectedly, we detected a higher level of cleaved-PARP1 in CASP3 + 7 DKO cells compared to the SKBR3 parental cells (Fig 5A–5D). There was no evidence of cell death in the DKO cells, consistent with previous observations that cleaved-PARP1 (89 kDa) is not always associated with cell death [77,78]. Cleaved-PARP1 levels were further increased when cells were stressed with a non-lethal dosage of MG132 (Fig 5B and 5D). PARylation is a PARP catalytic activity-dependent process [79]. Next, PAR immunolabeling was performed as a readout for PARylation activity of PARP1 [80]. We observed a significant reduction in PAR labeling in CASP3 + 7 DKO cells compared to parental cells in both SKBR3 and MDA-MB-231 cell lines (Fig 5E–5H). These results indicate that in the absence of CASP3 and CASP7, PARP1 cleavage is enhanced and PARP1 activity (as measured by PAR level) is impaired. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. PARylation and ATG gene expression are reduced in CASP3 + 7 DKO cells. (A, B) Representative western blots of indicated proteins from parental and CASP3 + 7 DKO SKBR3 cells cultured in fed conditions (F) or starved (S) for 8 h (A) or treated with MG132 (0.5 μM) for 24 h (B). (C) Quantification of cleaved-PARP1 shown in (A). Values were normalized to parental fed conditions. (D) Quantification of cleaved-PARP1 shown in (B). Values were normalized to parental NT. NT = No Treatment (vehicle DMSO). (E, F) Representative western blots of PAR immunolabeling from parental and CASP3 + 7 DKO SKBR3 (E) or MDA-MB-231(F) cells treated with vehicle (DMSO) or MG132 (0.5 μM) for 24 h. (G, -H) Quantification of PAR levels shown in (E) and (F). Values were normalized to the parental NT condition. (I) RT-qPCR analyses of LC3B in SKBR3 parental, CASP3 + 7 DKO or CASP3 + 7-WT re-expression in CASP3 + 7 DKO SKBR3 cells. Cells were treated with vehicle (DMSO) or 0.5 μM of MG132 for 24 h. (J) Representative western blots of indicated proteins from SKBR3 parental and CASP3 + 7 DKO cells transfected with scramble or CASP2 siRNAs (72 h). (K) Quantification of cleaved-PARP1 levels (shown in J) relative to scramble-siRNA control. (L) Quantification of CASP2 levels relative to scramble-siRNA-treated parental control. (M) Quantification of PAR levels shown in (J) relative to scramble-siRNA-treated parental control. (N) Representative western blots of indicated proteins from SKBR3 parental and CASP3 + 7 DKO cells transfected with scramble, CASP2 or CASP8 siRNAs (42 h) and treated with vehicle DMSO or MG132 (0.5 μM) for 24 h, with BafA1(50 nM) in the final 2 h. (O) Quantification of LC3B-based autophagy flux in MG132-treated cells, shown in (N). In graphs, data are shown as mean ± SEM. n = 3–6 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In C, D, G, and H, one-way ANOVA with Tukey’s post-test. In I, K–M, and O, one-way ANOVA with Dunnett’s post-test. See also S5 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g005 In cellular stress conditions, PARP1 and PARylation were shown previously to positively regulate cytoprotective autophagy through multiple mechanisms [73–75,81], including the transcriptional upregulation of autophagy pathway components) [82]. Since CASP3 + 7 DKO lead to reduced PARylation, we predicted that CASP3 + 7 DKO would result in transcriptional downregulation of autophagy pathway components. Among the components evaluated by reverse transcription polymerase chain reaction reverse transcription polymerase chain reaction (RT-qPCR), we found that LC3B and ATG7 transcript levels were reduced in CASP3 + 7 DKO SKBR3 cells (S5A–S5E Fig). There was an increase in both transcripts in response to MG132-induced stress (Figs 5I and S5F), with DKO cells showing weaker upregulation compared to parental (Figs 5I and S5F). Dual re-expression of CASP3 + 7-WT constructs partially rescued transcript levels of these genes in DKO cells (Figs 5I and S5F). These results are consistent with the possibility that CASP3 and CASP7 regulate transcription of key autophagy genes, potentially through changes to PARP1 and PARylation. To test whether CASP3 and CASP7 regulate autophagy through PARP1, we investigated whether overexpression of PARP1 could rescue autophagy in the CASP3 + 7 DKO background. Moreover, we compared rescue ability of wild-type PARP1 (PARP1-WT), catalytically inactive PARP1 (PARP1-CI), and PARP1 with the DEVD cleavage site mutated (PARP1-DEVA, PARP1-AEVA). Consistent with endogenous PARP1, the PARP1-WT transfected CASP3 + 7 DKO cells resulted in a significantly increased accumulation of cleaved-PARP1 compared to the PARP1-WT transfected parental cells (S5G and S5H Fig). As expected, levels of cleaved PARP1 were significantly reduced in the PARP1-DEVA and PARP1-AEVA expressing cells (S5G and S5H Fig). While the cleavage-resistant PARP1-DEVA and PARP1-AEVA cells also showed a trend of increased LC3B transcription relative to PARP1-WT (S5J Fig), the PARylation and autophagy flux levels showed no differences (S5G, S5I, S5K, and S5L Fig). The PARP1-CI transfected cells did show significantly reduced PARylation compared to the vector control, consistent with the reported dominant negative effect of catalytically inactive PARP1 (S5G and S5I Fig) [83], but induction of cell death and very low expression levels confounded interpretation of results. Overall, while the association between PARP1 cleavage and LC3B transcription remains consistent, it is not clear whether the reduced autophagy flux in the CASP3 + 7 DKO background is due to the enhanced PARP1 cleavage. Next, to identify the protease(s) responsible for PARP1 cleavage in the absence of CASP3 and 7, we evaluated multiple candidate proteases using a siRNA approach. The knockdown of CASP6, calpain 1, calpain 2, cathepsin B, and cathepsin D resulted in a further increase in PARP1 cleavage (S5M–S5O Fig). We next investigated caspase 2 (CASP2) and caspase 8 (CASP8), both shown to localize to the nucleus in some contexts [84,85]. We discovered that knockdown of CASP2, but not CASP8, consistently reduced PARP1 cleavage in CASP3 + 7 DKO cells (Figs 5J, 5K, and S5P). Interestingly, the relative expression level of CASP2 was significantly higher in CASP3 + 7 DKO cells compared to the parental cells (Fig 5J and 5L), further supporting a role for CASP2 in this context. Consistent with the lack of rescue by the DEVD cleavage site mutants, PARylation and autophagic flux were not rescued in the CASP2 knockdown CASP3 + 7 DKO cells (Fig 5M–5O). Collectively, our data suggest that CASP2 is at least partially responsible for the increased PARP1 cleavage in the CASP3 + 7 DKO cells. Further investigation of the contexts and role of CASP2 in PARP1 regulation is warranted. CASP7-p29/30 promote phosphorylation of DDR pathway protein H2AX In addition to transcriptional remodeling, PARP1 plays a central role in DNA repair and DNA damage response (DDR) pathways in response to cell stress [71,86]. Both starvation and proteasome inhibition were reported to induce DNA damage and activate autophagy as a part of stress-induced DDR [87,88]. Western blot analyses revealed that MG132 stress-induced phospho-H2AX (γ-H2AX) accumulation was significantly reduced in CASP3 + 7 DKO cells compared to parental cells without reduction in total H2AX levels (Fig 6A and 6B). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. CASP7-p29/30 promote phosphorylation of DDR pathway protein H2AX. (A) Representative western blots of indicated proteins from SKBR3 parental and CASP3 + 7 DKO cells treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Ponceau S staining was used as the loading control. (B) Quantification of γ-H2AX/total H2AX shown in (A). (C) Representative images from alkaline comet assay from SKBR3 parental and CASP3 + 7 DKO cells treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Cells treated with H2O2 (100 μM for 4 h) serve as the positive control. (D) Quantification of tail moments of comets shown in (C). Tukey’s box-and whisker plots are based on tail moments determined by CometScore from 200 cells in each of two independent assays. (E) Representative western blots of indicated proteins from SKBR3 parental, CASP3 + 7 DKO or single KO cells treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Ponceau S staining was used as the loading control. (F) Quantification of γ-H2AX/total H2AX shown in (E). (G) Representative western blots of indicated proteins from CASP3 + 7 DKO SKBR3 cells stably transfected with indicated CASP7 constructs treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Ponceau S staining was used as the loading control. (H) Quantification of γ-H2AX/total H2AX shown in (G). (I) Representative images from alkaline comet assay from CASP3 + 7 DKO SKBR3 cells stably transfected with indicated CASP7 constructs and treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. Cells treated with H2O2 (100 μM for 4 h) serve as the positive control. (J) Quantification of tail moments of comets shown in (I). Tukey’s box-and-whisker plots are based on tail moments obtained from 200 cells in each of two independent assays. (K) Representative images of crystal violet assay plate. CASP7-CCM, CASP7-p30, and CASP7-p29 SKBR3 cells treated with indicated concentrations of proteasome inhibitor MG132 for 24 h and continued to grow in drug free media for another 3 days. (L) Quantification of cell viability data presented in (K). The percentage of stained (viable) cells at each concentration was normalized to respective untreated cells. (M) Representative western blots of indicated proteins from CASP3 + 7 DKO SKBR3 cells re-expressing vector, CASP7-WT (catalytically active) or CASP7-inactive (catalytically inactive) constructs, treated with vehicle DMSO or MG132 (0.5 µM) for 24 h. H3 serves as the loading control. (N) Quantification of γ-H2AX/total H2AX shown in (M). (O) Representative western blots of LC3B-based autophagy flux in CASP3 + 7 DKO SKBR3 cells re-expressing the indicated CASP7 constructs, treated with vehicle DMSO or MG132 (0.5 µM) for 24 h, with BafA1(50 nM) in the final 2 h. (P) Quantification of autophagy flux in MG132 treated cells relative to MG132 treated CCM expressing cells, shown in (O). In graphs, data are shown as mean ± SEM. n = 3 independent experiments except for comet assays n = 2. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B, F, L, N, and P, one-way ANOVA with Dunnett’s post-test. In D and H, one-way ANOVA with Tukey’s post- test. In J, two-way ANOVA with Sidak’s post-test. See also S6 Fig. The numerical data presented in this figure can be found in S1 Data. https://doi.org/10.1371/journal.pbio.3003034.g006 Traditionally, the level of γ-H2AX was measured to determine the extent of DNA damage [89]; however, γ-H2AX levels were also shown to represent the strength of DDR signaling activity [12,90,91]. To distinguish between these two possibilities, we used comet assays to directly measure DNA damage [92]. The level of DNA damage/fragmentation in parental versus CASP3 + 7 DKO SKBR3 cells was not significantly different when treated with the vehicle (DMSO) control (Fig 6C and 6D). In response to non-lethal MG132 stress conditions, there was an increase in DNA damage, but it was comparable in both parental and CASP3 + 7 DKO cells. In contrast, the positive control, H2O2, lethal stress conditions induced an increase in DNA damage only in the parental cell line (Fig 6C and 6D). This data indicates that the γ-H2AX reduction observed in CASP3 + 7 DKO cells in the MG132 non-lethal stress conditions is not due to a reduction in DNA damage, but possibly instead due to impaired DDR signaling. Next, we investigated the γ-H2AX levels in CASP3 + 7 double and single KO cells (Fig 6E and 6F). The γ-H2AX levels were reduced in CASP3 + 7 DKO cells, but relatively increased in both CASP3 single KO and CASP7 single KO cells. Notably, the γ-H2AX level was highest in CASP3 KO cells, suggesting that CASP7 plays a prominent role, relative to CASP3, in the H2AX phosphorylation. To further investigate this possibility and given the proximity of the PARP1 binding site to the CASP7 calpain cleavage sites, we generated CASP3 + 7 DKO cell lines stably expressing CASP7-CCM, CASP7-p30, or CASP7-p29 (S6A Fig). We used CASP7-CCM as the full-length control since it does not undergo non-canonical cleavage. We compared γ-H2AX levels in the absence and presence of non-lethal MG132, and observed that the stress-induced γ-H2AX accumulation was increased in CASP7-p30 and CASP7-p29 expressing cells compared to CASP7-CCM expressing cells (Fig 6G and 6H). Among the three constructs, the greatest level of γ-H2AX was observed with CASP7-p29. Furthermore, comet assays revealed that the degree of DNA damage was not significantly different among the three cell lines in either vehicle (DMSO) control, non-lethal MG132-treated or lethal H2O2-treated conditions (Fig 6I and 6J). In fact, the degree of DNA damage appeared slightly reduced in CASP7-p29 expressing cells, where we detected the greatest level of γ-H2AX (Fig 6H and 6J). Crystal violet cell viability assays, following treatment with increasing concentrations of MG132 or Bortezomib (Figs 6K, 6L, S6B and S6C), showed no significant differences in cell viability despite the different γ-H2AX levels observed in the three cell lines. Overall, these results support that CASP7 non-canonical processing does not affect the degree of DNA damage, but rather plays a role in inducing different levels of γ-H2AX signaling (S6D Fig). Consistent with several previous studies showing that γ-H2AX is not an unambiguous marker for DNA double-strand breaks [12,90,91], our findings emphasize the need for careful interpretation of γ-H2AX levels. To further explore CASP7 in this role, we employed catalytically active wild type (CASP7-WT) and catalytically inactive (CASP7-inactive, C > A) constructs stably expressed in CASP3 + 7 DKO cells. CASP7-WT construct expression, but not catalytically inactive construct, led to a significant increase in the γ-H2AX signaling in non-lethal MG132-treated cells (Fig 6M and 6N), revealing that CASP7 and its catalytic activity are required for the phosphorylation of H2AX. Dual inhibition of CASP3 and CASP7 mimics effects of PARP1 inhibitors Our data shows that inhibition of CASP3 and CASP7 leads to increased PARP1 cleavage and reduced PARP activity (PARylation), DDR signaling, and autophagy response. Since these effects phenocopy the genetic or pharmacological inactivation of PARP1 [70,71], we hypothesized that dual CASP3 + 7 inhibition would also phenocopy the synthetic lethal effects of PARP1 in cells that are defective in homologous recombination (HR) [93,94]. To test this, we generated CASP3 + 7 DKD in the BRCA1-deficient SUM149PT breast cancer cells and assessed their viability by the crystal violet assay. In striking contrast to the BRCA1/2-proficient SKBR3, MDA-MB231, and JIMT1 breast cancer cells, the dual inhibition of CASP3 + 7 was lethal in SUM149PT cells (Fig 7A–7C). In the western blot analyses, a distinct band of cleaved-PARP1 was visible in CASP3 + 7 DKD cells compared to the scramble-siRNA control (Fig 7C). Western blot analyses also confirmed that autophagic flux was reduced in the CASP3 + 7 DKD SUM149PT cells compared to the scramble-siRNA control or compared to the single KD of CASP3 or CASP7 (Fig 7D and 7E). To determine whether the lethal effects of CASP3 + 7 DKD was redundant or potentially additive with PARP1 inhibition, we treated the SUM149PT cells with the PARP1 inhibitor olaparib. The combination of CASP3 + 7 DKD and olaparib treatment resulted in a further reduction of SUM149PT cell viability compared to olaparib treatment alone (Fig 7F and 7G). These results suggest that combined inhibition of CASP3 and CASP7 induces synthetic lethality with loss of BRCA1. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 7. Dual inhibition of CASP3 and 7 mimics effects of PARP1 inhibitors. (A) Representative images of crystal violet assays. SUM149PT, MDA-MB-231, SKBR3, or JIMT-1 cells transfected with scramble-siRNA or CASP3 and CASP7 siRNAs (10 nM) for 2 days and then cultured in siRNA-free media for another 3 days. (B) Quantification of cell viability from data represented in (A). Percentage of stained (viable) cells at each concentration was normalized to respective untreated cells. (C) Representative western blots showing CASP3 and CASP7 levels following siRNA treatment conditions used in crystal violet assays shown in (A). Immunolabelling of cleaved-PARP1 is also shown. n = 2 independent experiments. (D) Representative western blots of indicated proteins from SUM149PT cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and treated with vehicle DMSO or MG132 (0.5 μM) for 24 h, with BafA1 (50 nM) in the final 2 h. (E) Quantification of LC3B-based autophagy flux in proteasome inhibitor (MG132) treated cells shown in (D). The levels of LC3BII in MG132 treated cells were normalized to loading control and shown relative to the MG132 treated scramble-siRNA control. (F, G) Representative images of crystal violet assay plates (F) and quantification of percentage of cell viability (G). Indicated siRNA transfected cells treated with indicated concentrations of Olaparib for 24 h and continued to be cultured in drug free media for another 3 days. Graph (G) show the percentage of stained (viable) cells at each concentration normalized to scramble-siRNA-treated and Olaparib untreated cells. (H) Dot plot showing DepMap cancer cell lines with bottom 10% of CASP3 + 7 protein expression (n = 16) and top 10% CASP3 + 7 protein expression (n = 10). Normalized CASP3/7 protein expression (scaled and mean centered) were extracted from Nusinow and colleagues using GRETTA. (I) Ranked CASP3 + 7 genetic interaction scores generated using GRETTA. The cell lines at the bottom 10% of CASP3 and CASP7 protein expression are compared against cells lines at the top 10%. Rank is based on lethal GI scores. The red points indicate true candidate CASP3 + 7 lethal genetic interactions (GIs). The top most lethal genetic interactors are labeled. In graphs, unless otherwise noted, data are shown as mean ± SEM. n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B and E, one-way ANOVA with Sidak’s and Dunnett’s post-test, respectively. In G, two-way ANOVA with Sidak’s post-test. In I, Mann–Whitney U-test p-value < 0.05. See also S7 Fig. The numerical data presented in this figure can be found in S1 Data. The code related to Fig 7H and 7I is publicly available in a GitHub repository (https://github.com/MarraLab/Caspase_GRETTA_analysis) and archived on Zenodo (https://doi.org/10.5281/zenodo.14722298). https://doi.org/10.1371/journal.pbio.3003034.g007 To further assess genetic interactions between BRCA1 and CASP3 and CASP7, we utilized Genetic inteRaction and EssenTiality neTwork mApper (GRETTA) [95], a tool that leverages the publicly available data from the Cancer Dependency Map (DepMap [96–98], to perform a pan-cancer in silico genetic interaction screen. Using GRETTA, we identified 16 DepMap cancer cell lines harboring low protein expression of both CASP3 and CASP7 and 10 cancer cell lines with high expression of both CASP3 and CASP7 across 13 cancer types (Figs 7H and S7A). Comparisons between these two groups resulted in the identification of 142 candidate CASP3 + 7 synthetic lethal interactors, where perturbation of these genes resulted in significantly higher lethal probabilities in the CASP3 + 7 low expressor lines compared to the CASP3 + 7 high expressor lines (S7B Fig). The candidate synthetic lethal interactors included HR repair pathway genes (S7B Fig), with BRCA1 as the top synthetic lethal hit (Fig 7I). Collectively, our results show that the dual inhibition of CASP3 and CASP7 mimics the cellular effects of PARP1 inhibitors, revealing a surprising new avenue for potential therapeutic exploitation. Counter-intuitive to traditional caspase roles, our findings demonstrate that combined inhibition of CASP3 and CASP7 induces synthetic lethality in BRCA1-defective cells. Discussion Caspases remain categorized primarily as components of the cell death machinery despite studies demonstrating that they have multiple non-apoptotic functions. We discovered an autophagy-promoting and cell stress adaptation role for the human effector caspases, CASP3 and CASP7, in human breast cancer cell lines in non-lethal starvation or proteasome inhibition stress conditions. Using non-lethal stress conditions, we found that CASP7 is cleaved, resulting in stable fragments (CASP7-p29/30) that are distinct from canonical cleaved-CASP7 fragments (CASP7-p20/p12) reported in apoptosis. The non-canonical cleavage sites in CASP7 flank a PARP1 binding site, and we found that PARP activity and DNA damage-induced phosphorylation of H2AX are significantly reduced in CASP3 + 7 DKO cells. Strikingly, BRCA1-deficient cells that are known to be highly dependent on PARP activity for DNA repair, exhibited synthetic lethality with combined knockdown of CASP3 + 7. Overall, this study shifts the current paradigm of apoptotic caspase-PARP1 relationships to one that involves non-canonical caspase cleavage and the promotion of cell adaptation and survival pathways at the onset of cellular stress or in non-lethal cellular stress conditions. Unexpected key findings of our study include the increased PARP1 cleavage and decreased PARylation (PARP activity) in CASP3 + 7 DKO cells. There is extensive evidence that CASP3 and CASP7 function as negative regulators of PARP1, cleaving and inactivating it during apoptosis [99–101]. However, our data suggest that CASP3 and CASP7 alternatively contribute to PARP1 stability and function in non-lethal conditions. Our findings, along with the Conde-Rubio and colleagues (2021) [7] report of differential cleavage of PARP1 by CASP3 and CASP7 in non-lethal stress as compared to lethal stress, suggest that the primary outcome of the CASP-PARP1 relationship during non-lethal stress is not to inactivate PARP1 activity, but rather is to modulate PARP1 activity. We identified a role for CASP2 in PARP1 cleavage in the absence of CASP3 and CASP7, but the upstream signals remain to be identified. Given the location of the PARP binding site on CASP7 (Fig 4C), it is likely that PARP cleavage, stability and/or activity are modulated by non-canonical cleavage. This N-terminus ‘KKKK’ exosite located downstream of the first calpain cleavage site on CASP7 has been shown to promote efficient PARP1 binding and cleavage [43]. The exosite is directly exposed for PARP1 binding in CASP7-p30. It has been shown that removal of the pro-domain enhances the PARP1 binding potential of CASP7 [43]. However, it is not known whether the direct exposure of the exosite would further enhance or instead abrogate the binding or cleavage of PARP1. Since CASP7-p29 does not have an intact exosite, we initially expected it to differently modulate PARP1 and downstream stress response pathways. Despite this key difference, expression of either exogenous p29 or p30 alone was able to enhance the autophagy response or the DDR (γ-H2AX), revealing a potential functional similarity, which is likely independent of PARP1 binding, between these non-canonical CASP7 fragments. However, the p29 fragment lacking the PARP1 binding exosite shows the significantly highest level of LC3B-II flux or γ-H2AX. This result is somewhat consistent with an in vitro cell-free analysis that demonstrated higher enzymatic activity when CASP7 cleavage occurs at the second calpain cleavage site (equivalent to the p29 site), compared to the p30 or canonical cleavage sites [40]. In that study; however, the initial processing was followed by a second cleavage event to generate the enzymatically active tetrameric complex composed of p17 and p12 fragments [40]. In non-lethal contexts, we predict that the retention of CASP7-p30 and/or p29 forms serves to limit apoptotic activity. A model to be investigated in the future is whether p30 initiates the stress response signaling, but then switches to p29 to maximize the ability to respond to increased levels of stress. Despite the differences in autophagy or γ-H2AX levels, we did not observe differential effects on cell viability in lines expressing the different CASP7 fragments under normal or PI stress conditions. Several studies have reported non-lethal functions of caspases in DDR as well as in non-lethal DNA damage in chromatin modulation leading to changes in gene expression [3,102,103]). An increase in γ-H2AX levels was also reported to be caspase-dependent and associated with increased carcinogenesis or cell invasion [12,104]. It is possible that the non-canonical processing of CASP7 may have physiological effects on adaptations and/or survival that cannot be captured within our experimental window. Notably, since we did not detect further processing of p29/p30, it appears that these fragments function, at least in part, without formation of the traditional tetrameric complex. Future studies are needed to understand the structural and functional properties of CASP7-p29 and p30. PARP1 and PARylation have been well recognized as cellular hubs that modulate major stress response pathways [74,105]. It is possible that the impaired PARP1 activity in CASP3 + 7 DKO cells is responsible for the observed phenotypes, including the reduction in autophagy response. PARP1 and PARylation were shown previously to play direct and indirect roles in the upregulation of cytoprotective autophagy through multiple mechanisms, including modulation of AMPK and ATP levels [74,106], regulation of DDR signaling [87,88] or transcriptional upregulation of autophagy pathway components [75,81,82]. For example, LC3, GABARAPL1, ATG5, and ATG12 transcripts and proteins were shown to be upregulated by PARP1 in multiple cell types [71,82]. Somewhat consistent with these studies, we found that both ATG7 transcript and LC3B transcript and protein levels were significantly reduced in CASP3 + 7 DKO cells, which exhibited reduced PARylation. Cleavage-resistant PARP1-DEVA and PARP1-AEVA expressing cells exhibited a trend of increased LC3B transcript levels relative to PARP1-WT cells, but there was no apparent rescue of autophagy flux or PARylation phenotypes. However, expression level differences between constructs and the induction of cell death by the PARP expression constructs hindered direct comparisons of these readouts. Interestingly, CASP2 protein levels were increased in the CASP3 + 7 DKO cells and CASP2 KD in this background resulted in a modest, but consistent reduction in cleaved PARP levels. We did not detect a significant rescue of PARylation or autophagy flux following CASP2 KD, but further studies using alternate approaches to enhance KD or KO CASP2 are warranted. Additional investigations are required to define the precise relationship between PARP1 and autophagy in the presence and absence of CASP3 and CASP7. A limitation of our study relates to the potential non-canonical processing of CASP3 and the relative contributions of CASP3 and CASP7 in autophagy and PARP1 modulation. We did detect a non-canonical CASP3 fragment of 27 kDa in non-lethal stress conditions (Fig 3D). Although this observation was not consistent, it would be valuable to further investigate the nature and stability of this CASP3 fragment. In accordance with our findings that CASP3 plays a role in autophagy upregulation, the pro-domain of CASP3 has been shown to be involved in the clearance of intracellular protein aggregates [107–109]. Liu and colleagues (2015) [12] compared the relative contribution of CASP3 and CASP7 in γ-H2AX foci formation in WT, CASP3KO, CASP7KO, and CASP3 + 7 DKO MCF-10A cells and concluded that CASP3 and CASP7 have largely overlapping functions in terms of γ-H2AX foci formation, but with CASP3 playing a more dominant role. Our study is consistent with an overlapping role for both CASP3 and CASP7 in increasing the level of γ-H2AX, but induction of H2AX phosphorylation with CASP7-p29/30 suggests that non-canonical forms of CASP7 may also have a prominent role in this process. Further, the catalytically inactive form of CASP7 could not induce the phosphorylation of H2AX showing that the catalytic activity of CASP7 is required. Our study uncovers a novel mechanism of stress adaptation used by cancer cells, revealing new vulnerabilities that can potentially be exploited as therapeutic targets or biomarkers. Counter-intuitively, our data supports the idea that caspase inhibition, rather than caspase induction, may reduce cell fitness in certain cells and stress contexts. Strikingly, the synthetic lethal phenotype we observed in BRCA1 deficient cells points to a unique context where CASP3 + 7 inhibition may be most effective. Based on our in silico findings, we expect that this synthetic lethal relationship will be relevant to multiple cancer cell types, and that CASP3 + 7 inhibition has synthetic lethal potential with other genes involved in HR repair or other processes (S7B Fig). Currently, PARP1 inhibitors are used clinically for the treatment of BRCA-mutated breast, ovarian, pancreatic, and prostate cancers [110,111]. CASP3 + 7 inhibition could be further explored as a potential alternative to PARP1 inhibitors or as a second-line treatment in cases of intrinsic or acquired resistance to PARP1 inhibitors. In addition, combining CASP3 + 7 inhibition with chemotherapeutic drugs that induce DNA damage (i.e., cisplatin, doxorubicin) might increase the drug sensitivity. Our study also suggests CASP3 + 7 inhibition as an alternate way to inhibit the autophagy pathway which may be beneficial due to clinical challenges associated with existing autophagy inhibitors like hydroxychloroquine. While the safety and efficacy of CASP3 + 7 inhibition strategies [112,113] remain to be determined in these contexts, our study has identified multiple possibilities worthy of investigation. An evolutionarily conserved role for caspases in stress adaptation was postulated previously [1,6]. In support of such a role for human CASP3 and CASP7, we have identified a molecular mechanism that involves modulation of PARP activity, phosphorylation of H2AX for DNA damage signaling, and autophagy. Stress-induced cytoprotective autophagy induction by a Drosophila effector caspase was previously linked to a mechanism involving regulation of ATP levels [35]. Notably, like caspases or the yeast metacaspase, phylogenetic studies suggest that the ancestor of all extant eukaryotes expressed ancestral PARP proteins, including one similar to human PARP1 [114]. It will be interesting to determine if stress-adaptive caspase modulation of PARP also occurs in Drosophila or other organisms. We predict that this will be the case and, given the observation of CASP7-p29/p30 fragments in multiple tissues from healthy mice, we also predict the results reported herein to have broad biological relevance. The non-lethal stress adaptation role for caspases identified in this study also provides a plausible mechanistic explanation for previous reports of associations between increased caspase expression and worse cancer patient outcomes. We propose this increased caspase expression is used to support tumor stress adaptation, through enhanced PARP1 activity, autophagy, and DDR signaling, and may be a targetable vulnerability, particularly in the context of BRCA1 or other potential synthetic lethal mutations identified here. In this regard, it will be important to determine whether blocking the CASP7 calpain cleavage site(s) or exosite instead of inhibiting entire CASP7 function is more beneficial in a therapeutic context. A detailed understanding of caspase function, mechanisms, and pathways in non-lethal stress will be crucial for the successful clinical translation of caspase-based inhibitors for the treatment of cancer. Our study provides unexpected molecular mechanisms and many avenues to further explore in this regard. Materials and methods Ethics statement All experiments involving animals were conducted in accordance with the standards and guidelines of the Canadian Council on Animal Care and approval was granted by the University of British Columbia Animal Care Committee (A22-0274). Ethical approval (H18-02121, H23-01014) for research using human breast tumor tissues was granted by the Research Ethics Board of BC Cancer and Simon Fraser University. Cell lines and cell culture conditions SKBR3, MDA-MB-231, and JIMT-1 breast adenocarcinoma cell lines (Parental, CASP KO, KD, or expressing CASP constructs) were maintained in Dulbecco’s Modified Eagle Medium (DMEM) (Gibco, 11995-065) supplemented with 10% heat-inactivated fetal bovine serum (FBS) (Invitrogen, 12483020), 10 mM HEPES (Invitrogen, 15630–080), and 1× non-essential amino acids (Gibco, 11140-050). BRCA1 deficient SUM149PT breast cancer cells (parental or CASP KD) were maintained in Ham’s F-12 medium (Gibco, 11765-054) supplemented with 5% heat-inactivated FBS (Invitrogen, 12,483,020), Insulin (1 µg/ml; Gibco, 12585-014), and hydrocortisone (1 µg/ml; Sigma, H4001). Cell lines stably transfected with plasmids expressing CASP constructs were maintained using Geneticin (G418) (Invitrogen, 10131-035) at 1 mg/ml or Hygromycin (Invitrogen, 10687010) at 500 μg/ml for selection. The selection media was replaced with standard media two passages before cells were used in experiments. All cells were maintained at 37 °C with 5% CO2 and 95% humidity. Testing for mycoplasma contamination of cell lines was conducted on a regular basis using e-Myco Mycoplasma Detection Kit (Boca Scientific, 25,235). Mice and PDX To obtain cell lysate from murine organs, tissues from three female CD-1 mice, 10-weeks old, were flash-frozen and gifted by Dr. Nancy dos Santos and Nicole Wretham. All experiments were conducted in accordance with the standards and guidelines of the Canadian Council on Animal Care and were approved by the University of British Columbia Animal Care Committee (A22-0274). To obtain cell lysate from PDX, breast tumor tissues originally derived from human were grown as xenografts in mice as detailed in Eirew and colleagues (2022) [115] harvested, frozen in DMEM/FCS with 6%–10% DMSO and provided by Dr.Sam Aparicio and Dr. Peter Eirew. Ethical approval for research using these human breast tumor tissues was obtained from the Research Ethics Board of BC Cancer and Simon Fraser University. Starvation, pharmacological, or Bafilomycin (BafA1) treatment After growing for 3–4 days, cells were subjected to starvation or treated with proteasome inhibitors (or other drug treatments used in this study; rapamycin, staurosporine, olaparib). Cells were washed with 1× phosphate-buffered saline (PBS) (Glibco,10010023) and subjected to amino acid starvation for 2–24 h for autophagy flux assays or for 8–72 h for viability assays (as indicated) in Earle’s Balanced Salt Solution (EBSS) (Sigma, E3024). For well-fed cells, the media was replaced with fresh standard growth media. For the autophagic flux assay with Bafilomycin A1 (BafA1) (Sigma, B1793), cells were incubated in 50 nM of BafA1 to inhibit lysosomal fusion for 2 h prior to harvesting cells for subsequent analyses. Staurosporine (Sigma, S6942), Rapamycin (Invitogen, PHZ1235), PARP inhibitor, Olaparib (Selleckchem, S1060), and proteasome inhibitors MG132 (Apexbio, A2585) and Bortezomib (Apexbio, A2614) stock solutions were prepared in dimethylsulfoxide (DMSO) (Fisher BioReagents, BP231-100) and stored in aliquots at − 20 °C. Drugs were freshly diluted in media to minimize DMSO concentration and cells were treated at indicated concentrations and durations. Control cells (non-treated or DMSO) were treated with a comparable concentration of DMSO. Cell lysate extraction for western blotting Cells were plated in 6-well plates at a density of 2 × 105 cells per well for all cell lines except for MDA-MB-231 which was plated at 1.5 × 105 cells per well. For cell harvesting after indicated treatments (drugs or fed/starve conditions), cells were washed twice in PBS and then trypsinized for up to 5 min while in a CO2 incubator. Cell pellets were collected by adding PBS and then spinning for 30 s, followed by rinsing with PBS and stored at −80 °C or processed to the next step. All cell pellets were resuspended and lysed using 3× packed cell volume of ice-cold RIPA lysis buffer (Santa-Cruz, SC24948) supplemented with sodium orthovanadate (Santa-Cruz, SC24948), protease inhibitor cocktail (Santa-Cruz, SC24948), and phenylmethylsulfonyl fluoride (Santa-Cruz, SC24948) following manufacturer’s instructions. Cells were incubated in RIPA for 1 h at 4 00B0C with agitation before being centrifuged at 13,000 rpm for 10 min at 4 °C to obtain a lysate free of cell debris. To prepare lysate from frozen mouse organs or PDXs, samples were cut into small pieces with a single-edge blade on a clean petri dish, transferred to lysis Matrix M tubes containing a single zirconium ceramic bead and RIPA lysis buffer (300–500 μl) supplemented with 1× complete-protease inhibitor (Roche) and 1× PhosSTOP (Roche, 4906845001) and incubated on ice for 5 min. Samples were homogenized using MP Fast Prep Tissue Homogenizer (Biomedicals 116004500), spun down at 13,000 rpm for 2 min for debris to form pellets and then the supernatants were transferred to eppendorf tubes before being centrifuged at 13,000 rpm for 20 min at 4 °C. Lysates were transferred to new tubes avoiding pellets and floating lipid/fat layer. Western blot analysis Proteins in cell/tissue lysates were quantitated using the Pierce BCA Protein Assay Kit (Invitrogen, 23225). For gel electrophoresis, 10–20 µg protein was prepared by adjusting the total volume using dH2O and 4× SDS Loading buffer. Proteins were denatured by boiling at 95 °C for 10 min and then separated by gel electrophoresis (SDS PAGE) on a 4%–12% or 10% Bolt Bis–Tris Plus gel (Invitrogen NW04125BOX or NW00105BOX) with 1× NuPAGE MES SDS Running Buffer (Invitrogen, NP0002 at 180V for 40 min). Separated proteins were transferred to methanol-activated polyvinylidene difluoride (PVDF) membranes (BioRad, 1620177) at 100 V for 90 min in 1× Nupage Transfer Buffer (Invitrogen NP0006) with 20% (v/v) methanol. Membranes were blocked in 2% (w/v) skim milk in 1× PBS-T (PBS, 0.1% Tween-20, pH7.4) for 1 h at room temperature (RT), and incubated with primary antibodies overnight at 4 °C at 1:500 or 1:1000 dilution (see S3 Table for antibodies used) in Odyssey (Li-COR , 927-60003) plus 1× PBS-T (0.1%) solution mixed at 1:4 ratio. The following day, membranes were washed three times, for 15 min each, with 1× PBS-T, blocked in 2% skim milk for 30 min at RT and incubated with HRP-conjugated corresponding (goat anti-mouse or goat anti-rabbit) secondary antibodies at 1:5,000 dilution in 1× PBS-T (0.1%) for 1 h at RT. After washing membranes with 1× PBS-T (0.1%) three times (5 min each), protein signals were detected with the Bio-Rad Clarity Western ECL Enhanced Western Blotting substrate (BioRad,170-5061) or with SuperSignal West Femto Maximum Sensitivity Substrate (Thermo Fisher Scientific, 34094) using Bio-Rad ChemiDoc XRS (or MP) System. Densitometry was performed using Image Lab 5.1 software (BioRad). Protein levels were normalized to the loading control, either β-actin or vinculin. Cell fractionation Nuclear and cytoplasmic extracts for western blot analyses for H2AX and ɤ-H2AX were prepared using NE-PER Nuclear Cytoplasmic Extraction Reagent kit (Pierce) following the manufacturer’s instructions. Briefly, harvested cells were washed twice with PBS and were centrifuged at 500 g for 5 min. The cell pellet (packed volume of 20 µl) was suspended in 200 μl of cytoplasmic extraction reagent I, vortexed for 15 s, and incubated on ice for 10 min followed by the addition of 11 μl of ice-cold cytoplasmic extraction reagent II, vortexed for 5 s, incubated on ice for 1 min, and centrifuged for 5 min at 16,000 g. The supernatant (cytoplasmic extract) was transferred to a pre-chilled tube. The insoluble pellet fraction (nuclei), was resuspended in 100 μl of nuclear extraction reagent, vortexed for 15 s and incubated on ice for 10 min, before being centrifuged for 10 min at 16,000 g. The supernatant (nuclear extract) was transferred to a pre-chilled tube. Proteasome activity assay Two thousand cells were seeded in a 96-well, white cell culture clear-bottom microplate (Greiner bio-one, 655983) and cultured in appropriate media (as indicated above) for up to 4 days. After treating cells with proteasome inhibitors for 24 h at indicated concentrations, proteasome activity was measured using Proteasome-Glo Cell-Based Reagents as instructed (Promega, G862112). Chymotrypsin-like, caspase-like, and trypsin-like activities were individually measured after incubating the corresponding luminogenic substrates with cells for 10 min. The degree of proteasomal activity (luminescence) was measured on a Synergy H4 Hybrid (BioTek) plate reader. siRNA transfection For siRNA transfections, SKBR3 and SUM149PT cells were plated at 2 × 105 cells per well and MDA-MB-231 cells were plated at 1.5 × 105 cells per well, in 6-well plates in a total of 2 ml standard media. Twenty-four hours after plating, cells were transfected with 10 nM of CASP siRNA, calpain siRNA, cathepsin siRNA, or corresponding Scramble control siRNA (Integrated DNA Technologies for CASP-siRNAs and cathepsin-siRNAs and Santa-Cruz for calpain-siRNAs) using 4 μl of Lipofectamine RNAiMAX (Invitrogen, 13778075) as per manufacturer’s recommendations. siRNAs are listed in S1 Table. 48 h after transfection, cells were subjected to well-fed/starved or untreated/treated conditions with and/or without 50 nM BafA1 treatment for the final 2 h and harvested for western blot analysis as described above. SUM149PT cell lines were transfected with 5 nM of siRNA in experiments used to detect the combined effect with Olaparib. In experiments with CASP2 or CASP8 siRNA, cells were harvested at 72 h post-transfection. CRISPR-Cas9-based knockout line creation Single guide RNA sequences targeting the human CASP3 and CASP7 genes were designed using Crispor.org [116] and cloned into PX458 (Cas9-GFP vector with AMP resistance, Addgene, plasmid # 48138) or PX459 (Cas9-Puromycin vector with AMP resistance, Addgene, plasmid # 62988) backbone vector following the protocol provided by Addgene. For gRNA sequences, see S3 Table. Plasmids were then transformed into One Shot Stbl3 chemically competent Escherichia coli cells. The resulting CRISPR plasmids were isolated, sequenced and transfected into the SKBR3 or MDA-MB-231 cell line using Lipofectamine 3000 (Invitrogen, L3000-008). After 48 h , GFP-positive individual cells (cells with PX458 vector) were selected by fluorescence-activated cell sorting (FACS) into 96-well plates and cultured to generate monoclonal cell lines. For cells containing CRISPR plasmids in PX459 vector backbone, puromycin-resistant cells were selected by treating with puromycin (Sigma, P9620) at 10 µg/ml concentration and individual clones were obtained by performing serial dilution on the surviving clones. Isogenic knockout clones were validated by western blotting and sequencing. CASP, BCL2 and PARP1 construct generation and transfection For CASP3 and BCL2, a clone from Sino Biological (HG10050-CH) or Addgene (N-FLAG-BCL2, 18003) was obtained, respectively, and used directly for transfection. A PARP1 clone (N-Myc-PARP1, HG11040-NM) was obtained from SinoBiological. CASP7 plasmids containing CASP7-WT, CASP7-inactive (CASP7-C186A), CASP7-ΔPro or non-cleavable at the prodomain were kindly gifted by Dr. Salvesen [117]. The site-directed mutagenesis method was employed to modify nucleotides in CASP7 or PARP1 as needed. CASP7-p30, CASP7-p29, and CASP7 calpain cleavage sites mutated (CASP7-CCM) constructs were generated by PCR using CASP7-WT as the template. The primers and restriction enzymes used are listed below. Since calpains are known to recognize the three-dimensional structure rather than the sequence of the targeted site of the substrate protein [67,118], and a single amino acid mutation does not prevent the recognition by calpains [40,119], we sequentially altered amino acid residues to generate CASP7-CCM. Multiple amino acid residues at the 1st and 2nd calpain cleavage sites were changed into ‘A’ until a significant reduction in non-canonical cleavage was obtained (Fig 4C). For the catalytically inactive PARP1 construct (PARP1-CI), three amino acid residues (H862, Y896, and E988) in the catalytic (CAT) domain of PARP1 were sequentially mutated [79]. DNA sequencing was performed to verify the sequence integrity of all the constructs. To obtain stable cell lines, single or double CASP KO cells (CASP7 KO, CASP3 + 7 DKO) were plated at 2 × 105 cells/well and transfected with empty vectors (pcDNA3 for CASP7 constructs and pCMV3-C-His for CASP3 constructs) or plasmids containing the CASP constructs, using Lipofectamine 3000 (Invitrogen, L3000-008) as per manufacturer’s instructions. CASP3 and CASP7 construct expressing cells were selected with 500 μg/ml Hygromycin and/or 1 mg/ml G418, respectively. Antibodies were used to evaluate the expression of CASP constructs in cells that survived drug treatment. The selection agent(s) was removed for two passages prior to experiments. The following primers and restriction sites/enzymes were used in making CASP constructs For cloning CASP7-p30 fragment (via Acc65I and XhoI sites) Forward – 5′-TGACCTAGGGTACCATGAGTAAGAAGAAGAAAAATGTC-3′ Reverse – 5′-TGACCTAGCTCGAGCTACTTGTCATCGTCGTCCTTGTA-3′ For cloning CASP7-p29 fragment (via Acc65I and XhoI sites) Forward – 5′-TGACCTAGGGTACCATGCGATCCATCAAGACCACCCGG-3′ Reverse – 5′-TGACCTAGCTCGAGCTACTTGTCATCGTCGTCCTTGTA-3′ For site-directed mutagenesis of 1st calpain cleavage site for CASP7-CCM construct (FVPSLFSKKKK to FVPSAAAAKKK) 5′-/5Phos/CGGTCCTCGTTTGTACCGTCCGCAGCTGCCGCAAAGAAGAAAAAT-3′ For site-directed mutagenesis of 2nd calpain cleavage site for CASP7-CCM construct (KNVTMRSIKT to KNVTAAAIKT) 5′-/5Phos/AAGAAGAAAAATGTCACCGCTGCCGCAATCAAGACCACCCGGGAC-3′ Primers used in PARP constructs generation For PARP1-DEVD to DEVA Forward – 5′-GGAAAGAGA AAAGGCGATGAGGTGGCGGGAGTGGATGAAGTG-3′ Reverse – 5′-CACTTCATCCACTCCCGCCACCTCATCGCCTTTTCTCTTTCC-3′ For PARP1-DEVD to AEVA Forward – 5′-GGAAAGAGAAAAGGCGCAGAGGTGGCGGGAGTGGATGAAGTG-3′ Reverse – 5′-CACTTCATCCACTCCCGCCACCTCTGCGCCTTTTCTCTTTCC-3′ For PARP1-CI (catalytically inactive) H862A_Forward – 5′-CGAAGATTGCTGTGGGCAGGGTCCAGGACCACC-3′ H862A_Reverse – 5′-GGTGGTCCTGGACCCTGCCCACAGCAATCTTCG-3′ Y896A_Forward – 5′-TTTGGTAAAGGGATCGCATTCGCTGACATG GTC-3′ Y896A_Reverse– 5′-GACCATGTCAGCGAATGCGATCCCTTTACC AAA-3′ E988A_Forward– 5′-TCTCTACTATATAACGCATACATTGTCTAT GAT-3′ E988A_Reverse: –5′-ATCATAGACAATGTATGCGTTATATAGTAG AGA-3′ Crystal violet assay Cells (parental, CASP3 + 7 DKO, or CASP construct expressing), were plated at 1 × 104 cells per well in 24-well plates. For CASP3 + 7 knockdown, parental cells were transfected with 10 nM CASP siRNA or Scramble-siRNA at 24 h after seeding. For Olaparib treatment of SUM149PT, cells were transfected with 5 nM of CASP or Scramble-siRNAs. At 48 h after transfection (72 h after seeding), cells were treated with the drug (PI or Olaparib) and allowed to grow in the media containing the drug for 24 h. Then, the cells were washed twice in PBS and continued to grow in drug-free fresh media for another 3 days. For the post-starvation viability assays, at 24 h post-transfection of siRNA, cells were washed in PBS and the media was replaced with fresh media. Cells were then starved in EBSS for 0, 8, 24, 48, and 72 h. After removing the media and washing with PBS, cells were stained with 0.1% crystal violet (Sigma, C6158) for 15 min, washed with distilled water and air dried before taking images using the ChemiDoc MP System (BioRad). Immunoprecipitation of CASP7-p30/29 fragments for Edman sequencing CASP7 or mouse IgG (control) antibody-bound beads were prepared by incubating anti-CASP7 antibody (LSBio C179785), or mouse-IgG (Santa Cruz, Sc2025) with Dynabeads protein G (Thermo Fisher Scientific, 10003D) in IP buffer (Thermo Fisher Scientific, 8159600147) (25 mM Tris–HCl pH 7.5, 150 mM NaCl, 1× PhosSTOP,1% Np-40, 5% Glycerol and 1 mM EDTA with 1× complete mini protease inhibitor (Roche, 4693124001) at 4 °C overnight. CASP7 KO SKBR3 cells or CASP7 KO SKBR3 cells expressing the CASP7-WT construct were grown in T75 flasks for 3 days, fed with fresh growth media or starved for 4 h in EBSS, trypsinized (Gibco, 25300-062), collected and lysed by incubation in cell lysis buffer. Cell lysates were first pre-cleared at 4 °C for 1 h and the lysates were then added to the prepared protein-G beads and mutated at 4 °C for 4 h. The beads with captured proteins were washed 3× with IP wash buffer and elution buffer was added. Captured proteins were released by boiling the beads in 2× SDS sample buffer (117 mM Tris–Cl pH6.8, 4% SDS, 8% glycerol, 0.01% bromophenol blue, 200 mM DTT) at 98 °C for 10 min. The samples were subsequently analyzed by western blotting. After PonceauS staining, the band corresponding to CASP7-p30/29 that contains ~ 1–2 µg of proteins on PVDF was cut and sent for Edman degradation sequencing (for the first 10 amino acids at N-terminus) at the SPARC BioCentre, Toronto, Canada. Fluorescence and phase-contrast microscopy For staining with DALGreen (Dojindo, NC1879567), seeded cells were preincubated with the dye in the growth media for 30 min at 37 °C, then washed twice with PBS and subjected to autophagy induction by amino acid starvation. Live cells were imaged for fluorescence marker using constant settings (i.e., same laser power and gains). Microscopy images were obtained from a Zeiss Axio Observer (Z1/7) inverted fluorescence microscope equipped with an Apotome.2 and an AxioCam MRm R3 camera (Zeiss). Images were obtained at 20× magnification using the Zen software (version 2.5, blue edition; Zeiss). The number of DALGreen positive punctae in 10 randomly taken images from each cell line were determined using ImageJ64 software and normalized to the number of cells in each image. Total of 500–700 cells were covered and, the experiment was replicated twice. To analyze cell size, phase-contrast images were taken at 20× magnification from 10 randomly selected locations for each cell line (SKBR3 parental and CASP3 + 7 DKO). Each cell outline was manually marked in phase-contrast images, and the area was determined using ‘measure’ tool in imageJ64. The area measurements were performed in two separate experiments. RNA isolation and RT-qPCR analysis Total RNA was isolated and purified using the RNeasy Plus Mini Kit (QIAGEN, 74104) as per manufacturer’s instructions. To remove any contaminating genomic DNA, RNA was treated with DNase 1 (Invitrogen, 18068-015) as per manufacturer’s instructions. The quantification of RNA was performed using a Nanodrop Spectrophotometer. The quantitative RT-qPCR was performed to quantitate selected ATG transcript expression levels using One-Step Plus SYBR Green reagent kit (Applied Biosystems, 4385617) in 15 µl reactions containing 100 ng of total RNA and 0.1 µM of each primer on the 7900 sequence detection system (Applied Biosystems). Manufacturer’s recommended cycling conditions were used. The ATG proteins and the sequences of forward and reverse primers designed using Primer–BLAST [120] are listed in S2 Table. The mRNA expression levels were determined by the Comparative Cycle Threshold method (2−[delta][delta]Ct) normalizing to internal reference gene (. The expression is presented as the fold change relative to the control. RT-qPCR analyses were done in two or three technical replicates and experiments were repeated at least thrice. DNA fragmentation/Comet assay The level of DNA damage (single and double strand breaks) was evaluated by the comet assay (single cell-gel electrophoresis) as previously described [121]. Briefly, cells were washed in PBS (2-3×), trypsinised, harvested in cold PBS and the supernatant was removed after spinning for 5 min at 1,200 rpm at 4 °C. Each cell pellet was resuspended in cold PBS and gently mixed with 1% low melting agarose (Invitrogen, 16500-100) solution at a ratio of 1:10 (v/v). A drop from each cell-agarose mixture was immediately placed on a 1% agarose pre-coated glass microscopic slide, covered with a cover slip, and placed flat for 30 min at 4 °C. Next, the coverslips were removed gently, and the slide was immediately placed in 4 °C lysis solution (2.5 M NaCl, 100 mM disodium EDTA, 10 mM Tris base, and 200 mM NaOH) overnight at 4 °C. Next, slide was incubated in alkaline unwinding buffer (pH > 13) (200 mM NaOH, 1 mM EDTA) for 20 min at RT in the dark before being subjected to gel electrophoresis in alkaline buffer (200 mM NaOH, 1 mM EDTA) at ~ 20 V (300 mA) for 30 min at 4 °C. Slides were rinsed by immersing in dH2O for 5 min (2×), DNA was fixed by immersing in 70% ethanol for 5 min, and dried at 37 °C for 10 min before stained with GelRed (Biotium, 41,003) (1/3,000 dilution) for 30 min at RT in darkness. The slide was rinsed in water, dried, and mounted with slow-fade mounting media. Images were taken using a fluorescent microscope at 10× objective and DNA damage was quantified as the comet tail moment (product of the tail length × the fraction of total DNA in the tail) using CometScore 2.0 software [122]. Approximately, 200 randomly selected nucleoids were scored per each treatment. Experiments were repeated twice. In silico co-essentiality and genetic interaction network mapping In silico genetic interaction screening was performed using GRETTA (v0.99.2) [95] on the R statistical software (v4.2.2) [123]. We computed on the cancer DepMap public release version 22Q2 [124] data from FigShare https://figshare.com/articles/dataset/DepMap_22Q2_Public/19700056/2; accessed on 11 August 2022) according to Takemon and Marra (2023). GRETTA was used to identify cell lines with combined low or high CASP3 and CASP7 protein expression. We identified a group of pan-cancer cell lines below the 10th percentile of CASP3 and CASP7 protein expression (16 cell lines; low expressors) and those above the 90th percentile of both CASP3 and CASP7 protein expression (10 cell lines; high expressors) as the control comparator. The normalized (scaled and mean centered) gene and protein expression values for the selected cell lines were extracted from Nusinow and colleagues (2020) [98] using GRETTA for comparison between the low and high expressor groups. Genetic interactors of combined CASP3 and CASP7 low expression were predicted using GRETTA, which performed pairwise Mann–Whitney U-tests between the high expressors and low expressors cancer cell line groups for all 17,386 genes to obtain p-values. P-values were adjusted for multiple testing using a permutation (10,000 randomized resampling). Candidate genetic interactors of combined low CASP3 and CASP7 expression were called using a threshold of adjusted Mann–Whitney U-test p-value < 0.05 and a median lethality probability > 0.5 in at least one group as previously defined [95]. To highlight candidate lethal interactors, genetic interaction scores computed by GRETTA were used to generate lethal genetic interaction scores by collapsing genetic interaction scores below zero to zero. A higher interaction score indicates an increased likelihood that a gene knockout is lethal in the test group. Rank is based on lethal GI scores. Enrichment analysis and gene function annotation We used clusterProfiler (v4.6.0) [125] to annotate GO biological processes of genes that were candidate co-essential or genetic interactors (unadjusted p-value < 0.05). Given many related GO terms with similar sets of genes associated with them, we calculated the degree of overlapping genes associated between GO terms using the Jaccard indices and performed hierarchical clustering to summarize their roles, as performed by Takemon and colleagues [95]. The number of distinct clusters was determined using gap statistics, which calculated the optimal number of clusters (up to 20 clusters) by iteratively bootstrapping 10,000 times. Statistical analysis In each graph, error bars represent standard error of at least n = 3 independent experiments. As indicated in the legends, statistical significance was calculated by Student’s t-tests (for comparisons between two samples) or analysis of variance (ANOVA) with appropriate post-tests for multiple comparisons (Tukey’s, Dunnett’s, or Sidak’s) using GraphPad Prism version 8.4.1 for Windows (GraphPad Software, San Diego, California, USA, www.graphpad.com). Significance (p-values) indicated are relative to the control unless otherwise indicated. P-values less than 0.05% were determined as significant. Significance is indicated as follows unless otherwise noted: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. Ethics statement All experiments involving animals were conducted in accordance with the standards and guidelines of the Canadian Council on Animal Care and approval was granted by the University of British Columbia Animal Care Committee (A22-0274). Ethical approval (H18-02121, H23-01014) for research using human breast tumor tissues was granted by the Research Ethics Board of BC Cancer and Simon Fraser University. Cell lines and cell culture conditions SKBR3, MDA-MB-231, and JIMT-1 breast adenocarcinoma cell lines (Parental, CASP KO, KD, or expressing CASP constructs) were maintained in Dulbecco’s Modified Eagle Medium (DMEM) (Gibco, 11995-065) supplemented with 10% heat-inactivated fetal bovine serum (FBS) (Invitrogen, 12483020), 10 mM HEPES (Invitrogen, 15630–080), and 1× non-essential amino acids (Gibco, 11140-050). BRCA1 deficient SUM149PT breast cancer cells (parental or CASP KD) were maintained in Ham’s F-12 medium (Gibco, 11765-054) supplemented with 5% heat-inactivated FBS (Invitrogen, 12,483,020), Insulin (1 µg/ml; Gibco, 12585-014), and hydrocortisone (1 µg/ml; Sigma, H4001). Cell lines stably transfected with plasmids expressing CASP constructs were maintained using Geneticin (G418) (Invitrogen, 10131-035) at 1 mg/ml or Hygromycin (Invitrogen, 10687010) at 500 μg/ml for selection. The selection media was replaced with standard media two passages before cells were used in experiments. All cells were maintained at 37 °C with 5% CO2 and 95% humidity. Testing for mycoplasma contamination of cell lines was conducted on a regular basis using e-Myco Mycoplasma Detection Kit (Boca Scientific, 25,235). Mice and PDX To obtain cell lysate from murine organs, tissues from three female CD-1 mice, 10-weeks old, were flash-frozen and gifted by Dr. Nancy dos Santos and Nicole Wretham. All experiments were conducted in accordance with the standards and guidelines of the Canadian Council on Animal Care and were approved by the University of British Columbia Animal Care Committee (A22-0274). To obtain cell lysate from PDX, breast tumor tissues originally derived from human were grown as xenografts in mice as detailed in Eirew and colleagues (2022) [115] harvested, frozen in DMEM/FCS with 6%–10% DMSO and provided by Dr.Sam Aparicio and Dr. Peter Eirew. Ethical approval for research using these human breast tumor tissues was obtained from the Research Ethics Board of BC Cancer and Simon Fraser University. Starvation, pharmacological, or Bafilomycin (BafA1) treatment After growing for 3–4 days, cells were subjected to starvation or treated with proteasome inhibitors (or other drug treatments used in this study; rapamycin, staurosporine, olaparib). Cells were washed with 1× phosphate-buffered saline (PBS) (Glibco,10010023) and subjected to amino acid starvation for 2–24 h for autophagy flux assays or for 8–72 h for viability assays (as indicated) in Earle’s Balanced Salt Solution (EBSS) (Sigma, E3024). For well-fed cells, the media was replaced with fresh standard growth media. For the autophagic flux assay with Bafilomycin A1 (BafA1) (Sigma, B1793), cells were incubated in 50 nM of BafA1 to inhibit lysosomal fusion for 2 h prior to harvesting cells for subsequent analyses. Staurosporine (Sigma, S6942), Rapamycin (Invitogen, PHZ1235), PARP inhibitor, Olaparib (Selleckchem, S1060), and proteasome inhibitors MG132 (Apexbio, A2585) and Bortezomib (Apexbio, A2614) stock solutions were prepared in dimethylsulfoxide (DMSO) (Fisher BioReagents, BP231-100) and stored in aliquots at − 20 °C. Drugs were freshly diluted in media to minimize DMSO concentration and cells were treated at indicated concentrations and durations. Control cells (non-treated or DMSO) were treated with a comparable concentration of DMSO. Cell lysate extraction for western blotting Cells were plated in 6-well plates at a density of 2 × 105 cells per well for all cell lines except for MDA-MB-231 which was plated at 1.5 × 105 cells per well. For cell harvesting after indicated treatments (drugs or fed/starve conditions), cells were washed twice in PBS and then trypsinized for up to 5 min while in a CO2 incubator. Cell pellets were collected by adding PBS and then spinning for 30 s, followed by rinsing with PBS and stored at −80 °C or processed to the next step. All cell pellets were resuspended and lysed using 3× packed cell volume of ice-cold RIPA lysis buffer (Santa-Cruz, SC24948) supplemented with sodium orthovanadate (Santa-Cruz, SC24948), protease inhibitor cocktail (Santa-Cruz, SC24948), and phenylmethylsulfonyl fluoride (Santa-Cruz, SC24948) following manufacturer’s instructions. Cells were incubated in RIPA for 1 h at 4 00B0C with agitation before being centrifuged at 13,000 rpm for 10 min at 4 °C to obtain a lysate free of cell debris. To prepare lysate from frozen mouse organs or PDXs, samples were cut into small pieces with a single-edge blade on a clean petri dish, transferred to lysis Matrix M tubes containing a single zirconium ceramic bead and RIPA lysis buffer (300–500 μl) supplemented with 1× complete-protease inhibitor (Roche) and 1× PhosSTOP (Roche, 4906845001) and incubated on ice for 5 min. Samples were homogenized using MP Fast Prep Tissue Homogenizer (Biomedicals 116004500), spun down at 13,000 rpm for 2 min for debris to form pellets and then the supernatants were transferred to eppendorf tubes before being centrifuged at 13,000 rpm for 20 min at 4 °C. Lysates were transferred to new tubes avoiding pellets and floating lipid/fat layer. Western blot analysis Proteins in cell/tissue lysates were quantitated using the Pierce BCA Protein Assay Kit (Invitrogen, 23225). For gel electrophoresis, 10–20 µg protein was prepared by adjusting the total volume using dH2O and 4× SDS Loading buffer. Proteins were denatured by boiling at 95 °C for 10 min and then separated by gel electrophoresis (SDS PAGE) on a 4%–12% or 10% Bolt Bis–Tris Plus gel (Invitrogen NW04125BOX or NW00105BOX) with 1× NuPAGE MES SDS Running Buffer (Invitrogen, NP0002 at 180V for 40 min). Separated proteins were transferred to methanol-activated polyvinylidene difluoride (PVDF) membranes (BioRad, 1620177) at 100 V for 90 min in 1× Nupage Transfer Buffer (Invitrogen NP0006) with 20% (v/v) methanol. Membranes were blocked in 2% (w/v) skim milk in 1× PBS-T (PBS, 0.1% Tween-20, pH7.4) for 1 h at room temperature (RT), and incubated with primary antibodies overnight at 4 °C at 1:500 or 1:1000 dilution (see S3 Table for antibodies used) in Odyssey (Li-COR , 927-60003) plus 1× PBS-T (0.1%) solution mixed at 1:4 ratio. The following day, membranes were washed three times, for 15 min each, with 1× PBS-T, blocked in 2% skim milk for 30 min at RT and incubated with HRP-conjugated corresponding (goat anti-mouse or goat anti-rabbit) secondary antibodies at 1:5,000 dilution in 1× PBS-T (0.1%) for 1 h at RT. After washing membranes with 1× PBS-T (0.1%) three times (5 min each), protein signals were detected with the Bio-Rad Clarity Western ECL Enhanced Western Blotting substrate (BioRad,170-5061) or with SuperSignal West Femto Maximum Sensitivity Substrate (Thermo Fisher Scientific, 34094) using Bio-Rad ChemiDoc XRS (or MP) System. Densitometry was performed using Image Lab 5.1 software (BioRad). Protein levels were normalized to the loading control, either β-actin or vinculin. Cell fractionation Nuclear and cytoplasmic extracts for western blot analyses for H2AX and ɤ-H2AX were prepared using NE-PER Nuclear Cytoplasmic Extraction Reagent kit (Pierce) following the manufacturer’s instructions. Briefly, harvested cells were washed twice with PBS and were centrifuged at 500 g for 5 min. The cell pellet (packed volume of 20 µl) was suspended in 200 μl of cytoplasmic extraction reagent I, vortexed for 15 s, and incubated on ice for 10 min followed by the addition of 11 μl of ice-cold cytoplasmic extraction reagent II, vortexed for 5 s, incubated on ice for 1 min, and centrifuged for 5 min at 16,000 g. The supernatant (cytoplasmic extract) was transferred to a pre-chilled tube. The insoluble pellet fraction (nuclei), was resuspended in 100 μl of nuclear extraction reagent, vortexed for 15 s and incubated on ice for 10 min, before being centrifuged for 10 min at 16,000 g. The supernatant (nuclear extract) was transferred to a pre-chilled tube. Proteasome activity assay Two thousand cells were seeded in a 96-well, white cell culture clear-bottom microplate (Greiner bio-one, 655983) and cultured in appropriate media (as indicated above) for up to 4 days. After treating cells with proteasome inhibitors for 24 h at indicated concentrations, proteasome activity was measured using Proteasome-Glo Cell-Based Reagents as instructed (Promega, G862112). Chymotrypsin-like, caspase-like, and trypsin-like activities were individually measured after incubating the corresponding luminogenic substrates with cells for 10 min. The degree of proteasomal activity (luminescence) was measured on a Synergy H4 Hybrid (BioTek) plate reader. siRNA transfection For siRNA transfections, SKBR3 and SUM149PT cells were plated at 2 × 105 cells per well and MDA-MB-231 cells were plated at 1.5 × 105 cells per well, in 6-well plates in a total of 2 ml standard media. Twenty-four hours after plating, cells were transfected with 10 nM of CASP siRNA, calpain siRNA, cathepsin siRNA, or corresponding Scramble control siRNA (Integrated DNA Technologies for CASP-siRNAs and cathepsin-siRNAs and Santa-Cruz for calpain-siRNAs) using 4 μl of Lipofectamine RNAiMAX (Invitrogen, 13778075) as per manufacturer’s recommendations. siRNAs are listed in S1 Table. 48 h after transfection, cells were subjected to well-fed/starved or untreated/treated conditions with and/or without 50 nM BafA1 treatment for the final 2 h and harvested for western blot analysis as described above. SUM149PT cell lines were transfected with 5 nM of siRNA in experiments used to detect the combined effect with Olaparib. In experiments with CASP2 or CASP8 siRNA, cells were harvested at 72 h post-transfection. CRISPR-Cas9-based knockout line creation Single guide RNA sequences targeting the human CASP3 and CASP7 genes were designed using Crispor.org [116] and cloned into PX458 (Cas9-GFP vector with AMP resistance, Addgene, plasmid # 48138) or PX459 (Cas9-Puromycin vector with AMP resistance, Addgene, plasmid # 62988) backbone vector following the protocol provided by Addgene. For gRNA sequences, see S3 Table. Plasmids were then transformed into One Shot Stbl3 chemically competent Escherichia coli cells. The resulting CRISPR plasmids were isolated, sequenced and transfected into the SKBR3 or MDA-MB-231 cell line using Lipofectamine 3000 (Invitrogen, L3000-008). After 48 h , GFP-positive individual cells (cells with PX458 vector) were selected by fluorescence-activated cell sorting (FACS) into 96-well plates and cultured to generate monoclonal cell lines. For cells containing CRISPR plasmids in PX459 vector backbone, puromycin-resistant cells were selected by treating with puromycin (Sigma, P9620) at 10 µg/ml concentration and individual clones were obtained by performing serial dilution on the surviving clones. Isogenic knockout clones were validated by western blotting and sequencing. CASP, BCL2 and PARP1 construct generation and transfection For CASP3 and BCL2, a clone from Sino Biological (HG10050-CH) or Addgene (N-FLAG-BCL2, 18003) was obtained, respectively, and used directly for transfection. A PARP1 clone (N-Myc-PARP1, HG11040-NM) was obtained from SinoBiological. CASP7 plasmids containing CASP7-WT, CASP7-inactive (CASP7-C186A), CASP7-ΔPro or non-cleavable at the prodomain were kindly gifted by Dr. Salvesen [117]. The site-directed mutagenesis method was employed to modify nucleotides in CASP7 or PARP1 as needed. CASP7-p30, CASP7-p29, and CASP7 calpain cleavage sites mutated (CASP7-CCM) constructs were generated by PCR using CASP7-WT as the template. The primers and restriction enzymes used are listed below. Since calpains are known to recognize the three-dimensional structure rather than the sequence of the targeted site of the substrate protein [67,118], and a single amino acid mutation does not prevent the recognition by calpains [40,119], we sequentially altered amino acid residues to generate CASP7-CCM. Multiple amino acid residues at the 1st and 2nd calpain cleavage sites were changed into ‘A’ until a significant reduction in non-canonical cleavage was obtained (Fig 4C). For the catalytically inactive PARP1 construct (PARP1-CI), three amino acid residues (H862, Y896, and E988) in the catalytic (CAT) domain of PARP1 were sequentially mutated [79]. DNA sequencing was performed to verify the sequence integrity of all the constructs. To obtain stable cell lines, single or double CASP KO cells (CASP7 KO, CASP3 + 7 DKO) were plated at 2 × 105 cells/well and transfected with empty vectors (pcDNA3 for CASP7 constructs and pCMV3-C-His for CASP3 constructs) or plasmids containing the CASP constructs, using Lipofectamine 3000 (Invitrogen, L3000-008) as per manufacturer’s instructions. CASP3 and CASP7 construct expressing cells were selected with 500 μg/ml Hygromycin and/or 1 mg/ml G418, respectively. Antibodies were used to evaluate the expression of CASP constructs in cells that survived drug treatment. The selection agent(s) was removed for two passages prior to experiments. The following primers and restriction sites/enzymes were used in making CASP constructs For cloning CASP7-p30 fragment (via Acc65I and XhoI sites) Forward – 5′-TGACCTAGGGTACCATGAGTAAGAAGAAGAAAAATGTC-3′ Reverse – 5′-TGACCTAGCTCGAGCTACTTGTCATCGTCGTCCTTGTA-3′ For cloning CASP7-p29 fragment (via Acc65I and XhoI sites) Forward – 5′-TGACCTAGGGTACCATGCGATCCATCAAGACCACCCGG-3′ Reverse – 5′-TGACCTAGCTCGAGCTACTTGTCATCGTCGTCCTTGTA-3′ For site-directed mutagenesis of 1st calpain cleavage site for CASP7-CCM construct (FVPSLFSKKKK to FVPSAAAAKKK) 5′-/5Phos/CGGTCCTCGTTTGTACCGTCCGCAGCTGCCGCAAAGAAGAAAAAT-3′ For site-directed mutagenesis of 2nd calpain cleavage site for CASP7-CCM construct (KNVTMRSIKT to KNVTAAAIKT) 5′-/5Phos/AAGAAGAAAAATGTCACCGCTGCCGCAATCAAGACCACCCGGGAC-3′ Primers used in PARP constructs generation For PARP1-DEVD to DEVA Forward – 5′-GGAAAGAGA AAAGGCGATGAGGTGGCGGGAGTGGATGAAGTG-3′ Reverse – 5′-CACTTCATCCACTCCCGCCACCTCATCGCCTTTTCTCTTTCC-3′ For PARP1-DEVD to AEVA Forward – 5′-GGAAAGAGAAAAGGCGCAGAGGTGGCGGGAGTGGATGAAGTG-3′ Reverse – 5′-CACTTCATCCACTCCCGCCACCTCTGCGCCTTTTCTCTTTCC-3′ For PARP1-CI (catalytically inactive) H862A_Forward – 5′-CGAAGATTGCTGTGGGCAGGGTCCAGGACCACC-3′ H862A_Reverse – 5′-GGTGGTCCTGGACCCTGCCCACAGCAATCTTCG-3′ Y896A_Forward – 5′-TTTGGTAAAGGGATCGCATTCGCTGACATG GTC-3′ Y896A_Reverse– 5′-GACCATGTCAGCGAATGCGATCCCTTTACC AAA-3′ E988A_Forward– 5′-TCTCTACTATATAACGCATACATTGTCTAT GAT-3′ E988A_Reverse: –5′-ATCATAGACAATGTATGCGTTATATAGTAG AGA-3′ Crystal violet assay Cells (parental, CASP3 + 7 DKO, or CASP construct expressing), were plated at 1 × 104 cells per well in 24-well plates. For CASP3 + 7 knockdown, parental cells were transfected with 10 nM CASP siRNA or Scramble-siRNA at 24 h after seeding. For Olaparib treatment of SUM149PT, cells were transfected with 5 nM of CASP or Scramble-siRNAs. At 48 h after transfection (72 h after seeding), cells were treated with the drug (PI or Olaparib) and allowed to grow in the media containing the drug for 24 h. Then, the cells were washed twice in PBS and continued to grow in drug-free fresh media for another 3 days. For the post-starvation viability assays, at 24 h post-transfection of siRNA, cells were washed in PBS and the media was replaced with fresh media. Cells were then starved in EBSS for 0, 8, 24, 48, and 72 h. After removing the media and washing with PBS, cells were stained with 0.1% crystal violet (Sigma, C6158) for 15 min, washed with distilled water and air dried before taking images using the ChemiDoc MP System (BioRad). Immunoprecipitation of CASP7-p30/29 fragments for Edman sequencing CASP7 or mouse IgG (control) antibody-bound beads were prepared by incubating anti-CASP7 antibody (LSBio C179785), or mouse-IgG (Santa Cruz, Sc2025) with Dynabeads protein G (Thermo Fisher Scientific, 10003D) in IP buffer (Thermo Fisher Scientific, 8159600147) (25 mM Tris–HCl pH 7.5, 150 mM NaCl, 1× PhosSTOP,1% Np-40, 5% Glycerol and 1 mM EDTA with 1× complete mini protease inhibitor (Roche, 4693124001) at 4 °C overnight. CASP7 KO SKBR3 cells or CASP7 KO SKBR3 cells expressing the CASP7-WT construct were grown in T75 flasks for 3 days, fed with fresh growth media or starved for 4 h in EBSS, trypsinized (Gibco, 25300-062), collected and lysed by incubation in cell lysis buffer. Cell lysates were first pre-cleared at 4 °C for 1 h and the lysates were then added to the prepared protein-G beads and mutated at 4 °C for 4 h. The beads with captured proteins were washed 3× with IP wash buffer and elution buffer was added. Captured proteins were released by boiling the beads in 2× SDS sample buffer (117 mM Tris–Cl pH6.8, 4% SDS, 8% glycerol, 0.01% bromophenol blue, 200 mM DTT) at 98 °C for 10 min. The samples were subsequently analyzed by western blotting. After PonceauS staining, the band corresponding to CASP7-p30/29 that contains ~ 1–2 µg of proteins on PVDF was cut and sent for Edman degradation sequencing (for the first 10 amino acids at N-terminus) at the SPARC BioCentre, Toronto, Canada. Fluorescence and phase-contrast microscopy For staining with DALGreen (Dojindo, NC1879567), seeded cells were preincubated with the dye in the growth media for 30 min at 37 °C, then washed twice with PBS and subjected to autophagy induction by amino acid starvation. Live cells were imaged for fluorescence marker using constant settings (i.e., same laser power and gains). Microscopy images were obtained from a Zeiss Axio Observer (Z1/7) inverted fluorescence microscope equipped with an Apotome.2 and an AxioCam MRm R3 camera (Zeiss). Images were obtained at 20× magnification using the Zen software (version 2.5, blue edition; Zeiss). The number of DALGreen positive punctae in 10 randomly taken images from each cell line were determined using ImageJ64 software and normalized to the number of cells in each image. Total of 500–700 cells were covered and, the experiment was replicated twice. To analyze cell size, phase-contrast images were taken at 20× magnification from 10 randomly selected locations for each cell line (SKBR3 parental and CASP3 + 7 DKO). Each cell outline was manually marked in phase-contrast images, and the area was determined using ‘measure’ tool in imageJ64. The area measurements were performed in two separate experiments. RNA isolation and RT-qPCR analysis Total RNA was isolated and purified using the RNeasy Plus Mini Kit (QIAGEN, 74104) as per manufacturer’s instructions. To remove any contaminating genomic DNA, RNA was treated with DNase 1 (Invitrogen, 18068-015) as per manufacturer’s instructions. The quantification of RNA was performed using a Nanodrop Spectrophotometer. The quantitative RT-qPCR was performed to quantitate selected ATG transcript expression levels using One-Step Plus SYBR Green reagent kit (Applied Biosystems, 4385617) in 15 µl reactions containing 100 ng of total RNA and 0.1 µM of each primer on the 7900 sequence detection system (Applied Biosystems). Manufacturer’s recommended cycling conditions were used. The ATG proteins and the sequences of forward and reverse primers designed using Primer–BLAST [120] are listed in S2 Table. The mRNA expression levels were determined by the Comparative Cycle Threshold method (2−[delta][delta]Ct) normalizing to internal reference gene (. The expression is presented as the fold change relative to the control. RT-qPCR analyses were done in two or three technical replicates and experiments were repeated at least thrice. DNA fragmentation/Comet assay The level of DNA damage (single and double strand breaks) was evaluated by the comet assay (single cell-gel electrophoresis) as previously described [121]. Briefly, cells were washed in PBS (2-3×), trypsinised, harvested in cold PBS and the supernatant was removed after spinning for 5 min at 1,200 rpm at 4 °C. Each cell pellet was resuspended in cold PBS and gently mixed with 1% low melting agarose (Invitrogen, 16500-100) solution at a ratio of 1:10 (v/v). A drop from each cell-agarose mixture was immediately placed on a 1% agarose pre-coated glass microscopic slide, covered with a cover slip, and placed flat for 30 min at 4 °C. Next, the coverslips were removed gently, and the slide was immediately placed in 4 °C lysis solution (2.5 M NaCl, 100 mM disodium EDTA, 10 mM Tris base, and 200 mM NaOH) overnight at 4 °C. Next, slide was incubated in alkaline unwinding buffer (pH > 13) (200 mM NaOH, 1 mM EDTA) for 20 min at RT in the dark before being subjected to gel electrophoresis in alkaline buffer (200 mM NaOH, 1 mM EDTA) at ~ 20 V (300 mA) for 30 min at 4 °C. Slides were rinsed by immersing in dH2O for 5 min (2×), DNA was fixed by immersing in 70% ethanol for 5 min, and dried at 37 °C for 10 min before stained with GelRed (Biotium, 41,003) (1/3,000 dilution) for 30 min at RT in darkness. The slide was rinsed in water, dried, and mounted with slow-fade mounting media. Images were taken using a fluorescent microscope at 10× objective and DNA damage was quantified as the comet tail moment (product of the tail length × the fraction of total DNA in the tail) using CometScore 2.0 software [122]. Approximately, 200 randomly selected nucleoids were scored per each treatment. Experiments were repeated twice. In silico co-essentiality and genetic interaction network mapping In silico genetic interaction screening was performed using GRETTA (v0.99.2) [95] on the R statistical software (v4.2.2) [123]. We computed on the cancer DepMap public release version 22Q2 [124] data from FigShare https://figshare.com/articles/dataset/DepMap_22Q2_Public/19700056/2; accessed on 11 August 2022) according to Takemon and Marra (2023). GRETTA was used to identify cell lines with combined low or high CASP3 and CASP7 protein expression. We identified a group of pan-cancer cell lines below the 10th percentile of CASP3 and CASP7 protein expression (16 cell lines; low expressors) and those above the 90th percentile of both CASP3 and CASP7 protein expression (10 cell lines; high expressors) as the control comparator. The normalized (scaled and mean centered) gene and protein expression values for the selected cell lines were extracted from Nusinow and colleagues (2020) [98] using GRETTA for comparison between the low and high expressor groups. Genetic interactors of combined CASP3 and CASP7 low expression were predicted using GRETTA, which performed pairwise Mann–Whitney U-tests between the high expressors and low expressors cancer cell line groups for all 17,386 genes to obtain p-values. P-values were adjusted for multiple testing using a permutation (10,000 randomized resampling). Candidate genetic interactors of combined low CASP3 and CASP7 expression were called using a threshold of adjusted Mann–Whitney U-test p-value < 0.05 and a median lethality probability > 0.5 in at least one group as previously defined [95]. To highlight candidate lethal interactors, genetic interaction scores computed by GRETTA were used to generate lethal genetic interaction scores by collapsing genetic interaction scores below zero to zero. A higher interaction score indicates an increased likelihood that a gene knockout is lethal in the test group. Rank is based on lethal GI scores. Enrichment analysis and gene function annotation We used clusterProfiler (v4.6.0) [125] to annotate GO biological processes of genes that were candidate co-essential or genetic interactors (unadjusted p-value < 0.05). Given many related GO terms with similar sets of genes associated with them, we calculated the degree of overlapping genes associated between GO terms using the Jaccard indices and performed hierarchical clustering to summarize their roles, as performed by Takemon and colleagues [95]. The number of distinct clusters was determined using gap statistics, which calculated the optimal number of clusters (up to 20 clusters) by iteratively bootstrapping 10,000 times. Statistical analysis In each graph, error bars represent standard error of at least n = 3 independent experiments. As indicated in the legends, statistical significance was calculated by Student’s t-tests (for comparisons between two samples) or analysis of variance (ANOVA) with appropriate post-tests for multiple comparisons (Tukey’s, Dunnett’s, or Sidak’s) using GraphPad Prism version 8.4.1 for Windows (GraphPad Software, San Diego, California, USA, www.graphpad.com). Significance (p-values) indicated are relative to the control unless otherwise indicated. P-values less than 0.05% were determined as significant. Significance is indicated as follows unless otherwise noted: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. Supplementary information S1 Fig. Loss of CASP3 and CASP7 suppresses starvation-induced autophagy in MDA-MB-231 cells; related to Fig 1. (A, B) Representative western blots showing levels of CASP3 or CASP7 following treatment with scramble-siRNA, CASP3 siRNA or CASP7 siRNAs used in experiments. (C) Representative western blots showing levels of CASP3 and CASP7 in knockout lines of CASP3 and/or CASP7 in SKBR3 cells, cultured in fed conditions (F; fresh DMEM) or subjected to amino acid starvation (S) in EBSS for 8 h. (D) Representative western blots of indicated proteins from CASP3 + 7 DKO SKBR3 cells cultured in fed conditions (fresh DMEM) or subjected to amino acid starvation in EBSS for 24 h, in the absence or presence of BafA1(50 nM) for the final 2 h. (E) Quantification of LC3B-based autophagy flux shown in (D). The levels of LC3BII were normalized to loading control and shown relative to the fed (fresh DMEM) in the absence of BafA1. (F) Representative western blots of indicated proteins from SKBR3 parental and multiple CASP3 + 7 DKO cell lines (CRISPR-Cas9 mediated), cultured in fed conditions (fresh DMEM) or subjected to amino acid starvation in EBSS for 8 h. BafA1 (50 nM) was added for the final 2 h of culture. (G) Representative western blots of indicated proteins from MDA-MB-231 cells cultured in fed conditions (fresh DMEM) or subjected to amino acid starvation in EBSS for various time periods, in the absence (top) or presence (bottom) of BafA1(50 nM) for the final 2 h. (H) Quantification of LC3B-based autophagy flux shown in (G). The levels of LC3BII (in the presence of BafA1) were normalized to loading control and shown relative to the fed (fresh DMEM) control. (I) Representative western blots of indicated proteins from MDA-MB-231 cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and then continued to be cultured in fed conditions (fresh DMEM) or starved in EBSS for 8 h. BafA1 (50 nM) was added for the final 2 h. (J) Quantification of LC3BII-based autophagy flux shown in (I). The levels of LC3BII in starved cells were normalized to loading control and shown relative to the starved scramble-siRNA control. (K) Representative western blots of indicated proteins from MDA-MB-231 parental and CASP3 + 7 DKO cells cultured in fed conditions (fresh DMEM) or starved in EBSS for 8 or 24 h. BafA1 (50 nM) was added for the final 2 h. (L) Quantification of LC3BII-based autophagy flux shown in (K). The levels of LC3BII in starved cells were normalized to loading control and shown relative to the parental control (all from starved conditions in the presence of BafA1). (M) Representative phase contrast microscopy images of SKBR3 parental and CASP3 + 7 DKO cells starved in EBSS for 8 h. Scale bars, 50 µm. (N) Quantification of cell size (area) in cells shown in (M). n = 2, in each replicate, cells in 10 random images were measured. In graphs, all data are shown as mean ± SEM. n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In H, J, and L, one-way ANOVA with Dunnett’s post-test. In N, one-way ANOVA with Tukey’s post-test. The numerical data presented in this figure can be found in S2 Data. https://doi.org/10.1371/journal.pbio.3003034.s001 (TIF) S2 Fig. Loss of CASP3 and CASP7 suppresses compensatory autophagy induced by MG132 or Bortezomib in MDA-MB-231 and SKBR3 cells, sensitizing cells to proteasome inhibitors; related to Fig 2. (A) Representative western blots of indicated proteins from MDA-MB-231 cells treated with proteasome inhibitor (MG132) at increasing dosage for 24 h in the presence (top) or absence (bottom) of BafA1(50 nM) for the final 2 h. (B) Quantification of LC3B-based autophagy flux in MDA-MB-231 shown in (A). The levels of LC3BII were normalized to loading control and shown relative to the untreated control (NT). (C) Graph showing proteasome activity in MDA-MB-231 cells in response to increasing concentrations of MG132 as depicted by chymotrypsin-like, caspase-like, and trypsin-like activity measured by the Proteasome glow assay kit. (D) Representative western blots of indicated proteins from MDA-MB-231 cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and continued to be cultured in vehicle DMSO or MG132 (0.5 μM) treated fresh DMEM for 24 h. BafA1(50 nM) was added for the final 2 h. (E) Quantification of LC3B-based autophagy flux in proteasome inhibitor (MG132) treated MDA-MB-231 cells shown in (D). The levels of LC3BII in MG132-treated cells were normalized to loading control and shown relative to the MG132-treated scramble-siRNA control. (F) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, CASP3 and/or CASP7 siRNAs (48 h) and continued to be cultured in vehicle DMSO or Bortezomib- (1 nM) treated fresh DMEM for 24 h. BafA1 (50 nM) was added for the final 2 h. (G) Quantification of LC3B-based autophagy flux in proteasome inhibitor- (Bortezomib) treated SKBR3 cells shown in (F). The levels of LC3BII in Bortezomib-treated cells were normalized to loading control and shown relative to the Bortezomib-treated scramble-siRNA control. (H–M) Representative images of crystal violet assay plates and quantification of percentage of cell viability. The indicated siRNA transfected cells were grown for 2 days or the Parental and DKO cells were grown in normal media for 2 days and treated with indicated concentrations of proteasome inhibitors (MG132 or Bortezomib) for 24 h and were then continued to be cultured in drug free media for another 3 days before subjected to crystal violet assay. Graphs show the percentage of stained (viable) cells at each concentration normalized to respective untreated cells. MDA-MB-231 cells transfected with scramble or CASP3 + 7siRNA and treated with MG132 (H), and its quantification is in (I). Bortezomib-treated SKBR3 parental and CASP3 + 7 DKO cells (J) and its quantification (K). MDA-MB-231 cells transfected with scramble or CASP3,7 siRNAs and treated with Bortezomib (L) and its quantification (M). (N–O) Representative images of crystal violet assay plates and quantification of percentage of cell viability. The indicated siRNA transfected cells were grown for 1 day, replaced with fresh media or subjected to starvation for indicated duration before subjected to crystal violet assay. Graphs show the percentage of stained (viable) cells at each starvation time point normalized to fed cells. In graphs, all data are shown as mean ± SEM. n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. In B, C, E, and G, one-way ANOVA with Dunnett’s post-test. In I, M, and O, two-way ANOVA with Sidak’s post-test. In K mixed model with Sidak’s post-test. The numerical data presented in this figure can be found in S2 Data. https://doi.org/10.1371/journal.pbio.3003034.s002 (TIF) S3 Fig. Rapamycin-induced autophagy does not depend on CASP3 and CASP7; related to Fig 3. (A, B) Representative western blots of indicated proteins from SKBR3 (A) or MDA-MB-231 (B) parental and CASP3 + 7 DKO cells untreated or treated with mTOR inhibitor rapamycin (Rap) 10 nM for 24 h in the absence or presence of BafA1 (50 nM) for the final 2 h. The levels of phosphorylated and unphosphorylated p70S6K were used as markers for mTOR activity. (C) Representative western blots showing the effects of rapamycin on the formation of CASP7-p29/p30 bands in MDA-MB-231 cells. Cells grown in normal culture media for 4 days were continued to be cultured in fed (fresh DMEM) conditions, starved in EBSS for 4 h, treated with 10 nM rapamycin (Rap) for 24 h or subjected to both starvation and Rap treatment for 24 h. The p70S6K and p70S6K-PO4 immunolabeling was used as an mTOR activity reporter. In each n = 2 independent experiments. https://doi.org/10.1371/journal.pbio.3003034.s003 (TIF) S4 Fig. During non-lethal stress, non-canonical cleavage of CASP7 occurs at calpain cleavage sites; related to Fig 4. (A) Representative western blot showing CASP7 immunolabeling in CASP3 + 7 DKO SKBR3 cells stably expressing CASP7-WT, CASP7 ΔPro, and CASP7 prodomain cleavage blocked constructs cultured in fed conditions (fresh DMEM) or starved in EBSS for 4 h. Arrowhead indicates CASP7-ΔPro fragment (33.5 kDa) and arrow indicates CASP7-p29/30 fragments. n = 1. (B) Representative western blot showing immunoprecipitation of CASP7-p29/30 fragment(s) for Edman sequencing. CASP7 KO SKBR3 cells stably expressing CASP7-WT construct were cultured in fed conditions (fresh DMEM) or starved in EBSS for 2 h prior to immunoprecipitation with CASP7 antibody. Lysate are from CASP7-WT expressing and CASP7-KO cells. IP with anti-mouse CASP7 and anti-mouse IgG antibodies are shown. n = 2 independent experiments. (C) A schematic showing mouse CASP7 with predicted calpain cleavage sites identified by the Deepcalpain algorithm. (D) Representative western blots of indicated proteins from CASP3 KO SKBR3 cells transfected with scramble, calpain 1, or calpain 2 siRNAs (48 h) and then continued to be cultured in vehicle DMSO or MG132 (0.5 μM) in fresh DMEM for 24 h and with BafA1(50 nM) in the final 2 h. (E) Quantification of LC3B-based autophagy flux in MG132-treated cells shown in (D) relative to MG132-treated scramble-siRNA control. (F) Schematic showing the amino acid sequence between two non-canonical cleavage sites in CASP7. PARP binding site is shown. In graph (E), data are shown as mean ± SEM. n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant, with one-way ANOVA with Dunnett’s post-test. The numerical data presented in this figure can be found in S2 Data. https://doi.org/10.1371/journal.pbio.3003034.s004 (TIF) S5 Fig. Non-canonical CASP7 cleavage sites flank a PARP1 binding site; related to Fig 5. (A–E) Reverse transcription polymerase chain reaction (RT-qPCR) analyses of key ATG gene expression in SKBR3 parental or CASP3 + 7 DKO cells under well-fed conditions. (F) RT-qPCR analyses of ATG7 in SKBR3 parental, CASP3 + 7 DKO, or CASP3 + 7 DKO cells re-expressing CASP3 + 7-WT constructs, treated with vehicle DMSO (NT) or 0.5 μM of MG132 for 24 h. (G) Representative western blots of indicated proteins from SKBR3 parental or CASP3 + 7 DKO cells, transiently transfected with indicated Myc-tagged PARP constructs (PARP1-WT, PARP1-DEVA, PARP1-AEVA, or catalytically inactive PARP1-CI) or vector control (v) for 48 h. endo = endogenous PARP1. (H-I) Quantification of cleaved-PARP1 (H) and PAR levels (I) shown in (G). (J) RT-qPCR analyses of LC3B transcripts in basal conditions in SKBR3 parental or CASP3 + 7 DKO cells transiently transfected with indicated Myc-tagged PARP constructs (PARP1-WT, PARP1-DEVA, PARP1-AEVA, or PARP1-CI) or vector control (v). Data of one replicate with relatively high LC3B levels is circled in red. (K) Representative western blots of indicated proteins from SKBR3 parental or CASP3 + 7 DKO cells, transiently transfected with indicated Myc-tagged PARP constructs (PARP1-WT, PARP1-DEVA, PARP1-AEVA, or PARP1-CI) or vector control (v) for 48 h and with BafA1 (50 nM) in the final 2 h. endo = endogenous PARP1 (L) Quantification of LC3B-based autophagic flux in basal conditions, shown in (K). (M-P) Representative western blots of indicated proteins from SKBR3 cells transfected with scramble, calpain 1, calpain 2, CASP6, cathepsin B (cathB), cathepsin D (cathD) or CASP8 siRNAs (48 h) and treated with vehicle DMSO or MG132 (0.5 μM) for 24 h (M and N) or grown in untreated media (in O and P). Each with at least n = 2 independent experiments. In graphs, unless otherwise noted, all data are shown as mean ± SEM. n = 3 or more independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, NS, not significant. A-E with Student T test (unpaired). In F, H, I, J, and L, one-way ANOVA with Dunnett’s post-test. The numerical data presented in this figure can be found in S2 Data. https://doi.org/10.1371/journal.pbio.3003034.s005 (TIF) S6 Fig. Cell viability is unaffected in CASP3 + 7 DKO cells expressing CASP7-CCM, p30 or p29; related to Fig 6. (A) Representative western blot of CASP7 immunolabeling from SKBR3 parental or CASP3 + 7 DKO cells stably expressing CASP7-CCM, CASP7-p30 or CASP7-p29 constructs, treated with vehicle DMSO or proteasome inhibitor MG132 (0.5 μM) for 24 h. n = 2 independent experiments. (B) Representative images of crystal violet assay plates and quantification in cells expressing CASP7-CCM, CASP7-p30 or CASP7-p29 constructs. Cells were grown in standard media for 2 days and treated with indicated concentrations of proteasome inhibitor Bortezomib for 24 h and continued to grow in drug free media for another 3 days before subjected to crystal violet assay. (C) Quantification of cell viability data presented in (B). The percentage of stained (viable) cells at each concentration was normalized to respective untreated cells. Data are shown as mean ± SEM. n = 3 independent experiments. NS, not significant. Two-way ANOVA with Tukey’s post-test. (D) Graphical representation of current working model. The numerical data presented in this figure can be found in S2 Data. https://doi.org/10.1371/journal.pbio.3003034.s006 (TIF) S7 Fig. in silico analyses identify CASP3 and CASP7 genetic interactions; related to Fig 7. (A) Low and high levels of CASP3 and CASP7 expressing cancer cell lines identified and analyzed in Fig 7H and I. (B) Heatmap showing Jaccard index similarities between GO terms. Based on annotated GO biological processes of genes that were candidate co-essential or genetic interactors (unadjusted p-value < 0.05). GO terms were summarized into 16 distinct clusters using Jaccard index-based hierarchical clustering. The numerical data presented in this figure can be found in S2 Data. The code related to S7A and S7B Fig is publicly available in a GitHub repository (https://github.com/MarraLab/Caspase_GRETTA_analysis) and archived on Zenodo (https://doi.org/10.5281/zenodo.14722298) https://doi.org/10.1371/journal.pbio.3003034.s007 (TIF) S1 Raw Images. Original western blots used for Figs 1A, 1C, 1E, 1G, 2A, 2D, 2F, 2H, 3A-J, 4A, 4C-E, 5A-B, 5E-F, 5J, 5N, 6A, 6E, 6G, 6M, 6O, 7C-D, S1A-D, S1F-G, S1I, S1K, S2A, S2D, S2F, S3A-C, S4A-B, S4D, S5G, S5K, S5M-P and S6A. The green box indicates the area of the blot that was used for the figure when only a portion of the blot was used. Lanes have not been rearranged; the order of the lanes is same as the order shown in figures. https://doi.org/10.1371/journal.pbio.3003034.s008 (PDF) S1 Data. The underlying numerical data in Figs 1B, 1D, 1F, 1H, 1J, 2B-C, 2E, 2G, 2I, 2K, 4F-I, 5C-D, 5G-H, 5I, 5K-M, 5O, 6B, 6D, 6F, 6H, 6J, 6L, 6N, 6P, 7B, 7E and 7G-I. https://doi.org/10.1371/journal.pbio.3003034.s009 (XLS) S2 Data. The underlying numerical data in S1E, S1H, S1J, S1L, S1N, S2B-C, S2E, S2G, S2I, S2K, S2M, S2O, S4E, S5A-F, S5H-J, S5L, S6C and S7B Figs. https://doi.org/10.1371/journal.pbio.3003034.s010 (XLSX) S1 Table. siRNAs used in knockdown experiments. https://doi.org/10.1371/journal.pbio.3003034.s011 (DOCX) S2 Table. Forward and Reverse primer sequences used for qRT-PCR. https://doi.org/10.1371/journal.pbio.3003034.s012 (DOCX) S3 Table. Key resources table. https://doi.org/10.1371/journal.pbio.3003034.s013 (DOCX) Acknowledgments The authors would like to thank Arlene Gidda for valuable comments on the manuscript, Gorski lab members for helpful discussions, Dr. Peter Eirew for assistance with providing the PDX samples, Dr. Nancy dos Santos and Nicole Wretham for murine tissues and Dr. Guy S. Salvesen for FLAG-tagged CASP7 constructs, Susanna Y. Chan for technical advice for Comet assay, and Dr. Stephanie McInnis for project management support. Graphical representation of current working model was created using BioRender.com.

journal article

Open Access Collection

Hippocampal damage disrupts the latent decision-making processes underlying approach-avoidance conflict processing in humans

Duc, Willem Le;Butler, Christopher R.;Argyropoulos, Georgios P. D.;Chu, Sonja;Hutcherson, Cendri;Ruocco, Anthony C.;Ito, Rutsuko;Lee, Andy C. H.

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003033pmid: 39932954

Introduction Approach-avoidance conflict (AAC) arises when potential outcomes of reward and punishment are experienced simultaneously, leading to competing tendencies to engage or retreat [1]. In nonhuman animals, this dilemma is classically illustrated by the prey animal who, in deciding whether to forage for food, must balance the need for resources with the possibility of being exposed to predators. Successful AAC resolution is essential to survival, and dysregulation of approach and avoidance tendencies is suggested to be a characteristic of various mental health disorders [2–5]. A substantial body of rodent research has identified the ventral hippocampus (vHPC) as a key region in the arbitration of AAC [6]. Specifically, gross vHPC damage or inhibition has been shown to increase approach responses to motivationally conflicting stimuli [7], while subfield-specific inactivation can differentially impact approach or avoidance behavior [8,9]. Complementing this work, excitotoxic lesions to the HPC in nonhuman primates, impacting both the anterior HPC (aHPC), the primate homologue of the rodent vHPC, as well as the posterior HPC, have been demonstrated to facilitate the retrieval of reward located near a potential predator [10]. Corresponding human evidence comes primarily from neuroimaging studies that have reported greater activity in the aHPC during high AAC [11,12]. Human HPC dysfunction has also been associated with a greater propensity to approach reward in the presence of threat, although these findings are limited by the use of a paradigm with hippocampal-dependent spatial navigation demands [11] and/or assessment of a single focal bilateral HPC case [13]. Crucially, although the involvement of the HPC in AAC is clear, the aforementioned work has typically focused on behavioral measures (e.g., exploration time, number/proportion of approach responses, response latency) that provide limited insight into the latent computational processes that underpin the observed behavior. Thus, the precise contributions of this structure remain opaque and it is unknown whether the involvement of the HPC in AAC pertains primarily to its role in mnemonic processing or to decision-making processes per se. For example, greater approach behavior under motivational conflict following gross HPC damage may reflect disrupted retrieval of conflicting stimulus valences. Alternatively, lesions to the HPC may alter how evidence is used to guide decision-making, for example, by decreasing attention to negative outcomes and slowing the accumulation of evidence in support of avoidance, and/or decreasing response caution by reducing the amount of evidence necessary to make a decision. Computational models that incorporate choice and response latency data offer a compelling means to address this issue but, although these methods are being applied increasingly to the study of AAC [3,5,14–17], there has been very limited work on the HPC particularly in conjunction with brain lesion cases. To this end, we recruited a group of 8 individuals with focal hippocampal damage and 25 neurologically healthy controls (see Table 1 for medial temporal lobe structure volumes of hippocampal damage participants and Table 2 for background neuropsychology) and administered a computerized AAC paradigm adapted from previous fMRI work [12], in which participants first learned the reward/punishment outcomes of individual visual images and were then asked to approach or avoid the same items presented as motivationally conflicting or non-conflicting pairs (Fig 1). Two versions of this task, each employing scene or object images, were used to examine the possibility that, in keeping with its role in spatial cognition, the involvement of the HPC in AAC is restricted to spatial/contextual information [18,19]. Besides inspecting standard behavioral indices of AAC decision-making, we employed a Bayesian implementation of the drift diffusion model, the hierarchical drift diffusion model (hDDM) [20,21], which allowed us to investigate the impact of HPC damage on estimates of baseline propensity towards approach or avoidance (bias), rates of evidence accumulation towards approach or avoidance (drift rate), the amount of evidence needed for a given decision (threshold), and the recruitment of non-decision cognitive processes (i.e., non-decision time). Lastly, in light of work demonstrating impaired responding under response conflict in epilepsy patients with HPC sclerosis [22,23], participants were also administered computerized classic Stroop and Go/No-Go tasks. This allowed us to examine whether the HPC plays a wider role in conflict processing beyond value-based decision-making. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. (A) The Object and Scene AAC tasks each involved 4 unique stimuli, with 2 assigned to be positive and 2 assigned to be negative in valence (note: to comply with license restrictions, one of the scene task stimuli (bottom right) has been replaced with a similar image for display purposes). (B) During an initial learning phase, participants learned to approach or avoid these stimuli in order to gain or avoid the loss of game points, respectively. An example positive trial from the Object AAC task (top) and a negative trial from the Scene AAC task (bottom) are depicted. Participants were presented with a feedback screen after each response, showing the outcome of their response and their total accumulated points. (C) During a subsequent decision phase, participants were then asked to approach or avoid pairs of stimuli in the absence of feedback, with each pair composed of stimuli with non-conflicting positive, non-conflicting negative, or conflicting valences. Examples of Object No-Conflict Positive (left), Object No-Conflict Negative (middle), and Scene Conflict (right) pairs from the Object and Scene AAC tasks are shown. AAC, approach-avoidance conflict. https://doi.org/10.1371/journal.pbio.3003033.g001 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 1. Hippocampal damage participant volume differences, expressed as Z-scores, for individual medial temporal lobe regions compared to age-matched controls (mean = 61.18 years old (SD = 8.94); 20 M:8 F). Data for autoimmune limbic encephalitis (aLE) patients and controls are taken from [45]. Medial temporal lobe epilepsy (MTLE) patient volumetrics were derived using the same methodology described in [45]. In brief, the HPC and Amyg were manually delineated in native space using guidelines described in [77], while PRC, ERC, and PHC were manually delineated in native space using guidelines described in [78]. The HPC was split into anterior (aHPC) and posterior (pHPC) portions, with the aHPC comprising the HPC head extending from the most anterior coronal slice to the first appearance of the uncal apex, and the pHPC comprised the HPC body and tail. All volumes were corrected for intracranial volume prior to Z-score transformation. https://doi.org/10.1371/journal.pbio.3003033.t001 Download: PPT PowerPoint slide PNG larger image TIFF original image Table 2. Mean hippocampal damage participant and control test scores (raw or %) for standard neuropsychological tests including the MoCA [63], the WMS-III [64] and WMS-IV [65], the D and P) [66], the RCFT [67], the WASI-II [62], and the VOSP [68]. Performance qualitative descriptors are taken from published test norms. https://doi.org/10.1371/journal.pbio.3003033.t002 Results Approach-avoidance tasks Linear mixed model analyses of choice and response time data. Learning Phase–Participants first learned the valences of 4 scene (Scene task) or 4 object (Object task) stimuli over 120 trials by making approach or avoid key presses to individually presented stimuli. Approaching a positive stimulus led to the award of game points, whereas approaching a negative stimulus led to the loss of game points. An avoid response had no impact on game points regardless of stimulus valence. The proportion of correct responses was analyzed using a linear mixed model (LMM) with fixed effects of group (Hippocampal Damage versus Control), valence (Positive versus Negative), block (1 to 10, with 12 trials [i.e., 3 repetitions of each stimulus] per block), stimuli (Object versus Scene), and the interactions between them as predictors. We additionally modeled random intercepts and slopes for valence per participant. The selected model’s formula was: accuracy ~ group * valence * block * stimuli + (1 + valence | participant). Table 3A summarizes the selected model’s outputs, as well as post hoc estimated marginal mean (EMM) comparisons (adjusted for multiple comparisons using Tukey’s honestly significant difference (HSD) method). To correct for multiple LMMs being conducted in this study (7 in total), all p-values for this model and subsequent LMMs were additionally adjusted with a Bonferroni correction (pcorr = p × 7). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 3. Object and Scene AAC task learn phase LMM results for (A) Accuracy; and (B) Response time. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t003 Based on our predictions, post hoc pairwise comparisons of EMMs, selected a priori, were performed to probe relevant main and interaction effects. Specifically, we sought to find out whether: (1) participants showed significantly improved accuracy from Block 1 to Block 10; (2) both groups showed similar accuracy at Block 10 on both tasks; and (3) performance was similar for negative and positive stimuli for the Scene and Object tasks at Block 10. We found a main effect of block (pcorr < 0.001) and a significant block-by-group (pcorr = 0.007) interaction. There were also trends for the block-by-group-by-stimuli (pcorr = 0.063) and valence-by-block-by-stimuli (pcorr < 0.091) interactions, but no significant interaction between valence, block, and group (pcorr = 1.000). Comparing EMMs at Block 1 and at Block 10 revealed, as expected, that the proportion of accurate responses increased significantly from Block 1 to Block 10 (pcorr < 0.001; Fig 2A). When compared at Block 10, both groups had similar accuracies on both the Scene (pcorr = 1.000) and Object tasks (pcorr = 1.000). Collapsing across groups, there was no significant difference in performance at Block 10 between negative and positive stimuli in the Scene task (pcorr = 1.000) or Object task (pcorr = 1.000). These results indicate that both groups learned the images’ valences over the course of learning trials and had achieved comparable knowledge of them by the end of the learning phase. Moreover, learning of negative and positive valenced stimuli was similar. We next used a LMM to analyze response times to determine whether participants showed increased familiarity with the stimulus images by the end of the learning phase. We used identical fixed predictors to the abovementioned LMM but modeled random intercepts and slopes for stimuli per participant. The formula for the selected model was: RT ~ group * valence * block * stimuli + (1 + stimuli | participant). Table 3B summarizes the selected model’s outputs as well as post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). As expected, we observed a significant main effect of block (pcorr < 0.001). To determine whether this reflected faster response times over the course of the task, we compared EMMs for response time at Block 1 and Block 10. Indeed, participants responded significantly faster at Block 10 than they did at Block 1 (pcorr < 0.001; Fig 2B), suggesting they had become more familiar with the task stimuli by the end of the learning phase. A block-by-group interaction was observed (pcorr < 0.001), although a post hoc comparison revealed that the groups did not differ in their response times at Block 10 (pcorr = 1.000). There was no significant interaction between valence, block, and group (pcorr = 1.000) or block, group, and stimuli (pcorr = 1.000). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Both participant groups successfully acquired the stimulus valences across trial blocks during the learning phase, as reflected in (A) accuracy and (B) response times (RT) (means ± SE). Importantly, the individuals with hippocampal damage and controls demonstrated comparable learning by the final block (all pcorr = 1.000). In the decision phase, (C) individuals with hippocampal damage responded similarly to controls on No-Conflict Positive and No-Conflict Negative trials (both pcorr = 1.000) but approached significantly more often in response to Conflict image pairs (pcorr < 0.001). There were no significant differences in (D) response times between groups (all pcorr = 1.000) (individual data points with EMMs ±SE). Underlying data for these figures and associated analyses are available from https://doi.org/10.5683/SP3/C4GWZU. EMM, estimated marginal mean. https://doi.org/10.1371/journal.pbio.3003033.g002 Decision phase–Following the learning phase, participants made approach or avoid key presses to pairs of scenes (Scene task) or objects (Object task) across 108 trials. Two thirds of the trials contained pairs composed of stimuli with the same valence (No-Conflict Positive and No-Conflict Negative trials) and a third of the trials involved a positive stimulus paired with a negative stimulus (Conflict trials). Participants were told that approaching a Conflict pair would lead to a 50% chance of receiving a gain or loss of game points. We analyzed decision phase approach/avoid responses with a LMM with group (Hippocampal Damage versus Control), condition (No-Conflict Positive versus No-Conflict Negative versus Conflict), and stimuli (Object versus Scene), as well as the interactions between them as fixed effects, with random intercepts per participant. The formula for the selected model was: response ~ group + positive_vs_conflict + negative_vs_conflict + stimuli + group * positive_vs_conflict + group * negative_vs_conflict + group * positive_vs_conflict * stimuli + group * negative_vs_conflict * stimuli + (1 | participant). Table 4A summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 4. Object and Scene AAC task decision phase LMM results for (A) Proportion of approach responses; and (B) Response time. Since there were 3 trial types (No-Conflict Positive, No-Conflict Negative, Conflict), condition was coded as a contrast between Conflict and each of the No-Conflict conditions (Positive, Negative). Significant predictors involving these contrasts were then explored further with 2 multi-parameter effects tests, one to assess the significance of the main effect of condition and another the interaction between group and condition. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t004 Multi-parameter tests revealed a significant main effect of condition (pcorr < 0.001) as well as a significant group-by-condition interaction (pcorr < 0.001). Probing the main effect of condition by comparing EMMs for all 3 conditions to one another revealed that rates of approach response differed significantly across all 3 conditions (all pcorr < 0.001), suggesting that participants retained the stimulus identities and their respective valences acquired during the learning phase, with proportions of approach close to 0 for No-Conflict Negative trials, close to 1 for No-Conflict Positive trials, and in between for Conflict trials (Fig 2C). To probe the group-by-condition interaction, we next compared group EMMs for each condition. Groups did not differ significantly in their rates of approach responses on either No-Conflict Positive trials (pcorr = 1.000) or No-Conflict Negative (pcorr = 1.000), indicating comparable retention of stimulus-valence associations at test. Importantly, consistent with our hypotheses, the hippocampal damage group approached significantly more often than the control group on Conflict trials (pcorr < 0.001). Contrary to our hypotheses, however, stimulus type did not interact significantly with group or trial type (both p = 1.000), indicating that the group differences observed in approach behavior on Conflict trials were not specific to either objects or scenes. Next, we analyzed response times to determine whether the task paradigm successfully elicited AAC and to investigate whether groups differed in the impact of Conflict on their response speed. To this end, we used a LMM with the same fixed and random effects structure as that described above, implementing the formula: RT ~ group + conflict_vs_positive + conflict_vs_negative + stimuli + group * conflict_vs_positive + group * conflict_vs_negative + group * conflict_vs_positive * stimuli + group * conflict_vs_negative * stimuli + (1 | participant). Table 4B summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). A multi-parameter test revealed a main effect of condition on response time (pcorr < 0.001) and a comparison of EMMs revealed no significant difference between No-Conflict Positive and No-Conflict Negative trials (pcorr = 1.000). However, Conflict trials had significantly longer response times compared to No-Conflict Positive (pcorr < 0.001) and No-Conflict Negative trials (pcorr < 0.001; Fig 2D), suggesting that these trials successfully elicited AAC. We also observed a group-by-condition interaction (p = 0.018), although this did not survive Bonferroni correction (pcorr = 0.126). Exploratory post hoc comparisons of EMMs found similar response times between groups on all 3 conditions (all pcorr ≥ 1.000). Hierarchical drift diffusion model analyses The candidate hDDM model that converged successfully and showed best fit for the decision phase data made separate estimates for each parameter, varying by condition (i.e., drift rate, threshold, and non-decision time), except bias (deviance information criterion (DIC) = 9848.59; all alternative models DIC > 9862). Bias was modeled collapsed across conditions because it is conceptualized as the starting point for evidence accumulation before participants are exposed to the condition of each trial. We also modeled participant-wise estimates for every parameter, except for bias, which we modeled only at the group level to achieve convergence. Within-group comparisons. Parameter estimates largely differed between task conditions as expected. In both groups, model parameter estimates suggested that non-decision times on No-Conflict conditions were likely near-identical (Hippocampal Damage: PPositive > Negative = 0.493; Controls: PPositive > Negative = 0.378; Fig 3A) but were almost certainly longer on Conflict trials (Hippocampal Damage: PConflict > Positive = 0.984, PConflict > Negative = 0.982; Controls: PConflict > Positive > 0.999, PConflict > Negative > 0.999). Posterior group estimates also indicated that drift rates differed across the 3 task conditions, such that No-Conflict Positive and Negative trials were respectively associated with more positive and negative drift rates than the other conditions (in both groups PPositive > Negative = 0.999, PPositive > Conflict > 0.999, PNegative < Conflict > 0.999; Fig 3B). There was weak evidence that Conflict was associated with numerically smaller threshold values than either of the No-Conflict trials, with this difference being more likely for the comparison between No-Conflict Positive and Conflict trials in the hippocampal damage group (Hippocampal Damage: PPositive > Conflict = 0.956, PNegative > Conflict = 0.927; Controls: PPositive > Conflict = 0.915, PNegative > Conflict = 0.856; Fig 3C). Importantly, though, threshold values for No-Conflict conditions were similar in both groups (Hippocampal Damage: PPositive > Negative = 0.621; Controls: PPositive > Negative = 0.441). In aggregate, these findings are consistent with the notion that AAC decision-making, relative to No-Conflict decision-making, is characterized by increased recruitment of non-decision cognitive processes and slower evidence accumulation, and possibly with lower decision thresholds. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. HDDM posterior distributions of means for (A) Non-decision time; (B) Drift rate; and (C) Threshold for hippocampal damage participants (green) and controls (red/orange). Data for hDDM analyses are available from https://doi.org/10.5683/SP3/C4GWZU. hDDM, hierarchical drift diffusion model. https://doi.org/10.1371/journal.pbio.3003033.g003 Between-groups comparisons. On both No-Conflict Positive and No-Conflict Negative trials, posterior group estimates indicated similar values for individuals with hippocampal damage and controls for non-decision time (PHippocampal Damage < Controls: No-Conflict Positive = 0.265, No Conflict Negative = 0.350), drift rate (PHippocampal Damage < Controls: No-Conflict Positive = 0.766, No-Conflict Negative = 0.874), and threshold (PHippocampal Damage < Controls; No-Conflict Positive = 0.723, No-Conflict Negative = 0.858). However, there was very strong evidence for differences between the groups’ parameter estimates on Conflict trials. Specifically, the hippocampal damage group drift rate was more positive relative to controls (PHippocampal Damage > Controls = 0.995; Fig 4A) and the hippocampal damage group exhibited lower decision thresholds compared to controls (PHippocampal Damage < Controls = 0.982; Fig 4B). We observed little evidence that the groups differed on non-decision time (PHippocampal Damage < Controls = 0.391) on Conflict trials. Lastly, we found strong evidence that starting biases were more positive for hippocampal damage participants than controls (PHippocampal Damage > Controls = 0.992; Fig 4C), suggesting a baseline approach propensity among the hippocampal damage group. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Examination of hDDM posterior distributions revealed that there was very strong evidence for differences between the hippocampal damage (green) and control (orange) groups for (A) Conflict drift rate (PHippocampal Damage > Controls = 0.995); (B) Conflict threshold (PHippocampal Damage < Controls = 0.982); and (C) overall starting bias (PHippocampal Damage > Controls = 0.992). Data for hDDM analyses are available from https://doi.org/10.5683/SP3/C4GWZU. hDDM, hierarchical drift diffusion model. https://doi.org/10.1371/journal.pbio.3003033.g004 Taken together, these analyses suggest that individuals with hippocampal damage and controls did not differ markedly in their evidence accumulation processes during decision-making on No-Conflict trials. On Conflict trials, however, individuals with hippocampal damage lacked the rapid evidence accumulation towards avoidance seen in controls (controls’ drift rate estimates were strongly negative, while hippocampal damage participants’ estimates were close to 0), and they were willing to make decisions with less evidence than controls (i.e., lower threshold estimate). There was also a greater baseline bias towards approach decisions in individuals with hippocampal damage compared to controls (i.e., more positive bias estimate). Response conflict tasks Stroop task. Participants were administered a computer-based version of the color Stroop task [24] in which they indicated the color of a rectangle (Control trials) or the lettering of words of color names presented on each trial via a key press. The color of the lettering and color name could either be congruent or incongruent. We analyzed accuracy on this task using a LMM with fixed effects of condition (Control versus Congruent versus Incongruent) and group, as well as the interactions between them, and random intercepts per participant. The formula for the selected model was: accuracy ~ incongruent_vs_control + incongruent_vs_congruent + group + group * incongruent_vs_control + group * incongruent_vs_congruent + (1 | participant). Table 5A summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 5. Stroop task LMM results for (A) Accuracy; and (B) RT. Since there were 3 trial types (Control, Congruent, Incongruent), condition was coded as a contrast between Incongruent and each of the other conditions (Control, Congruent). Significant predictors involving these contrasts were then explored further with a multi-parameter effects test to assess the significance of the main effect of condition. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t005 As expected, a multi-parameter test revealed a significant main effect of condition (pcorr < 0.001; Fig 5A). Post hoc comparisons indicated that this was driven by lower accuracy on Incongruent compared to Congruent trials (pcorr < 0.001) as well as to Control trials (pcorr < 0.001), but no difference in accuracy between Congruent and Control trials (pcorr = 1.000), suggesting that Incongruent trials produced greater response conflict than the other conditions. However, there was no significant group effect or group-by-condition interaction (all pcorr = 1.000), suggesting that the groups did not differ in their overall accuracy, nor in their ability to inhibit incorrect responses on Incongruent trials. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. No significant group differences were observed for (A) Stroop task accuracy (pcorr = 1.000); (B) Stroop task response times (pcorr = 1.000); or (C) Go/No-Go Task proportion of inhibition errors (pcorr = 1.000) (individual data points with EMMs ±SE). Underlying data for these figures and associated analyses are available from https://doi.org/10.5683/SP3/C4GWZU. EMM, estimated marginal mean. https://doi.org/10.1371/journal.pbio.3003033.g005 Response time data were analyzed with a LMM identical in structure to that described for accuracy using the following formula: RT ~ incongruent_vs_control + incongruent_vs_congruent + group + group * incongruent_vs_control + group * incongruent_vs_congruent + (1 | participant). Table 5B summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Here too, a multi-parameter test revealed the expected main effect of condition (pcorr < 0.001; Fig 5B). Post hoc comparisons indicated that this was driven by significantly longer response times on Incongruent compared to Congruent trials (pcorr = 0.014) and compared to Control trials (pcorr < 0.001), but no difference between Congruent and Control trials (pcorr = 1.000). This likely reflects the additional deliberation time needed to resolve the response conflict elicited by Incongruent trials compared to the other conditions. As with accuracy, we found no group differences, nor group-by-condition interactions (all pcorr = 1.000), suggesting that both groups responded at similar speeds across all conditions. Go/No-Go task Participants were administered a Cued Go/No-Go task from the literature [25] in which they were presented with a rectangle (in vertical or horizontal orientation) on each trial and were required to either make a key press in response to a “Go” cue (rectangle turning green in color) or withhold from responding following a “No-Go” cue (rectangle turning blue in color). Importantly, a vertically oriented rectangle was associated with a 4:1 Go/No-Go trial ratio whereas the horizontal rectangle had a 1:4 Go/No-Go trial ratio. As we were interested in assessing response inhibition under response conflict, we analyzed the proportion of inhibition errors participants committed on No-Go trials (i.e., the proportion of these trials on which they incorrectly produced a response). To this end, we constructed a LMM with fixed effects of group and cue (Go versus No-Go), as well as the interactions between them, and random slopes per participant using the following formula: inhibition_errors ~ group * cue + (1 | participant). Table 6 summarizes the selected model’s outputs. As expected, No-Go trials with Go cues were associated with numerically higher error rates than No-Go trials with No-Go cues (Fig 5C), suggesting that the former elicited significantly greater difficulties with response inhibition than the latter. This effect, however, did not survive Bonferroni correction (p = 0.016; pcorr = 0.112). There were no significant group or group-by-cue effects (both pcorr ≥ 0.882), suggesting that the groups did not differ in their ability to inhibit responses on No-Go trials, regardless of cue type. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 6. Go/No-Go task inhibition errors LMM results. To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t006 Approach-avoidance tasks Linear mixed model analyses of choice and response time data. Learning Phase–Participants first learned the valences of 4 scene (Scene task) or 4 object (Object task) stimuli over 120 trials by making approach or avoid key presses to individually presented stimuli. Approaching a positive stimulus led to the award of game points, whereas approaching a negative stimulus led to the loss of game points. An avoid response had no impact on game points regardless of stimulus valence. The proportion of correct responses was analyzed using a linear mixed model (LMM) with fixed effects of group (Hippocampal Damage versus Control), valence (Positive versus Negative), block (1 to 10, with 12 trials [i.e., 3 repetitions of each stimulus] per block), stimuli (Object versus Scene), and the interactions between them as predictors. We additionally modeled random intercepts and slopes for valence per participant. The selected model’s formula was: accuracy ~ group * valence * block * stimuli + (1 + valence | participant). Table 3A summarizes the selected model’s outputs, as well as post hoc estimated marginal mean (EMM) comparisons (adjusted for multiple comparisons using Tukey’s honestly significant difference (HSD) method). To correct for multiple LMMs being conducted in this study (7 in total), all p-values for this model and subsequent LMMs were additionally adjusted with a Bonferroni correction (pcorr = p × 7). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 3. Object and Scene AAC task learn phase LMM results for (A) Accuracy; and (B) Response time. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t003 Based on our predictions, post hoc pairwise comparisons of EMMs, selected a priori, were performed to probe relevant main and interaction effects. Specifically, we sought to find out whether: (1) participants showed significantly improved accuracy from Block 1 to Block 10; (2) both groups showed similar accuracy at Block 10 on both tasks; and (3) performance was similar for negative and positive stimuli for the Scene and Object tasks at Block 10. We found a main effect of block (pcorr < 0.001) and a significant block-by-group (pcorr = 0.007) interaction. There were also trends for the block-by-group-by-stimuli (pcorr = 0.063) and valence-by-block-by-stimuli (pcorr < 0.091) interactions, but no significant interaction between valence, block, and group (pcorr = 1.000). Comparing EMMs at Block 1 and at Block 10 revealed, as expected, that the proportion of accurate responses increased significantly from Block 1 to Block 10 (pcorr < 0.001; Fig 2A). When compared at Block 10, both groups had similar accuracies on both the Scene (pcorr = 1.000) and Object tasks (pcorr = 1.000). Collapsing across groups, there was no significant difference in performance at Block 10 between negative and positive stimuli in the Scene task (pcorr = 1.000) or Object task (pcorr = 1.000). These results indicate that both groups learned the images’ valences over the course of learning trials and had achieved comparable knowledge of them by the end of the learning phase. Moreover, learning of negative and positive valenced stimuli was similar. We next used a LMM to analyze response times to determine whether participants showed increased familiarity with the stimulus images by the end of the learning phase. We used identical fixed predictors to the abovementioned LMM but modeled random intercepts and slopes for stimuli per participant. The formula for the selected model was: RT ~ group * valence * block * stimuli + (1 + stimuli | participant). Table 3B summarizes the selected model’s outputs as well as post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). As expected, we observed a significant main effect of block (pcorr < 0.001). To determine whether this reflected faster response times over the course of the task, we compared EMMs for response time at Block 1 and Block 10. Indeed, participants responded significantly faster at Block 10 than they did at Block 1 (pcorr < 0.001; Fig 2B), suggesting they had become more familiar with the task stimuli by the end of the learning phase. A block-by-group interaction was observed (pcorr < 0.001), although a post hoc comparison revealed that the groups did not differ in their response times at Block 10 (pcorr = 1.000). There was no significant interaction between valence, block, and group (pcorr = 1.000) or block, group, and stimuli (pcorr = 1.000). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Both participant groups successfully acquired the stimulus valences across trial blocks during the learning phase, as reflected in (A) accuracy and (B) response times (RT) (means ± SE). Importantly, the individuals with hippocampal damage and controls demonstrated comparable learning by the final block (all pcorr = 1.000). In the decision phase, (C) individuals with hippocampal damage responded similarly to controls on No-Conflict Positive and No-Conflict Negative trials (both pcorr = 1.000) but approached significantly more often in response to Conflict image pairs (pcorr < 0.001). There were no significant differences in (D) response times between groups (all pcorr = 1.000) (individual data points with EMMs ±SE). Underlying data for these figures and associated analyses are available from https://doi.org/10.5683/SP3/C4GWZU. EMM, estimated marginal mean. https://doi.org/10.1371/journal.pbio.3003033.g002 Decision phase–Following the learning phase, participants made approach or avoid key presses to pairs of scenes (Scene task) or objects (Object task) across 108 trials. Two thirds of the trials contained pairs composed of stimuli with the same valence (No-Conflict Positive and No-Conflict Negative trials) and a third of the trials involved a positive stimulus paired with a negative stimulus (Conflict trials). Participants were told that approaching a Conflict pair would lead to a 50% chance of receiving a gain or loss of game points. We analyzed decision phase approach/avoid responses with a LMM with group (Hippocampal Damage versus Control), condition (No-Conflict Positive versus No-Conflict Negative versus Conflict), and stimuli (Object versus Scene), as well as the interactions between them as fixed effects, with random intercepts per participant. The formula for the selected model was: response ~ group + positive_vs_conflict + negative_vs_conflict + stimuli + group * positive_vs_conflict + group * negative_vs_conflict + group * positive_vs_conflict * stimuli + group * negative_vs_conflict * stimuli + (1 | participant). Table 4A summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 4. Object and Scene AAC task decision phase LMM results for (A) Proportion of approach responses; and (B) Response time. Since there were 3 trial types (No-Conflict Positive, No-Conflict Negative, Conflict), condition was coded as a contrast between Conflict and each of the No-Conflict conditions (Positive, Negative). Significant predictors involving these contrasts were then explored further with 2 multi-parameter effects tests, one to assess the significance of the main effect of condition and another the interaction between group and condition. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t004 Multi-parameter tests revealed a significant main effect of condition (pcorr < 0.001) as well as a significant group-by-condition interaction (pcorr < 0.001). Probing the main effect of condition by comparing EMMs for all 3 conditions to one another revealed that rates of approach response differed significantly across all 3 conditions (all pcorr < 0.001), suggesting that participants retained the stimulus identities and their respective valences acquired during the learning phase, with proportions of approach close to 0 for No-Conflict Negative trials, close to 1 for No-Conflict Positive trials, and in between for Conflict trials (Fig 2C). To probe the group-by-condition interaction, we next compared group EMMs for each condition. Groups did not differ significantly in their rates of approach responses on either No-Conflict Positive trials (pcorr = 1.000) or No-Conflict Negative (pcorr = 1.000), indicating comparable retention of stimulus-valence associations at test. Importantly, consistent with our hypotheses, the hippocampal damage group approached significantly more often than the control group on Conflict trials (pcorr < 0.001). Contrary to our hypotheses, however, stimulus type did not interact significantly with group or trial type (both p = 1.000), indicating that the group differences observed in approach behavior on Conflict trials were not specific to either objects or scenes. Next, we analyzed response times to determine whether the task paradigm successfully elicited AAC and to investigate whether groups differed in the impact of Conflict on their response speed. To this end, we used a LMM with the same fixed and random effects structure as that described above, implementing the formula: RT ~ group + conflict_vs_positive + conflict_vs_negative + stimuli + group * conflict_vs_positive + group * conflict_vs_negative + group * conflict_vs_positive * stimuli + group * conflict_vs_negative * stimuli + (1 | participant). Table 4B summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). A multi-parameter test revealed a main effect of condition on response time (pcorr < 0.001) and a comparison of EMMs revealed no significant difference between No-Conflict Positive and No-Conflict Negative trials (pcorr = 1.000). However, Conflict trials had significantly longer response times compared to No-Conflict Positive (pcorr < 0.001) and No-Conflict Negative trials (pcorr < 0.001; Fig 2D), suggesting that these trials successfully elicited AAC. We also observed a group-by-condition interaction (p = 0.018), although this did not survive Bonferroni correction (pcorr = 0.126). Exploratory post hoc comparisons of EMMs found similar response times between groups on all 3 conditions (all pcorr ≥ 1.000). Linear mixed model analyses of choice and response time data. Learning Phase–Participants first learned the valences of 4 scene (Scene task) or 4 object (Object task) stimuli over 120 trials by making approach or avoid key presses to individually presented stimuli. Approaching a positive stimulus led to the award of game points, whereas approaching a negative stimulus led to the loss of game points. An avoid response had no impact on game points regardless of stimulus valence. The proportion of correct responses was analyzed using a linear mixed model (LMM) with fixed effects of group (Hippocampal Damage versus Control), valence (Positive versus Negative), block (1 to 10, with 12 trials [i.e., 3 repetitions of each stimulus] per block), stimuli (Object versus Scene), and the interactions between them as predictors. We additionally modeled random intercepts and slopes for valence per participant. The selected model’s formula was: accuracy ~ group * valence * block * stimuli + (1 + valence | participant). Table 3A summarizes the selected model’s outputs, as well as post hoc estimated marginal mean (EMM) comparisons (adjusted for multiple comparisons using Tukey’s honestly significant difference (HSD) method). To correct for multiple LMMs being conducted in this study (7 in total), all p-values for this model and subsequent LMMs were additionally adjusted with a Bonferroni correction (pcorr = p × 7). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 3. Object and Scene AAC task learn phase LMM results for (A) Accuracy; and (B) Response time. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t003 Based on our predictions, post hoc pairwise comparisons of EMMs, selected a priori, were performed to probe relevant main and interaction effects. Specifically, we sought to find out whether: (1) participants showed significantly improved accuracy from Block 1 to Block 10; (2) both groups showed similar accuracy at Block 10 on both tasks; and (3) performance was similar for negative and positive stimuli for the Scene and Object tasks at Block 10. We found a main effect of block (pcorr < 0.001) and a significant block-by-group (pcorr = 0.007) interaction. There were also trends for the block-by-group-by-stimuli (pcorr = 0.063) and valence-by-block-by-stimuli (pcorr < 0.091) interactions, but no significant interaction between valence, block, and group (pcorr = 1.000). Comparing EMMs at Block 1 and at Block 10 revealed, as expected, that the proportion of accurate responses increased significantly from Block 1 to Block 10 (pcorr < 0.001; Fig 2A). When compared at Block 10, both groups had similar accuracies on both the Scene (pcorr = 1.000) and Object tasks (pcorr = 1.000). Collapsing across groups, there was no significant difference in performance at Block 10 between negative and positive stimuli in the Scene task (pcorr = 1.000) or Object task (pcorr = 1.000). These results indicate that both groups learned the images’ valences over the course of learning trials and had achieved comparable knowledge of them by the end of the learning phase. Moreover, learning of negative and positive valenced stimuli was similar. We next used a LMM to analyze response times to determine whether participants showed increased familiarity with the stimulus images by the end of the learning phase. We used identical fixed predictors to the abovementioned LMM but modeled random intercepts and slopes for stimuli per participant. The formula for the selected model was: RT ~ group * valence * block * stimuli + (1 + stimuli | participant). Table 3B summarizes the selected model’s outputs as well as post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). As expected, we observed a significant main effect of block (pcorr < 0.001). To determine whether this reflected faster response times over the course of the task, we compared EMMs for response time at Block 1 and Block 10. Indeed, participants responded significantly faster at Block 10 than they did at Block 1 (pcorr < 0.001; Fig 2B), suggesting they had become more familiar with the task stimuli by the end of the learning phase. A block-by-group interaction was observed (pcorr < 0.001), although a post hoc comparison revealed that the groups did not differ in their response times at Block 10 (pcorr = 1.000). There was no significant interaction between valence, block, and group (pcorr = 1.000) or block, group, and stimuli (pcorr = 1.000). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Both participant groups successfully acquired the stimulus valences across trial blocks during the learning phase, as reflected in (A) accuracy and (B) response times (RT) (means ± SE). Importantly, the individuals with hippocampal damage and controls demonstrated comparable learning by the final block (all pcorr = 1.000). In the decision phase, (C) individuals with hippocampal damage responded similarly to controls on No-Conflict Positive and No-Conflict Negative trials (both pcorr = 1.000) but approached significantly more often in response to Conflict image pairs (pcorr < 0.001). There were no significant differences in (D) response times between groups (all pcorr = 1.000) (individual data points with EMMs ±SE). Underlying data for these figures and associated analyses are available from https://doi.org/10.5683/SP3/C4GWZU. EMM, estimated marginal mean. https://doi.org/10.1371/journal.pbio.3003033.g002 Decision phase–Following the learning phase, participants made approach or avoid key presses to pairs of scenes (Scene task) or objects (Object task) across 108 trials. Two thirds of the trials contained pairs composed of stimuli with the same valence (No-Conflict Positive and No-Conflict Negative trials) and a third of the trials involved a positive stimulus paired with a negative stimulus (Conflict trials). Participants were told that approaching a Conflict pair would lead to a 50% chance of receiving a gain or loss of game points. We analyzed decision phase approach/avoid responses with a LMM with group (Hippocampal Damage versus Control), condition (No-Conflict Positive versus No-Conflict Negative versus Conflict), and stimuli (Object versus Scene), as well as the interactions between them as fixed effects, with random intercepts per participant. The formula for the selected model was: response ~ group + positive_vs_conflict + negative_vs_conflict + stimuli + group * positive_vs_conflict + group * negative_vs_conflict + group * positive_vs_conflict * stimuli + group * negative_vs_conflict * stimuli + (1 | participant). Table 4A summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 4. Object and Scene AAC task decision phase LMM results for (A) Proportion of approach responses; and (B) Response time. Since there were 3 trial types (No-Conflict Positive, No-Conflict Negative, Conflict), condition was coded as a contrast between Conflict and each of the No-Conflict conditions (Positive, Negative). Significant predictors involving these contrasts were then explored further with 2 multi-parameter effects tests, one to assess the significance of the main effect of condition and another the interaction between group and condition. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t004 Multi-parameter tests revealed a significant main effect of condition (pcorr < 0.001) as well as a significant group-by-condition interaction (pcorr < 0.001). Probing the main effect of condition by comparing EMMs for all 3 conditions to one another revealed that rates of approach response differed significantly across all 3 conditions (all pcorr < 0.001), suggesting that participants retained the stimulus identities and their respective valences acquired during the learning phase, with proportions of approach close to 0 for No-Conflict Negative trials, close to 1 for No-Conflict Positive trials, and in between for Conflict trials (Fig 2C). To probe the group-by-condition interaction, we next compared group EMMs for each condition. Groups did not differ significantly in their rates of approach responses on either No-Conflict Positive trials (pcorr = 1.000) or No-Conflict Negative (pcorr = 1.000), indicating comparable retention of stimulus-valence associations at test. Importantly, consistent with our hypotheses, the hippocampal damage group approached significantly more often than the control group on Conflict trials (pcorr < 0.001). Contrary to our hypotheses, however, stimulus type did not interact significantly with group or trial type (both p = 1.000), indicating that the group differences observed in approach behavior on Conflict trials were not specific to either objects or scenes. Next, we analyzed response times to determine whether the task paradigm successfully elicited AAC and to investigate whether groups differed in the impact of Conflict on their response speed. To this end, we used a LMM with the same fixed and random effects structure as that described above, implementing the formula: RT ~ group + conflict_vs_positive + conflict_vs_negative + stimuli + group * conflict_vs_positive + group * conflict_vs_negative + group * conflict_vs_positive * stimuli + group * conflict_vs_negative * stimuli + (1 | participant). Table 4B summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). A multi-parameter test revealed a main effect of condition on response time (pcorr < 0.001) and a comparison of EMMs revealed no significant difference between No-Conflict Positive and No-Conflict Negative trials (pcorr = 1.000). However, Conflict trials had significantly longer response times compared to No-Conflict Positive (pcorr < 0.001) and No-Conflict Negative trials (pcorr < 0.001; Fig 2D), suggesting that these trials successfully elicited AAC. We also observed a group-by-condition interaction (p = 0.018), although this did not survive Bonferroni correction (pcorr = 0.126). Exploratory post hoc comparisons of EMMs found similar response times between groups on all 3 conditions (all pcorr ≥ 1.000). Hierarchical drift diffusion model analyses The candidate hDDM model that converged successfully and showed best fit for the decision phase data made separate estimates for each parameter, varying by condition (i.e., drift rate, threshold, and non-decision time), except bias (deviance information criterion (DIC) = 9848.59; all alternative models DIC > 9862). Bias was modeled collapsed across conditions because it is conceptualized as the starting point for evidence accumulation before participants are exposed to the condition of each trial. We also modeled participant-wise estimates for every parameter, except for bias, which we modeled only at the group level to achieve convergence. Within-group comparisons. Parameter estimates largely differed between task conditions as expected. In both groups, model parameter estimates suggested that non-decision times on No-Conflict conditions were likely near-identical (Hippocampal Damage: PPositive > Negative = 0.493; Controls: PPositive > Negative = 0.378; Fig 3A) but were almost certainly longer on Conflict trials (Hippocampal Damage: PConflict > Positive = 0.984, PConflict > Negative = 0.982; Controls: PConflict > Positive > 0.999, PConflict > Negative > 0.999). Posterior group estimates also indicated that drift rates differed across the 3 task conditions, such that No-Conflict Positive and Negative trials were respectively associated with more positive and negative drift rates than the other conditions (in both groups PPositive > Negative = 0.999, PPositive > Conflict > 0.999, PNegative < Conflict > 0.999; Fig 3B). There was weak evidence that Conflict was associated with numerically smaller threshold values than either of the No-Conflict trials, with this difference being more likely for the comparison between No-Conflict Positive and Conflict trials in the hippocampal damage group (Hippocampal Damage: PPositive > Conflict = 0.956, PNegative > Conflict = 0.927; Controls: PPositive > Conflict = 0.915, PNegative > Conflict = 0.856; Fig 3C). Importantly, though, threshold values for No-Conflict conditions were similar in both groups (Hippocampal Damage: PPositive > Negative = 0.621; Controls: PPositive > Negative = 0.441). In aggregate, these findings are consistent with the notion that AAC decision-making, relative to No-Conflict decision-making, is characterized by increased recruitment of non-decision cognitive processes and slower evidence accumulation, and possibly with lower decision thresholds. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. HDDM posterior distributions of means for (A) Non-decision time; (B) Drift rate; and (C) Threshold for hippocampal damage participants (green) and controls (red/orange). Data for hDDM analyses are available from https://doi.org/10.5683/SP3/C4GWZU. hDDM, hierarchical drift diffusion model. https://doi.org/10.1371/journal.pbio.3003033.g003 Between-groups comparisons. On both No-Conflict Positive and No-Conflict Negative trials, posterior group estimates indicated similar values for individuals with hippocampal damage and controls for non-decision time (PHippocampal Damage < Controls: No-Conflict Positive = 0.265, No Conflict Negative = 0.350), drift rate (PHippocampal Damage < Controls: No-Conflict Positive = 0.766, No-Conflict Negative = 0.874), and threshold (PHippocampal Damage < Controls; No-Conflict Positive = 0.723, No-Conflict Negative = 0.858). However, there was very strong evidence for differences between the groups’ parameter estimates on Conflict trials. Specifically, the hippocampal damage group drift rate was more positive relative to controls (PHippocampal Damage > Controls = 0.995; Fig 4A) and the hippocampal damage group exhibited lower decision thresholds compared to controls (PHippocampal Damage < Controls = 0.982; Fig 4B). We observed little evidence that the groups differed on non-decision time (PHippocampal Damage < Controls = 0.391) on Conflict trials. Lastly, we found strong evidence that starting biases were more positive for hippocampal damage participants than controls (PHippocampal Damage > Controls = 0.992; Fig 4C), suggesting a baseline approach propensity among the hippocampal damage group. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Examination of hDDM posterior distributions revealed that there was very strong evidence for differences between the hippocampal damage (green) and control (orange) groups for (A) Conflict drift rate (PHippocampal Damage > Controls = 0.995); (B) Conflict threshold (PHippocampal Damage < Controls = 0.982); and (C) overall starting bias (PHippocampal Damage > Controls = 0.992). Data for hDDM analyses are available from https://doi.org/10.5683/SP3/C4GWZU. hDDM, hierarchical drift diffusion model. https://doi.org/10.1371/journal.pbio.3003033.g004 Taken together, these analyses suggest that individuals with hippocampal damage and controls did not differ markedly in their evidence accumulation processes during decision-making on No-Conflict trials. On Conflict trials, however, individuals with hippocampal damage lacked the rapid evidence accumulation towards avoidance seen in controls (controls’ drift rate estimates were strongly negative, while hippocampal damage participants’ estimates were close to 0), and they were willing to make decisions with less evidence than controls (i.e., lower threshold estimate). There was also a greater baseline bias towards approach decisions in individuals with hippocampal damage compared to controls (i.e., more positive bias estimate). Within-group comparisons. Parameter estimates largely differed between task conditions as expected. In both groups, model parameter estimates suggested that non-decision times on No-Conflict conditions were likely near-identical (Hippocampal Damage: PPositive > Negative = 0.493; Controls: PPositive > Negative = 0.378; Fig 3A) but were almost certainly longer on Conflict trials (Hippocampal Damage: PConflict > Positive = 0.984, PConflict > Negative = 0.982; Controls: PConflict > Positive > 0.999, PConflict > Negative > 0.999). Posterior group estimates also indicated that drift rates differed across the 3 task conditions, such that No-Conflict Positive and Negative trials were respectively associated with more positive and negative drift rates than the other conditions (in both groups PPositive > Negative = 0.999, PPositive > Conflict > 0.999, PNegative < Conflict > 0.999; Fig 3B). There was weak evidence that Conflict was associated with numerically smaller threshold values than either of the No-Conflict trials, with this difference being more likely for the comparison between No-Conflict Positive and Conflict trials in the hippocampal damage group (Hippocampal Damage: PPositive > Conflict = 0.956, PNegative > Conflict = 0.927; Controls: PPositive > Conflict = 0.915, PNegative > Conflict = 0.856; Fig 3C). Importantly, though, threshold values for No-Conflict conditions were similar in both groups (Hippocampal Damage: PPositive > Negative = 0.621; Controls: PPositive > Negative = 0.441). In aggregate, these findings are consistent with the notion that AAC decision-making, relative to No-Conflict decision-making, is characterized by increased recruitment of non-decision cognitive processes and slower evidence accumulation, and possibly with lower decision thresholds. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. HDDM posterior distributions of means for (A) Non-decision time; (B) Drift rate; and (C) Threshold for hippocampal damage participants (green) and controls (red/orange). Data for hDDM analyses are available from https://doi.org/10.5683/SP3/C4GWZU. hDDM, hierarchical drift diffusion model. https://doi.org/10.1371/journal.pbio.3003033.g003 Between-groups comparisons. On both No-Conflict Positive and No-Conflict Negative trials, posterior group estimates indicated similar values for individuals with hippocampal damage and controls for non-decision time (PHippocampal Damage < Controls: No-Conflict Positive = 0.265, No Conflict Negative = 0.350), drift rate (PHippocampal Damage < Controls: No-Conflict Positive = 0.766, No-Conflict Negative = 0.874), and threshold (PHippocampal Damage < Controls; No-Conflict Positive = 0.723, No-Conflict Negative = 0.858). However, there was very strong evidence for differences between the groups’ parameter estimates on Conflict trials. Specifically, the hippocampal damage group drift rate was more positive relative to controls (PHippocampal Damage > Controls = 0.995; Fig 4A) and the hippocampal damage group exhibited lower decision thresholds compared to controls (PHippocampal Damage < Controls = 0.982; Fig 4B). We observed little evidence that the groups differed on non-decision time (PHippocampal Damage < Controls = 0.391) on Conflict trials. Lastly, we found strong evidence that starting biases were more positive for hippocampal damage participants than controls (PHippocampal Damage > Controls = 0.992; Fig 4C), suggesting a baseline approach propensity among the hippocampal damage group. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Examination of hDDM posterior distributions revealed that there was very strong evidence for differences between the hippocampal damage (green) and control (orange) groups for (A) Conflict drift rate (PHippocampal Damage > Controls = 0.995); (B) Conflict threshold (PHippocampal Damage < Controls = 0.982); and (C) overall starting bias (PHippocampal Damage > Controls = 0.992). Data for hDDM analyses are available from https://doi.org/10.5683/SP3/C4GWZU. hDDM, hierarchical drift diffusion model. https://doi.org/10.1371/journal.pbio.3003033.g004 Taken together, these analyses suggest that individuals with hippocampal damage and controls did not differ markedly in their evidence accumulation processes during decision-making on No-Conflict trials. On Conflict trials, however, individuals with hippocampal damage lacked the rapid evidence accumulation towards avoidance seen in controls (controls’ drift rate estimates were strongly negative, while hippocampal damage participants’ estimates were close to 0), and they were willing to make decisions with less evidence than controls (i.e., lower threshold estimate). There was also a greater baseline bias towards approach decisions in individuals with hippocampal damage compared to controls (i.e., more positive bias estimate). Response conflict tasks Stroop task. Participants were administered a computer-based version of the color Stroop task [24] in which they indicated the color of a rectangle (Control trials) or the lettering of words of color names presented on each trial via a key press. The color of the lettering and color name could either be congruent or incongruent. We analyzed accuracy on this task using a LMM with fixed effects of condition (Control versus Congruent versus Incongruent) and group, as well as the interactions between them, and random intercepts per participant. The formula for the selected model was: accuracy ~ incongruent_vs_control + incongruent_vs_congruent + group + group * incongruent_vs_control + group * incongruent_vs_congruent + (1 | participant). Table 5A summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 5. Stroop task LMM results for (A) Accuracy; and (B) RT. Since there were 3 trial types (Control, Congruent, Incongruent), condition was coded as a contrast between Incongruent and each of the other conditions (Control, Congruent). Significant predictors involving these contrasts were then explored further with a multi-parameter effects test to assess the significance of the main effect of condition. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t005 As expected, a multi-parameter test revealed a significant main effect of condition (pcorr < 0.001; Fig 5A). Post hoc comparisons indicated that this was driven by lower accuracy on Incongruent compared to Congruent trials (pcorr < 0.001) as well as to Control trials (pcorr < 0.001), but no difference in accuracy between Congruent and Control trials (pcorr = 1.000), suggesting that Incongruent trials produced greater response conflict than the other conditions. However, there was no significant group effect or group-by-condition interaction (all pcorr = 1.000), suggesting that the groups did not differ in their overall accuracy, nor in their ability to inhibit incorrect responses on Incongruent trials. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. No significant group differences were observed for (A) Stroop task accuracy (pcorr = 1.000); (B) Stroop task response times (pcorr = 1.000); or (C) Go/No-Go Task proportion of inhibition errors (pcorr = 1.000) (individual data points with EMMs ±SE). Underlying data for these figures and associated analyses are available from https://doi.org/10.5683/SP3/C4GWZU. EMM, estimated marginal mean. https://doi.org/10.1371/journal.pbio.3003033.g005 Response time data were analyzed with a LMM identical in structure to that described for accuracy using the following formula: RT ~ incongruent_vs_control + incongruent_vs_congruent + group + group * incongruent_vs_control + group * incongruent_vs_congruent + (1 | participant). Table 5B summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Here too, a multi-parameter test revealed the expected main effect of condition (pcorr < 0.001; Fig 5B). Post hoc comparisons indicated that this was driven by significantly longer response times on Incongruent compared to Congruent trials (pcorr = 0.014) and compared to Control trials (pcorr < 0.001), but no difference between Congruent and Control trials (pcorr = 1.000). This likely reflects the additional deliberation time needed to resolve the response conflict elicited by Incongruent trials compared to the other conditions. As with accuracy, we found no group differences, nor group-by-condition interactions (all pcorr = 1.000), suggesting that both groups responded at similar speeds across all conditions. Stroop task. Participants were administered a computer-based version of the color Stroop task [24] in which they indicated the color of a rectangle (Control trials) or the lettering of words of color names presented on each trial via a key press. The color of the lettering and color name could either be congruent or incongruent. We analyzed accuracy on this task using a LMM with fixed effects of condition (Control versus Congruent versus Incongruent) and group, as well as the interactions between them, and random intercepts per participant. The formula for the selected model was: accuracy ~ incongruent_vs_control + incongruent_vs_congruent + group + group * incongruent_vs_control + group * incongruent_vs_congruent + (1 | participant). Table 5A summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Download: PPT PowerPoint slide PNG larger image TIFF original image Table 5. Stroop task LMM results for (A) Accuracy; and (B) RT. Since there were 3 trial types (Control, Congruent, Incongruent), condition was coded as a contrast between Incongruent and each of the other conditions (Control, Congruent). Significant predictors involving these contrasts were then explored further with a multi-parameter effects test to assess the significance of the main effect of condition. All post hoc EMM comparisons are adjusted for multiple comparisons using Tukey’s HSD (pHSD). To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t005 As expected, a multi-parameter test revealed a significant main effect of condition (pcorr < 0.001; Fig 5A). Post hoc comparisons indicated that this was driven by lower accuracy on Incongruent compared to Congruent trials (pcorr < 0.001) as well as to Control trials (pcorr < 0.001), but no difference in accuracy between Congruent and Control trials (pcorr = 1.000), suggesting that Incongruent trials produced greater response conflict than the other conditions. However, there was no significant group effect or group-by-condition interaction (all pcorr = 1.000), suggesting that the groups did not differ in their overall accuracy, nor in their ability to inhibit incorrect responses on Incongruent trials. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. No significant group differences were observed for (A) Stroop task accuracy (pcorr = 1.000); (B) Stroop task response times (pcorr = 1.000); or (C) Go/No-Go Task proportion of inhibition errors (pcorr = 1.000) (individual data points with EMMs ±SE). Underlying data for these figures and associated analyses are available from https://doi.org/10.5683/SP3/C4GWZU. EMM, estimated marginal mean. https://doi.org/10.1371/journal.pbio.3003033.g005 Response time data were analyzed with a LMM identical in structure to that described for accuracy using the following formula: RT ~ incongruent_vs_control + incongruent_vs_congruent + group + group * incongruent_vs_control + group * incongruent_vs_congruent + (1 | participant). Table 5B summarizes the selected model’s outputs, effect tests for multi-parameter predictors, and post hoc EMM comparisons (adjusted for multiple comparisons using Tukey’s HSD). Here too, a multi-parameter test revealed the expected main effect of condition (pcorr < 0.001; Fig 5B). Post hoc comparisons indicated that this was driven by significantly longer response times on Incongruent compared to Congruent trials (pcorr = 0.014) and compared to Control trials (pcorr < 0.001), but no difference between Congruent and Control trials (pcorr = 1.000). This likely reflects the additional deliberation time needed to resolve the response conflict elicited by Incongruent trials compared to the other conditions. As with accuracy, we found no group differences, nor group-by-condition interactions (all pcorr = 1.000), suggesting that both groups responded at similar speeds across all conditions. Go/No-Go task Participants were administered a Cued Go/No-Go task from the literature [25] in which they were presented with a rectangle (in vertical or horizontal orientation) on each trial and were required to either make a key press in response to a “Go” cue (rectangle turning green in color) or withhold from responding following a “No-Go” cue (rectangle turning blue in color). Importantly, a vertically oriented rectangle was associated with a 4:1 Go/No-Go trial ratio whereas the horizontal rectangle had a 1:4 Go/No-Go trial ratio. As we were interested in assessing response inhibition under response conflict, we analyzed the proportion of inhibition errors participants committed on No-Go trials (i.e., the proportion of these trials on which they incorrectly produced a response). To this end, we constructed a LMM with fixed effects of group and cue (Go versus No-Go), as well as the interactions between them, and random slopes per participant using the following formula: inhibition_errors ~ group * cue + (1 | participant). Table 6 summarizes the selected model’s outputs. As expected, No-Go trials with Go cues were associated with numerically higher error rates than No-Go trials with No-Go cues (Fig 5C), suggesting that the former elicited significantly greater difficulties with response inhibition than the latter. This effect, however, did not survive Bonferroni correction (p = 0.016; pcorr = 0.112). There were no significant group or group-by-cue effects (both pcorr ≥ 0.882), suggesting that the groups did not differ in their ability to inhibit responses on No-Go trials, regardless of cue type. Download: PPT PowerPoint slide PNG larger image TIFF original image Table 6. Go/No-Go task inhibition errors LMM results. To correct for multiple LMMs being conducted in this study (7 in total), all p-values have additionally been adjusted with a Bonferroni correction (pcorr). Significant Bonferroni corrected findings are highlighted in bold (* < 0.05; ** < 0.01, *** < 0.001). https://doi.org/10.1371/journal.pbio.3003033.t006 Discussion We compared the behavior of individuals with HPC lesions to that of healthy controls on an AAC paradigm and employed computational modeling to elucidate the underlying latent cognitive processes. Both groups approached and avoided No-Conflict positive and negative trials at similar rates, and exhibited longer response times on Conflict trials, suggesting the successful establishment of motivational conflict. Crucially, however, the individuals with hippocampal damage approached significantly more often than control participants on these trials associated with high AAC. There was limited evidence for differences in HDDM parameters across groups under No-Conflict conditions, with strong evidence for group differences on Conflict trials only. Specifically, there was strong evidence that relative to controls, individuals with hippocampal damage exhibited lower decision thresholds during AAC while controls exhibited faster evidence accumulation towards avoidance than hippocampal damage participants. The hippocampal damage group also demonstrated a stronger general positive bias, indicating a greater overall predilection towards approach decisions, regardless of condition. Taken together, our findings provide strong evidence that structural damage to the HPC potentiates approach behavior under AAC and that this is driven by an increased baseline propensity to approach, a willingness to initiate decisions with less accumulated evidence than controls, and a slower-than-typical drift toward avoidance. Although we sought to minimize mnemonic demands, participants were nevertheless required to learn and recall stimulus-valence associations in our AAC paradigm. Our findings cannot, however, be attributed to group differences in memory ability, a critical point given the role of the HPC in mnemonic processing [26,27]. Both groups showed similar accuracy in valence judgments of the individual stimulus images at the end of the learning phases and crucially, similarly approached No-Conflict Positive pairs and avoided No-Conflict Negative pairs nearly 100% of the time during the decision phase. This indicates that individuals with hippocampal damage and controls were able to appropriately recall the valences of individual images to inform their decisions and that the hippocampal damage participants’ more frequent approach behavior on Conflict trials cannot be explained by poorer stimulus valence memory (e.g., for negative stimuli). Given that the individuals with hippocampal damage in our study were selected on the basis of relatively circumscribed volume loss to the HPC, the observed AAC behavioral differences can be attributed to structural alterations to this area. Our causal findings dovetail with correlative evidence from human neuroimaging studies and lesion findings from nonhuman primates highlighting a role for the HPC in arbitrating AAC decisions [10,12,28]. Moreover, they are not inconsistent with previous findings that restricted lesioning of the rodent vHPC potentiates approach behavior [7], as the hippocampal damage participants’ volume loss was numerically slightly greater within the anterior portion of the HPC compared to the posterior portion. The current data also add considerably to existing neuropsychological studies on AAC conflict in humans, which have involved behavioral tasks with a spatial navigation component and/or a single patient participant with circumscribed bilateral HPC damage [11,13]. Notably, the present study provides novel insight into the role of the HPC in AAC by demonstrating that it is critically involved in the evidence accumulation processes that underlie AAC decision-making. One open question is why a disruption to HPC-dependent evidence accumulation leads to a potentiation of approach rather than avoidance under conditions of AAC. The former is suggestive of an inhibitory role of the HPC in resolving AAC, which is consistent with interpretations of previous rodent work, wherein the vHPC has been shown to be involved in anxiogenic behaviors and cost-benefit evaluations, with vHPC ablation producing disinhibited behavior [29–38]. One theoretical model suggests that the HPC acts as a comparator between current and predicted event states (e.g., outcome), and that conflict is detected when there is an incongruency between the two (e.g., reward versus punishment), resulting in preferential strengthening of representations of negative possible outcomes and subsequently in behavioral inhibition [39]. Broadly speaking, our finding of group differences in drift rates is consistent with this idea: whereas normal HPC functioning under conflict is associated with rapid integration of information to inform an avoidant decision, structural damage to the HPC appears to blunt this process, resulting in more neutral drift rates. Indeed, the strongly negative drift rates observed in controls may reflect the predominant retrieval of evidence associated with undesirable outcomes. Likewise, the finding of reduced decision thresholds under conflict within the hippocampal damage group is in keeping with a behavioral inhibition model. While our hDDM analyses offered some evidence that AAC may generally be associated with lower decision thresholds, this was especially true within the hippocampal damage group. That the individuals with hippocampal damage were initiating responses on Conflict trials with less evidence than controls may reflect a disruption of hippocampally mediated comparison of possible outcome states. Although not explicitly predicted, the finding of a greater general approach bias among individuals with hippocampal damage is not incompatible with a behavioral inhibition viewpoint and suggests that the HPC may exert some inhibitory influence even prior to the initiation of evidence accumulation. Although speculative, it may be that, even at baseline, the HPC holds representations of possible aversive outcomes associated with goal-relevant stimuli. Damage to the HPC may disrupt these representations and produce a more reward-seeking and less cautious disposition. It may also be that the disruption of these representations hampers the very detection of motivational conflict, as their inclusion in the comparison of possible outcome states is theoretically integral to this process. It is important to note that, while our findings are broadly consistent with a behavioral inhibition model, it is also clear that the HPC’s role in AAC processing is not fully captured by this viewpoint. For instance, ventral CA1 inactivation has been reported to increase avoidance, with CA3 inactivation increasing approach under AAC conditions [8]. Likewise, increased aHPC fMRI activity has been observed for approach behavior, which is somewhat unexpected if this region is involved in the wholesale inhibition of approach responses [12]. Furthermore, human electrophysiological work has demonstrated that firing rates in the HPC following rewarding and punishing outcomes during an AAC task predict subsequent approach and avoid decisions [40]. Finally, the HPC has also been implicated in evidence sampling to support choice behavior in the face of competing positive outcomes [41]. Taken together, these findings suggest that the involvement of the v/aHPC in AAC processing is likely more complex than the exertion of inhibitory control, and that subregions within the aHPC, along with their distinct extra-hippocampal connectivity [9], contribute differentially to AAC resolution in ways that remain to be elucidated. Considering prior work demonstrating that the HPC is preferentially involved in the processing of spatial and contextual stimuli, and that perirhinal cortex (PRC) is preferentially involved in the processing of objects [42–44], it is surprising that the group differences on the AAC paradigm were not specific to the Scene task, particularly considering the absence of significant PRC volume group differences (Table 1). Indeed, recent human neuroimaging and rodent optogenetic studies have found that PRC, rather than the v/aHPC, is predominantly involved in the resolution of AAC associated with discrete objects [18,19]. Although the reason for this contradiction is unknown, it may relate to findings that have highlighted that the relationship between volume loss and impaired memory in hippocampal amnesia is mediated by abnormalities in functional connectivity [45]. In a similar fashion, network abnormalities, in addition to HPC volume loss, may be contributing to the observed patterns of AAC behavior. For instance, the v/aHPC and d/pHPC and their subregions possess different patterns of connectivity [46,47] and thus, it is conceivable that the impact of HPC damage on AAC processing is dependent on the location and extent of cell loss, and the associated disruption to wider anatomical and functional networks. Due to our selection criteria for participants with hippocampal damage, we did not have sufficient numbers to analyze network functioning in this study and future research will need to address this issue by potentially incorporating a less mnemonically demanding AAC task to allow the inclusion of a greater number of hippocampal damage participants. One important question moving forwards is to what extent the present findings can be generalized to other AAC behavioral paradigms. For instance, while the current study used a secondary reinforcer (i.e., game points) and stimuli for which the incentive values had to be learned, many nonhuman animal AAC tasks involve innate rewards or threats, including ethological tests of anxiety or behavioral tasks involving predator stimuli. Since prior experience of innately conflicting stimuli may be very limited, it is possible that impaired conflict detection following HPC dysfunction may be the primary contributor towards increased approach behavior in these latter paradigms [10,33,48] compared to the disrupted retrieval of outcome evidence for learned AAC scenarios. With respect to human AAC behavior, a broad range of tasks have been used including human adaptations of rodent ethological tests of anxiety such as the open field test [11,49] and elevated plus maze [50], operant conflict tasks in which a lever/button press can be associated with both reward and punishment [51–53], and gambling-like tasks [28], to name a few. While there are fundamental characteristics that are shared between these tasks (e.g., the possibility of receiving reward and punishment), it is also evident that there are clear differences such as the nature of conflict elicited, and the type and schedule of reinforcers used. Whether the HPC contributes to AAC processing in a similar manner across these paradigms, therefore, remains to be investigated. Indeed, extending this line of thought more broadly, to what extent the current HPC findings are relevant to mental health disorders, in particular anxiety, is also unclear. The relationship between AAC and clinical anxiety is underexplored and likely complex, and limitations pervade the behavioral paradigms that have typically been used to study this disorder in preclinical and clinical models [54–56]. Lastly, we also administered Stroop and Go/No-Go tasks to determine whether group differences in AAC behavior extended to response conflict tasks. Across a number of measures, no significant group differences were found on either task, which contrasts with previous work that has reported greater HPC activity during high response conflict Stroop [22,57], Flanker, and Garner filter interference [58] trials, as well as higher response error rates on high response conflict Stroop trials, which were correlated with right HPC volume loss in medial temporal lobe epilepsy (MTLE) patients [23]. Although accounting for this discrepancy is beyond the aims of this study, it is worth noting that greater HPC activity during response conflict may reflect, at least in part, greater incidental memory encoding [59] (although see also [60]). In summary, our findings indicate that structural damage to the HPC in humans results in changes in AAC decision-making behavior and the associated underlying evidence accumulation processes. By assessing multiple participants with relatively circumscribed HPC lesions and leveraging computational modeling techniques, our study provides novel, robust causal evidence of a role for the HPC in arbitrating AAC behavior. Materials and methods Participants Ten participants were recruited as part of the hippocampal damage group. One of these participants has medial temporal lobe epilepsy (MTLE) with sclerosis to the HPC (Table 1) and mild amnesia. The other 9 individuals had previously participated in a larger neuroimaging study [45] following a rare form of autoimmune limbic encephalitis (aLE) that is associated with an increase in antibodies against the voltage-gated potassium channel (VGKC) complex [61]. These aLE patients were selected for this study on the basis of their relatively circumscribed focal HPC atrophy (Table 1) and their mild amnestic profile as captured by standard neuropsychological tests (Table 2). Two of these patients, however, struggled with the mnemonic demands of the experimental tasks and were therefore excluded. Of the remaining 7 aLE patients, one patient’s data set was excluded from analyses for the Scene AAC task due to a failure to learn the stimulus valences and another was removed from the Stroop task due to a misunderstanding of task instructions—the data for these patients were otherwise retained for all other analyses. Twenty-six adults were recruited as healthy control participants (2 to 3 age-matched participants per participant with hippocampal damage) and were required to have normal or corrected-to-normal vision and hearing, and no previous or current neurological condition or traumatic brain injury. One individual had recently recovered from a stroke and was, therefore, ineligible to be included. One control data set was excluded from each of the Scene and Object AAC tasks due to a failure to learn stimulus valences, and one data set was removed from each of the Stroop and the Go/No-Go tasks due to a misunderstanding of task instructions—all other data sets from these control participants were otherwise included. Our final sample, therefore, comprised 8 individuals with hippocampal damage (7 male, 1 female; Age: M = 63.90, SD = 8.32; IQ: M = 109.00, SD = 14.90) and 25 control participants (14 male, 11 female; Age: M = 68.5, SD = 9.24; IQ: M = 107.84, SD = 11.10). The groups did not differ significantly in either age (t(13.02) = 1.34, p = 0.204) or IQ as measured by the Wechsler Abbreviated Scale of Intelligence, Second Edition (WASI-II) [62] (t(9.63) = 0.20, p = 0.844). Experimental task data were collected from the individuals with hippocampal damage over the course of a single session, either in their homes or at the John Radcliffe Hospital in Oxford, United Kingdom. Control participants underwent 2 testing sessions on separate days to collect experimental and background neuropsychological task data. Due to COVID-19–related restrictions, 13 controls completed one or both testing sessions virtually, during a synchronous Zoom session (https://zoom.us). The remaining 12 participants were tested in person at the University of Toronto Scarborough campus. All participants gave informed written consent prior to participation and received monetary compensation plus travel/parking expenses, if applicable. This study received ethical approval from the University of Toronto Research Ethics Board (#26827) and the South Central Oxford Research Ethics Committee (#08/H0606/133) and was conducted in accordance with the ethical principles expressed in the Declaration of Helsinki. Background neuropsychology The individuals with hippocampal damage had previously undergone a standard neuropsychological battery (aLE patient data from [45]). A comparable battery was devised for the control group, which included subtests selected from the following neuropsychology tests: the Montreal Cognitive Assessment (MoCA) [63], the Wechsler Memory Scale, Fourth Edition (WMS-IV) [65], the Doors and People test battery (D and P) [66], the Rey Complex Figure Task (RCFT) [67], the WASI-II [62], and the Visual Object Spatial Perception Battery (VOSP) [68]. Due to technical difficulties and 2 participants not returning for a second session, not all control participants completed all neuropsychology tests. Considering performance across neuropsychology subtests, all controls retained in the sample were deemed neurologically healthy, performing in aggregate in the “Average” to “Superior” range across all subtests (Table 2). Experimental procedure Participants completed 2 experimental AAC tasks (Object and Scene) and 2 response conflict tasks (Stroop and Go/No-Go) within the same testing session, the order of which was counter-balanced across participants. Control participants additionally completed the MoCA before the experimental tasks in the first session, and the remaining neuropsychological tests after the experimental tasks in the remainder of the first session and in the second session. In-person versus online testing sessions Data collection was interrupted by the COVID-19 pandemic, which precluded in-person testing. The testing protocol was therefore adapted to allow for synchronous remote experimental sessions. For in-person testing, we programmed AAC tasks using E-Studio version 2.0.10.252 from the E-Prime 2 Professional Suite (https://pstnet.com) while the Stroop and Go/No-Go tasks were programmed and administered in Inquisit version 5.0.12.0 (https://www.millisecond.com). Tasks were administered on a 12’ Lenovo ThinkPad X240 laptop (2.2 GHz Intel Core i7-4600U CPU processor; 8GB RAM; 1,366 × 768-pixel monitor) and occupied a 1,024 × 768-pixel window on-screen. For synchronous remote data collection, participants connected with the experimenter via Zoom videoconferencing software for the duration of both sessions. AAC, Stroop, and Go/No-Go tasks were re-created using PsychoPy Experiment Builder v2020.2.8 [69] and administered in participants’ browsers through Pavlovia (https://pavlovia.org). Neuropsychology tasks were adapted to replicate in-person testing conditions as faithfully as possible. Images normally presented in stimulus books were scanned and presented using Microsoft PowerPoint (https://www.microsoft.com) via screen-sharing. If participants were required to draw as part of a task, the image was held up to the camera and screen-captured by the experimenter, and then mailed or scanned and emailed to the experimenter. Approach-avoidance conflict tasks AAC processing was assessed using 2 different tasks (Object and Scene) to determine whether HPC-mediated AAC processing effects are specific to certain classes of complex everyday stimuli, based on prior work showing regional specificity in visual stimulus class processing [42–44], which may extend to AAC processing [18,19]. These tasks were simplified versions of previously used AAC paradigms [12,16,18] and were identical in structure—consisting of an initial learning phase and a subsequent decision phase—but differed in the stimuli presented (Fig 1A). Both tasks consisted of 3 learning blocks of 40 trials each, and 3 test blocks of 36 trials each. On the Object task, the stimulus images depicted unfamiliar computer-generated objects (in-person image dimensions = 384 × 288 pixels; remote dimensions = 50% × 37% participants’ display height, DH). On the Scene task, the stimuli consisted of photographs of real-world unfamiliar scenes that were easily recognizable, but otherwise nondescript (i.e., no famous landmarks or monuments, or people present; in-person dimensions = 653 × 357 pixels; remote dimensions = 85% × 46% DH). Participants were given instructions verbally and via written text using Microsoft PowerPoint prior to each task, as well as a text-based refresher of the instructions prior to each block of trials (on-screen text: black Courier New font, size 18, over white background). On all blocks across both tasks, participants were instructed to earn as many game points as possible. In the learning phases, participants were required to learn the valences associated with stimulus images (positive or negative) through trial-and-error. To minimize mnemonic demands on these tasks, participants learned only 4 stimulus-valence associations per task (i.e., 2 per valence; Fig 1A). On each learning phase trial (Fig 1B), a single stimulus image was presented in the center of the screen, enclosed in a black line border (live testing: border size = 768 × 576 pixels, line size = 10 pixels; remote testing: 100% × 75% DH, line size = 10 pixels). Participants chose whether to approach or to avoid the image by pressing the “1” or “2” keys, respectively. Image-valence associations were arbitrary and predetermined. Approaching a positive image resulted in a gain of 100 game points, while avoiding it resulted in no gain in points. Approaching a negative image resulted in a loss of 100 points, while avoiding it resulted in no loss of points. Therefore, to maximize their score, participants needed to approach positive images, and avoid negative ones. Stimulus images remained on-screen until participants responded, after which (latency = 500 ms) participants were shown text-based feedback on the outcome of their response (duration = 1,500 ms), which included the number of points lost/gained, a running sum of their score, and, if they avoided, a message indicating whether they had correctly avoided a negative image (“Good”) or had mistakenly avoided a positive image (“Miss”). A 1,000 ms inter-stimulus interval consisting of a white screen then ensued. The 4 stimulus images were presented in a random order in mini blocks of 4, for a total of 30 times each (10 per block of 40 trials) in the learning phase. In the decision phase (Fig 1C), stimuli from the learning phase were combined into 3 possible pairs: No-Conflict Positive (2 positive images) (12 trials per block), No-Conflict Negative (2 negative images) (12 trials per block), or Conflict (1 image of each valence) (12 trials per block). Trials consisted of the concurrent presentation of 2 images, enclosed in a black line border (dimensions identical to those described above). Image pairs remained onscreen until participants chose to approach/avoid them via a key press, after which a 1,000 ms ISI (white screen) occurred. Participants were told that approaching a No-Conflict Positive pair or a No-Conflict Negative pair would result in a gain or loss of 100 points, respectively, and that avoiding any pair would result in no change in score. As such, the appropriate responses were to approach No-Conflict Positive pairs and to avoid No-Conflict Negative Pairs. Participants were told that approaching on Conflict trials was equally likely to result in either a gain or loss of 100 points. Participants were therefore instructed to decide whether to approach, weighing the risk of losing points with the possibility of gaining points. They were also told that to maximize their score, they would have to approach on at least some Conflict trials. However, participants’ scores were not tracked. To prevent learning and consistent with our previous work [12,16,18], no feedback was given following their responses during the decision phase. Response conflict tasks Stroop task. Participants completed a computer-based version of the Stroop task [24]. On every trial, either a word spelling out the name of a color (live and remote testing: font size = 7% DH) or a colored rectangle (live and remote testing: dimensions = 20% × 10% DH) appeared on the screen. The word could appear in either green, red, blue, or black lettering. Participants were instructed to indicate the color of the lettering (while ignoring the name of the color the word spelled) or the rectangle by pressing a corresponding key (“f,” “d,” “j,” and “k,” respectively). There were 3 conditions. In the Congruent condition, the color name and the color of the lettering were the same (e.g., “green” spelled in green letters). In the Incongruent condition, the color name and the color of the lettering were different (e.g., “black” spelled in red letters). Trials with rectangles were the Control condition. Participants completed 4 trials in each of the 3 conditions, over 7 repetitions, for a total of 28 trials per condition, and 84 trials overall. If the participant responded correctly, the next trial was immediately initiated (ITI = 200 ms). If they responded incorrectly, a red “X” flashed in the center of the screen (duration = 400 ms). Participants were given as much time as they needed to respond but were instructed to work as quickly as they could, while making as few mistakes as possible. Cued Go/No-Go task. We implemented a Cued Go/No-Go task from the literature [25]. On every trial, participants viewed a fixation cross (duration = 500 ms; live and remote dimensions = 10% × 10% DH), followed by a white rectangle (i.e., the cue; delay = 300 ms). The rectangle appeared in either a horizontal or vertical orientation (live and remote dimensions = 30% × 10% DH or vice-versa, depending on orientation). After a brief pause (SOA = [100, 500 ms]), the rectangle turned either green or blue. If the rectangle turned green, participants were instructed to press the spacebar as quickly as possible. If the rectangle turned blue, they were instructed to do nothing and wait for the next trial (ITI = 400 ms). Trials ended after 700 ms, if participants did not respond. The orientation of the cue rectangle related to the likelihood of a Go or No-Go trial: vertical cues had a 4:1 Go/No-Go trial ratio while horizontal cues had a 1:4 Go/No-Go trial ratio. Participants completed 250 trials. Statistical analyses Linear mixed models. We analyzed individual trial choice and response time data for the AAC, Stroop, and Go/No-Go tasks using LMMs, as implemented by the lme4 package [70] in R 4.0.4 (R Core Team, 2021) (to be precise, generalized LMMs were used for choice data given their categorical nature, e.g., approach versus avoid, but will be collectively referred to as LMMs for simplicity). Besides the consideration of random effects, the use of LMMs allowed us to account for the unbalanced nature of our groups, potential inequalities in variance between groups, and the fact that some participants’ data were excluded for one of the AAC tasks. LMMs were constructed iteratively, such that the greatest number of desired random effects were modeled, while achieving convergence and appropriate fit. We constructed models based on the protocol described in [71], evaluating the most complex models first. When a model converged and showed appropriate fit, it was compared both to a purely fixed effects model and to a simplified nested mixed effects model with a likelihood ratio test to determine whether the inclusion of random effects significantly improved model fit. In cases where multiple candidate models converged and showed appropriate fit, they were compared to one another with a likelihood ratio test, and the model with the best fit was selected. In cases where model fit did not differ significantly, the model of highest theoretical interest (i.e., with the highest number of random effects) was chosen. Since all our predictors were categorical and we modeled the interactions between them, a deviation coding scheme was adopted as this facilitates the interpretation of main effects [72]. In models of AAC decision phase and Stroop task data, the predictors for trial condition each comprised 3 levels (No-Conflict Positive, No-Conflict Negative, Conflict; Congruent, Control, Incongruent) and therefore were coded using 2 dummy variables. In the AAC decision phase data, these represented contrasts between the Conflict condition and one of the No-Conflict conditions. In the Stroop task, these represented contrast between the Incongruent condition and the Congruent and Control conditions. Because the main and interaction effects involving condition in these analyses were represented by multiple model terms, we evaluated significant model predictors involving condition by performing multi-parameter tests, wherein, using a likelihood ratio test, the selected model was compared to an identical model with the terms for the relevant effects removed. If the specified model showed better fit than its counterpart with the relevant effect terms removed, the main or interaction effect was considered significant. Significant main effects and interactions were probed via post hoc comparisons between relevant EMMs using the emmeans package [73] and correcting p-values for multiple comparisons using Tukey’s HSD. Lastly, since multiple LMMs were conducted across multiple behavioral tasks and response measures (7 LMMs in total), we accounted for the increased probability of Type I errors occurring by additionally adjusting all p-values using a Bonferroni correction (i.e., p × 7). Both uncorrected and corrected p-values are reported in the statistical tables (Tables 3–6). Hierarchical drift diffusion modeling. In addition to analyzing rates of approach responding, we were interested in whether the underlying decision-making processes differed across groups. To this end, we employed the hDDM. DDM approaches are a type of sequential sampling techniques, which have long been used to model two-choice behavior and response times [74]. These models assume that decision-making occurs by means of accumulation of noisy information (i.e., evidence) about the stimulus. In DDMs, evidence is continuously evaluated while it is collected, until sufficient information has been gathered to cross a decision threshold. DDMs produce parameter estimates for decision thresholds (a), the rate at which evidence toward either decision threshold is accumulated (i.e., drift rate; v), and the distance between the information gathering “starting point” and either threshold (i.e., bias; z). Additionally, these models provide estimates of non-decision time (t), or time between the presentation of the stimulus and the initiation of evidence accumulation. HDDM methods hold a significant advantage over traditional DDMs, in that they handle nested data structures more effectively. They also use Bayesian inference to produce full posterior distributions of parameter estimates, providing both the most likely value for a given estimate (i.e., the distribution’s mode) and an indication of the relative certainty of the estimate (i.e., distribution’s spread). We used the hDDM software package in Python v3.8.10 (Python Software Foundation, 2021), which allows flexible construction of hDDM models, and which uses the PyMC package [75] to generate parameter distributions [21]. We generated a single model incorporating both groups and collapsing across Object and Scene tasks since we did not observe a significant effect of stimulus type in our LMM analyses of choice and response time data (see Results). Model selection was based on an iterative process, wherein the most complex models were evaluated first using several methods to assess model fit and convergence. These included visual inspection of parameter estimates posterior distribution and trace plots, examining whether parameter Gelman–Rubin values fell below a specified cut-off of 1.01 (with = 1 representing perfect convergence) and comparing each parameter’s Monte Carlo (MC) error statistic, to its posterior distribution’s standard deviation, with MC error values less than 1% of the posterior considered to show poor convergence. For each specified model, we generated 55,000 samples from posterior distributions, of which 5,000 were discarded. Of the remaining 50,000 samples, we saved one of every five, yielding a net sampling trace length of 10,000 samples. The most complex and theoretically desirable model estimated separate parameters per trial condition for each participant. If a model did not achieve convergence, it was simplified (i.e., by assigning parameters with poor convergence indices to be estimated only at the group level) and re-evaluated. The selected model was compared to alternative models (i.e., with fewer parameters estimated per participant and per condition) to determine whether the inclusion of additional parameter estimates improved model fit. We assessed relative fit across models by comparing their DIC values. DIC is a Bayesian measure of model fit and is defined as a classical estimate of fit plus twice the effective number of parameters [76]. Lower DIC values indicate better fit, and models whose values differ by 10 or more are considered to show significantly different fit. To test hypotheses about differences in parameter estimates within and between groups, we examined the overlap between their posterior distributions to determine the probability, denoted by P, that a value drawn from either distribution was greater than the other. To illustrate, a hypothesis test of PPositive > PNegative = 0.493 indicates that a randomly drawn value from the posterior distribution for No-Conflict Positive trials has a 49.3% probability of being greater than a value drawn from the posterior distribution for the No-Conflict Negative condition. In other words, the estimated distributions are sufficiently overlapping as to provide little evidence for a difference between No-Conflict Positive and Negative trials. Participants Ten participants were recruited as part of the hippocampal damage group. One of these participants has medial temporal lobe epilepsy (MTLE) with sclerosis to the HPC (Table 1) and mild amnesia. The other 9 individuals had previously participated in a larger neuroimaging study [45] following a rare form of autoimmune limbic encephalitis (aLE) that is associated with an increase in antibodies against the voltage-gated potassium channel (VGKC) complex [61]. These aLE patients were selected for this study on the basis of their relatively circumscribed focal HPC atrophy (Table 1) and their mild amnestic profile as captured by standard neuropsychological tests (Table 2). Two of these patients, however, struggled with the mnemonic demands of the experimental tasks and were therefore excluded. Of the remaining 7 aLE patients, one patient’s data set was excluded from analyses for the Scene AAC task due to a failure to learn the stimulus valences and another was removed from the Stroop task due to a misunderstanding of task instructions—the data for these patients were otherwise retained for all other analyses. Twenty-six adults were recruited as healthy control participants (2 to 3 age-matched participants per participant with hippocampal damage) and were required to have normal or corrected-to-normal vision and hearing, and no previous or current neurological condition or traumatic brain injury. One individual had recently recovered from a stroke and was, therefore, ineligible to be included. One control data set was excluded from each of the Scene and Object AAC tasks due to a failure to learn stimulus valences, and one data set was removed from each of the Stroop and the Go/No-Go tasks due to a misunderstanding of task instructions—all other data sets from these control participants were otherwise included. Our final sample, therefore, comprised 8 individuals with hippocampal damage (7 male, 1 female; Age: M = 63.90, SD = 8.32; IQ: M = 109.00, SD = 14.90) and 25 control participants (14 male, 11 female; Age: M = 68.5, SD = 9.24; IQ: M = 107.84, SD = 11.10). The groups did not differ significantly in either age (t(13.02) = 1.34, p = 0.204) or IQ as measured by the Wechsler Abbreviated Scale of Intelligence, Second Edition (WASI-II) [62] (t(9.63) = 0.20, p = 0.844). Experimental task data were collected from the individuals with hippocampal damage over the course of a single session, either in their homes or at the John Radcliffe Hospital in Oxford, United Kingdom. Control participants underwent 2 testing sessions on separate days to collect experimental and background neuropsychological task data. Due to COVID-19–related restrictions, 13 controls completed one or both testing sessions virtually, during a synchronous Zoom session (https://zoom.us). The remaining 12 participants were tested in person at the University of Toronto Scarborough campus. All participants gave informed written consent prior to participation and received monetary compensation plus travel/parking expenses, if applicable. This study received ethical approval from the University of Toronto Research Ethics Board (#26827) and the South Central Oxford Research Ethics Committee (#08/H0606/133) and was conducted in accordance with the ethical principles expressed in the Declaration of Helsinki. Background neuropsychology The individuals with hippocampal damage had previously undergone a standard neuropsychological battery (aLE patient data from [45]). A comparable battery was devised for the control group, which included subtests selected from the following neuropsychology tests: the Montreal Cognitive Assessment (MoCA) [63], the Wechsler Memory Scale, Fourth Edition (WMS-IV) [65], the Doors and People test battery (D and P) [66], the Rey Complex Figure Task (RCFT) [67], the WASI-II [62], and the Visual Object Spatial Perception Battery (VOSP) [68]. Due to technical difficulties and 2 participants not returning for a second session, not all control participants completed all neuropsychology tests. Considering performance across neuropsychology subtests, all controls retained in the sample were deemed neurologically healthy, performing in aggregate in the “Average” to “Superior” range across all subtests (Table 2). Experimental procedure Participants completed 2 experimental AAC tasks (Object and Scene) and 2 response conflict tasks (Stroop and Go/No-Go) within the same testing session, the order of which was counter-balanced across participants. Control participants additionally completed the MoCA before the experimental tasks in the first session, and the remaining neuropsychological tests after the experimental tasks in the remainder of the first session and in the second session. In-person versus online testing sessions Data collection was interrupted by the COVID-19 pandemic, which precluded in-person testing. The testing protocol was therefore adapted to allow for synchronous remote experimental sessions. For in-person testing, we programmed AAC tasks using E-Studio version 2.0.10.252 from the E-Prime 2 Professional Suite (https://pstnet.com) while the Stroop and Go/No-Go tasks were programmed and administered in Inquisit version 5.0.12.0 (https://www.millisecond.com). Tasks were administered on a 12’ Lenovo ThinkPad X240 laptop (2.2 GHz Intel Core i7-4600U CPU processor; 8GB RAM; 1,366 × 768-pixel monitor) and occupied a 1,024 × 768-pixel window on-screen. For synchronous remote data collection, participants connected with the experimenter via Zoom videoconferencing software for the duration of both sessions. AAC, Stroop, and Go/No-Go tasks were re-created using PsychoPy Experiment Builder v2020.2.8 [69] and administered in participants’ browsers through Pavlovia (https://pavlovia.org). Neuropsychology tasks were adapted to replicate in-person testing conditions as faithfully as possible. Images normally presented in stimulus books were scanned and presented using Microsoft PowerPoint (https://www.microsoft.com) via screen-sharing. If participants were required to draw as part of a task, the image was held up to the camera and screen-captured by the experimenter, and then mailed or scanned and emailed to the experimenter. Approach-avoidance conflict tasks AAC processing was assessed using 2 different tasks (Object and Scene) to determine whether HPC-mediated AAC processing effects are specific to certain classes of complex everyday stimuli, based on prior work showing regional specificity in visual stimulus class processing [42–44], which may extend to AAC processing [18,19]. These tasks were simplified versions of previously used AAC paradigms [12,16,18] and were identical in structure—consisting of an initial learning phase and a subsequent decision phase—but differed in the stimuli presented (Fig 1A). Both tasks consisted of 3 learning blocks of 40 trials each, and 3 test blocks of 36 trials each. On the Object task, the stimulus images depicted unfamiliar computer-generated objects (in-person image dimensions = 384 × 288 pixels; remote dimensions = 50% × 37% participants’ display height, DH). On the Scene task, the stimuli consisted of photographs of real-world unfamiliar scenes that were easily recognizable, but otherwise nondescript (i.e., no famous landmarks or monuments, or people present; in-person dimensions = 653 × 357 pixels; remote dimensions = 85% × 46% DH). Participants were given instructions verbally and via written text using Microsoft PowerPoint prior to each task, as well as a text-based refresher of the instructions prior to each block of trials (on-screen text: black Courier New font, size 18, over white background). On all blocks across both tasks, participants were instructed to earn as many game points as possible. In the learning phases, participants were required to learn the valences associated with stimulus images (positive or negative) through trial-and-error. To minimize mnemonic demands on these tasks, participants learned only 4 stimulus-valence associations per task (i.e., 2 per valence; Fig 1A). On each learning phase trial (Fig 1B), a single stimulus image was presented in the center of the screen, enclosed in a black line border (live testing: border size = 768 × 576 pixels, line size = 10 pixels; remote testing: 100% × 75% DH, line size = 10 pixels). Participants chose whether to approach or to avoid the image by pressing the “1” or “2” keys, respectively. Image-valence associations were arbitrary and predetermined. Approaching a positive image resulted in a gain of 100 game points, while avoiding it resulted in no gain in points. Approaching a negative image resulted in a loss of 100 points, while avoiding it resulted in no loss of points. Therefore, to maximize their score, participants needed to approach positive images, and avoid negative ones. Stimulus images remained on-screen until participants responded, after which (latency = 500 ms) participants were shown text-based feedback on the outcome of their response (duration = 1,500 ms), which included the number of points lost/gained, a running sum of their score, and, if they avoided, a message indicating whether they had correctly avoided a negative image (“Good”) or had mistakenly avoided a positive image (“Miss”). A 1,000 ms inter-stimulus interval consisting of a white screen then ensued. The 4 stimulus images were presented in a random order in mini blocks of 4, for a total of 30 times each (10 per block of 40 trials) in the learning phase. In the decision phase (Fig 1C), stimuli from the learning phase were combined into 3 possible pairs: No-Conflict Positive (2 positive images) (12 trials per block), No-Conflict Negative (2 negative images) (12 trials per block), or Conflict (1 image of each valence) (12 trials per block). Trials consisted of the concurrent presentation of 2 images, enclosed in a black line border (dimensions identical to those described above). Image pairs remained onscreen until participants chose to approach/avoid them via a key press, after which a 1,000 ms ISI (white screen) occurred. Participants were told that approaching a No-Conflict Positive pair or a No-Conflict Negative pair would result in a gain or loss of 100 points, respectively, and that avoiding any pair would result in no change in score. As such, the appropriate responses were to approach No-Conflict Positive pairs and to avoid No-Conflict Negative Pairs. Participants were told that approaching on Conflict trials was equally likely to result in either a gain or loss of 100 points. Participants were therefore instructed to decide whether to approach, weighing the risk of losing points with the possibility of gaining points. They were also told that to maximize their score, they would have to approach on at least some Conflict trials. However, participants’ scores were not tracked. To prevent learning and consistent with our previous work [12,16,18], no feedback was given following their responses during the decision phase. Response conflict tasks Stroop task. Participants completed a computer-based version of the Stroop task [24]. On every trial, either a word spelling out the name of a color (live and remote testing: font size = 7% DH) or a colored rectangle (live and remote testing: dimensions = 20% × 10% DH) appeared on the screen. The word could appear in either green, red, blue, or black lettering. Participants were instructed to indicate the color of the lettering (while ignoring the name of the color the word spelled) or the rectangle by pressing a corresponding key (“f,” “d,” “j,” and “k,” respectively). There were 3 conditions. In the Congruent condition, the color name and the color of the lettering were the same (e.g., “green” spelled in green letters). In the Incongruent condition, the color name and the color of the lettering were different (e.g., “black” spelled in red letters). Trials with rectangles were the Control condition. Participants completed 4 trials in each of the 3 conditions, over 7 repetitions, for a total of 28 trials per condition, and 84 trials overall. If the participant responded correctly, the next trial was immediately initiated (ITI = 200 ms). If they responded incorrectly, a red “X” flashed in the center of the screen (duration = 400 ms). Participants were given as much time as they needed to respond but were instructed to work as quickly as they could, while making as few mistakes as possible. Cued Go/No-Go task. We implemented a Cued Go/No-Go task from the literature [25]. On every trial, participants viewed a fixation cross (duration = 500 ms; live and remote dimensions = 10% × 10% DH), followed by a white rectangle (i.e., the cue; delay = 300 ms). The rectangle appeared in either a horizontal or vertical orientation (live and remote dimensions = 30% × 10% DH or vice-versa, depending on orientation). After a brief pause (SOA = [100, 500 ms]), the rectangle turned either green or blue. If the rectangle turned green, participants were instructed to press the spacebar as quickly as possible. If the rectangle turned blue, they were instructed to do nothing and wait for the next trial (ITI = 400 ms). Trials ended after 700 ms, if participants did not respond. The orientation of the cue rectangle related to the likelihood of a Go or No-Go trial: vertical cues had a 4:1 Go/No-Go trial ratio while horizontal cues had a 1:4 Go/No-Go trial ratio. Participants completed 250 trials. Stroop task. Participants completed a computer-based version of the Stroop task [24]. On every trial, either a word spelling out the name of a color (live and remote testing: font size = 7% DH) or a colored rectangle (live and remote testing: dimensions = 20% × 10% DH) appeared on the screen. The word could appear in either green, red, blue, or black lettering. Participants were instructed to indicate the color of the lettering (while ignoring the name of the color the word spelled) or the rectangle by pressing a corresponding key (“f,” “d,” “j,” and “k,” respectively). There were 3 conditions. In the Congruent condition, the color name and the color of the lettering were the same (e.g., “green” spelled in green letters). In the Incongruent condition, the color name and the color of the lettering were different (e.g., “black” spelled in red letters). Trials with rectangles were the Control condition. Participants completed 4 trials in each of the 3 conditions, over 7 repetitions, for a total of 28 trials per condition, and 84 trials overall. If the participant responded correctly, the next trial was immediately initiated (ITI = 200 ms). If they responded incorrectly, a red “X” flashed in the center of the screen (duration = 400 ms). Participants were given as much time as they needed to respond but were instructed to work as quickly as they could, while making as few mistakes as possible. Cued Go/No-Go task. We implemented a Cued Go/No-Go task from the literature [25]. On every trial, participants viewed a fixation cross (duration = 500 ms; live and remote dimensions = 10% × 10% DH), followed by a white rectangle (i.e., the cue; delay = 300 ms). The rectangle appeared in either a horizontal or vertical orientation (live and remote dimensions = 30% × 10% DH or vice-versa, depending on orientation). After a brief pause (SOA = [100, 500 ms]), the rectangle turned either green or blue. If the rectangle turned green, participants were instructed to press the spacebar as quickly as possible. If the rectangle turned blue, they were instructed to do nothing and wait for the next trial (ITI = 400 ms). Trials ended after 700 ms, if participants did not respond. The orientation of the cue rectangle related to the likelihood of a Go or No-Go trial: vertical cues had a 4:1 Go/No-Go trial ratio while horizontal cues had a 1:4 Go/No-Go trial ratio. Participants completed 250 trials. Statistical analyses Linear mixed models. We analyzed individual trial choice and response time data for the AAC, Stroop, and Go/No-Go tasks using LMMs, as implemented by the lme4 package [70] in R 4.0.4 (R Core Team, 2021) (to be precise, generalized LMMs were used for choice data given their categorical nature, e.g., approach versus avoid, but will be collectively referred to as LMMs for simplicity). Besides the consideration of random effects, the use of LMMs allowed us to account for the unbalanced nature of our groups, potential inequalities in variance between groups, and the fact that some participants’ data were excluded for one of the AAC tasks. LMMs were constructed iteratively, such that the greatest number of desired random effects were modeled, while achieving convergence and appropriate fit. We constructed models based on the protocol described in [71], evaluating the most complex models first. When a model converged and showed appropriate fit, it was compared both to a purely fixed effects model and to a simplified nested mixed effects model with a likelihood ratio test to determine whether the inclusion of random effects significantly improved model fit. In cases where multiple candidate models converged and showed appropriate fit, they were compared to one another with a likelihood ratio test, and the model with the best fit was selected. In cases where model fit did not differ significantly, the model of highest theoretical interest (i.e., with the highest number of random effects) was chosen. Since all our predictors were categorical and we modeled the interactions between them, a deviation coding scheme was adopted as this facilitates the interpretation of main effects [72]. In models of AAC decision phase and Stroop task data, the predictors for trial condition each comprised 3 levels (No-Conflict Positive, No-Conflict Negative, Conflict; Congruent, Control, Incongruent) and therefore were coded using 2 dummy variables. In the AAC decision phase data, these represented contrasts between the Conflict condition and one of the No-Conflict conditions. In the Stroop task, these represented contrast between the Incongruent condition and the Congruent and Control conditions. Because the main and interaction effects involving condition in these analyses were represented by multiple model terms, we evaluated significant model predictors involving condition by performing multi-parameter tests, wherein, using a likelihood ratio test, the selected model was compared to an identical model with the terms for the relevant effects removed. If the specified model showed better fit than its counterpart with the relevant effect terms removed, the main or interaction effect was considered significant. Significant main effects and interactions were probed via post hoc comparisons between relevant EMMs using the emmeans package [73] and correcting p-values for multiple comparisons using Tukey’s HSD. Lastly, since multiple LMMs were conducted across multiple behavioral tasks and response measures (7 LMMs in total), we accounted for the increased probability of Type I errors occurring by additionally adjusting all p-values using a Bonferroni correction (i.e., p × 7). Both uncorrected and corrected p-values are reported in the statistical tables (Tables 3–6). Hierarchical drift diffusion modeling. In addition to analyzing rates of approach responding, we were interested in whether the underlying decision-making processes differed across groups. To this end, we employed the hDDM. DDM approaches are a type of sequential sampling techniques, which have long been used to model two-choice behavior and response times [74]. These models assume that decision-making occurs by means of accumulation of noisy information (i.e., evidence) about the stimulus. In DDMs, evidence is continuously evaluated while it is collected, until sufficient information has been gathered to cross a decision threshold. DDMs produce parameter estimates for decision thresholds (a), the rate at which evidence toward either decision threshold is accumulated (i.e., drift rate; v), and the distance between the information gathering “starting point” and either threshold (i.e., bias; z). Additionally, these models provide estimates of non-decision time (t), or time between the presentation of the stimulus and the initiation of evidence accumulation. HDDM methods hold a significant advantage over traditional DDMs, in that they handle nested data structures more effectively. They also use Bayesian inference to produce full posterior distributions of parameter estimates, providing both the most likely value for a given estimate (i.e., the distribution’s mode) and an indication of the relative certainty of the estimate (i.e., distribution’s spread). We used the hDDM software package in Python v3.8.10 (Python Software Foundation, 2021), which allows flexible construction of hDDM models, and which uses the PyMC package [75] to generate parameter distributions [21]. We generated a single model incorporating both groups and collapsing across Object and Scene tasks since we did not observe a significant effect of stimulus type in our LMM analyses of choice and response time data (see Results). Model selection was based on an iterative process, wherein the most complex models were evaluated first using several methods to assess model fit and convergence. These included visual inspection of parameter estimates posterior distribution and trace plots, examining whether parameter Gelman–Rubin values fell below a specified cut-off of 1.01 (with = 1 representing perfect convergence) and comparing each parameter’s Monte Carlo (MC) error statistic, to its posterior distribution’s standard deviation, with MC error values less than 1% of the posterior considered to show poor convergence. For each specified model, we generated 55,000 samples from posterior distributions, of which 5,000 were discarded. Of the remaining 50,000 samples, we saved one of every five, yielding a net sampling trace length of 10,000 samples. The most complex and theoretically desirable model estimated separate parameters per trial condition for each participant. If a model did not achieve convergence, it was simplified (i.e., by assigning parameters with poor convergence indices to be estimated only at the group level) and re-evaluated. The selected model was compared to alternative models (i.e., with fewer parameters estimated per participant and per condition) to determine whether the inclusion of additional parameter estimates improved model fit. We assessed relative fit across models by comparing their DIC values. DIC is a Bayesian measure of model fit and is defined as a classical estimate of fit plus twice the effective number of parameters [76]. Lower DIC values indicate better fit, and models whose values differ by 10 or more are considered to show significantly different fit. To test hypotheses about differences in parameter estimates within and between groups, we examined the overlap between their posterior distributions to determine the probability, denoted by P, that a value drawn from either distribution was greater than the other. To illustrate, a hypothesis test of PPositive > PNegative = 0.493 indicates that a randomly drawn value from the posterior distribution for No-Conflict Positive trials has a 49.3% probability of being greater than a value drawn from the posterior distribution for the No-Conflict Negative condition. In other words, the estimated distributions are sufficiently overlapping as to provide little evidence for a difference between No-Conflict Positive and Negative trials. Linear mixed models. We analyzed individual trial choice and response time data for the AAC, Stroop, and Go/No-Go tasks using LMMs, as implemented by the lme4 package [70] in R 4.0.4 (R Core Team, 2021) (to be precise, generalized LMMs were used for choice data given their categorical nature, e.g., approach versus avoid, but will be collectively referred to as LMMs for simplicity). Besides the consideration of random effects, the use of LMMs allowed us to account for the unbalanced nature of our groups, potential inequalities in variance between groups, and the fact that some participants’ data were excluded for one of the AAC tasks. LMMs were constructed iteratively, such that the greatest number of desired random effects were modeled, while achieving convergence and appropriate fit. We constructed models based on the protocol described in [71], evaluating the most complex models first. When a model converged and showed appropriate fit, it was compared both to a purely fixed effects model and to a simplified nested mixed effects model with a likelihood ratio test to determine whether the inclusion of random effects significantly improved model fit. In cases where multiple candidate models converged and showed appropriate fit, they were compared to one another with a likelihood ratio test, and the model with the best fit was selected. In cases where model fit did not differ significantly, the model of highest theoretical interest (i.e., with the highest number of random effects) was chosen. Since all our predictors were categorical and we modeled the interactions between them, a deviation coding scheme was adopted as this facilitates the interpretation of main effects [72]. In models of AAC decision phase and Stroop task data, the predictors for trial condition each comprised 3 levels (No-Conflict Positive, No-Conflict Negative, Conflict; Congruent, Control, Incongruent) and therefore were coded using 2 dummy variables. In the AAC decision phase data, these represented contrasts between the Conflict condition and one of the No-Conflict conditions. In the Stroop task, these represented contrast between the Incongruent condition and the Congruent and Control conditions. Because the main and interaction effects involving condition in these analyses were represented by multiple model terms, we evaluated significant model predictors involving condition by performing multi-parameter tests, wherein, using a likelihood ratio test, the selected model was compared to an identical model with the terms for the relevant effects removed. If the specified model showed better fit than its counterpart with the relevant effect terms removed, the main or interaction effect was considered significant. Significant main effects and interactions were probed via post hoc comparisons between relevant EMMs using the emmeans package [73] and correcting p-values for multiple comparisons using Tukey’s HSD. Lastly, since multiple LMMs were conducted across multiple behavioral tasks and response measures (7 LMMs in total), we accounted for the increased probability of Type I errors occurring by additionally adjusting all p-values using a Bonferroni correction (i.e., p × 7). Both uncorrected and corrected p-values are reported in the statistical tables (Tables 3–6). Hierarchical drift diffusion modeling. In addition to analyzing rates of approach responding, we were interested in whether the underlying decision-making processes differed across groups. To this end, we employed the hDDM. DDM approaches are a type of sequential sampling techniques, which have long been used to model two-choice behavior and response times [74]. These models assume that decision-making occurs by means of accumulation of noisy information (i.e., evidence) about the stimulus. In DDMs, evidence is continuously evaluated while it is collected, until sufficient information has been gathered to cross a decision threshold. DDMs produce parameter estimates for decision thresholds (a), the rate at which evidence toward either decision threshold is accumulated (i.e., drift rate; v), and the distance between the information gathering “starting point” and either threshold (i.e., bias; z). Additionally, these models provide estimates of non-decision time (t), or time between the presentation of the stimulus and the initiation of evidence accumulation. HDDM methods hold a significant advantage over traditional DDMs, in that they handle nested data structures more effectively. They also use Bayesian inference to produce full posterior distributions of parameter estimates, providing both the most likely value for a given estimate (i.e., the distribution’s mode) and an indication of the relative certainty of the estimate (i.e., distribution’s spread). We used the hDDM software package in Python v3.8.10 (Python Software Foundation, 2021), which allows flexible construction of hDDM models, and which uses the PyMC package [75] to generate parameter distributions [21]. We generated a single model incorporating both groups and collapsing across Object and Scene tasks since we did not observe a significant effect of stimulus type in our LMM analyses of choice and response time data (see Results). Model selection was based on an iterative process, wherein the most complex models were evaluated first using several methods to assess model fit and convergence. These included visual inspection of parameter estimates posterior distribution and trace plots, examining whether parameter Gelman–Rubin values fell below a specified cut-off of 1.01 (with = 1 representing perfect convergence) and comparing each parameter’s Monte Carlo (MC) error statistic, to its posterior distribution’s standard deviation, with MC error values less than 1% of the posterior considered to show poor convergence. For each specified model, we generated 55,000 samples from posterior distributions, of which 5,000 were discarded. Of the remaining 50,000 samples, we saved one of every five, yielding a net sampling trace length of 10,000 samples. The most complex and theoretically desirable model estimated separate parameters per trial condition for each participant. If a model did not achieve convergence, it was simplified (i.e., by assigning parameters with poor convergence indices to be estimated only at the group level) and re-evaluated. The selected model was compared to alternative models (i.e., with fewer parameters estimated per participant and per condition) to determine whether the inclusion of additional parameter estimates improved model fit. We assessed relative fit across models by comparing their DIC values. DIC is a Bayesian measure of model fit and is defined as a classical estimate of fit plus twice the effective number of parameters [76]. Lower DIC values indicate better fit, and models whose values differ by 10 or more are considered to show significantly different fit. To test hypotheses about differences in parameter estimates within and between groups, we examined the overlap between their posterior distributions to determine the probability, denoted by P, that a value drawn from either distribution was greater than the other. To illustrate, a hypothesis test of PPositive > PNegative = 0.493 indicates that a randomly drawn value from the posterior distribution for No-Conflict Positive trials has a 49.3% probability of being greater than a value drawn from the posterior distribution for the No-Conflict Negative condition. In other words, the estimated distributions are sufficiently overlapping as to provide little evidence for a difference between No-Conflict Positive and Negative trials. Acknowledgments We thank all participants for their time, and Yi Yang Teoh for help with the hDDM analyses.

journal article

Open Access Collection

A call for broadening the altmetrics tent to democratize science outreach

Jarić, Ivan;Pipek, Pavel;Novoa, Ana

2025 PLoS Biology

doi: 10.1371/journal.pbio.3003010pmid: 39919128

The need to measure and understand societal impact of research, together with the widespread use of online platforms to disseminate scientific content, has led to efforts to develop indicators of scholarly work uptake beyond traditional scientometrics. These metrics, known as altmetrics, represent web-based measures of societal attention and engagement with scientific publications. Altmetrics provide a complementary measure of research impact by tracking diverse online sources, such as social media, blogs, and news articles, to capture the broader societal uptake of research outputs [1]. Various platforms providing scientific article-level altmetrics have been developed by different companies and organizations, such as Altmetric (Altmetric.com), Plum Analytics (PlumX), Crossref (Event Data), and OurResearch (ImpactStory), while several publishers provide their own sets of altmetrics indices for their journals. Altmetrics have been already implemented in some major scientific databases [2], and research institutions and scientific journals use them to assess their performance [3]. Altmetrics are receiving increasing scientific attention (Fig 1A), and they are being used in job and grant evaluations, as well as for academic rankings [4,5]. The importance of the information collected by altmetrics is likely to be further strengthened with the wider adoption of the Declaration on Research Assessment (DORA), launched in 2012 with the aim of moving away from the use of the impact factor by introducing a wider range of evaluation criteria (https://sfdora.org/read/). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Impact of research on altmetrics, and the structure and temporal instability of Altmetric scores. (A) Publication output and impact of research related to altmetrics, based on Web of Science Core Collection search using the query “altmetric*” to search within topics. (B) Average proportion of Altmetric scores for papers published in PLOS Biology during 2021–2024 representing different online media sources. Left and right Altmetric badges represent publications with Altmetric scores of at least 10 and 100, respectively. (C) Temporal decline in Altmetric scores and X/Twitter mentions between November 2021 and November 2024 (i.e., following Twitter acquisition by Elon Musk in 2022 and US presidential elections in 2024) of the top five most mentioned papers published in the journal Biological Invasions. https://doi.org/10.1371/journal.pbio.3003010.g001 Despite their growing influence and value, altmetrics have significant limitations. First, stakeholders (e.g., online news and other media content creators) are not always aware of the relevance of altmetrics and of how they work—for example, scientific papers often feature on online platforms without their DOI, and such posts do not get recognized by altmetric platforms. Effective tracking of scholarly impact would require improved stakeholder awareness and cooperation. Second, altmetrics are often presented as a single composite score of societal impact, or at best as a set of a few metrics. However, the way such scores are calculated is not fully transparent, and it is consequently challenging to interpret them meaningfully [6]. The greatest importance of altmetrics should rather lie in highlighting the actual content and context featuring the public engagement with scientific outputs [6], considering that high altmetrics scores can also be generated by negative reception of the results by the community. This could be achieved automatically, for example, with the use of Large Language Models [7], such as ChatGPT or Claude, which can interpret texts in a contextually meaningful way and extract topics and sentiments associated with the altmetrics scores. Third, most altmetrics platforms only consider a narrow selection of mainstream media sources. This is especially true for social media coverage. For instance, the prominent altmetrics platform PlumX currently collects data exclusively from Facebook, while Altmetric, until very recently, has tracked just three additional major social media platforms—X (formerly Twitter), Reddit and Youtube. However, many other social networking services are not covered, including open-source and decentralized platforms such as Mastodon. Additionally, they are heavily biased towards English content, with the dissemination through non-English communications (such as large social media platforms in China and India, often with hundreds of millions of users) being particularly neglected [8]. While we acknowledge the challenges linked to the issue of state censorship in some of those platforms [9], such selective focus raises concerns about the inclusivity, accuracy and usefulness of altmetrics scores. By ignoring online communication and outreach activities beyond a few selected media platforms, altmetrics are effectively failing to capture the full spectrum of scholarly communication and true scholarly impact, and underrepresenting research disseminated through alternative media platforms. Biases in social media coverage by major altmetrics platforms can lead to a significant monopolization of the social media landscape, by inadvertently making scholars feel pressured to confine their dissemination and communication efforts to a handful of platforms recognized by altmetrics systems, to ensure that their research impact will be properly acknowledged. This is especially true considering that dissemination activities and discussions on social media tend to represent the major part of the altmetrics scores (Fig 1B). Growing concerns about platforms like X, due to issues of content moderation and disinformation [10], further highlight the urgency of enhancing the altmetrics coverage. Following the recent platform takeover by Elon Musk, there has been a surge in migration of scientists from X to other platforms such as Mastodon and Bluesky, which was further intensified following the 2024 US presidential elections [11]. We acknowledge that broadening the coverage of social media platforms is not an easy task, but we are positive that this might happen in the future. For example, Bluesky has been recently included in Altmetric scoring, and it remains to be seen whether other altmetrics platforms such as PlumX will follow suit. However, this problem needs to be addressed in a more systematic way, by accounting for a wider range of online media sources, and especially by a more balanced regional and linguistic coverage. Finally, the ongoing migration from X to other social media platforms points to another weakness of altmetrics: their temporal instability [5]. When an X account gets deactivated, all the posts that have mentioned scientific publications cease to be accessible, which automatically leads to a reduction of altmetrics scores (Fig 1C). Altmetrics scores can also be affected by online platforms being closed or banned, or by changes in altmetrics companies’ coverage of online sources, as for example happened when PlumX ceased tracking X in 2023 or when Altmetric ceased following Pinterest (in 2013), LinkedIn (in 2014) or Weibo (in 2015). To address the issue of temporal instability, altmetrics scores should always include a timestamp. Furthermore, it would be important to ensure adequate level of interoperability among social media platforms, allowing users to transfer the content between platforms [12]. Such measures would help maintain social media content when users switch between platforms, as well as help diminish present monopolization of the social media landscape. Furthermore, wider adoption of common protocols, such as ActivityPub in Fediverse, or of bridging services, such as Bridgy Fed, would allow greater visibility, communication and data sharing among media platforms. The issue is overall quite complex, and our aim is not to cover exhaustively all the relevant aspects. Our intention is to initiate a constructive discussion that should involve the scientific community and key stakeholders. We believe that, to support a more inclusive and democratic picture of scholarly impact, altmetrics platforms should broaden their scope to include a wider array of social media platforms and more non-English content, address the temporal instability of altmetrics through timestamping, and be more transparent about how they track and quantify public attention. This together would allow for a better capture of the global reach and societal impact of research, and support a more democratic and diverse scientific discourse.

Showing 1 to 10 of 30 Articles

Articles per page

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Related Journals: