Background: While CRISPR-Cas systems hold tremendous potential for engineering the human genome, it is unclear how well each system performs against one another in both non-homologous end joining (NHEJ)-mediated and homology-directed repair (HDR)-mediated genome editing. Results: We systematically compare five different CRISPR-Cas systems in human cells by targeting 90 sites in genes with varying expression levels. For a fair comparison, we select sites that are either perfectly matched or have overlapping seed regions for Cas9 and Cpf1. Besides observing a trade-off between cleavage efficiency and target specificity for these natural endonucleases, we find that the editing activities of the smaller Cas9 enzymes from Staphylococcus aureus (SaCas9) and Neisseria meningitidis (NmCas9) are less affected by gene expression than the other larger Cas proteins. Notably, the Cpf1 nucleases from Acidaminococcus sp. BV3L6 and Lachnospiraceae bacterium ND2006 (AsCpf1 and LbCpf1, respectively) are able to perform precise gene targeting efficiently across multiple genomic loci using single-stranded oligodeoxynucleotide (ssODN) donor templates with homology arms as short as 17 nucleotides. Strikingly, the two Cpf1 nucleases exhibit a preference for ssODNs of the non-target strand sequence, while the popular Cas9 enzyme from Streptococcus pyogenes (SpCas9) exhibits a preference for ssODNs of the target strand sequence instead. Additionally, we find that the HDR efficiencies of Cpf1 and SpCas9 can be further improved by using asymmetric donors with longer arms 5′ of the desired DNA changes. Conclusions: Our work delineates design parameters for each CRISPR-Cas system and will serve as a useful reference for future genome engineering studies. Keywords: CRISPR, Genome editing, Non-homologous end joining (NHEJ), Homology-directed repair (HDR), Cas9 nucleases, Cpf1 nucleases * Correspondence: firstname.lastname@example.org; email@example.com Yuanming Wang and Kaiwen Ivy Liu contributed equally to this work. Equal contributors School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore 637459, Singapore Genome Institute of Singapore, Agency for Science Technology and Research, Singapore 138672, Singapore Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Wang et al. Genome Biology (2018) 19:62 Page 2 of 16 Background Given the widespread occurrence of CRISPR-Cas in The CRISPR (clustered regularly interspaced short palin- bacteria, users are currently uncertain about the relative dromic repeats)-Cas system is a versatile tool that has performance of each system in engineering mammalian been successfully used to modify the genome of myriad genomes. Despite the abundant literature available, it is organisms [1–10]. In this system, an endonuclease (typic- difficult to directly compare the studies that employ dif- ally Cas9 or Cpf1) is recruited to a specific genomic locus ferent Cas nucleases due to inconsistencies in cellular by a chimeric single guide RNA (sgRNA), which com- context, target sites, protein expression levels, and other prises both a crRNA spacer that recognizes the target site experimental conditions. Two recent reports attempted by reverse complementary base pairing and a stem to assess SpCas9 with Cpf1 nucleases, but they were lim- loop-containing scaffold for the nuclease [11–13]. Add- ited in scope and focused mainly on the specificities of itionally, the target locus must carry a short sequence Cpf1 [19, 20]. Here, we rigorously characterize three signature, known as the protospacer adjacent motif Cas9 nucleases (SpCas9, SaCas9, and NmCas9) and two (PAM), before DNA cleavage can occur. Hence, the PAM Cpf1 nucleases (also known as Cas12a) from Acidamino- places a constraint on which parts of a genome may be coccus sp. BV3L6 and Lachnospiraceae bacterium cut by a particular Cas nuclease. Nevertheless, Cas ND2006 (AsCpf1 and LbCpf1, respectively) across a enzymes from different bacteria species generally plethora of target sites. Our data provide much-needed recognize different PAMs. Therefore, by incorporating guidance to others who are keen to leverage the various CRISPR-Cas systems into our engineering toolbox, CRISPR-Cas technology to perform genome editing in we can expand the range of sites to target in a genome. human cells and potentially also in other organisms. Distinct endogenous DNA repair pathways are har- nessed to achieve desired genome engineering outcomes Results [12, 13]. Typically, after the Cas endonuclease cleaves Framework for fair evaluation of different CRISPR-Cas the DNA at the target site, the double-stranded break is systems repaired by either the non-homologous end joining We sought to establish an evaluation framework that (NHEJ) pathway or the homology-directed repair (HDR) allowed an unbiased assessment of the five Cas endonu- pathway. The former is activated in the absence of a re- cleases (SpCas9, SaCas9, NmCas9, AsCpf1, and LbCpf1). pair template. Being error-prone, it frequently introduces Every protein was fused to two nuclear localization sig- insertions or deletions (indels) during the repair process, nals (NLS) and an identical V5 epitope tag. Additionally, which may result in frameshift mutations and gene we expressed each enzyme and its cognate sgRNA from knockouts. On the other hand, the latter is activated the same plasmid backbone. The CAG or the EF1α pro- when a homologous repair template is supplied. Precise moter was used to drive the expression of the Cas nucle- DNA changes are specified in the template and are ase, while the same U6 promoter was used to drive the hence incorporated into the target locus with high fidel- expression of the sgRNA. After cloning, we verified the ity by the HDR pathway. However, NHEJ is the domin- activity of each construct using target sites that were ant repair pathway in higher eukaryotes. Consequently, known to be edited robustly by the respective nucleases precise genome editing via homologous recombination (Additional file 1: Figure S1 and Tables S1 and S2). (HR) usually occurs at a very low frequency. The various Cas enzymes should ideally be targeting Several CRISPR-Cas systems from different bacterial identical genomic loci in order for the results to be com- species have been deployed for genome engineering in parable. As each endonuclease requires a different PAM human cells [1, 2, 4, 14–17]. So far, the vast majority of for efficient cleavage and the PAMs for SaCas9 and studies utilize SpCas9 for both NHEJ-mediated and NmCas9 are incompatible, we initially selected 61 HDR-mediated genome editing largely because it is the matched target sites that are flanked by TTTN (PAM first Cas endonuclease to be successfully used in human for AsCpf1 and LbCpf1) and either NGGRRT (combined cells [1, 2, 4] and is also the best characterized enzyme PAM for SpCas9 and SaCas9) or NGGNGATT (com- to date. Additionally, the nuclease’s relatively simple bined PAM for SpCas9 and NmCas9) (Additional file 1: NGG PAM requirement contributes to its popularity for Table S3). The sites ranged in length from 17 to 23 nu- genome engineering. However, a major disadvantage of cleotides (nt). Additionally, since each Cas nuclease SpCas9 is its relatively large size (1368 amino acids), may be differentially affected by chromatin accessibil- which hinders certain in vivo therapeutic applications. ity, we targeted genes with varying expression levels Some Cas9 homologs, including SaCas9 (1053 amino in HEK293T cells because gene transcription is largely acids) and NmCas9 (1082 amino acids), help alleviate controlled by the underlying chromatin architecture the size issue, but they typically require more complex . Based on our RNA-seq data, the chosen genes PAMs, such as NNGRRT for SaCas9 and NNNNGATT showed more than 4000-fold difference in expression for NmCas9 . (Additional file 1: Figure S2a). Consistently, we also Wang et al. Genome Biology (2018) 19:62 Page 3 of 16 observed higher levels of H3K27ac at the promoters seed regions (PAM-proximal 7 nt; Fig. 1c, d and of more actively transcribed genes (Additional file 1: Additional file 1: Figure S5c, d and Table S4). However, we Figure S2b). still observed similar trends with these new sites. We asked whether our evaluation studies may be in- Notably, NmCas9 performed poorly at most of the fluenced by the choice of promoter used to express the target sites irrespective of spacer lengths, with editing Cas enzymes. We checked the expression of each endo- frequencies considerably lower than the other nucleases nuclease from either the CAG or the EF1α promoter by (Fig. 1b and Additional file 1: Figure S5b). We also quantitative real-time PCR (qRT-PCR) and found that observed that with our chimeric sgRNAs, NmCas9 did transcript levels were approximately 1.5-fold higher not show a preference for longer spacer lengths, consist- under the latter promoter (Additional file 1: Figure S2c). ent with a recent study on the usage of NmCas9 in However, the cleavage efficiencies of Cas nucleases mammalian genome editing . Nevertheless, since the expressed under the CAG promoter were highly corre- length of naturally occurring crRNA spacers in N. men- lated with the cleavage efficiencies of the enzymes ingitides was found to be 24 nt , we selected nine expressed under the EF1α promoter, regardless of new 24 nt- or 25 nt-long target sites that are flanked by whether T7 endonuclease I (T7E1) mismatch cleavage the PAMs for Cpf1 and NmCas9 (Additional file 1: assays or Illumina deep sequencing experiments were Figure S7a, b and Table S5). Moreover, these sites are used to measure the rate of indel formation (Pearson R in highly expressed genes to ensure accessibility of = 0.75 or 0.96, respectively) (Additional file 1: Figure the chromatin. When we quantified editing efficien- S2d). The data obtained from the CAG promoter was cies at these new genomic loci by T7E1 cleavage as- also not significantly different from the data obtained says (Additional file 1: Figure S7c, d) and Illumina from the EF1α promoter (P > 0.5, Wilcoxon rank sum deep sequencing experiments (Fig. 1e, f) in HEK293T test; Additional file 1: Figure S2e). This may be because cells, we again found that the editing activity of both promoters were strong enough to produce suffi- NmCas9 was lower than those of both AsCpf1 and cient amounts of Cas proteins, so that enzyme concen- LbCpf1 at all nine matched target sites. We further tration in the cells was no longer a limiting factor. verified the poorer performance of NmCas9 in other Hence, we pooled the data obtained using the CAG pro- cell lines (Additional file 1: Figure S8). Collectively, moter with the data obtained using the EF1α promoter our results suggest that NmCas9 might not be an to perform a combined analysis. ideal Cas nuclease for many genome editing applica- tions, such as multiplex gene targeting. Performance of CRISPR-Cas in NHEJ-mediated genome Next, we asked whether the editing efficiency of each editing Cas endonuclease may be affected by local chromatin con- We first examined the editing activities of the CRISPR-Cas text. To increase the statistical power of our analysis, we systems without any repair template. Both the T7E1 cleav- selected 18 additional target sites that contain NGGRRT age assays (Additional file 1: Figure S3) and Illumina deep at their 3′ ends and are of length 21 nt, which is within sequencing experiments (Additional file 1:FigureS4) were the optimal spacer lengths for both SpCas9 and SaCas9 used to assess activity at the 61 selected genomic loci in (Additional file 1: Table S6). Six of these sites are in lowly HEK293T cells. Overall, SpCas9 exhibited the highest cleav- expressed genes, while the remaining 12 sites are in highly age efficiencies for spacer lengths between 17 and 20 nt in- expressed genes (Additional file 1: Figure S9). We assayed clusive (Fig. 1a, b and Additional file 1:FigureS5a,b). In the activity of each enzyme by the T7E1 assay and by deep particular,itwas theonlynucleasethatwas consistently sequencing the targeted loci (Additional file 1:Figure active with short 17-nt spacers, which we further confirmed S10). When we considered all the selected sites together, in other cell lines (Additional file 1:FigureS6).Incontrast, we found that the editing efficiencies of SpCas9, AsCpf1, SaCas9 and LbCpf1 gave the highest amount of genome and LbCpf1 were significantly affected by the expression modifications for spacer lengths between 21 and 23 nt in- of the targeted genes (P < 0.05, Wilcoxon rank sum test; clusive (Fig. 1a, b and Additional file 1:FigureS5a,b). Simi- Fig. 2a and Additional file 1: Figure S11a), consistent with lar results were obtained regardless of whether the matched previous studies that showed that chromatin structure target sites were present in introns (Additional file 1: may influence the efficacy of CRISPR-mediated genome Figures S3a–eand S4a–e) or in protein-coding regions editing [24–27]. The same results were obtained when we (Additional file 1:Figures S3f–hand S4f–h). We also noted restricted our analysis to only the sgRNAs of optimal that the PAM-proximal seed region of the DNA target is lengths for every enzyme (Fig. 2b and Additional file 1: more critical for proper recruitment of the CRISPR-Cas Figure S11b). Interestingly, however, the efficacy of SaCas9 system, but the PAMs for Cpf1 and Cas9 are on opposite and NmCas9 in human cells appeared to be unaffected by sides of each protospacer. Hence, we selected new target gene expression levels, especially when we considered only sites where the Cpf1 and Cas9 nucleases had overlapping the sgRNAs of optimal lengths (Fig. 2b and Wang et al. Genome Biology (2018) 19:62 Page 4 of 16 a c b d e f Fig. 1 Evaluation of various CRISPR-Cas systems in NHEJ-mediated genome editing using perfectly matched spacers or spacers with overlapping seeds. a, b Summary of matched target site activities (see Additional file 1: Figure S4) for SpCas9, either a SaCas9 or b NmCas9, AsCpf1, and LbCpf1 based on deep sequencing. Each horizontal bar indicates the mean of the editing activities for the indicated enzyme and range of spacer lengths. c, d Extent of genome modifications at a target locus in the c CACNA1D or d PPP1R12C gene whereby the Cas9 and Cpf1 nucleases had overlapping seed regions. Three different spacer lengths (17, 20, and 23 nt) were tested. The editing efficiencies were determined by deep sequencing. The cells were harvested 24 h after transfection. Data represent mean ± standard error of the mean (s.e.m.; n ≥ 6). e The editing activity of NmCas9 and the two Cpf1 nucleases at nine new target sites (C1–C9) of the form TTTN-N -NNNNGATT (see Additional file 1: Table S5). The cells were harvested 24-25 24 h after transfection and then the editing frequencies were quantified by deep sequencing. Data represent mean ± s.e.m. (n ≥ 4). f Strip chart summarizing the editing efficiencies of NmCas9, AsCpf1, and LbCpf1 at perfectly matched target sites of longer lengths (24–25 nt) Additional file 1: Figure S11b), possibly because they are other two nucleases (P < 0.05, Wilcoxon rank sum test; smaller enzymes and hence may be able to access Fig. 2c and Additional file 1: Figure S11c). Nevertheless, nucleosome-bound DNA or heterochromatin more easily. despite its overall weaker editing activity, AsCpf1 While AsCpf1 performed generally well in our showed the lowest tolerance to single mismatches NHEJ-mediated genome editing experiments, it was usu- between the sgRNA and the target DNA (Fig. 2d, e and ally surpassed by some other enzyme at most target Additional file 1: Figure S11d, e). Hence, our results sug- sites, regardless of whether they are located in lowly gest that there is a compromise between cleavage effi- expressed or highly expressed genes. When we carried ciency and specificity of naturally occurring Cas out a four-way comparison of the different Cas nucleases endonucleases. using spacers that were either perfectly matched or con- tained matched seed regions, we found that AsCpf1 was Performance of CRISPR-Cas in HDR-mediated genome the best performing enzyme at only a minority of the editing with single-stranded oligodeoxynucleotide donor sites, even for optimal spacer lengths (Additional file 1: We sought to determine how well the various Figure S12). When we carried out a pairwise comparison CRISPR-Cas systems perform in HDR-mediated precise of AsCpf1 with either SpCas9 or LbCpf1 alone, focusing genome editing. We again targeted the two genomic loci only on the sgRNAs of optimal lengths for both enzymes containing matched seeds for Cas9 and Cpf1 nucleases, under consideration, we also found that AsCpf1 exhib- but here we co-transfected donor single-stranded oligo- ited significantly lower cleavage efficiencies than the deoxynucleotide (ssODN) with our CRISPR plasmids in Wang et al. Genome Biology (2018) 19:62 Page 5 of 16 ab c Fig. 2 Relationship of DNA cleavage efficiency with gene expression and target specificity. a Impact of gene expression on editing efficiency. We divided the target sites into those that occur in lowly expressed genes (FPKM < 25, blue boxplots) and those that occur in highly expressed genes (FPKM ≥ 25, red bloxplots) using our RNA-seq data. The FPKM value of 25 was chosen to divide the target sites into two groups of roughly equal sizes for the five Cas nucleases. Here, all sgRNAs were considered in the analysis. Overall, we found from our deep sequencing experiments that SpCas9, AsCpf1, and LbCpf1 were able to edit highly expressed genes more efficiently than lowly expressed genes (P < 0.05, Wilcoxon rank sum test). In contrast, the two smaller nucleases, SaCas9 and NmCas9, were less affected by gene expression. b Similar analysis to a, except that only sgRNAs of optimal lengths were considered. In the current study, we set the optimal lengths of SpCas9 as 17–22 nt inclusive, SaCas9 as ≥ 21 nt, NmCas9 as ≥ 19 nt (based on our results in Fig. 1b and Additional file 1: Figure S5b as well as a previous report ), AsCpf1 as ≥ 19 nt, and LbCpf1 as ≥ 19 nt. c Comparison of AsCpf1 with either SpCas9 (left boxplot) or LbCpf1 (right boxplot). Only sgRNAs of the optimal lengths for SpCas9 and the Cpf1 nucleases (19–22 nt inclusive) were considered. From deep sequencing analysis, we found that the editing activity of AsCpf1 was significantly lower than that of both SpCas9 and LbCpf1 (P < 0.001, Wilcoxon rank sum test). d To assess the specificities of SpCas9, AsCpf1, and LbCpf1, we examined the tolerance of these enzymes to single mismatches along the spacer targeting the A17 site in the NF1 gene. Red letters indicate the mutated bases. e Using the spacers indicated in d, we determined the editing activities of SpCas9, AsCpf1, and LbCpf1 by deep sequencing. The cells were harvested 24 h after transfection. For all three nucleases, we observed an increased tolerance to mismatches around the middle of the spacer. Importantly, while SpCas9 and LbCpf1 exhibited higher cleavage efficiencies than AsCpf1 with a perfect matched (PM) spacer, they also showed an overall higher tolerance to mismatches between the spacer and the target DNA. Data represent mean ± standard error of the mean (n ≥ 4) order to introduce a XbaI restriction site between the LbCpf1 gave significantly more digested products than cleavage sites of Cas9 and Cpf1 (Fig. 3a, b and SpCas9 (P < 0.05, Student’s t-test; Additional file 1: Additional file 1: Figure S13a, b). Every ssODN con- Figure S13a, b). SaCas9 and NmCas9 yielded almost no tained the restriction site flanked by 47 nt of homology detectable shorter fragments after restriction digest re- on each side. The donor templates were also comple- gardless of spacer lengths, possibly because they cleaved mentary to the target strands. Expectedly, restriction less efficiently than the other Cas nucleases at the two fragment length polymorphism (RFLP) analysis revealed targeted loci (Fig. 1c, d and Additional file 1: Figure S5c, that only SpCas9 was able to consistently insert the XbaI d). We further confirmed the results by deep sequencing site when the spacer length was just 17 nt. However, for to ensure that the restriction site was correctly inserted spacers that were 20 or 23 nt long, both AsCpf1 and (Fig. 3a, b). When we reduced the homology arm length Wang et al. Genome Biology (2018) 19:62 Page 6 of 16 a b c d e f Fig. 3 Evaluation of various CRISPR-Cas systems in HDR-mediated genome editing using symmetric ssODN donor templates and spacers with overlapping seeds. a, b Extent of XbaI restriction site (depicted in green) insertion into a coding exon of the a CACNA1D or b PPP1R12C gene. The brown horizontal bars represent the 47-nt homology arms of the donor template and NT indicates that the donor is of the non-target strand sequence. Three different spacer lengths (17, 20, and 23 nt) were tested. The cells were harvested 72 h after transfection and the gene-targeting efficiencies were determined by Illumina deep sequencing. Data represent mean ± standard error of the mean (s.e.m.; n ≥ 4). *P < 0.05, **P < 0.01; Student’s t-test. c, d Extent of precise gene editing by SpCas9, AsCpf1, and LbCpf1 when ssODNs of different homology arm lengths (27–47 nt) were used together with 20-nt spacers targeting c CACNA1D or d PPP1R12C. The cells were harvested 72 h after transfection and the gene targeting efficiencies were determined by Illumina deep sequencing. Data represent mean ± s.e.m. (n ≥ 4). *P < 0.05, **P < 0.01, ***P < 0.001; Student’s t-test. e, f Concurrent T7E1 assays and RFLP analysis of cells co-transfected with the indicated CRISPR plasmids and donor ssODNs containing 47-nt long homology arms. We used 20 or 23 nt long spacers targeting either e CACNA1D or f PPP1R12C. The cells were harvested 72 h after transfection. Overall, the cleavage efficiencies of SpCas9 were comparable to those of AsCpf1 and LbCpf1, as determined by the T7E1 assays. However, the extent of XbaI integration into the target sites was lower for SpCas9 compared to AsCpf1 and LbCpf1, as determined by RFLP analysis of the donor template from 47 to 27 nt, the editing effi- Figure S15). Moreover, although we detected a small ciency of each enzyme was unaffected and the Cpf1 nu- amount of incorrect XbaI integration from our se- cleases continued to exhibit significantly higher HDR quencing data,it was,onaverage,6.4-foldand frequencies than SpCas9 (P < 0.05, Student’s t-test; 12.4-fold lower than the rate of correct integration at Fig. 3c, d and Additional file 1: Figure S13c, d). Compar- the CACNA1D and PPP1R12C loci, respectively able results were obtained when we varied the (Additional file 1:FigureS16). amount of donor templates between 100 and 300 ng Subsequently, we selected six perfectly matched target (Additional file 1: Figure S14). Additionally, we ob- sites, namely A3, A11, A12, B4, B8, and B18, that could served that the HDR efficiencies of all Cas nucleases be cleaved robustly by at least SpCas9, AsCpf1, and increased with time after transfection (Additional file 1: LbCpf1 (Additional file 1: Figures S3 and S4) to perform Wang et al. Genome Biology (2018) 19:62 Page 7 of 16 additional HDR-mediated genome editing experiments Figure S19). Since the six additional sites are located in with ssODNs as donor templates. For A12 and B4, the genes of varying expression levels, the higher HDR effi- ssODNs contained 47-nt homology arms flanking either ciency exhibited by Cpf1 appears to be independent of a XbaI or HindIII recognition sequence, while for the the underlying chromatin architecture. Importantly, the remaining target sites, the ssODNs contained 27-nt editing efficiency of each Cas endonuclease at the B8 homology arms instead. Moreover, for the B8 target site, locus was not compromised even when we reduced the we also tested extra donor templates with even shorter homology arm length down to 17 nt. This result is con- homology arms (27, 25, 23, 21, 19, and 17 nt). All donor sistent with a previous study that found that zinc finger templates were of the non-target strand sequence. Over- nucleases could perform precise gene editing with tem- all, we observed that the Cpf1 nucleases exhibited sig- plates containing only around 30–40 total bases of hom- nificantly higher HDR efficiencies at all the six target ology . sites than SpCas9 in RFLP assays and deep sequencing We wondered whether the results from our experiments (P < 0.05, Student’s t-test; Fig. 4a–c and HDR-mediated editing experiments might be due to dif- Additional file 1: Figures S17 and S18). The frequency of ferences in cleavage efficiencies. After co-transfecting erroneous restriction site integrations was much lower ssODNs with our CRISPR plasmids, we performed T7E1 than the rate of correct integrations (Additional file 1: assays and RFLP analysis on the same genomic DNA b c Fig. 4 Evaluation of various CRISPR-Cas systems in HDR-mediated genome editing using symmetric ssODN donor templates and perfectly matched spacers. a Intended DNA changes at the A3 (in ALK), A11 (in EGFR), B8 (in EGFR), and B18 (in STAG2) target sites. Each red vertical line indicates the cleavage site of Cas9 nucleases, which occurs 3 bp upstream of their PAM. Each blue vertical line indicates the cleavage site of Cpf1 nucleases on one DNA strand, which occurs 18 nt downstream of their PAM. The HindIII restriction site is indicated in green. b Extent of correctly incorporating the HindIII recognition sequence into the A3, A11, or B18 target locus. Donor ssODNs with 27-nt homology arm lengths were used. The donor templates were complementary to the target DNA strand. Cells were harvested for deep sequencing analysis 72 h post-transfection. Both the Cpf1 endonucleases consistently exhibited higher levels of precise gene targeting than SpCas9. Data represent mean ± standard error of the mean (s.e.m.; n ≥ 5). ***P < 0.001; Student’s t-test. c Extent of precise gene editing by SpCas9, AsCpf1, and LbCpf1 at the B8 locus when ssODNs of different homology arm lengths (17–27 nt inclusive) were used. The donor templates were complementary to the target DNA strand. Cells were harvested 72 h after transfection and the gene targeting efficiencies were determined by Illumina deep sequencing. Data represent mean ± s.e.m. (n ≥ 3). *P < 0.05, **P < 0.01, ***P < 0.001; Student’s t-test. d Concurrent T7E1 assays and RFLP analysis of cells co-transfected with the indicated CRISPR plasmids and donor ssODNs containing 27 nt long homology arms for the A3, A11, B8, and B18 target sites. Overall, the cleavage efficiencies of SpCas9 were comparable to those of AsCpf1 and LbCpf1, as determined by the T7E1 assays. However, the extent of HindIII integration into the target sites was lower for SpCas9 compared to AsCpf1 and LbCpf1, as determined by RFLP analysis Wang et al. Genome Biology (2018) 19:62 Page 8 of 16 samples. Overall, we observed that SpCas9 generated An earlier study showed that the strand bias of SpCas9 indels as efficiently as AsCpf1 and LbCpf1 in the T7E1 at an AAVS1 genomic locus became more obvious with assays, but yet it produced weaker cleavage bands than longer donor templates . Hence, to better detect any the Cpf1 nucleases after restriction digest with XbaI or such bias, we next used ssODNs with 37-nt homology HindIII (Figs. 3e, f and 4d). Additionally, we sequenced arm lengths to edit the CACNA1D and PPP1R12C loci. the targeted genomic loci and examined the sequencing In agreement with previous work [29–31], we found reads. Strikingly, SpCas9 produced random indels at from deep sequencing experiments (Fig. 5) and RFLP least as efficiently as AsCpf1 and LbCpf1 at all the tested analysis (Additional file 1: Figure S25) that SpCas9 ex- loci (Additional file 1: Figure S20), but clearly fewer hibited significantly higher HDR efficiencies at both gen- sequencing reads had the desired restriction site cor- omic loci when donor DNA complementary to the rectly incorporated (Additional file 1: Figure S21). non-target strand was used (P < 0.05, Student’s t-test). In Hence, the lower efficiency of precise genome editing contrast, we also now observed that the NT ssODNs exhibited by SpCas9 compared to the Cpf1 nucleases were consistently more effective than the T ssODNs at when ssODNs of non-target strand sequences were introducing precise edits at both loci for the Cpf1 nucle- used was not simply due to a poorer ability to cut ases. Hence, Cas9 and Cpf1 prefer donor templates of the target sites. opposite orientations. Subsequently, we sought to determine whether the Optimization of ssODN donor templates structure of the ssODN could further impact on the The design of the ssODN donor template can influence editing efficiency of the Cas enzymes. A previous study HDR efficiency [29–32]. So far, all our experiments had demonstrated that homology-directed editing by SpCas9 relied on symmetric ssODNs of the non-target strand se- could be enhanced by using asymmetric donor templates quence. Hence, we first sought to explore the extent to . Here, to create such asymmetric donors, we ex- which the editing activity of each CRISPR-Cas system tended either the PAM-proximal or the PAM-distal side may be influenced by the orientation of the donor tem- of each ssODN from 37 to 77 nt (Fig. 5a). Again, we plate. To this end, we targeted the CACNA1D and tested donor DNA that was complementary to either the PPP1R12C loci as well as the A3, A11, B8, and B18 loci target or the non-target strand of the CACNA1D or using ssODNs that were complementary to either the PPP1R12C locus. Consistent with the published report target or the non-target strand. All the ssODNs con- , we found that for SpCas9, extending the homology tained 27-nt homology arms. We also tested ssODNs arm at the PAM-distal side of the T ssODN could im- with 17-nt arms for the B8 locus. Surprisingly, we did prove HDR efficiency, while extending the homology not detect a consistent strand bias for each Cas nuclease arm at the PAM-proximal side was either neutral or det- by deep sequencing experiments (Additional file 1: rimental to the performance of the enzyme (Fig. 5b, c Figure S22) or by RFLP analysis (Additional file 1: Figure and Additional file 1: Figure S25). In contrast, we discov- S23). Instead, at five out of the six targeted sites, we ob- ered that for the Cpf1 nucleases, extending the hom- served a trend for the editing activity of all the enzymes ology arm at the PAM-proximal side of the NT ssODN to change in the same direction when we altered the instead led to an increase in HDR frequency, while orientation of the donor template, thereby suggesting extending the homology arm at the PAM-distal side de- that each genomic locus may have an inherent ssODN creased the rate of HDR. Overall, LbCpf1 still exhibited strand preference. For example, at the PPP1R12C locus, a higher HDR efficiency than SpCas9 at the CACNA1D the HDR frequencies of all the enzymes showed an in- locus when all possible types of donor DNA had been crease when we switched from the original ssODN tem- considered, but at the PPP1R12C locus, the HDR rate plate that was of the non-target strand sequence (NT) to exhibited by SpCas9 with its optimal ssODN template a new donor that was of the target strand sequence (T), was significantly higher than that exhibited by LbCpf1 although this increase was much larger for SpCas9 with its optimal donor template (P < 0.05, Student’s (Additional file 1: Figures S22b and S23b). Conversely, at t-test). Taken together, our results indicate that both the A11 locus, the HDR frequencies of SpCas9, AsCpf1, SpCas9 and LbCpf1 may be used for ssODN-mediated and LbCpf1 all decreased when we used T ssODNs in editing, but strand preferences of the genomic locus and place of the original NT ssODNs, although this the enzyme as well as the structure of the donor tem- reduction was more significant for the Cpf1 nucleases plate need to be carefully considered. (Additional file 1: Figures S22d and S23d). Furthermore, the changes in HDR frequencies were not simply due to Enhancement of error-prone repair with long single- differences in cleavage rates as every nuclease yielded stranded DNA similar amounts of indels in the presence of either the Our deep sequencing data afforded us an opportunity to NT or the T ssODNs (Additional file 1: Figure S24). examine the cleavage efficiencies of the Cas enzymes in Wang et al. Genome Biology (2018) 19:62 Page 9 of 16 Fig. 5 Evaluation of multiple symmetric and asymmetric ssODN donor designs used in combination with different CRISPR-Cas systems. a Various types of ssODN donor templates tested. Each single-strand DNA donor is complementary to either the target strand (and hence is of the non- target strand sequence and is denoted “NT”) or the non-target strand (and hence is of the target strand sequence and is denoted “T”). The NT ssODN donor is colored red, while the T ssODN donor is colored blue. The green box between the homology arms indicates the restriction site to be integrated into the target genomic locus. The 37/77 T ssODN has previously been found to be optimal for SpCas9-induced HDR . b, c Extent of precise gene editing by SpCas9, AsCpf1, and LbCpf1 at the b CACNA1D or c PPP1R12C locus. Cells were harvested 72 h after transfection and the gene targeting efficiencies were determined by Illumina deep sequencing. Data represent mean ± standard error of the mean (s.e.m.; n ≥ 4). *P < 0.05, **P < 0.01, ***P < 0.001, N.S. not significant; Student’s t-test the presence of various types of donor DNA. Overall, single-stranded DNA of lengths 40 to 100 nt) did not the presence of symmetric ssODNs with homology arm affect the frequency of indel formation significantly lengths ranging from 17 to 47 nt (corresponding to (Additional file 1: Figures S20, S24, S26). However, we Wang et al. Genome Biology (2018) 19:62 Page 10 of 16 observed that the rate of such error-prone repair out- editing with a linearized plasmid donor, which is comes tended to increase when we used the longer commonly used to integrate an epitope tag into an asymmetric ssODNs, whose total length was 120 nt. endogenous target locus. Here, we aimed to fuse en- This increase was observed at both the CACNA1D hanced green fluorescent protein (eGFP) to the (Fig. 6a, b) and the PPP1R12C (Fig. 6c, d) loci for all the C-terminus of CLTA and GLUL, which were selected Cas nucleases and appeared to be more significant for because the SpCas9 and Cpf1 nucleases could theor- ssODNs with a longer PAM-proximal homology arm etically cleave both genes at overlapping sites close to (Fig. 6b, d). Additionally, the higher indel frequencies the translation stop codon (Fig. 7a, b). FACS analysis were unlikely to account for the increased HDR rates revealed that Cpf1 did not give a higher rate of eGFP achieved with optimized ssODN donor templates (Fig. 5 integration than SpCas9 when differences in cleavage and Additional file 1: Figure S25) because suboptimal efficiencies (Fig. 7c) were taken into account. For asymmetric donors that caused a decrease in HDR rates CLTA, the relative HDR efficiency of the three Cas could also boost the frequencies of indel formation. We endonucleases paralleled the relative cleavage further noted from a previous study that even efficiency observed in T7E1 assays. For GLUL, non-homologous 127-mer single-stranded DNA could SpCas9 exhibited a significantly higher knockin rate stimulate gene disruption by SpCas9 . Collectively, than both AsCpf1 and LbCpf1 because AsCpf1 our results suggest a strategy whereby the efficiency of cleaved significantly more poorly than SpCas9 at this gene knockout may be enhanced by introducing a long target site (P <0.05, Student’s t-test) and also possibly ssODN donor that contains a frameshift or a nonsense because the blunt cut created by SpCas9 is overall mutation flanked by asymmetric homology arms, so that nearer to the stop codon than the staggered cut cre- the target gene could be inactivated not only by a stimu- ated by Cpf1 and CRISPR-facilitated gene tagging is lated error-prone repair pathway but also by the HDR known to be more efficient closer to the break site. pathway using an optimized single-stranded DNA Similar results were obtained when we varied the donor. amount of donor plasmids between 300 and 900 ng (Additional file 1:FigureS27).Wefurther confirmed Performance of CRISPR-Cas in HDR-mediated genome by PCR the correct integration of eGFP into the editing with plasmid donor CLTA and GLUL genomic loci regardless of the Cas Finally, we asked how well SpCas9 would perform nuclease used (Additional file 1:FigureS28). Collect- against AsCpf1 and LbCpf1 in HDR-mediated genome ively, our results indicate that SpCas9 performs ab 80nt oligo (37/37) 120nt oligo (37/77) 120nt oligo (77/37) * 100 ** ** ** P = 0.0441 20 P = 0.000201 SpCas9 (NT) SpCas9 (T) AsCpf1 (NT) AsCpf1 (T) LbCpf1 (NT) LbCpf1 (T) 80nt 120nt 120nt 37/37 37/77 77/37 cd 80nt oligo (37/37) 120nt oligo (37/77) 120nt oligo (77/37) *** ** 90 * ** ** 80 ** P = 0.00304 P = 0.000693 SpCas9 (NT) SpCas9 (T) AsCpf1 (NT) AsCpf1 (T) LbCpf1 (NT) LbCpf1 (T) 80nt 120nt 120nt 37/37 37/77 77/37 Fig. 6 Gene disruption efficiencies in the presence of different single-stranded DNA donors. a, c Extent of indel formation at the a CACNA1D or c PPP1R12C genomic locus when various ssODN donor templates were used in combination with SpCas9, AsCpf1, or LbCpf1. The rates were quantified by Illumina deep sequencing. NT and T indicate donors of non-target and target strand sequences, respectively. Data represent mean ± standard error of the mean (n ≥ 4). *P <0.05, **P < 0.01, ***P < 0.001; Student’s t-test. b, d Boxplot summarizing the rates of indel formation at the b CACNA1D or d PPP1R12C locus for oligonucleotides of different lengths and structures. There was no significant difference in the indel frequencies between NT and T ssODN donors and hence their data were pooled % Indels % Indels % Indels % Indels Wang et al. Genome Biology (2018) 19:62 Page 11 of 16 Fig. 7 Evaluation of various CRISPR-Cas systems in HDR-mediated genome editing using linearized plasmid donor templates. a, b Fusing eGFP to the C-terminus of a CLTA and b GLUL via a P2A self-cleaving peptide. In the schematics of the targeted loci, the blue boxes depict the exons (E), the brown horizontal bars indicate the homology arms (1000–1500 nt each), and the small triangle in the left homology arm of the GLUL donor indicates that there are several third base pair (bp) mutations towards the end of the coding region to prevent re-targeting by the Cas nucleases. Within the nucleotide sequences, the red vertical lines indicate the Cas9 cleavage sites, while the blue vertical lines indicate the Cpf1 cleavage sites. Expectedly, the percentages of GFP-positive cells were very low when no sgRNA was transfected (red bars), but showed a clear increase in the presence of the appropriate sgRNA for all three Cas nucleases (green bars). Data represent mean ± standard error of the mean (s.e.m.; n ≥ 5). *P < 0.05, **P < 0.01, N.S. not significant; Student’s t-test. c Rate of indel formation as determined by T7E1 assays. The cleavage efficiencies observed could largely explain the rates of eGFP knockin shown in a, b. Data represent mean ± s.e.m. (n ≥ 8). *P < 0.05, **P < 0.01, N.S. not significant; Student’s t-test Wang et al. Genome Biology (2018) 19:62 Page 12 of 16 favorably compared to the Cpf1 enzymes in precision region or a gene with several close paralogs. In such genome engineering when linearized plasmids are cases, we would recommend AsCpf1 due to its lower used as donor templates. tolerance for mismatches between the sgRNA and the target DNA (Fig. 2d, e and Additional file 1: Figure Discussion S11d, e). Alternatively, SaCas9 may be used as it requires CRISPR-Cas is a powerful technology for engineering longer spacers (at least 21 nt) for activity. the complex genomes of plants and animals, but users The ideal CRISPR-Cas system is one with both high are frequently befuddled by which particular system to cleavage efficiency and high target specificity. Two re- adopt. Direct comparisons of published studies per- cent studies reported that AsCpf1 and LbCpf1 appeared formed by different laboratories are unreliable due to to satisfy both these criteria [19, 20], suggesting that they numerous confounding factors, including differences in may be the model Cas endonucleases to pursue in future cellular contexts and target sites. Consequently, most applications. However, our analysis indicates that there users tend to rely on SpCas9 as a default. Here, we con- may be a compromise between editing activity and target ducted a survey of five distinct CRISPR-Cas systems specificity in naturally occurring Cas enzymes (Fig. 2c–e across numerous target loci to obtain a comprehensive and Additional file 1: Figure S11c–e). Specifically, both view of how each Cas endonuclease performs against SpCas9 and LbCpf1 showed more robust editing activity one another in both NHEJ- and HDR-mediated genome than AsCpf1, but they also exhibited higher tolerance editing. From our extensive evaluation study, we derived for mismatches between their sgRNA and the target a set of guidelines to help users of the CRISPR-Cas tech- DNA. It may be possible that the genome-wide methods nology select the most appropriate system for their used in the two recent studies [19, 20] have some limita- applications (Fig. 8). Comfortingly, SpCas9 did exhibit tions that preclude comprehensive detection of all the highest cleavage efficiency at more target sites than off-target sites cleaved by the Cpf1 nucleases. the other nucleases when the spacer length was con- Digenome-seq [34, 35] requires very high sequencing strained to 17–20 nt (Additional file 1: Figure S12). Fur- depth to capture cleavage sites, while GUIDE-seq  thermore, when we compared SpCas9 with LbCpf1 requires the incorporation of blunt double-stranded oli- using only the sgRNAs of optimal lengths for the two godeoxynucleotides (dsODNs) into DNA breakage sites, enzymes, we found that SpCas9 and LbCpf1 generated which may inherently be biased against the staggered indels at comparable frequencies in both lowly expressed cuts generated by Cpf1. Indeed, the efficiency of tag in- and highly expressed genes (Additional file 1: Figure tegration for AsCpf1 and LbCpf1 was found to be lower S29). Hence, for the purpose of generating routine gene than that for SpCas9 . Hence, additional work is knockouts via the NHEJ pathway, we would recommend needed to fully investigate the relationship between edit- utilizing either SpCas9 or LbCpf1. However, there may ing activity and target specificity of all promising natural be situations where target specificity is an issue. For ex- CRISPR-Cas systems. Newer and more sensitive ample, one may want to target a repetitive genomic methods of detecting off-target effects, such as Type of genome editing NHEJ-mediated editing HDR-mediated editing Type of Type of target site donor template Distinct Several similar Asymmetric ssODN Linearized plasmid genomic locus genomic loci LbCpf1 or SpCas9 AsCpf1 or SaCas9 LbCpf1 or SpCas9 SpCas9 Fig. 8 Flowchart to guide CRISPR-Cas users in selecting the appropriate system for their experiments. The choice of Cas endonuclease depends on several considerations, such as the type of genome modification desired or the type of repair template to be utilized. For HDR-mediated editing with ssODNs, we recommend asymmetric donors that are complementary to either the target or non-target strand when used in combination with LbCpf1 or SpCas9, respectively Wang et al. Genome Biology (2018) 19:62 Page 13 of 16 CIRCLE-seq  and SITE-Seq , may help to resolve plasmids were used as donor templates (Fig. 7), suggest- the issue. We further note that one may also evolve nat- ing that it is not the nature of the cut per se that affects ural Cas enzymes into variants that achieve both high HR efficiency. While this may be explained by the fact editing efficiency and targeting specificity, as demon- that ssODN-mediated editing and plasmid-mediated strated recently for SpCas9 . editing are resolved through different DNA repair path- HDR-mediated precise genome editing typically occurs ways, further work is needed to carefully dissect the at low frequencies. This roadblock needs to be overcome underlying mechanisms. before CRISPR-Cas can realize its full potential in gene therapy, whereby accurate correction of disease-causing Conclusions mutations can lead to a permanent cure. As a result, We systematically assessed the ability of different there have been numerous efforts over the past few years CRISPR-Cas systems to perform NHEJ- and to improve its efficiency [29, 31, 40–46]. Our data indi- HDR-mediated genome editing in human cells. We tar- cate that AsCpf1 and LbCpf1 are able to introduce pre- geted numerous genomic loci with matched spacers or cise genome edits in human cells efficiently when matched seeds to obtain a clearer and fairer picture of ssODNs are used as donor templates (Figs. 3, 4 and 5 how the various Cas enzymes compared against one an- and Additional file 1: Figures S13, S17, S18, S22, S23, other. Our extensive survey enabled us to formulate a S25). We found that the Cpf1 nucleases prefer set of rules and design parameters that others may single-stranded DNA donors that are complementary to follow to carry out their genome editing experiments the target strand, in contrast to SpCas9, which prefers with CRISPR-Cas (Fig. 8). We anticipate that the guide- donor templates that are complementary to the lines will evolve with time as the technology matures non-target strand instead. In addition, we observed that and more Cas nucleases are discovered and character- asymmetric donors with a longer PAM-proximal hom- ized in the future. ology arm could further improve Cpf1-mediated editing. From these results, we propose that Cpf1 may asymmet- Methods rically release the 3′ end of the cleaved target strand, Cell culture and transfection thereby allowing the shorter arm of the optimized donor All cell lines were cultured in Dulbecco’s modified Eagle template to anneal and consequently enabling the longer medium (DMEM) supplemented with 10% FBS, arm of the template to invade and displace the 2 mM L-glutamine, and 1% penicillin/streptomycin. Cells base-paired non-target DNA strand at the other side of were incubated at 37 °C in a humidified 5% CO air incu- the break. We further note that our data, which were bator. The cell lines were routinely checked by PCR for obtained in human cells, are in agreement with another mycoplasma contamination using the following primers: recent study performed in zebrafish embryos . forward, GGG AGC AAA CAG GAT TAG ATA CCC T; It is tempting to speculate that the Cpf1 nucleases reverse, TGC ACC ATC TGT CAC TCT GTT AAC should perform precise genome editing more efficiently CTC. than SpCas9. First, Cpf1 generates a staggered cut that Transfections were performed using either Turbofect may facilitate HDR, while Cas9 generates a blunt cut. (Thermo Scientific), jetPRIME (Polyplus), or Lipofecta- Second, Cpf1 cleaves outside the critical seed region and mine 2000 (Life Technologies), according to the manu- hence repeated targeting may occur, while Cas9 cleaves facturers’ instructions. We seeded 350,000 or 120,000 within the seed region and hence re-targeting is less cells in each well of a 12-well plate or a 24-well plate, re- likely to happen because indel mutations will prevent spectively, one day prior to transfection, so that the cells any subsequent recognition by the enzyme . Indeed, would be approximately 60% confluent the next day. For our results showed that even when we used ssODNs of NHEJ-mediated editing experiments, HEK293T cells the optimal strand sequence, both AsCpf1 and LbCpf1 were transfected with 500 ng CRISPR plasmids in still yielded higher HDR frequencies than SpCas9 at five 12-well plates and sorted 24 h post-transfection for out of the six target sites tested (Additional file 1: Fig- fluorescent cells. For HDR-mediated editing experiments ures S22 and S23). However, we note that there might with ssODNs, HEK293FT cells were co-transfected with be other confounding factors. For example, our data in- 300 ng CRISPR plasmids and 300 ng ssODNs, which dicate the presence of an inherent strand bias at each were purchased from Integrated DNA Technologies, in target genomic locus, possibly due to native chromatin 24-well plates and sorted 3 days post-transfection for context. This localized bias can play in favor of SpCas9’s fluorescent cells. Sequences of the donor ssODNs are preference for a ssODN template that is complementary provided in Additional file 1: Appendix S1. For to the non-target strand, which is what we observed at HDR-mediated editing experiments with donor plas- the PPP1R12C locus. Furthermore, we found that mids, HEK293FT cells were co-transfected with 300 ng SpCas9 and Cpf1 performed comparably when linearized CRISPR plasmids and 300 ng donor templates, which Wang et al. Genome Biology (2018) 19:62 Page 14 of 16 were linearized with either KpnI or SalI, in 24-well plates percentages. Randomly selected reads were also aligned and analyzed by flow cytometry for GFP-positive cells with the relevant reference sequences using the 9 days post-transfection. Needleman-Wunsch algorithm. These alignments were then manually compared with our script outputs to ensure T7E1 assay and RFLP analysis the accuracy of our analysis. Genomic DNA was isolated using QuickExtract (Epicentre) For the characterization of HDR-mediated genome according to the manufacturer’s instructions. The editing, we built another local reference library consist- loci-of-interest were then amplified using Q5 High-Fidelity ing of the modified genes. The procedures described DNA Polymerase (New England Biolabs) and the following above were repeated, but the mapping was done to the PCR parameters: 98 °C for 30s, 98 °C for 10s, 63–65 °C for newly built reference library (consisting of the modified 30s, 72 °C for 20s (repeat from step 2 for 34 more cycles), genes). The reads were then classified into three categor- and 72 °C for 2 min. Sequences of the primers used are ies for calculating the HDR incorporation percentages: provided in Additional file 1: Table S7. Subsequently, the ‘Correct incorporation’, when the read is tagged as a PCR products were purified using the GeneJET Gel Extrac- match and the restriction site is present; ‘Wrong incorp- tion Kit (Thermo Scientific). oration’, when the read is tagged as an insertion or dele- For the T7E1 assay, 200 ng PCR products was incu- tion and the restriction site is present; ‘No bated at 95 °C for 5 min in 1× NEBuffer 2 and then incorporation’, when the restriction site is not present. slowly cooled at a rate of − 0.1 °C/second. After anneal- ing, 5 U T7 endonuclease I (New England Biolabs) was Quantitative real-time PCR added to each sample and the reactions were incubated First, RNA was isolated using Direct-zol RNA Miniprep at 37 °C for 50 min. The T7E1-digested products were kit (Zymo Research) according to manufacturer’s instruc- separated on a 2.5% agarose gel stained with GelRed tions. cDNA was then synthesized using qScript cDNA (Biotium) and the gel bands were quantified using Supermix (QuantaBio). Finally, PCR was performed using ImageJ. For the RFLP analysis, 200 ng PCR products Perfecta SYBR Fastmix (QuantaBio) according to manu- were digested overnight with either XbaI or HindIII-HF facturer’s instructions and the following primers: OFP set (New England Biolabs) in CutSmart buffer. The reac- 1 forward, TGA GCA AAA ACG TGA GCG TG; OFP set tions were separated on a 2% agarose gel stained with 1 reverse, ACC ATA CTG AAA TGC CGT GGT; OFP set GelRed (Biotium) and the gel bands were quantified 2 forward, AAA CGG GGT TCT TGT TGG CT; OFP set using ImageJ. 2 reverse, TCA GTC TGC TCA ACC GTC TT. FACS analysis Flow cytometry was performed on FACSAriaIII (Becton Statistical analysis Dickinson). FSC and SSC were used to separate singlets Statistical tests, including Student’s t-test, Wilcoxon rank from cell aggregates. Subsequently, RFP- and sum test, and Kolmogorov-Smirnov test, were performed GFP-positive cells were gated relative to untransfected as described in the figure captions. All P values were cal- control cells. Data analysis was performed using FACS- culated with either the R software package or Microsoft Diva (Becton Dickinson) and FlowJo. Excel. Illumina deep sequencing analysis Additional file Sequencing libraries were constructed as previously de- scribed . Sequences of the PCR primers used to amp- Additional file 1: Supplementary figures, tables, and appendix. lify the loci-of-interest are provided in Additional file 1: (PDF 5236 kb) Table S8. To process the data, we first built a local refer- ence library comprising the amplicon sequences of the tar- Acknowledgements geted genes. The sequenced reads were then mapped The authors thank Shyam Prabhakar and Talal Bin Amin for sharing the against this reference library with BWA-MEM (mismatch H3K27ac ChIP-seq data and Yue Wan for critical reading of the manuscript. penalty = 2 and clipping penalty = 8). The uniquely mapped reads with mapping quality ≥ 20 were sorted and Funding M.H.T. is supported by an Agency for Science Technology and Research’s assigned group information using Picard. GATK toolkit Joint Council Office grant (1431AFG103), a National Medical Research Council was used to perform local realignment and recalibration. grant (OFIRG/0017/2016), National Research Foundation grants (NRF2013- The ‘CIGAR’ string of the BAM file was used to classify THE001-046 and NRF2013-THE001-093), a Ministry of Education Tier 1 grant (RG50/17 (S)), a startup grant from Nanyang Technological University, and the reads as ‘Insertion’, ‘Deletion’,and ‘Match’.We used funds for the International Genetically Engineering Machine (iGEM) competition only the portions of the reads within 40 bp upstream and from Nanyang Technological University. T.B. is supported by a student grant downstream of the spacer for the calculation of indel from Studierendenwerk Tübingen-Hohenheim. Wang et al. Genome Biology (2018) 19:62 Page 15 of 16 Availability of data and materials 14. Zetsche B, Gootenberg JS, Abudayyeh OO, Slaymaker IM, Makarova KS, The deep sequencing data from this study have been deposited in the Essletzbichler P, Volz SE, Joung J, van der Oost J, Regev A, et al. Cpf1 is a National Center for Biotechnology Information (NCBI) Sequence Read single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell. 2015; Archive (SRA) under accession code SRP144449 . 163:759–71. 15. Ran FA, Cong L, Yan WX, Scott DA, Gootenberg JS, Kriz AJ, Zetsche B, Authors’ contributions Shalem O, Wu X, Makarova KS, et al. In vivo genome editing using MHT conceived the project and supervised the research. YW, KIL, JL, TB, and Staphylococcus aureus Cas9. Nature. 2015;520:186–91. MHT designed the experiments. YW, KIL, NABS, JZ, JL, CRJL, HX, RS, HTY, KHO, 16. Hou Z, Zhang Y, Propson NE, Howden SE, Chu LF, Sontheimer EJ, Thomson JA. TB, YYRP, SML, NNBI, NABS, STVL, JL, and MHT performed the experiments. Efficient genome engineering in human pluripotent stem cells using Cas9 HS, FZ, JNF, and MHT analyzed the Illumina sequencing data. MHT wrote the from Neisseria meningitidis. Proc Natl Acad Sci U S A. 2013;110:15644–9. manuscript. All authors read and approved the final manuscript. 17. Kim E, Koo T, Park SW, Kim D, Kim K, Cho HY, Song DW, Lee KJ, Jung MH, Kim S, et al. In vivo genome editing with a small Cas9 orthologue derived Competing interests from Campylobacter jejuni. Nat Commun. 2017;8:14500. The authors declare that they have no competing interests. 18. Komor AC, Badran AH, Liu DR. CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell. 2017;168:20–36. 19. Kleinstiver BP, Tsai SQ, Prew MS, Nguyen NT, Welch MM, Lopez JM, McCaw Publisher’sNote ZR, Aryee MJ, Joung JK. Genome-wide specificities of CRISPR-Cas Cpf1 Springer Nature remains neutral with regard to jurisdictional claims in nucleases in human cells. Nat Biotechnol. 2016;34:869–74. published maps and institutional affiliations. 20. Kim D, Kim J, Hur JK, Been KW, Yoon SH, Kim JS. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells. Nat Biotechnol. Author details 2016;34:863–8. School of Chemical and Biomedical Engineering, Nanyang Technological 21. Clapier CR, Cairns BR. The biology of chromatin remodeling complexes. University, Singapore 637459, Singapore. Genome Institute of Singapore, Annu Rev Biochem. 2009;78:273–304. Agency for Science Technology and Research, Singapore 138672, Singapore. 22. Lee CM, Cradick TJ, Bao G. The Neisseria meningitidis CRISPR-Cas9 system School of Biological Sciences, Nanyang Technological University, Singapore enables specific genome editing in mammalian cells. Mol Ther. 2016;24: 637551, Singapore. Lee Kong Chian School of Medicine, Nanyang 645–54. Technological University, Singapore 636921, Singapore. School of Applied 23. Zhang Y, Heidrich N, Ampattu BJ, Gunderson CW, Seifert HS, Schoen C, Science, Republic Polytechnic, Singapore 738964, Singapore. School of Life Vogel J, Sontheimer EJ. Processing-independent CRISPR RNAs limit natural Sciences and Chemical Technology, Ngee Ann Polytechnic, Singapore transformation in Neisseria meningitidis. Mol Cell. 2013;50:488–503. 599489, Singapore. 24. Hinz JM, Laughery MF, Wyrick JJ. Nucleosomes Inhibit Cas9 Endonuclease Activity in Vitro. Biochemistry. 2015;54:7063–6. Received: 17 June 2017 Accepted: 7 May 2018 25. Horlbeck MA, Witkowsky LB, Guglielmi B, Replogle JM, Gilbert LA, Villalta JE, Torigoe SE, Tjian R, Weissman JS. Nucleosomes impede Cas9 access to DNA in vivo and in vitro. eLife. 2016;5:e12677. References 26. Isaac RS, Jiang F, Doudna JA, Lim WA, Narlikar GJ, Almeida R. Nucleosome 1. Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, Hsu PD, Wu X, Jiang W, breathing and remodeling constrain CRISPR-Cas9 function. eLife. 2016;5:e13450. Marraffini LA, Zhang F. Multiplex genome engineering using CRISPR/Cas 27. Daer RM, Cutts JP, Brafman DA, Haynes KA. The impact of chromatin systems. Science. 2013;339:819–23. dynamics on Cas9-mediated genome editing in human cells. ACS Synth 2. Mali P, Yang L, Esvelt KM, Aach J, Guell M, DiCarlo JE, Norville JE, Church Biol. 2017;6:428–38. GM. RNA-guided human genome engineering via Cas9. Science. 2013;339: 28. Chen F, Pruett-Miller SM, Huang Y, Gjoka M, Duda K, Taunton J, 823–6. Collingwood TN, Frodin M, Davis GD. High-frequency genome editing using 3. Hwang WY, Fu Y, Reyon D, Maeder ML, Tsai SQ, Sander JD, Peterson RT, Yeh ssDNA oligonucleotides with zinc-finger nucleases. Nat Methods. 2011;8: JR, Joung JK. Efficient genome editing in zebrafish using a CRISPR-Cas 753–5. system. Nat Biotechnol. 2013;31:227–9. 29. Lin S, Staahl BT, Alla RK, Doudna JA. Enhanced homology-directed human 4. Jinek M, East A, Cheng A, Lin S, Ma E, Doudna J. RNA-programmed genome genome engineering by controlled timing of CRISPR/Cas9 delivery. elife. editing in human cells. elife. 2013;2:e00471. 2014;3:e04766. 5. Li D, Qiu Z, Shao Y, Chen Y, Guan Y, Liu M, Li Y, Gao N, Wang L, Lu X, et al. 30. Yang L, Guell M, Byrne S, Yang JL, De Los AA, Mali P, Aach J, Kim-Kiselak C, Heritable gene targeting in the mouse and rat using a CRISPR-Cas system. Briggs AW, Rios X, et al. Optimization of scarless human stem cell genome Nat Biotechnol. 2013;31:681–3. editing. Nucleic Acids Res. 2013;41:9049–61. 6. Friedland AE, Tzur YB, Esvelt KM, Colaiacovo MP, Church GM, Calarco JA. 31. Richardson CD, Ray GJ, DeWitt MA, Curie GL, Corn JE. Enhancing homology- Heritable genome editing in C. elegans via a CRISPR-Cas9 system. Nat directed genome editing by catalytically active and inactive CRISPR-Cas9 Methods. 2013;10:741–3. using asymmetric donor DNA. Nat Biotechnol. 2016;34:339–44. 7. Bassett AR, Tibbit C, Ponting CP, Liu JL. Highly efficient targeted 32. Moreno-Mateos MA, Fernandez JP, Rouet R, Vejnar CE, Lane MA, Mis E, mutagenesis of Drosophila with the CRISPR/Cas9 system. Cell Rep. 2013;4: Khokha MK, Doudna JA, Giraldez AJ. CRISPR-Cpf1 mediates efficient 220–8. homology-directed repair and temperature-controlled genome editing. Nat 8. Li JF, Norville JE, Aach J, McCormack M, Zhang D, Bush J, Church GM, Sheen Commun. 2017;8:2024. J. Multiplex and homologous recombination-mediated genome editing in 33. Richardson CD, Ray GJ, Bray NL, Corn JE. Non-homologous DNA increases Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9. Nat gene disruption efficiency by altering DNA repair outcomes. Nat Commun. Biotechnol. 2013;31:688–91. 2016;7:12463. 9. Blitz IL, Biesinger J, Xie X, Cho KW. Biallelic genome modification in F(0) 34. Kim D, Bae S, Park J, Kim E, Kim S, Yu HR, Hwang J, Kim JI, Kim JS. Xenopus tropicalis embryos using the CRISPR/Cas system. Genesis. 2013;51: Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects in 827–34. human cells. Nat Methods. 2015;12:237–43. 231 p following 243 10. Harel I, Benayoun BA, Machado B, Singh PP, Hu CK, Pech MF, 35. Kim D, Kim S, Kim S, Park J, Kim JS. Genome-wide target specificities of Valenzano DR, Zhang E, Sharp SC, Artandi SE, Brunet A. A platform for CRISPR-Cas9 nucleases revealed by multiplex Digenome-seq. Genome Res. rapid exploration of aging and diseases in a naturally short-lived 2016;26:406–15. vertebrate. Cell. 2015;160:1013–26. 11. Mali P, Esvelt KM, Church GM. Cas9 as a versatile tool for engineering 36. Tsai SQ, Zheng Z, Nguyen NT, Liebers M, Topkar VV, Thapar V, Wyvekens N, biology. Nat Methods. 2013;10:957–63. Khayter C, Iafrate AJ, Le LP, et al. GUIDE-seq enables genome-wide profiling of 12. Doudna JA, Charpentier E. Genome editing. The new frontier of genome off-target cleavage by CRISPR-Cas nucleases. Nat Biotechnol. 2015;33:187–97. engineering with CRISPR-Cas9. Science. 2014;346:1258096. 37. Tsai SQ, Nguyen NT, Malagon-Lopez J, Topkar VV, Aryee MJ, Joung JK. 13. Hsu PD, Lander ES, Zhang F. Development and applications of CRISPR-Cas9 CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 for genome engineering. Cell. 2014;157:1262–78. nuclease off-targets. Nat Methods. 2017;14(6):607–614. Wang et al. Genome Biology (2018) 19:62 Page 16 of 16 38. Cameron P, Fuller CK, Donohoue PD, Jones BN, Thompson MS, Carter MM, Gradia S, Vidal B, Garner E, Slorach EM, et al. Mapping the genomic landscape of CRISPR-Cas9 cleavage. Nat Methods. 2017;14(6):600–606. 39. Hu JH, Miller SM, Geurts MH, Tang W, Chen L, Sun N, Zeina CM, Gao X, Rees HA, Lin Z, Liu DR. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature. 2018;556(7699):57–63. 40. Chu VT, Weber T, Wefers B, Wurst W, Sander S, Rajewsky K, Kuhn R. Increasing the efficiency of homology-directed repair for CRISPR-Cas9- induced precise gene editing in mammalian cells. Nat Biotechnol. 2015;33: 543–8. 41. Maruyama T, Dougan SK, Truttmann MC, Bilate AM, Ingram JR, Ploegh HL. Increasing the efficiency of precise genome editing with CRISPR-Cas9 by inhibition of nonhomologous end joining. Nat Biotechnol. 2015;33:538–42. 42. Yu C, Liu Y, Ma T, Liu K, Xu S, Zhang Y, Liu H, La Russa M, Xie M, Ding S, Qi LS. Small molecules enhance CRISPR genome editing in pluripotent stem cells. Cell Stem Cell. 2015;16:142–7. 43. Pinder J, Salsman J, Dellaire G. Nuclear domain ‘knock-in’ screen for the evaluation and identification of small molecule enhancers of CRISPR-based genome editing. Nucleic Acids Res. 2015;43:9379–92. 44. Song J, Yang D, Xu J, Zhu T, Chen YE, Zhang J. RS-1 enhances CRISPR/Cas9- and TALEN-mediated knock-in efficiency. Nat Commun. 2016;7:10548. 45. Gutschner T, Haemmerle M, Genovese G, Draetta GF, Chin L. Post- translational regulation of Cas9 during G1 enhances homology-directed repair. Cell Rep. 2016;14:1555–66. 46. Zhang JP, Li XL, Li GH, Chen W, Arakaki C, Botimer GD, Baylink D, Zhang L, Wen W, Fu YW, et al. Efficient precise knockin with a double cut HDR donor after CRISPR/Cas9-mediated double-stranded DNA cleavage. Genome Biol. 2017;18:35. 47. Liu KI, Ramli MN, Woo CW, Wang Y, Zhao T, Zhang X, Yim GR, Chong BY, Gowher A, Chua MZ, et al. A chemical-inducible CRISPR-Cas9 system for rapid control of genome editing. Nat Chem Biol. 2016;12(11):980–987. 48. Wang Y, Liu KI, Sutrisnoh NAB, Srinivasan H, Zhang J, Li J, Zhang F, Lalith CRJ, Xing H, Shanmugam R, et al. Systematic evaluation of CRISPR-Cas systems reveals design principles for genome editing in human cells. https://www.ncbi.nlm.nih.gov/sra/?term=SRP144449.
– Springer Journals
Published: May 29, 2018