Browsed by
Category: siPOOLs

Unexpected Mutations after CRISPR in vivo editing – post-commentary

Unexpected Mutations after CRISPR in vivo editing – post-commentary

You might have heard or participated in the global discussion over the recently published Nature Commentary that described >1000 off-target mutations in CRISPR-edited mice.

The paper reported a small study involving three mice but gained enough virality online to trigger a significant drop in share prices of companies founded based on CRISPR gene-editing – Editas Medicine, CRISPR Therapeutics and Intellia Therapeutics.

Here is a summary of the study, with respective concerns raised by the scientific community regarding the validity of the findings. These are highlighted *in blue with further explanations below:

  • FVB/NJ mice were used in the study.These mice are a highly inbred strain (F87 on Dec 2002) originating from the NIH but transferred to The Jackson Laboratory for maintenance and sale. They are homozygous for the Pde6brd1 allele, subjecting them to early onset retinal degeneration.

 

  • The same authors previously published a pretty decent paper where they functionally characterized a rescue of the retinal degeneration by correcting what was thought to be a nonsense mutation (Y347X, C>A) at exon7 of the Pde6β subunit. The same “rescued” mice, edited by CRISPR (F03 and F05), along with the control co-housed mouse that did not undergo editing, were used in this subsequent sequencing study. *Concern 1

 

  • The CRISPR mutation was performed by introducing the sgRNA via a pX335 plasmid (which would co-express Cas9D10A nickase) into FVB/NJ zygotes, alongside a single-stranded oligo which acts as a donor to introduce a controlled mutation at the Pde6b. WT Cas9 protein was also introduced. *Concern 2

 

  • DNA was isolated from spleen of the mice and whole genome sequencing was performed with an Illumina HiSeq 2500 sequencer with a 50X coverage for CRISPR-treated mice and 30X coverage for the control mouse.

 

  • The authors used three different algorithms to detect variants – Mutect, Lofreq and Strelka. The number of single nucleotide variants (SNVs) and insertion deletions (indels) detected that were absent in the control mouse are shown below for the two CRISPR-edited mice.

   

Overlap of SNV/indels detected in two CRISPR-edited mice – F03 mouse (blue), F05 mouse (green).

 

  • Each of the variants were filtered against the FVB/NJ genome in the mouse dbSNP database (v138) and also against 36 other mouse strains from the Mouse Genome Project (v3). As none of the variants detected were found in these database genomes, the authors concluded they had to arise through CRISPR-editing. *Concern 3

 

  • Interestingly, the top 50 predicted off-target sites showed no mutations. And in sites where mutations were detected, there was no significant sequence homology against the sgRNA used. The authors conclude in silico modelling fails to predict off-target sites. *Concern 4.

A number of criticisms have been raised regarding the study and the four main concerns highlighted are explained below:

Concern 1: The study only involved three mice, hence is too underpowered to draw any statistically significant conclusions. Further, the choice of control mouse simply being a co-housed mouse (no mention of its background) may fail to capture any genetic alterations induced by the experimental procedure or by genetic drift within a colony.

More appropriate controls may have included a mouse produced with a sham-injected zygote, a mouse where only Cas9 was introduced without an sgRNA, and a mouse with only sgRNA and ssDNA donor.

Parent mice should also have been sequenced to check if variants detected were already in the existing strain.

Concern 2: Cas9 was introduced both as a protein and in a plasmid. Talk about overkill! Though the plasmid form of Cas9 is the nickase version, where 2 sgRNAs are required to produce a double-strand break, having high levels of active Cas9 floating about has been demonstrated to increase the incidence of off-target effects.

Concern 3: Even though the authors filtered the variants found against mouse genome databases, this may not be sufficient to capture the extent of genetic drift that occurs over multiple generations of in-breeding.

Gaetan Burgio wrote that from his experience, the reference genomes found in databases often fail to capture the amount of variants that are specific to every breeding facility. Often large numbers of reference mice (1oo mouse exomes from > 50 founders) have to be sequenced to determine if SNPs were specific to the mouse strain and not induced by the test condition.

Editas and George Church’s group from Harvard also highlighted the high amount of overlap in SNVs/indels between the two CRISPR-edited mice which..

“strongly suggests the vast majority of these mutations were present in the animals of origin. The odds of  the exact nucleotide changes occurring in the exact same position of the exact same gene at the exact same ratios in almost every case are effectively zero.”

Concern 4: Apart from the flaw that only one sgRNA was studied, Church’s group also claim the sgRNA studied had a high off-target profile. This sgRNA would apparently have failed their criteria for use as a therapeutic candidate. The table below shows the number of predicted off-target sites when allowing for 1-3 mismatches from the sgRNA sequence.

Predicted off-target profile of sgRNA used in study
Off-target sites with 1 mismatch 1
Off-target sites with 2 mismatches 1
Off-target sites with 3 mismatches 24

 

What was surprising from the study however, was that despite the high off-targeting potential, mutations were not seen at predicted off-target sites.

The consensus therefore, by both Church’s group and the authors of the study was that one cannot rely on in silico prediction alone to account for off-target effects.

Calls are now being made to validate the study using the appropriate controls, or to compare the variants obtained with other more updated mouse genome SNP databases. I expect we will not hear the last of this study.

The study however, does re-enforce our message in a previous blogpost of validating CRISPR experiments with other techniques to establish gene function. It also highlights the extensive genetic heterogeneity seen now not only between cell lines, but between mouse strains. As always we recommend not being swept up in the hype, but to remain scientifically skeptical.

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

Follow us or share this post:
Making sense of siGENOME deconvolution

Making sense of siGENOME deconvolution

As discussed previously, deconvoluted Dharmacon siGENOME pools often give surprising results.  (Deconvolution is the process of testing the 4 siRNAs in a pool individually.  This is usually done in the validation phase of siRNA screens.)

One way to compare the relative contribution of target gene and off-target effects is to calculate the correlation between reagents having the same target gene or the same seed sequence.  One of the first things we do when analysing single siRNA screens is to calculate a robust form of the intraclass correlation (rICC, see discussion at bottom for more about this).

Recently we were analysing deconvolution data from Adamson et al. (2012) and calculated the following rICC’s.  (The phenotype measured was relative homologous recombination.)

Grouping variable  rICC    95% confidence interval

Target gene        0.040   -0.021-0.099
Antisense 7mer     0.383   0.357-0.413
Sense 7mer         0.093   0.054-0.129

Besides the order of magnitude difference between target gene and antisense seed correlation (which is commonly observed in RNAi screens), what stands out is the ~2-fold difference between the correlation by target gene and sense seed.

Very little of the the sense strand should be loaded into RISC, if the siRNAs were designed with appropriate thermodynamic considerations (the 5′ antisense end should be less stable than the 5′ sense end, to ensure that the antisense strand is preferentially loaded into RISC).

The above correlations suggest that some not insubstantial amount of sense strand is making it into the RISC complex.

Here is the distribution of delta-delta-G for siPOOLs and siGENOME siRNAs targeting the same 500 human kinases (see bottom of post for discussion of calculation).  A positive delta-delta G means that the sense end is more thermodynamically stable than the antisense end, favouring the loading of the antisense strand into RISC.

 

 

This discrepancy in delta-delta G is also consistent with comparison of mRNA knockdown:

The siGENOME knockdown data comes from 774 genes analysed by qPCR in Simpson et al. (2008).  The siPOOL knockdown data is from 223 genes where we have done qPCR validation.

Of note, the siGENOME pools were tested at 100 nM, whereas siPOOLs were tested at 1 nM.

(It should be mentioned that, although consistent with the observed differences in ddG, this is only an indirect comparison, and delta-delta G is not the only determinant of functional siRNAs.)

 

Notes on intraclass correlation

Intraclass correlation measures the agreement between multiple measurements (in this case, multiple siRNAs with the same target gene, or multiple siRNAs with the same seed sequence).   One could also pair off all the repeated measures and calculate correlation using standard methods (parametrically using Pearson’s method, or non-parametrically using Spearman’s method).  The main problem with such an approach is that there is no natural way to determine which measure goes in the x or y column.  Correlations are normally between different variables (e.g. height and weight).  In a case of repeated measures, there is no natural order, so the intraclass correlation (ICC) is the more correct way to measure the similarity of within-group measurements.  As ICC depends on a normal distribution, datasets must first be examined, and if necessary, transformed beforehand.

Robust methods have the advantage of permitting the use of untransformed data, which is especially useful when running scripts across hundreds of screening dataset features.  The algorithm we use calculates a robust approximation of the ICC by combining resampling and non-parametric correlation.

Here is the algorithm, in a nutshell:

  1. Group observations (e.g. cell count) by the grouping variable (e.g. target gene or antisense seed)
  2. Randomly assign one value of each group to the x or y column (groups with one 1 observation are skipped)
    • for example, if the grouping variable is target gene and siRNAs targeting PLK1 had the values 23, 30, 37, 45, the program would randomly choose 1 of the values for the x column and another for the y column
  3. Calcule Spearman’s rho (non-parametric measure of correlation)
  4. Repeat steps 1-3 a set number of times (e.g. 300) and store the calculated rho’s
  5. Calculate mean of the rho values from 4.  This is the robust approximation of the ICC (rICC).
    • Values from 4 are also used to calculate confidence intervals.

The program that calculates this is available upon request.

Notes on calculating delta-delta G

Delta-delta G was calculated using the Vienna RNA package, as detailed here: https://www.biostars.org/p/58979/ (in answer by Brad Chapman).

The delta-delta G was calculated using 3 terminal bps.  We found that that ddG of the terminal 3 bps had the strongest correlation with observed knockdown.  Others (e.g. Schwarz et al., 2003 and Khvorova et al., 2003) have also used the terminal 4 bps.

 

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

Follow us or share this post:
How reproducible are CRISPR screens?

How reproducible are CRISPR screens?

The reproducibility of different CRISPR or RNAi reagents targeting the same gene is sometimes cited as prima facie evidence for the superiority of CRISPR screens to RNAi screens.

A landmark paper by Shalem et al. showed that different gRNAs inhibit gene expression much more consistently than do different shRNAs:

But does this ensure that CRISPR screens are more reliable (as determined by assay reproducibility) than RNAi screens?  Not necessarily.

Shalem et al. performed two pooled CRISPR screens in parallel, and found substantial overlap between the top hits.

How does this overlap compare to that between replicate RNAi screens?

In 2010, Barrows et al. tested the reproducibility between genome-wide siRNA screens conducted 5 months apart.  Using the sum of ranks hit selection algorithm, they found 75 and 82 hits from the first and second screens, respectively, with 43 hits overlapping.

If we take the top 75 and top 82 hits from the Shalem replicate screens, we only find 17 genes overlapping.

It’s important to note that the Shalem and Barrows assays were different, as were the screening formats: arrayed (siRNA) vs. pooled (CRISPR).  And this was one of the earliest CRISPR libraries.  Much has been learned about optimising gRNA efficiency and specificity since the Shalem screen.

However, it is also important to note that consistent inhibition of gene expression does not guarantee consistent phenotypes.  The above analysis suggests that care is needed in interpreting the results of CRISPR screens.  RNAi screens possess advantages, e.g. ease of arrayed screening, that will make them useful for many years to come.

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

 

 

Follow us or share this post:
The final RNAiL?

The final RNAiL?

A recent article in The Scientist asks whether, in light of a paper by Lin et al. showing phenotypic discrepancies between RNAi and CRISPR, this is not ‘the last nail in the coffin for RNAi as a screening tool’?

The paper in question found that a gene (MELK) that had been shown by many RNAi-based studies to be critical for several cancer types shows no effect when knocked out via CRISPR.  They also report that in relevant published genome-wide screens, MELK was not at the top of the hit lists.

Does this mean that the papers that used RNAi were unlucky and off-target effects were responsible for their observed phenotypes?

Gray et al. identified MELK as a gene of interest based on microarray experiments.  They then designed RNAi experiments to test its role in proliferation.  Assuming that this study and the subsequent ones followed good RNAi experimental design (using reagents with varying seed sequences, testing the correlation between gene knockdown and phenotypic strength, etc.), we can be fairly confident that MELK is involved in proliferation.  It might not be the most essential player, which would explain why it is not at the top of screening hit lists.  And screening lists have the draw-back of enriching for off-target hits.

Another possibility is that Lin et al. have observed a known complicating feature of knock-out screens: genetic compensation.  Although they undertake experiments to address this issue, it could be that compensation takes place too quickly for their experiments to rule it out.  Furthermore, they could have addressed this issue by testing knock-down reagents themselves, and checking whether genes they hypothesise as responsible for the supposed off-target effect in the published RNAi work are in fact down-regulated.  C911 reagents could also be used to test for off-target effects.  This is extra work, but given that they are disputing the results in many published studies, this seems justified.

As regards the role of RNAi in screening, The Scientist concludes with the following (suggesting that their answer to the question of whether this is the final nail is also No):

In the meantime, one obvious solution to the problem of target identification and validation is to use both CRISPR and RNAi to validate a target before it moves into clinical research, rather than relying on a single method. “We have CRISPR and short hairpin reagents for every gene in the human genome,” said Bernards. “So when we see a phenotype with CRISPR, we validate with short hairpin, and the other way around. I think that would be ideal.”

Although we agree that validating CRISPR hits with RNAi reagents is important (especially if drugability is a concern), one has to be careful with RNAi reagents, like single siRNAs/shRNAs or low-complexity pools, that are susceptible to seed-based off-target effects.  For validating CRISPR screening hits, siPOOLs provide the best protection against unwanted off-target effects, saving you time, money, and disappointment during the validation phase.

 

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

Follow us or share this post:
CRISPR – what can go wrong and how to deal with it

CRISPR – what can go wrong and how to deal with it

CRISPR is a gene editing technique based on tools and principles learnt from the bacterial immune system. Gaining immense popularity world-wide, many are trying to establish CRISPR in their favourite model systems to study gene function. Here, we highlight issues to be aware of when using CRISPR and what one can do to counter or manage them.

To simplify matters, we have classified what could go wrong while performing CRISPR into three main categories, accompanied by associated exclamations one may hear in the process:

  1. “Hmm… I don’t see anything.” – Absence of phenotype
  2. “This is taking wayyy too long.” – Inefficient editing
  3. “What the *@#?!” – Unexpected phenotypes

First, some key terms…

Cas9: The bacterial RNA-guided endonuclease that mediates cutting of the DNA. The most commonly used Cas9 ortholog is from Streptococcus Pyogenes and can be introduced into cells in the form of DNA, mRNA, or protein.

sgRNA: single guide RNA composed of a 17-20 base long guide RNA (gRNA) which hybridizes to its complementary DNA sequence on the genome, defining  the target site. This is often joined to a ~70-80 base long transactivating crRNA (tracrRNA), a constant region that mediates recruitment of Cas9. sgRNAs can be introduced as one unit or in its separate components – gRNA and tracRNA – as DNA or RNA.

PAM: protospacer adjacent motif, a trinucleotide sequence 3’ adjacent to the gene editing site required for Cas9 to bind and mediate cleavage. Sequence is NGG for Cas9 from Streptococcus Pyogenes though NAG is often recognized as well. PAM sequences differ between various forms of Cas enzymes.

 

  1. “Hmm… I don’t see anything.” – Absence of phenotype

The anti-climax of a null result may stem from adaptation where the cell or organism alters other gene pathways to compensate for the loss-of-function of the target gene.

This problem is most visible to those maintaining Drosophila stocks as strength of phenotype typically decreases over multiple generations. The phenomenon is also well-documented in other models such as yeast (Teng X et al., 2013), zebrafish (Rossi et al., 2016, covered in a previous blogpost) and mice (Babaric et al., 2007). A notable Developmental Cell paper recently reported adaptation in cells (Cerikan et al., 2016) where prolonged knock-down (KD) or knock-out (KO) yielded no visible phenotype as opposed to acute KD by RNAi.

Multiple cell passages increase genetic drift, providing opportunities for the system to adapt to counter the disruptive effects of a gene knock-out. It is therefore prudent to preserve early passages of clones during clonal selection and limit multiple passages prior to assay measurement.

Besides adaptation, redundancy may also account for an absence of phenotype. Paralogous genes (i.e. genes closely related in structure or function) often exist in model systems that can fully or partially compensate for the loss-of-function of the target gene. About 50% of mouse genes and at least 17% of human genes have paralogues that may mask loss-of-function phenotypes.

One can find paralogous genes arising from gene duplication with this database and by checking existing literature. If they do exist, a co-knock-out/knock-down approach may be necessary.

 

  1. “This is taking wayyy too long.” – Inefficient editing

Despite the high efficiency of Cas9-mediated cleavage, obtaining the desired gene knock-out can still be a tedious and time-consuming process, with wide-ranging overall efficiencies of 1-79% (Unniyampurath et al., 2016).

These challenges often stem from issues associated with the cell line of choice. Due to many standard cell lines being polyploid (containing multiple copies of chromosomes), every copy of the gene has to be disrupted to ensure a complete knock-out. A process aggravated by the need for a homozygous knock-out. Transfection efficiencies, how well the cell line tolerates clonal selection and the impact of the gene modification on cell viability can also impact outcomes. If performing homology directed repair (HDR) to introduce a new sequence at the cut site, clone screening efforts have to be amplified due to the lower frequency of HDR events compared to indels.

Understanding the characteristics of your cell line and ensuring sufficient numbers of clones are screened is essential to avoid mindless weeks repeating experiments!

Editing efficiency may also be hindered by genomic accessibility. gRNAs targeting transcriptional start sites or promoters were found to be more efficient than intergenic sites due to the open chromatin structure in these areas (Liu X et al., 2016). Numerous design criteria have been recommended to ensure high cutting efficiency but performance of gRNAs may still vary. Therefore it is advisable to use at least 3 sgRNAs per gene to increase chances of success.

Sidenote: Looking for someone who can design CRISPR sgRNAs for you? siTOOLs Biotech’s CRISPR sgRNA design service couples our long-standing experience in off-target filtering with published gRNA design criterion to generate reliable gRNA sequences. Send us your enquiry and we will get back to you.

 

  1. “What the *@#?!” – Unexpected phenotypes

Unexpected results can stem from off-target effects or in some cases, may be a real effect that requires some brain rattling to make sense of.

Off-target effects are still a cause of concern for CRISPR and vary widely with different gRNA sequences ranging from 0 to up to 150 in one report (Tsai et al., 2015). In another study, ~10 to > 1000 off-target binding sites were found that varied with sgRNA sequence (Kuscu et al., 2014).

Toxicity correlated with increased off-targeting (Morgens et al., 2017) and the use of safe-targeting controls (i.e. where gRNAs are directed towards sites where cleavage is predicted to have minimal impact) was recommended. This served as a more appropriate measure of nuclease-induced toxicity as opposed to non-targeting controls that might not lead to cleavage.

Some other strategies to minimize off-targets:

  • Use the Cas9 recombinant protein/mRNA rather than a plasmid or keep DNA transfection amounts low (plasmid-driven prolonged Cas9 expression increased off-targeting events as reported by Liang et al., 2015)
  • Use truncated gRNAs of 17-18 nucleotides
  • Use D10A Cas9 nickase and paired gRNAs
  • Use a Cas9 ortholog with a longer PAM requirement

Despite our efforts to predict off-target effects, two reported sources of potential off-targets make prediction challenging:

a) Single nucleotide variants from clonal heterogeneity

b) Cas9 effects on mRNA translation

 

a) Single nucleotide variants from clonal heterogeneity

Table 1: Spontaneous SNVs and indels generated over clonal selection in human pluripotent stem cells.

Two studies (Smith et al., 2014Veres et al., 2014) carried out in pluripotent stem cells to detect off-targets saw a higher specificity of Cas9 in these cells compared to cancer cell lines but shockingly, rather large clonal heterogeneity (Table 1).  Each clone generated from the parental cell line had on average 100 unique SNVs per clone and 2-5 indels not induced by the gene modification but arising spontaneously during cell culture.

Target and off-target indel frequencies
Number of mismatches Number of genomic sites Cas9 targeting efficiency
0 1 53.9%
1 0
2 0 → 1 36.7%
3 32 ~0.15% per site

Table 2: Editing efficiencies at off-target sites with 0-3 mismatches. Condition of SNV enhancing editing efficiency shown in bold.

Yang et al., 2014 then goes on to demonstrate how an SNV at the wrong place at the wrong time can produce a high-efficiency off-target site. The said SNV corrected a mismatch at an off-target site, reducing mismatch number from 3 to 2, which increased Cas9 –mediated indel frequency to ~37%!

To manage clonal heterogeneity, we recommend performing deep sequencing to fully characterize the knock-out clone and its parental wild-type cell line. Once the locations of SNVs are identified, these can be aligned with potential off-target gRNA binding sites to check for interference. Check locations of identified unique SNVs or indels to see if they are impacting genes that may play a relevant role in your studied phenotype.

b) Cas9 effects on mRNA translation

A Scientific Reports study (Liu Y et al., 2016) reported a worrying finding that Cas9 could be recruited by gRNAs to mRNAs and block their translation. Neither PAM sequences nor Cas9 enzyme activity was required for this and the effect varied with gRNA sequence. Cas9-mediated mRNA translation suppression produced a 30-60% decrease in protein levels, sufficient to impact downstream phenotypes. For example, a gRNA targeting VEGFA with an off-target binding site to the mRNA of oncogene, B3GNT8, produced a nearly 50% drop in B3GNT8 protein levels with a corresponding drop in cell viability. This was partially rescued by overexpressing B3GNT8 with a vector.

It is still unclear to what extent this phenomenon occurs. There have been limited reports on this mechanism so far, but if true, would have a far-ranging impact. The study found gRNAs with single base mismatches at position 8-20 were still able to carry out Cas9-mediated translation repression. This low hybridization stringency requirement would make off-targets impossible to predict.

CRISPR is no doubt a powerful technology, but it still brings many unknowns. After its discovery in the 1990s, RNAi experienced a similar exponential uptake and use by the scientific community. It took several years for the problem of siRNA off-targets to become visible. Unfortunately by that time, enormous resources and energy had been sunk into large RNAi screens, which yielded numerous false hits and difficult-to-interpret data.

Figure 1. Pubmed Citations (1999-2015) with CRISPR or RNAi in Title/Abstract/Summary

Thankfully we now have  siPOOLs, or high-complexity defined siRNA pools (from siTOOLs Biotech). These custom-designed pools of 30 unique siRNAs counter the off-target effects often seen with single siRNAs or low complexity siRNA pools of 3-4 siRNAs (Marine et al., 2012, Hannus et al., 2014). Efficient at 1 nM in standard cell lines, it is the optimal RNAi reagent for highly specific, efficient and robust gene knock-down.

In order not to repeat past mistakes, it is imperative to proceed with caution and use multiple methods to establish gene function.

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

References:

Barbaric, I., Miller, G. & Dear, T. N. Appearances can be deceiving: Phenotypes of knockout mice. Briefings Funct. Genomics Proteomics 6, 91–103 (2007).

Cerikan, B. et al. Cell-Intrinsic Adaptation Arising from Chronic Ablation of a Key Rho GTPase Regulator. Dev. Cell 39, 28–43 (2016).

Kuscu, C., Arslan, S., Singh, R., Thorpe, J. & Adli, M. Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat Biotechnol 32, 677–683 (2014).

Hannus, M. et al. siPools: highly complex but accurately defined siRNA pools eliminate off-target effects. Nucleic Acids Res. 42, 8049–61 (2014).

Liang, X. et al. Rapid and highly efficient mammalian cell engineering via Cas9 protein transfection. J. Biotechnol. 208, 44–53 (2015).

Liu, X. et al. Sequence features associated with the cleavage efficiency of CRISPR/Cas9 system. Sci. Rep. 6, 19675 (2016).

Liu, Y. et al. Targeting cellular mRNAs translation by CRISPR-Cas9. Nat. Publ. Gr. 2–10 (2016). doi:10.1038/srep29652

Marine, S., Bahl, A., Ferrer, M. & Buehler, E. Common seed analysis to identify off-target effects in siRNA screens. J. Biomol. Screen. 17, 370–8 (2012).

Rossi, A. et al. Genetic compensation induced by deleterious mutations but not gene knockdowns. Nature 524, 230–233 (2015).

Smith, C. et al. Whole-Genome Sequencing Analysis Reveals High Specificity of CRISPR/Cas9 and TALEN-Based Genome Editing in Human iPSCs. doi:10.1016/j.stem.2014.06.011

Teng, X. et al. Genome-wide Consequences of Deleting Any Single Gene. Mol. Cell 52, 485–494 (2017).

Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat Biotech 33, 187–197 (2015).

Unniyampurath, U., Pilankatta, R. & Krishnan, M. N. RNA Interference in the Age of CRISPR : Will CRISPR Interfere with RNAi ? (2016). doi:10.3390/ijms17030291

Veres, A. et al. Low incidence of Off-target mutations in individual CRISPR-Cas9 and TALEN targeted human stem cell clones detected by whole-genome sequencing. Cell Stem Cell 15, 27–30 (2014).

Yang, L. et al. Targeted and genome-wide sequencing reveal single nucleotide variations impacting specificity of Cas9 in human stem cells. Nat. Commun. 5, 1–6 (2014).

Further helpful reading:

Housden, B. E. et al. Loss-of-function genetic tools for animal models: cross-species and cross-platform differences. Nat. Publ. Gr. (2016). doi:10.1038/nrg.2016.118

 

Follow us or share this post:
Where’s the beef?

Where’s the beef?

In our last blog entry, we discussed a classic RNAi screening paper from 2005 that showed that the top 3 screening hits were were due to off-target effects.

In this post, we analyse a more recent genome-wide RNAi screen by Hasson et al., looking in more detail at what proportion of top screening hits are due to on- vs. off-target effects.

Hasson et al. used the Silencer Select library, a second-generation siRNA library designed to optimise on-target knock down, and chemically modified to reduce off-target effects.  Each gene is covered by 3 different siRNAs.

To begin the analysis, we ranked the screened siRNAs in descending order of % Parkin translocation, the study’s main readout.

We then performed a hypergeometric test on all genes covered by the ranked siRNAs.  For example, if gene A has three siRNAs that rank 30, 44, and 60, we calculate a p-value for the likelihood of having siRNAs that rank that highly (more details provided at bottom of this post).  It’s the underlying principle of the RSA algorithm, widely used in RNAi screening hit selection.  If the 3 siRNAs for gene B have a ranking of 25, 1000, and 1500, the p-value will be higher (worse) than for gene A.

The same type of hypergeometric testing was done for the siRNA seeds in the ranked list.  For example, if the seed ATCGAA was found in siRNAs having ranks of 11, 300, 4000, and 6000, we would calculate the p-value for those rankings.  Seeds are over-represented in siRNAs at the top of the ranked list will have lower p-values.

After doing these hypergeometric tests, we had a gene p-value and a seed p-value for each row in the ranked list.  We could then look at each row in the ranked list estimate whether the phenotypic is due to an on- or off-target effect by comparing the gene and seed p-values.  [As a cutoff, we said that the effect is due to one of either gene or seed if the difference in p-value is at least two orders of magnitude.  If the difference is less than this, the cause was considered ambiguous.]

After assigning the effect as gene/seed/ambiguous, we then calculated the cumulative percent of hits by effect at each position in the ranked list.   Those fractions were then plotted as a stacked area chart (here, looking at the top 200 siRNAs from the screen):

 

The on-target effect is sandwiched between the massive ‘bun’ of off-target effects and ambiguous cause.  We are reminded of these classic commercials from the 80s:

 

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

 

Note on p-value calculations:

P-values were calculated using the cumulative hyper-geometric test (tests the probability of finding that many or more instances of members belonging to the particular group, in our case a particular gene or seed sequence).  The p-value associated with a gene or seed is the best p-value for all the performed tests.  For example, assume a gene had siRNAs with the following ranks: 5, 20, 1000.  The first test calculates the p-value for finding 1 (of the 3) siRNAs when taking a sample of 5 siRNAs.  The next test calculates the p-value for finding 2 (of 3) siRNAs when taking a sample of 20 siRNAs.  And the last is the probability of getting 3 (of 3) siRNAs when taking a sample of 1000.  If the best p-value came from the second test (2 of 3 siRNAs found in a sample size of 20), that is the p-value that the gene receives.  This is also the approach used by the RSA (redundant siRNA activity) algorithm.  One advantage of RSA is that it can compensate for variable knock down efficiency of the siRNAs covering a gene (e.g. if 1 of 3 gives little knockdown).

Follow us or share this post:
Classic Papers Series: Lin et al. show RNAi screen dominated by seed effects

Classic Papers Series: Lin et al. show RNAi screen dominated by seed effects

Over the coming months, we will highlight a number of seminal papers in the RNAi field.

The first such paper is from 2005 by Lin et al. of Abbott Laboratories, who showed that the top hits from their RNAi screen were due to seed-based off-target effects, rather than the intended (and at that time, rather expected) on-target effect.

The authors screened 507 human kinases with 1 siRNA per gene, using a HIF-1 reporter assay to identify genes regulating hypoxia-induced HIF-1 response.

In the validation phase of their screen, they tested new siRNAs for hit genes, but found that they failed to reproduce the observed effect, even when using siRNAs that had a better on-target knock down than the pass 1 siRNAs.

Figure 1A.  Left panel shows on-target knock down of pass 1 siRNA for GRK4 (O) and the new design (N).  Centre panel shows Western blot of protein  levels  Right panel shows HIF-1 reporter activity for positive control (HIF1A) and the original (O) and new (N) siRNAs.

The on-target knock down is much-improved for the new design, yet its reporter activity is indistinguishable from negative control.  Yet the pass 1 siRNA with poor knock down gives almost as strong a result as HIF1A (positive control).

By qPCR, they then showed that GRK4(O) and another one of the top 3 siRNAs silence HIF1A (the positive control gene).  Using a number of different target constructs they also nicely show that it was due to seed-based targeting in the 3′ UTR.

Although the authors screened at a high initial concentration (100 nM), the observed off-targets persisted at 5 nM, suggesting that just screening at lower concentrations would not have improved their results.

The authors conclude:

In addition, due to the large percentage of the off- target hits generated in the screening, using a redundant library without pooling in the primary screen could significantly reduce the efforts required to eliminate off-target false positives and therefore, will be a more efficient design than using a pooled library.

This is true for low-complexity pools, but high-complexity pools can overcome this problem by providing a single reliable result for each screened gene.

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

Follow us or share this post:
Simplicity is the ultimate sophistication

Simplicity is the ultimate sophistication

The beauty of the siPOOL strategy is its simplicity.

In this presentation from the  (relatively) early days of Apple, Steve Jobs says that his company’s goal is to serve the one-on-one relationship between a user and his/her computer.

 

Similarly, siPOOLs, are designed to serve the one-on-one relationship between a scientist and his/her RNAi results.

By providing an interpretable result without the need for extensive follow-up work and off-target corrections, siPOOLs make it possible for a scientist to use a single gene list to gain insight into biological function.

We believe, as stated in the brochure to market the Apple II,  that  Simplicity is the ultimate sophistication.

(Note: this quote is popularly, though apparently falsely, attributed to Leonardo da Vinci) .

 

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

Follow us or share this post:
Seed effects persist in hyperdimensional space

Seed effects persist in hyperdimensional space

Work from the Carpenter lab suggests that attempts to shake seed-based off-targets by going to  ‘phenotypic hyperspace’ will not work.

They performed a high-content assay with 315 shRNAs covering 41 genes.  A 1301-dimensional profile was created for each well, and compressed to 205 principal components that captured  99% of the variance.

The hope would be that by examining a wider phenotypic space, the gene-specific effects of RNAi reagents would become more prominent.

However, the profiles between shRNAs targeting the same gene are only slightly better than those between random shRNAs, while shRNAs sharing the same seed sequence have much more similar profiles.

Screen Shot 2015-12-15 at 14.27.51

(figure shows percent of significant profile correlations for different pairings)

Off-target phenotypes can only be escaped by using a reagent that exclusively knocks down the target gene.

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

Follow us or share this post:
Knocking out the phenotype

Knocking out the phenotype

Consistent with the work of Rossi et al. (discussed previously),  another recent paper shows a lack of phenotypic response when knocking out a gene that gives a phenotypic response when knocked down.

Knocking out klf2a does not result in any discernible difference from wild-type (whereas knock-down has been shown to produce a range of cardiovascular phenotypes).

The authors conclude:

In summary, our work shows that even in the face of clear evidence of a potentially disruptive mutation induced in a gene of interest, it is currently very difficult to be certain that this leads to loss-of-function, and hence to be confident about the role of the gene in embryonic development.

Using a knock-down reagent that prevents off-target effects is the best way to be confident about your phenotypes.

 

Want to receive regular blog updates? Sign up for our siTOOLs Newsletter:

Follow us or share this post:

Like what you see? Mouse over icons to Follow / Share