Author: Andrew Walsh

Comparing silencing for CDS and 3′ UTR siRNAs

Comparing silencing for CDS and 3′ UTR siRNAs

CDS vs. 3′ UTR

In some cases, it may be preferable to only target the 3′ UTR of an mRNA.

For example, the CDS may be highly similar to a related paralog gene that should not be co-targeted.

Or a rescue experiment will be performed, and one would rather use the native CDS for rescue.

(Note that we can also provide rescue constructs for siPOOLs that target CDS. The rescue sequence uses the most common alternative codon in the targeted regions to ensure there is no silencing of the rescue construct.)

One common question is whether siRNAs against the 3′ UTR are as effective as those against the CDS?

Early results from 2004

An early paper by Hsieh et al. (2004) showed that siRNAs targeting the 3′ UTR (indicated as region 5 in the plot below) were as effective as those targeting CDS (indicated as regions 1-4):

siPOOL silencing

We have performed thousands of qPCR validation experiments on our siPOOLs and have also not found much difference between siPOOLs that target only CDS versus those that target only 3′ UTR.

The following plot shows the % remaining mRNA for siPOOLs that target only 3′ UTR (left) or only CDS (right):

And if we look at all possible numbers of CDS siRNAs, there is no observable trend. The horizontal line in the following plot shows the median siPOOL silencing (12.9% remaining RNA).

Design space

Given the focus on CDS for gene function, there may be an assumption that the 3′ UTR is relatively small and thus may not provide enough design space for siPOOL design.

The difference in CDS and 3′ UTR lengths for RefSeq coding genes is actually not that great:

The median CDS is ~1.3 kb, whereas the median 3′ UTR is ~1.0 kb. The mean is actually higher for 3′ UTR (~1.7 kb) than for CDS (~1.2 kb).

It should be noted, however, that there are some very short 3′ UTRs, and in those cases a 3′ UTR-only siPOOL will not be possible.

Conclusion

siPOOLs targeting only 3′ UTR should be as effective as ones only targeting CDS.

If our current design targets CDS, we can also do a redesign that only targets 3′ UTR.

Complex siRNA pooling is especially important for silencing lncRNAs

Complex siRNA pooling is especially important for silencing lncRNAs

One advantage of pooling siRNAs is that the pool tends to silence about as well as any single siRNA. That is why independent complex siRNA pools (siPOOLs) for the same gene have much more similar and better average silencing than single siRNAs or mini-pools (Dharmacon), as discussed in our last blog post.

Another advantage of complex siRNA pools is that you can cover more regions of the target gene. This may be especially important when silencing lncRNAs, which can be very long and whose structure may cause certain regions of the transcript to be inaccessible to RISC.

MALAT1

A good exemplar of this phenomenon is MALAT1, whose transcripts are nearly 9 kb and whose secondary and tertiary structure is important for its cellular function.

Two groups who used single siRNAs or mini-pools (Dharmacon) found poor silencing for MALAT1.

Stojic et al. (2018) used a Lincode mini-pool (Dharmacon) and saw almost no silencing:

That was despite using a very high siRNA concentration (50 nM) in a cell line where RNAi normally works very well (HeLa).

Lennox and Behlke (2016) used several single siRNAs at two concentrations (1 nM and 10 nM), again in HeLa cells. They found highly variable silencing, where most siRNAs had poor silencing, and a small number worked fairly well at the higher concentration (10 nM):

Nuclear IncRNA: MALAT1

If one were to randomly choose an siRNA, there is only a 25% chance (3 / 12) that it would give decent silencing at 10 nM (using 30% remaining RNA as cutoff for decent silencing). And none would be considered decent at 1 nM.

Of note, this paper is often cited as a reason for not using RNAi for lncRNAs (the authors, from IDT, recommend using ASOs).

The results from Stojic et al. were quite poor (the Lincode mini-pool hardly silenced), and could be due to reagent quality.

siPOOLs effectively silence MALAT1

The results from Lennox and Behlke are more in line with what we’ve observed when researching the silencing of individual siRNAs versus complex siRNA pools (siPOOLs). Some siRNAs are much better than others. And, as mentioned earlier, a complex pool of siRNAs tends to silence like the best constituent siRNAs.

Given the best-silencing-siRNA selection from complex siRNA pooling and the increased transcript coverage, we would expect siPOOLs to give better silencing than what these authors found.

Our 2 independent siPOOLs for MALAT1 give silencing of 16.5% and 24.4% remaining RNA, respectively, at 1 nM:

MALAT1 Silencing by siPOOLs

If we compare our reagents to those used by Lennox and Behlke, we see that there is much broader transcript coverage. As mentioned, this could be especially important for lncRNAs like MALAT1 that have extensive secondary and tertiary structure.

Lennox and Behlke siRNAs (used individually):

siPOOL #1 siRNAs (used together):

Conclusion

Two factors make complex siRNA pools (siPOOLs) especially well suited for silencing lncRNAs:

  1. Complex siRNA pools tend to silence like their best constituent siRNAs. A single siPOOL silences about as well as the best single siRNA. For targets with high silencing variability, siPOOLs are far superior to single siRNAs or mini-pools (Dharmacon).
  2. By using a complex siRNA pool, more of the gene can be covered. For targets with lots of secondary and tertiary structure, siPOOLs give you the best chance of targeting an accessible region.
siPOOLs: robust reagents for gene silencing

siPOOLs: robust reagents for gene silencing

Although we talk a lot about off-targets, one of the main advantages of siPOOLs (complex siRNA pools) compared to single siRNAs or mini-pools (Dharmacon) is that they provide near optimum silencing of target genes. Two siPOOLs for the same gene give very similar knock down levels, and their silencing is around the best of any single siRNA. Given how many candidate siRNAs there are for a gene, and how difficult it is to accurately predict silencing levels, this makes siPOOLs the best choice for gene silencing.

The following plot, comparing independent siPOOLs and siRNAs for the same target gene, shows that siPOOLs for the same gene give more similar silencing than do siRNAs (these are Ambion Silencer Select siRNAs).

We see that the correlation for independent siPOOLs is nearly twice that for independent siRNAs.

(Note that for siRNAs we are doing all pairwise comparisons for 3 siRNAs per target gene. Randomly selecting 2 siRNAs per gene gives similar R values.)

In the above plot, we removed 3 siRNAs that did not work, for the gene TRIB1. TRIB1 has some association with the nucleus and has a short mRNA half life, both of which are factors associated with poor gene silencing.

The following plot shows the TRIB1 siPOOLs and siRNAs.

Note that including these non-functional siRNAs actually improves the reagent correlation, though not for a good reason!

We also see that independent TRIB1 siPOOLs give very similar silencing and it’s much better than for the siRNAs. In our experience, if a siPOOL does not work well for a gene, designing a second siPOOL does not substantially improve things, as the poor silencing is normally a feature of the target gene itself. ~50% silencing is probably about the best one can expect for this gene.

Just because siRNAs do not give any on-target silencing, this does not mean they can’t show up as hits in screening assays. Because most of the downregulation is in off-target genes (due to the seed effect), each of those TRIB1 siRNAs may silence nearly 100 genes.

We looked at a genome-wide RNAi screen that included these 3 Silencer Select siRNAs. We see that one of them gives a fairly strong phenotype (Z-score < -2 for cell count), even though the siRNAs do not silence their on-target gene.

Screening with siPOOLs is the smarter alternative, as you can be confident that they provide near optimal on-target silencing and have less off-target effects.

Cutting the Gordian Knot of RNAi off-targets

Cutting the Gordian Knot of RNAi off-targets

The C911 siRNA control generated a lot of excitement in the RNAi world when it emerged ~11 years ago. A former colleague, who was a pioneer in the commercialisation of RNAi, described it then as the biggest breakthrough in the last 10 years of RNAi research.

The idea of the C911 control is to get rid of the on-target effect of the siRNA by using the complement of bases 9-11, while retaining any off-target (seed-based) effects of the siRNA, which are mostly dictated by the bases in positions 2-8.

If the observed phenotype of the siRNA is due to an off-target effect (rather than silencing of the on-target gene), the C911 version will show the same phenotype. i.e., because it is not silencing the target gene, the phenotype must come from an off-target effect.

Despite the initial excitement, the C911 approach did not become that widely used. There are a number of drawbacks to the strategy, perhaps foremost being that new reagents must be ordered and the assay set up to run again. We’ve compared the validation of low-complexity RNAi reagents to the old lady who swallowed a fly.

The best strategy is to avoid getting entangled in off-targets in the first place. And that seems to be the approach preferred by the research community.

The following plot shows Google Scholar citations for siTOOLs (i.e., papers using our reagents) and the C911 method paper.

We see that after an initial adoption period, use of C911s tapered off and it has levelled out in recent years.

None of this suggests that C911s are bad. For single siRNAs or Dharmacon pools, they are indeed an effective control. But the inconvenience of the method has probably hindered its adoption.

The convenience and robustness of the siPOOL are its great advantages. The siPOOL approach ensures maximum on-target silencing and a minimum of off-target effects. We look forward to supporting more great research in the coming years.

RNAi vs CRISPR: RNAi even better at finding essential genes

RNAi vs CRISPR: RNAi even better at finding essential genes

Which technology is better, RNAi or CRISPR?

The best answer to this question, like so many others is, it depends.

If cells can adapt and compensate for loss of the gene, or you want to titrate gene levels (important in drug discovery), then RNAi will be better.

If a gene’s transcripts have lots of secondary structure and must be silenced to 99.9% in order to see an assay phenotype, then CRISPR may be better.

We have used two large datasets to attempt to answer the following question: is RNAi or CRISPR better at identifying essential genes?

The first dataset is the BROAD Institute’s Dependency Map (DepMap). It has both RNAi (shRNA) and CRISPR (Cas9) screens from over 700 human cell lines, using hundreds of thousands of reagents. Both types of reagents were used to do pooled screening for cell viability.

The second dataset, also from the BROAD Institute, is called gnomAD. It has genome and exome sequencing for over 100K humans. Based on how frequently mutations are found in the sequenced genomes/exomes (and what type of mutations are preferred), an essentiality score can be assigned to every human gene. It’s the ultimate test (within ethical limits) of whether a gene is essential to humans.

Our approach was as follows:

  • for each gene, get the median DepMap viability score across the 700+ cell lines
    • done separately for RNAi and CRISPR screens
  • for each gene, retrieve the gnomdAD pLI score (probability that loss-of-function not tolerated)
    • higher values means the gene is considered more essential
    • genes with pLI > 0.9 are classified by gnomAD as essential

If we look at the top 200 genes in each of the RNAi and CRISPR datasets (note: 70 genes are common to both lists), we see that the top 200 genes from RNAi screening are more essential (as measured by pLI) than are the top 200 genes from CRISPR screening. (note that the curves show the running mean for 30 genes)

Eventually the curves do converge, but for the top genes, we see that those found by RNAi are more essential.

Alternatively, if we group the genes from the CRISPR and RNAi screens into deciles for cell viability score, we again see that the results from the RNAi screens are more consistent with gnomAD.

In the following plots, we look at the number of gnomAD essential genes (defined as pLI > 0.9) in each of the deciles. Decile 1 has the top 10% of genes for reducing cell viability (most essential), whereas Decile 10 has the bottom 10% (least essential).

For CRISPR screens, we see that the top 2 deciles show markedly more gnomAD essential genes. But after that, the counts flatten out. There is little difference in the number of gnomAD essential genes in deciles 3 through 10.

The results from RNAi screening show a fairly steady decline in gnomAD essential genes in deciles 1 through 10. Which is what one would expect. Genes that increase cell count should tend to be less essential. i.e., decile 10 should have the smallest number of essential genes. That is what we see with the RNAi screens, but not with the CRISPR screens.

Conclusion

RNAi and CRISPR screens can both pull out genes found to be essential in the gnomAD dataset.

However, the top genes from RNAi screening tend to be a bit more essential in real-life experiments (i.e., the humans from the gnomAD dataset).

Furthermore, the trend for gnomAD essential gene counts through the ranked datasets makes more sense for RNAi screens than for CRISPR screens.

CRISPR may be a newer technology, but that does not necessarily make it better than RNAi.

Both have their advantages and disadvantages, which we will discuss more in future blog posts.

It should also be noted that two of the main disadvantages of RNAi screening (seed-based off-target effects, and variability in silencing between different siRNAs) have been addressed by siPOOLs.

In an upcoming blog post, we will take a closer look at genes that gave different results in the DepMap RNAi and CRISPR screens.

Chemical modifications only shift the siRNA seed profile

Chemical modifications only shift the siRNA seed profile

In the last post, we saw that chemically modified ON-TARGETplus siRNAs still have a strong seed effect.

The seed-based off-target effects (measured by correlation of reagents with the same 7mer seed) were as strong for chemically modified ON-TARGETplus (R = 0.50) and Silencer Select (R = 0.59) as what we typically see with unmodified siRNAs (Qiagen, siGENOME, or Silencer).

Chemical modification must not prevent seed-based target recognition, because RISC uses the seed to scan the transcriptome for target sites. Because of how RISC presents the guide strand seed region for target scanning, the binding energy for finding an on-target site (19-base complementarity) versus an off-target site (6/7-base complementarity) is nearly the same. It’s not like a microarray oligo, where more extensive complementarity leads to stronger binding. The seed is driving this site recognition, so any modification that eliminates its binding will make the siRNA ineffective.

The chemical modifications added by Ambion and Dharmacon do not prevent seed binding, but instead change the efficiency of different bases at certain positions, in effect changing the seed profile of off-target sites.

The following heatmap shows the cell viability scores from 9 genome-wide siRNA screens. The average viability score for all siRNAs with a specific base at a specific position was calculated (shown are guide positions 1-9). If the value is red, it means siRNAs with the base at that position tend to be more lethal.

The first 4 columns are from screens using chemically modified, Silencer Select siRNAs (S+). The next 2 columns are from screens using unmodified, Silencer siRNAs (S). And the last 3 are from screens using unmodified, Qiagen siRNAs (Q).

We see that for some bases (e.g. 2C, top row), siRNAs tend to be non-toxic regardless of whether or not they are chemically modified (S+, S, and Q all show deep blue).

But there are other positions where the chemically modified siRNAs are very different from the unmodified siRNAs.

For example, the bottom row shows that 6G tends to be very toxic in unmodified siRNAs, but is not toxic in Silencer Select (chemically modified) siRNAs. On the other hand, 6U (towards the middle row) looks to be toxic for Silencer Select siRNAs but have the opposite effect for unmodified siRNAs.

Whatever the chemical modification for Silencer Select is (has not been made public), it appears to make seed off-targets stronger when position 6 is a U, and weaker when position 6 is a G.

If we compare the effect on cell viability of Silencer Select vs ON-TARGETplus siRNAs from the Tan and Martin screen (subject of last post), we also see strong differences in the effect of having a U or a G at position 6.

The following plot shows the toxicity rank of seed bases in Silencer Select siRNAs vs ON-TARGETplus siRNAs. Bases towards the origin (e.g., 2C) tend to make siRNAs non-toxic for both types, whereas bases towards the top right (e.g., 2G) tend make make siRNAs toxic for both types. Bases that fall off the diagonal tend to be toxic for one type and non-toxic for the other.

We see that 6U is toxic for Silencer Select siRNAs (as also seen in the heat map) and ON-TARGETplus siRNAs, like the unmodified siRNAs from the heat map, tend to be non-toxic. And the effect is similar to the heat map for 6G: toxic for Silencer Select and non-toxic for ON-TARGETplus (and unmodified in heat map).

Conclusion

Chemical modification does not get rid of seed effects, as evidenced by the strong phenotypic correlation of modified siRNAs with the same seed sequence. Rather, modifications tend to change the effectiveness for specific bases in eliciting seed-based silencing.

One suggestion would be to design a chemically modified siRNA library that avoids bases that tend to be toxic (e.g., 6U for Silencer Select).

However, there are a few problems:

  • The heat maps and scatterplot only show tendencies. There is still variation within those positions. While 2G tends to be non-toxic for Silencer Select, there are still lots of toxic siRNAs with that sequence.
  • Bases that reduce toxicity may be doing so because they tend to reduce target recognition. For example, 2C is also associated with poorer on-target silencing. Using only 2C for siRNAs could thus result in a library that is not as efficient at on-target silencing.
  • Finally, these suppliers have already produced their siRNA libraries. i.e., those bases have already been used.

The only reliable way to both reduce the off-target effect (via dilution of seeds) and maintain robust on-target silencing is by using siRNA pools (siPOOLs).

ON-TARGETplus siRNAs have strong off-target effects (despite chemical modification)

ON-TARGETplus siRNAs have strong off-target effects (despite chemical modification)

History of chemical modifications

Chemical modification has long been proposed as a way to limit the off-target effects of siRNAs.

The earliest siRNAs from the two main commercial suppliers (siGENOME from Dharmacon/Horizon Discovery, and Silencer from Ambion/ThermoFisher) were quickly replaced with new chemically-modified siRNAs (ON-TARGETplus from Dharmacon, and Silencer Select from Ambion).

We have already seen that Silencer Select siRNAs, despite their chemical modification, maintain a strong off-target seed effect.

The phenotypic correlation between siGENOME (unmodified) and ON-TARGETplus (chemically modified) low-complexity (4-siRNA) pools for the same gene was shown to be very poor.

However, showing a direct seed effect of ON-TARGETplus siRNAs using published data is not straightforward, since Dharmacon (unlike Ambion) has not made their siRNA sequences publicly available.

Here, for the first time, we show massive seed-based off-target effects from ON-TARGETplus siRNAs.

Seed off-target effects from ON-TARGETplus siRNAs

Tan and Martin (2016) provide a dataset that includes 4 different ON-TARGETplus siRNAs for nearly 700 genes, screened for their effect on nuclear area.

We were also able to find a paper that provides sequences for ON-TARGETplus siRNAs. Those sequences were assigned to the siRNAs from the Tan and Martin screen (details on sequence assignment provided at end of post).

The intraclass correlation (ICC) is a measure of reproducibility of measures of the same group, e.g. siRNAs with the same target gene, or siRNAs with the same 7mer seed.

The ICC for ON-TARGETplus siRNAs with the same gene was only 0.09.

However, the ICC for ON-TARGETplus siRNAs with the same 7mer seed was much higher: 0.50.

Despite chemical-modification, the phenotype of ON-TARGETplus siRNAs is still mostly driven by off-target seed effects.

To show these ICCs graphically, here is a plot with pairs of siRNAs for the same target gene (2 of 4 siRNAs chosen randomly for each gene). [ note that some outliers were removed to assist comparison with same-seed siRNAs]

And here is the plot with pairs of siRNAs with the same 7mer seed:

Conclusion

Chemical modification does not get rid of seed-based off-target effects.

The only effective way to robustly eliminate these effects is with high-complexity (30+ siRNA) pools (siPOOLs).

Technical notes

In order to determine the sequence of the ON-TARGETplus siRNAs from the Tan and Martin screen, the sequences from the supplementary materials of Kim et al. were assigned in order to the siRNAs sorted by catalog number. It is possible that some of the sequences thus assigned were not correct (e.g. Tan and Martin may have used different siRNAs from those listed in Kim et al. for some of the genes), in which case the observed seed effect is actually underestimated.

Similar seed effects in independent siRNA screens

Similar seed effects in independent siRNA screens

A 2013 study on Parkin translocation used genome-wide siRNA libraries from Ambion (single Silencer Select siRNAs) and Dharmacon (pools of 4 siGENOME siRNAs).

The correlation between results for the same on-target gene from the two libraries was very low (R = 0.09). (Each point in the following plot is for a gene.)

% Parkin Translocation (PPT) for Ambion vs. Dharmacon siRNAs grouped by same 7mer seed

The correlation between results for the same 7mer seed were higher (0.26), providing another example of the Iron Law of RNAi Screening. (Each point in the following plot is for a 7mer seed.)

It is also worth noting that the seed-based correlation would likely have been much higher, had the Dharmacon siRNAs been screened individually (see details below).

Conclusion

The only effective way to avoid off-target effects in RNAi screening is to use high-complexity reagents like siPOOLs, which dilute away off-target effects while maintaining strong on-target silencing.

Analysis details

To calculate the Ambion by-gene value, the mean PPT value was taken for the 3 on-target siRNAs for the gene. (The Dharmacon pooled library only has 1 value per gene, so no further calculation is necessary.)

To calculate the Ambion by-seed value, the mean PPT value was taken for all siRNAs with the 7mer. For Dharmacon, the pool value was assigned to each siRNA, and then siRNAs were grouped by their 7mer seed in order to calculate the seed mean. This means that the Dharmacon siRNA seed value is actually the average from 4 different siRNAs (with different seeds). Had the Dharmacon siRNAs been screened individually, the correlation with Ambion seed results would have been higher.

The Iron Law of RNAi Screening

The Iron Law of RNAi Screening

This is the lead singer of a band called Iron Law. He looks like a researcher experiencing massive frustration after discovering what we call the Iron Law of RNAi Screening.

This law states that in any screen with low-complexity reagents (single siRNAs like Silencer Selects, or mini-pools like Dharmacon SMARTpools), off-target effects will predominate.

Given that the average lone siRNA will down-regulate nearly 100 off-target genes, but has only a single on-target gene, it is not hard to see how this comes about.

The only effective way to break this law is to use high-complexity reagents like siPOOLs, which dilute away off-target effects while maintaining strong on-target silencing.

Below is a figure showing the reduced off-target effects with a siPOOL (3 nM) after 48 hours in HeLa cells:

Reduce Off Targets effect with siPOOLs

Transcriptome-wide profiling revealed a single siRNA can induce numerous off-targets (red dots) while a  siPOOL against the same target gene (green dot), and containing the non-specific siRNA, had greatly reduced off-target effects.

Low hit validation rate for Dharmacon siGENOME screens

Low hit validation rate for Dharmacon siGENOME screens

Good experimental design is important when validating hits from RNAi screens.  Off-target effects from single siRNAs and low-complexity siRNA pools (e.g. Dharmacon siGENOME) result in high false-positive rates that must be sorted out in validation experiments.

Dharmacon siGENOME pools (SMARTpools) have 4 siRNAs, and the most common form of validation is to test the pool siRNAs individually (deconvolution).

Unfortunately, the results of such deconvolution screening rounds are difficult to interpret.

The pool phenotype could be due to the off-target effects of any single siRNA, or even synthetic off-target effects from pooled siRNAs.

Rather than deconvoluting the pool, a better approach is to test with independent reagents.  Should the phenotype be due to the seed effects of an siRNA in the siGENOME pool, the new designs (with presumably different seed sequences) should not show them.  (Note that because they have their own potential complicating off-targets, an even better option would be to use a reagent like siPOOLs that minimises the likelihood of off-target effects).

Independent validation reagents was the approach used by Li et al. in a screen looking for enhancers of antiviral protein ZAP activity.

They first did a genome-wide (18,200 genes) screen with siGENOME pools, looking for pools that increased viral infection rate.

The biggest effect was with the positive control, ZAP (aka ZC3HAV1).   Several other pools also stood out as giving large increases in viral infection (Fig 1B):

They identified 90 non-control genes with reproducible Z-scores above 3 in their replicate experiments (~0.5% of screened genes).

These 90 genes were then tested with 3 Ambion Silencer siRNAs.  (They also included a few genes in the validation round based on pathway information and off-target analysis– more on this below.)

Of the 90 candidate hit genes, only 11 could be confirmed (Fig 2B, note that ZC3HAV1/ZAP is the positive control and JAK1 was added to the validation round based on pathway info.  A gene was considered confirmed if 2 of 3 siRNAs had a Z-score > 3.):

 

We also see that only 1 of the 7 top hits from the first round (blue genes in the first figure) was confirmed.  This is a common observation in RNAi screens: the strongest phenotypes are mostly due to off-target effects.

Off-target effects are difficult to interpret, even using advanced analysis programs like Haystack or GESS.  The authors tested 4 genes identified by Haystack as targets for seed-based off-targeting.  None of those genes could be confirmed in the validation round.

error

Like what you see? Mouse over icons to Follow / Share