RNA part I: RNA sequencing with Oxford Nanopore Technologies: Direct RNA Q&A
Here you can find all of the answers (given directly form our guest speakers) to all of the questions asked during the RNA part I: RNA seq with Oxford Nanopore Technologies: direct RNA webinar.
Speaker: Daniel Garalde
1. Will direct RNA sequencing without poly-A be covered in another talk?
We may cover this in a future webinar but here is a brief overview of the two main approaches for Direct RNA sequencing of RNA without poly-A tails. In the case when you have many different (or unknown) 3' sequences without a poly-A tail (e.g. the prokaryotic transcriptome), the easiest method is to add a poly-A tail to the RNA with poly-A polymerase, then prepare it with the standard Oxford Nanopore Direct RNA sequencing kit. For targeting a single (or handful) of RNAs with a known 3' sequence (for example, a conserved 3' end of a viral RNA, or rRNA), we provide a protocol that guides you in designing a splint to target the known 3' sequence.
2. Can we also rely on accurate quantification from direct RNA sequencing?
Yes, direct RNA sequencing completely removes PCR, a well known source of bias, so we expect this method to have the least bias. There is a practical (not technical) trade-off because currently direct RNA sequencing throughput is lower than for our cDNA sequencing methods. This means that you will have fewer reads (less dynamic range) per run with direct RNA. But if you are comparing a similar number of reads, direct RNA should be most accurate at quantification.
3. Is it possible to sequence SARS-COV-2 using direct RNA sequencing?
There are several groups who have done this for cultured virus and this allows analysis of RNA modifications. For example:
- Taiaroa et al. Direct RNA sequencing and early evolution of SARS-CoV-2
- Davidson et al. Characterisation of the transcriptome and proteome of SARS-Cov-2 using direct RNA sequencing and tandem mass spectrometry.
- Kim et al. The architecture of the SARS-Cov-2 transcriptome
- Nomburg et al. Noncanonical junctions in subgenomic RNAs of SARS-Cov-2 lead to variant open reading frames
- Gribble et al. The coronvirus proofreading exoribonuclease mediates extensive viral recombination
These can all be found within the Resource Centre area of our website.
A large number of of SARS-CoV-2 genomes submitted to GISAID, a repository for the viral genomes, have been sequenced on Oxford Nanopore platforms, using a cDNA approach. See here for more information: https://nanoporetech.com/covid-19/overview
4. Does Oxford Nanopore provide training for first time users?
Absolutely, you can find more information about the training offered on the website: https://store.nanoporetech.com/uk/services.html
5. For direct RNA sequencing of SARS-Cov-2, what is the accuracy?
Direct RNA sequencing has a per-molecule modal accuracy around 94%, but it is possible to use coverage to create a consensus with much higher accuracy.
6. With regards to pipelines used to analyse RNA sequencing data, is it possible to use some short-read sequencing analysis tools for Oxford Nanopore data, e.g. STAR for mapping reads?
Generally, tools for analysing long reads are more suitable, as they are specifically developed for this application. There is a large and growing number of tools for analysing nanopore RNA sequencing data (in fact, both speakers today have made their own custom tools public). There are too many to list here, but you can find a lot of them in our Resource Centre (under the Tools section), and for further details we would be happy to send you a list of popular tools for a particular application – please get in touch via firstname.lastname@example.org.
7. What is the minimum amount of poly-A+ RNA required for a good sequencing run?
The nanopore PCR-cDNA library prep kit recommends 1-5 ng as input, and it amplifies the cDNA products so is excellent for limited sample input. The Direct RNA library prep does not involve amplification, so requires more input; we currently recommend 500 ng of target RNA for direct RNA sequencing.
8. Could you elaborate a little bit more on direct RNA sequencing multiplexing?
We do not currently provide a multiplexing option for direct RNA sequencing, although it is possible to wash a flow cell and run a second sample on the same flow cell if you don't need the full throughput for one sample. There is a method developed in the Community to barcode and demultiplex direct RNA data (Smith et al. 2019. BioRxiv).
9. How sensitive is direct RNA sequencing to carryover of guanidine/tannins/general inhibitors etc.? Is there a limit on the purity required?
It is best to minimize carryover of guanidine salt, but the Direct RNA library prep includes a few clean-ups, so it's not a common problem.
10. What is the resolution of the pore for RNA modification detection? In other words, if two modifications are close enough together in the strand, will they be detected as a single modification? If yes, what is the minimum distance to avoid detection problems?
The nanopore can detect modifications at single nucleotide resolution, and this information is in the raw data. However, due to the nature of the samples used to train some modification detection algorithms, some algorithms make an assumption that two of the same modifications within a few bases of each other should be lumped together when making the modification call, so it's worth looking at how a particular tool works if detecting differential modifications between two or more adjacent bases is important in your experiment.
Speaker: Chris Vollmers
1. Can you barcode cDNAs from mixed samples to detect differential isoform expression?
Yes. We can barcode using unique DNA splints for each sample, or oligo-dT primers with sample indexes. If you combine those you could pool >100 samples per library. The most that we have done was to pool ~20 libraries, and spread that over many flow cells to reduce batch effects. We still have to release all those sequences, but that should be in the next few weeks, together with a C3POa version that demultiplexes the data.
2. What are the remaining challenges for full-length transcriptome sequencing?
Sequencing really long transcripts (>10 kb) is a challenge for any long read technology.
3. Do you have tips on generating ds cDNA from RNA with no poly-A tail?
If it is a single gene you are interested in, using gene-specific primers would work fine. If you are interested in a pool of transcripts, I would suggest poly-A-tailing them; there are kits available for that.
4. Have you looked at the level of template switching artifacts in R2C2 data?
Yes. In high quality RNA, it is very low (low single-digit percent). In degraded RNA, you'll still mostly template-switch at the ends of molecules, it's just that those ends are now the ends of fragments. So it all comes down to RNA integrity. R2C2 is also compatible with methods capturing 5' CAPs. You just have to know the amplification primers.
5. What is the length limit for R2C2?
We have done experiments with size-selected amplified DNA, and as long as we can PCR amplify it, R2C2 seems to be able to handle it. That said, the longer the insert, the longer the raw reads have to be. For cDNA experiments specifically, the reverse transcriptase won't reverse transcribe anything over 10 kb really, so that is currently the hard limit. PCR-amplifying a pool of mixed length transcripts seems to be the next biggest limitation, limiting insert lengths to <5 kb without size selection. Going longer than this would also reduce consensus accuracy considering our average raw read length of 12 kb.
6. Can the technology sequence snRNAs?
As long as you can RT and amplify it, you can probably apply R2C2. You just need known ends for circularization. We have never done anything shorter than 300 bp.