Nanopore sequencing the SARS-CoV-2 genome: introduction to protocol

During this online seminar, Phillip James, who is the Applications Manager for the UK at Oxford Nanopore Technologies, provides an in-depth introduction to sequencing the SARS-CoV-2 genome using the their protocol including advice for best performance.

Answers from the Oxford Nanopore team to your questions during the webinar:

1. The PCR uses a 5 minute extension time. Can you explain why this is necessary to amplify short amplicons of 400 bp?

Annealing/extension times are typically long for multiplex PCRs. Primers continue to anneal to the target during the extension step because it is a 2-step PCR.

2. Is there a reason for amplifying 400 nt amplicons, and not longer amplicons?

We are in collaboration with ARTIC network for this protocol and they have tremendous expertise in previous outbreaks, such as Ebola and Zika. Their experience from these indicates that the RNA extracted from those samples is sometimes degraded and not the best quality, so choosing shorter amplicons increases the chance of amplifying better across the virus genome. So if you use longer amplicons, and you are working with samples where you see low viral load, sometimes you will see dropout in amplification. If you are using shorter amplicons, the effect on the total coverage of the genome is actually not that great compared to when you are using longer amplicons. This is the reason why 400 bp was chosen as the mid-range, to make sure that the coverage of the viral genome is well represented even in samples where the viral load is relatively low. The protocol has also been heavily optimised by the ARTIC network.

3. Have you considered using more processive RTs, such as MarathonRT?

The protocol has been developed and optimised by the ARTIC group. For the RT reaction they are using SuperScript IV. Other reverse transcriptases could potentially be used; as we internally have not tried a different reverse transcriptase, it would be advisable for any lab making changes to the protocol to optimise as appropriate.

4. Is DNase treatment needed prior to the reverse transcription reaction?

In terms of the use of the DNase prior to reverse transcription, some of the extraction protocols for RNA actually include this step in the extraction protocol. But we understand especially with the COVID-19 protocol, where you are actually amplifying the cDNA, that doing a treatment with DNase is not necessary. Having said that, we haven't done a comparison between a DNase treatment versus a no-DNase treatment. But because we are employing PCR, and we are not going straight into sequencing the native RNA molecules, we assume that the inclusion of DNase will not have a great effect on the efficiency of the reaction.

5. What is the actual amount of viral RNA that you recommend going into the RT step?

The input RNA recommendation for the RT step is based on the Ct values obtained from the qPCR test. Ct values give a representation of the viral load present in the clinical samples. In the relevant section of the protocol there is a table with recommendations on the dilution factor based on the qPCR Ct values.

6. Can RNA samples isolated from nasal swabs be used for RT and further amplification?

For any RNA extracted from nasal swabs, a qPCR test is advisable before setting up the RT. Based on the viral load of the samples (Ct values) the RT reaction will need to be set up as per protocol recommendations.

7. How are people using the technology right now to understand the outbreak? And what are the main advantages of using Nanopore for COVID sequencing?

Our customers are using Nanopore sequencing to quickly generate high consensus accuracy genomes of SARS-CoV-2, to track both transmission of COVID-19 and viral evolution over time. The latter is useful to understand how the virus may evolve to potentially evade pending vaccines. Users have deposited their nanopore genomes on shared repositories such as GISAID, which is then populated on NextStrain for analysis of putative transmission models. The benefits of nanopore sequencing include our ability to generate data in real time (1 hour of sequencing time is required when using MinION Flow Cells), and to scale sequencing throughput needs from the Flongle to the high-throughput PromethION.

8. How many samples can be multiplexed?

In terms of multiplexing samples on a flow cell, the protocol is optimised for native barcoding, and at the moment we have 24 barcodes available. We are working towards 96 native barcodes. You can multiplex up to 24 samples on all flow cells and platforms (Flongle, MinION, GridION, and PromethION). When you are multiplexing on a standard MinION/GridION Flow Cell, the sequencing time is still only about 1 hour, maximum 2 hours, to generate sufficient coverage to proceed with the Bioinformatic workflows. The ARTIC workflow is about 7 hours long in total, and runs could be staggered to accommodate higher sample throughput.

9. Is it possible to use RNA direct sequencing to sequence SARS-Cov-2?

There are a few papers which have performed native RNA sequencing of SARS-CoV-2 from culture (transfected human cell lines). This has enabled the researchers to investigate the subgenomic mRNA usage of the virus. Doing this directly from clinical samples is likely to depend on your viral titre. This is because, with direct RNA sequencing, you use a poly(T) RNA sequencing adapter which ligates to the end of the poly(A) tail. You are likely to have a lot of human RNA in clinical samples, so without any kind of manipulation or tweaks to the protocol as is, you will probably extend and sequence a lot of human RNA. However, we have not tried this internally so cannot confirm. It is one of the reasons why we opted for the targeted approach at the outset. Therefore, it is definitely possible to obtain direct RNA sequencing data, as has been shown from culture, but for clinical research samples, the jury is out at the moment.

10. Can we use 96 PCR barcodes?

Theoretically yes, and I believe that groups have added on PCA, which is the PCR adapter, and then re-PCR'd again after that using the 96 barcoding kit. Feel free to try that, just keep an eye on your negative controls. You are performing a very very sensitive PCR, and then taking your amplified sample through another set of rounds of amplification. PCRs are very prone to contamination and this is a protocol which is already very sensitive, so this is important to bear in mind. If you try it, let us know how you get on!

11. There have been 3 versions of primer sets - which one should we use and why?

The first ARTIC primers were generated when the first genome was deposited on GISAID. Once more samples were made available, multiple labs worked on iterating and improving these primer sets to address some issues with coverage dropout noted in V1/V2. Please use the V3 primer sets, which will generate coverage of at least 100x across the entire genome if 1000x total coverage is generated.

12. Have you calculated the price per sample if we multiplex 24 samples?

Currently, with 24 Native Barcodes, the cost per sample is about $29/sample for Oxford Nanopore reagents, using one MinION Flow Cell. Please remember that we have released a COVID-19 store which features steeply discounted flow cells ($500 each) to help lower the cost per sample. As we work to release the 96 Native Barcoding Kit, the cost per sample would reduce to about $9/sample. There are additional third party reagent costs as well, which add about $20/sample.

13. Where can we get this slide deck?

Unfortunately we are unable to share the slide deck, but you can view them in the recording of the presentation.

14. The LSK109 kit says it can handle 6 rxns per kit. Does this mean we'll need 4 kits of LSK109 for a 24 sample run?

The LSK109 kit contains sufficient reagents to prepare 6 different libraries. Therefore, if you purchase one each of both Native Barcoding Kits and an LSK109 kit, you will have sufficient reagents to run 144 total samples (and would exhaust all reagents in each of the three kits).

15. Why is the initial genome of the virus not complete?

As we mentioned during the webinar, the ARTIC protocol went through a few protocol/primer iterations for optimisation of genome coverage. The current version of the protocol is V3.

16. Is the accuracy of sequencing using Oxford Nanopore better than with short reads? Do you have any data to show this?

We would like to stress that this protocol have been developed by the ARTIC network which has extensive experience and expertise in deploying this technology in the sequencing and surveillance of outbreaks, including Zika and Ebola. Through generation of multiple reads at the same location in the COVID-19 genome, we achieve a high consensus accuracy which is an important metric in the context of long reads.

17. Does a Flongle Flow Cell generate enough data when multiplexing, and is there an optimal number of adapters that should be used?

As mentioned during the presentation, the ARTIC protocol using Nanopore is adaptable to all of our devices and to all of our flow cells - you can use it with Flongle, MinION/GridION and PromethION Flow Cells. For all protocols and all flow cells you can use up to 24 native barcodes so for Flongle, yes, you will be able to use barcodes as well to achieve the desired yield. Just to give you an idea of how much you will need and what you will need to take into account when you choose one flow cell rather than the other: The COVID genome is roughly 30 kb; to get to a coverage of around 1000x you will need ~30 Mb. If you want to multiplex 24 barcodes, you will need around 800 Mb, so you can achieve these yields on all the flow cells. What will be different will be the time taken to achieve the yields; on a MinION Flow Cell, for example, you can get this yield in about 1.5 hours. With Flongle, because it has a smaller chip and fewer sequencing pores, you can achieve this yield in around 7-8 hours. So, the answer is yes, you can, it is just that the time taken to achieve the expected yield is different between the different types of flow cells.

18. Is there an equal number of primers in Pool A and Pool B?

Almost. They are not exactly equal but they are only different by one or two pairs. There are 18 individual primers altogether in two different pairs, and that's for the V3 set which has been published by the ARTIC network.

19. How are Ct numbers usually determined?

In the qPCR assay (threshold cycle).

20. Is VolTRAX useful in this protocol, for library prep and PCR? Is there a specific protocol for that?

There is a team of people at Oxford Nanopore currently looking at how to do this on VolTRAX. There is nothing available at the moment, but this is another avenue of research that is being worked on.

21. What is the time total time associated with this protocol (excluding the sequencing and sample prep/extraction time)?

Typically 5 hours.

22. Can we use a subset of primers, just for virus detection?

Yes, you may pick and choose a subset of primers to amplify the characteristic targets.

23. Do these primers work on any COVID-19 strain?

Yes, as far as we are aware.

24. What are the motor proteins that are being used?

We pre-load the motor proteins onto the sequencing adapter and that is a proprietary part of our sequencing technology.

25. Have you tried smaller PCR reaction volumes, say 12.5 ul rather than 25 ul?

We have not tried internally to adjust the reaction volumes. Whilst we do not think this would be an issue, optimisation of the protocol with lower volumes would be advisable.

26. Which qPCR is being considered for this Ct value parameter for dilution?

We do not recommend a certain qPCR protocol. The testing lab can use the method of their choice. As long as Ct values are recorded, this is the essential parameter to set up the dilution factor for the reverse transcription reaction.

27. Where can we find documentation for the Twist standard transcripts? I can't find them on the Twist Bio website.

You can find all the information in this link: https://www.twistbioscience.com/coronavirus-research-tools

28. What is the average length of the cDNA after RT?

As the sample contains both viral and host (human) RNA, it is very difficult to determine the exact size of the viral cDNA. Typically when random hexamers are used for reverse transcription you would obtain different size products. When the products are run on the gel you wold see a smear rather than a clear solid band.

29. What is the percentage of genome coverage with the current protocol?

With the V3 version of the primers, we can achieve full coverage as issues with amplicon dropout have been resolved.

30. What is the turnaround time from library preparation to sequencing?

The total time for library preparation (not including extraction) is 7 hours. The end-to-end workflow includes reverse transcription, PCR, and library preparation with LSK109 and Native Barcoding kits. The sequencing time is dependent on how many samples are multiplexed on a flow cell, and which flow cell types are used. Typically, for a MinION Flow Cell you will require ~1.5 hours of sequencing to achieve an average of 1000x coverage per sample when 24 samples are multiplexed on a flow cell. For a Flongle Flow Cell, the same sequencing yield will be achieved in ~8 hours. For PromethION, this will be ~20 minutes.

31. Do you recommend ribosomal RNA depletion if we expect high human RNA to viral RNA ratio?

Not for the targeted PCR workflow, as we use primers to only amplify the viral genome.

32. Why did you change Blunt/TA ligase in the barcode ligation reaction?

We have changed to the one-pot protocol after barcode ligation so we use the Ligation module to increase ligation efficiency. With the one-pot protocol we do not perform SPRI bead clean up per individual barcoded sample, but pool all samples and clean the pool.

33. Where can we get the primers mentioned in the ARTIC protocol V3?

Any oligo supplier in your region should be able to supply them, but we recommend IDT as they offer the lab-ready version (liquid).

34. How do we check whether all the targets in both pools have been successfully amplified in the PCR?

As the amplicons are of very similar size, the Bioanalyzer trace will show one peak of the expected size.

35. Would this work without the ligation enhancer?

The latest V3 protocol uses a combined end-prep and ligation step (one pot protocol), which increases the efficiency of native barcode ligation and avoids a per-sample bead clean up. We highly recommend following the protocol for optimal results.

36. What quantity of RNA from swabs provides good sequencing results?

Prior to setting up the ARTIC protocol, a lab needs to perform qPCR to identify the presence of the viral genome and determine the viral load. Based on Ct values, we recommend adjusting the input RNA into the RT reaction accordingly. This information is captured in the protocol under the reverse transcription section.

37. Is it necessary to perform quality control of the RNA before carrying out reverse transcription?

It is important to set up qPCR prior to reverse transcription to determine the viral load. As the clinical research samples will have both host and viral RNA, you would not get a clear representation of the actual viral genome without this quantification step.

38. Are there any disadvantages in using Axygen™ AxyPrep Mag™ PCR Clean-up Kits (Catalog No.14-223-151) instead of the AMPure XP beads?

The protocol has been optimised and released using AMPure beads from Beckman Coulter, and we do not have data available regarding the comparability of this with alternative clean-up options from different suppliers. For any changes to the protocol, we advise a lab to perform optimisation to ensure comparable efficiency of the library preparation.

39. I have checked a few coronavirus genomes submitted to GISAID, and some of them contain long stretches of Ns, especially at the ends of the genome sequences. A couple of genomes even have Ns in the middle of the sequences. Do you know what might be the reason for this?

Firstly, the primers don't quite cover the very ends of the untranslated regions (UTRs) - around 20 bases either end of the genomic sequence. Secondly, the V1 and V2 primers supplied by the ARTIC network had some drop out which would be seen as Ns in the final consensus, but the V3 version is much more robust. See Figure 1 in the following document, which shows why Ns will be seen using earlier versions: https://artic.network/resources/ncov/ncov-amplicon-v3.pdf.

40. Is there an approach to enrich/target the RNA of interest using beads or something similar?

Internally we have not tried such enrichment methods; these methods are not supported out of the box and would require significant research, development, optimisation, and testing.

41. For sample preparation, does this require phenol:chloroform extraction?

The webinar did not cover RNA extraction methods. The Nanopore ARTIC workflow includes all steps from reverse transcription to the generation of sequencing reads and data analysis methods. Extraction and qPCR protocols will vary based on each individual lab. Internally, we have not optimised an extraction protocol for SARS-Cov-2 viral RNA.

42. Can automation be used for RNA extraction and library preparation?

We are working internally on automation workflows for library preparation. All news and updates will be announced on the COVID-19 page in the Nanopore Community.

43. Has anyone synthesised the primers in their research centre?

Not as far as we know. All of our customers are using the primers developed by the ARTIC network. We encourage you, and others, to post in the Nanopore Community.

44. Will rapid library preparation work for this protocol?

The rapid chemistry uses a transposase which cleaves DNA randomly. This protocol is under active research, and further optimisation is needed. One thing to consider with rapid chemistry is that transposase fragments the library further, and fragmentation of short amplicons is less efficient than fragmentation of longer molecules.

45. What type of quantification is done before the sample pooling to ensure an equimolar quantity of each sample?

The Qubit fluorometer is used.

46. How long is extracted RNA stable for, prior to sequencing?

RNA degradation rates vary depending on storage conditions, the extraction method used, and freeze/thaw cycles. It is recommended that if the RNA is to be stored for any length of time this should be done at -80C, and freeze/thaw cycles are minimised. Proper RNA handling techniques should be adhered to in order to minimise the potential for nuclease contamination - we recommend visiting our "Nanopore know-how" section of the Community (https://community.nanoporetech.com/knowledge/know-how) for advice on nucleic acid storage and stability.

47. What is the loading volume required for the MinION and GridION?

All details about library loading volumes, and detailed instructions on library loading for the MinION/GridION/PromethION, are highlighted within the corresponding protocols. All protocols can be found in the Nanopore Community under the "Knowledge" section.

48. Is it correct that you need to obtain 100,000 reads per barcoded sample?

Nick Loman has provided a release note for the V3 protocol, stating that 100,000 reads are needed, per sample (per barcode), to achieve 50x minimum depth of coverage across all targets. Please see here for the release note: https://artic.network/resources/ncov/ncov-amplicon-v3.pdf.

49. Can the flow cell used for SARS-Cov-2 sequencing be reused for other applications?

The flow cell can be reused, yes, but always ensure that you wash and store the flow cells following Oxford Nanopore guidelines. All such protocols can be found in the Nanopore Community.

Authors: Phillip James, Applications Manager - UK, Oxford Nanopore Technologies Ltd.