Thidathip Wongsurawat: Nanopore goes viral (RNA)

Spotlight session

Thidathip Wongsurawat, of the University of Arkansas for Medical Sciences, presented her work using direct RNA sequencing on the MinION device to capture multiple layers of genetic information for mixed populations of RNA viruses.

Tip described how RNA viruses are the major cause of emerging diseases in humans, with ssRNA viruses responsible for diseases including ebola hemorrhagic fever, dengue fever and influenza. ssRNA viruses are “very diverse at both the structure and sequence level”: their genomes can be segmented or non-segmented, polyA-tailed or non-polyA-tailed and positive- or negative stranded. Subgenomic RNA also plays an important functional role. Tip outlines her team’s goal: to “sequence everything in one run.” With direct RNA nanopore sequencing, full, strand-specific transcripts can be sequenced, whilst preserving modifications; Tip highlighted how in short-read sequencing, the requirement of reverse transcription and amplification can lead to bias and artefacts. She noted that Matthew Keller et al. (Scientific Reports volume 8, Article number: 14408 (2018)) had successfully sequenced the coding complete influenza A virus genome to 100% coverage; however, here this required designing probes to capture conserved RNA termini – sequencing a mixed, unknown population of RNA virus transcriptomes and subgenomes needs an approach that does not require prior knowledge of the sequences present.

First, the team optimised their protocol, starting by testing whether they could sequence direct RNA without the need for reverse transcription. They tested this by preparing Mayaro virus (MAYV) and Venezuelan equine encephalitis virus (VEEV) samples for sequencing with the Direct RNA Sequencing Kit (SQK-RNA001) without a reverse transcription step: this generated 100% coverage for each virus genome, with the longest reads in each spanning nearly the full length of the genome. However, in investigating the effects of sequencing the two viruses in the same run, which had been pooled in equal amounts (ng), they here found that 6% of data mapped to MAYV and 81% to VEEV. They then tested whether a polyA tail could be enzymatically added to non-polyA genomic RNA: they developed a modified protocol using NEB E. coli poly(A) polymerase, which was successfully used to polyA-tail a sample of the bacteriophage MS2, which was sequenced to 100% coverage. However, this approach can also target host rRNA: in this case, depletion of host rRNA via Ribo-Zero rRNA Removal Kit did not reduce the presence of host RNA when sequencing Zika, so was not included in the protocol.

Tip and her team used their optimised protocol to prepare and sequence a pool of six  diverse ssRNA viruses, including positive- and negative-stranded, polyA- and non-polyA-tailed RNAs from a wide range of families on the MinION device. She showed how reads were identified belonging to all six viruses within two minutes; complete genome coverage was achieved for each virus in two hours. Due to the method of polyA tail addition, 72% of data mapped to host rRNA: Tip plans to investigate alternative methods of polyA tailing next. She also demonstrated how the stranded nature of direct RNA sequencing enabled separation of data into transcriptomic mRNA and genomic RNA.

Tip concluded that this method enabled the capturing of several layers of genetic information from multiple RNA viruses in parallel, eliminating the complexity of protocols and allowing for rapid sequencing.