Redefining the transcriptional complexity of viral pathogens using direct RNA sequencing
- Home
- Redefining the transcriptional complexity of viral pathogens using direct RNA sequencing
Daniel Depledge (New York University School of Medicine) spoke about how nanopore Direct RNA sequencing is enabling new insights into the Herpes virus. Eight Herpes viruses are known to infect humans; the different viruses are responsible for a wide range of conditions ranging from mild to severe, including cold sores, chickenpox, glandular fever, cancer, blindness and encephalitis. Around 65% of people are infected with the Herpes virus HSV-1, and Daniel showed how in different parts of the world, the infection rate is increasing significantly. HSV-1 has a comparatively small genome, at only a 152 kb dsDNA genome, yet encodes over 100 polyA RNAs and over 80 proteins. The virus primarily infects and replicates in epithelial cells, but can also latently infect neurons; in some cases, latent HSV-1 in neurons can reactivate. Daniel noted that HSV-1 has multiple ways of evading the immune system.
Daniel then displayed a Circos plot of an HSV-1 genome, showing that herpes virus genomes are "incredibly compact"; he points out the known protein-coding genes, but notes that it's hard to know if we are seeing everything that's going on: "spoiler alert - yes, there's a lot more going on that's not been seen before." To demonstrate the challenges of decoding the genome of HSV-1, he shows three transcripts with different start sites but the same, conserved RNA cleavage sites, which each encode different products. Complex splicing patterns, overlapping reading frames and other features make analysis difficult. "This is where MinION has been revolutionary, from our point of view", Daniel added.
As a sci-fi fan, Daniel introduced the MinION with a quote from Arthur C. Clarke: "any sufficiently advanced technology is indistinguishable from magic." RNA from HSV-1 was prepared for sequencing on the MinION device using the Direct RNA Sequencing Kit. Daniel compared the sequencing results of the long, native RNA nanopore reads with that of a short-read technology, in which RNA was converted to cDNA, demonstrating the high-resolution capture of the herpesviral transcriptome, saying that they could easily and more clearly distinguish transcript structures than when using short-read data. From this data, transcription start sites (TSS) and polyadenylation sites (CPAS) could be estimated "very easily" by mapping reads and constructing pile-ups: he highlighted a peak corresponding to a new TSS, amongst those identified. Daniel noted that one of the most exciting features of nanopore RNA sequencing was that "1 read = 1 RNA."
Daniel then explored the use of short-read sequencing data for error correction, using Proovread. This was used to increase the aligned portion of the read, rescue unmapped (spliced) reads and improve the detection fidelity of open reading frames. However, he pointed out that the method only works if the sequences come from the same RNA as the reference genome, and may lead to some clipping of reads. He demonstrated that the error-corrected data produced good ORF predictions from 90-95% of sequences.
Daniel then described how the long read RNA data is helping to redefine transcriptional complexity for HSV-1. He showed data for an ICP0-gL fusion mRNA, resulting in a fusion protein. Daniel noted that ICP0 is an important transcript, and the fusion transcript seemed to accumulate in the later stages of infection, when cells are "going to pieces": at this point, read-through transcription occurs, resulting in in the fusion of the two transcripts and the production of a new protein. This protein was subsequently detected by Western blot. As the transcript is seen to increase in abundance during infection, Daniel and his team are keen to further characterise this product.
Daniel concluded with reasons to sequence Direct RNA using Oxford Nanopore. He highlighted the lack of the need for recoding or amplification with direct RNA sequencing, increasing its fidelity and making it "incredibly reproducible." He also noted the way the long reads enable the mapping of fine detail, even across complex loci, and the analysis of alternative splicing, TSS, RNA cleavage sites, polyA tail length and RNA modifications, and noted that this can be further improved with the careful use of error correction. He concluded that the data "changes how we think about and investigate viral biology."