Dynamic Transcriptome Profiling Dataset of Vaccinia Virus Obtained from Long-read Sequencing Techniques

Background
Poxviruses are large DNA viruses infecting humans and animals. Vaccinia virus (VACV) has been applied as a live vaccine for immunization against smallpox, which was eradicated by 1980 as a result of worldwide vaccination. VACV is the prototype of poxviruses in the investigation of the molecular pathogenesis of the virus. Short-read sequencing methods have revolutionized transcriptomics; but, they are not efficient in distinguishing between the RNA isoforms and transcript overlaps. Long-read sequencing (LRS) is much better suited to solve these problems, and also allow direct RNA sequencing. Despite the scientific relevance of VACV, no LRS data have been generated for the viral transcriptome so far.

Findings
For the deep characterization of the VACV RNA profile, various LRS platforms and library preparation approaches were applied. The raw reads were mapped to the VACV reference genome and also to the host (Chlorocebus sabaeus) genome. In this study, we applied the Pacific Biosciences RSII and Sequel platforms, which altogether resulted in 937,531 mapped reads of inserts (1.42 Gb), while we obtained 2,160,348 aligned reads (1.75 Gb) from the different library preparation methods, using the MinION device from Oxford Nanopore Technologies.

Conclusions
By applying cutting-edge technologies, we were able to generate a large dataset that can serve as a valuable resource for the investigation of the dynamic VACV transcriptome, the virus-host interactions and RNA base modifications. These data can provide useful information for novel gene annotations in the VACV genome. Our dataset can also be applied for analyzing the currently available LRS platforms, library preparation methods and bioinformatics pipelines.

Authors: Dóra Tombácz, István Prazsák, Attila Szűcs, Béla Dénes, Michael Snyder, Zsolt Boldogkői