Anthony Bayega: Transcriptome landscape of the developing olive fruit fly embryo delineated by Oxford Nanopore long-read RNA-Seq


The olive fly is a non-model organism which is under-studied, however is one of the most important pests for cultivated and wild olive fruits costing over 200 million dollars in crop loss. A diploid organism with a genome of 450 Mb and 6 chromosomes, there are a number of developmental mechanisms that are characterised by dramatic transcriptional changes over short time periods.  Anthony explained how sex determination occurs within the first 6 hours of embryo development and is mediated through alternative splicing events, however the male-determining factor remains elusive.

Using nanopore long read technology, Anthony stated that his aims were to identify complete transcripts and perform de-novo transcriptome assembly and compare these to current gene models predicted by NCBI. Furthermore, he said his final aim was to attempt to delineate temporal transcript kinetics that occur in the first two hours of development.

Using a time series experimental design, total RNA from olive fly embryos at 1 – 6 hours post oviposition was extracted and reverse transcribed using a strand switching approach. Using LSK-108 ligation sequencing kit a total of over 31 million cDNA reads were generated for the entire experiment. Comparing sequences from this approach with cDNA sequences generated from short read technology, forty-fold fewer ONT reads were required to detect the same number of genes, while 7.9 fold fewer bases were required. The two technologies agreed in terms of gene counts with a Spearman’s rho value of 0.739 and showed that the majority of transcripts spanned the 5’ – 3’ end of the RNA references used.

Anthony went on to show how the long reads generated in this experiment could be used to correct mis-annotated genes and that direct absolute normalisation of this RNA data, using ERCC spike-ins, out performs relative normalisation techniques commonly used in these types of experiment. Furthermore, when the mRNA concentration was calculated from sequence data over the first two hours, the theoretical concentration halved. This was validated by qPCR suggesting that patterns in the sequencing data reflected the actual abundance of mRNA transcripts as defined by qPCR.

Next Anthony showed that transcriptional expression profiles for each time point over the first 6 hours of development could be compared in a pairwise fashion, and that those closer in time showed a higher correlation that those of distant time points, identifying clear transitions in developmental processes taking place. To explore temporal patterns in higher detail, transcripts were clustered using absolute expression profiles over the time course. Specific clusters containing genes used in early development could be seen decreasing in abundance over the course of the experiment, while those involved in late developmental processes could be seen to be up regulated at later time points.

Moving on to the last section of his talk, Anthony described how long read nanopore sequencing was used to improve the annotation of the Doublesex gene over that already generated by short read sequencing. To highlight why understanding the mechanisms of sex determination in this organism is important, Antony showed how a CRISPR-Cas9 targeting the doublesex gene in caged mosquitoes showed that complete population suppression could be achieved, suggesting this is the perfect target for population suppression in the olive fly. As a final highlight slide, Antony mentioned that he has been an early developer of the up-and-coming PCS-109 PCR cDNA sequencing kit showing that this kit provided a simpler workflow and has doubled the throughput of their sequencing runs.