Transcription dynamics of the developing olive fruit fly embryo case study

The olive fruit fly (Bactrocera oleae) is one of the most important pests of cultivated olive trees, causing an estimated $800 million of damage per year. B. oleae is a diploid organism with a genome of >470 Mb across 6 pairs of chromosomes, including a pair of heterochromatic sex chromosomes, the male being the heterogametic sex. Sex determination is known to be initiated within the first 6 hours of embryonic development, yet the transcription dynamics of the developing embryo have not been explored and the male determining factor remains elusive.

To study the transcriptome of the developing olive fly embryo, Anthony Bayega and colleagues used the MinION to perform full-length cDNA sequencing of poly-A+ RNA from mixed-sex embryos, collected at hourly intervals over the first 6 hours of development (Figure 1). Oxford Nanopore’s long-read technology was utilised for its ability to sequence complete transcripts, with the aims of performing transcriptome analysis of the developing embryo and comparing transcriptome information to current gene models available from the National Center for Biotechnology Information (NCBI).

Figure 1: Schematic of experimental procedure, from sample collection to sequencing.

A total of 31 million reads was obtained by MinION sequencing on R9.4 flow cells (median 4.2 million reads per timepoint) using the SQK-PCS108 PCR-cDNA sequencing kit. Anthony has been an early developer of the new SQK-PCS109 PCR-cDNA kit which he has found to provide a simpler workflow and double the sequencing output, demonstrating the potential benefits that could be gained from using the PCS109 kit in future transcriptome studies.

Over 50% of expressed genes had at least one full-length read. Reads containing a 5’ adapter, poly-A+ tail and 3’ adapter were selected to focus on full-length transcripts. Comparing sequences from this approach to those generated from short-read technology revealed that forty-fold fewer reads were required to detect the same number of genes.

De novo transcriptome assembly identified 3,553 novel genes, 8,330 genes matching the predicted NCBI genes, and a total of 79,810 transcripts. Overall, a four-fold increase in transcriptome diversity compared to the NCBI predicted transcriptome was achieved. Furthermore, 38 genes incorrectly modelled by NCBI were corrected with this dataset.

Absolute transcript numbers were determined by adding RNA standards (ERCC) during the cDNA synthesis stage; these were used to generate a standard curve for subsequent conversion of relative read counts to absolute counts of transcript abundance for each embryo. This quantitative approach was validated by qPCR measurement of gene expression.

Quantification of absolute transcript numbers was used to explore the maternal-to-zygotic transition, a process that has not been studied in B. oleae before but has been investigated in the closely related Diptera Drosophila melanogaster. During this transition, a large proportion of maternal transcripts and proteins in the developing embryo are cleared and zygotic transcription is induced. In the Drosophila embryo, maternal transcripts are destabilised throughout the first 3 hours of development and zygotic transcription is activated at 2 hours. Similarly, Bayega et al. found that the total number of transcripts per embryo dropped to half between the 1st and 2nd hour of development; this then increased 143% at 3 hours compared to the 2-hour level.

Using long-read sequences obtained from adult male and female olive fly heads, the isoform complexity of genes involved in sex determination was further explored. A wide range of alternative splicing events could be seen for genes such as double sex (dsx) and transformer (tra), two master regulators of sex determination. The isoform complexity of dsx was significantly different between embryonic and adult tissues; a longer length isoform was most prominent during early embryonic development, shifting to a prevalence of shorter isoforms in the adult brain. In particular, the inclusion of exon 4 was observed only in adult female tissue.

Understanding the mechanisms behind sex determination is important as it might reveal opportunities for genetic control. As an example, CRISPR-Cas9 targeting of dsx in Anopheles gambiae mosquitoes in the laboratory has achieved population suppression. A similar method could be used to target exon 4 of dsx in B. oleae gametes, which is absent in males, for genetic control of the olive fly.

This research provides the first insight into the complexity of the olive fly embryo transcriptome and demonstrates how a greater understanding of sex determination has the potential to achieve population control of an agricultural pest of great economic impact.

Find out more about this work during Anthony Bayega's webinar on 14th Feb, 5pm (GMT): 'Transcriptome of an Agricultural Pest Delineated by Oxford Nanopore RNA-Seq'