The tip of the iceberg — Sequencing the lettuce genome

Scientists at KeyGene in the Netherlands are at the forefront of crop innovation. Together with their partners, they work on a variety of economically important crops, including vegetables, field crops and flowers. A significant focus is crop improvement through breeding for traits such as pathogen resistance, longer shelf life, improved taste, colour, easy packaging, and, more recently, the development of crop varieties that can grow under LED lighting for urban farming. One example is the successful breeding of aphid-resistant lettuce varieties which has allowed a significant reduction in the use of pesticides1.

Lettuce (Lactuca sativa) is an important crop, with 74 billion harvested globally every year. In order to further enhance lettuce-breeding efforts, the team at KeyGene performed whole genome sequencing of two lettuce lines using the high-yield, high-throughput PromethION platform1. 100x genome coverage was obtained for both lines using just 7 and 9 flow cells respectively.*

A de novo assembly approach using minimap2 and miniasm was applied to a subset of the data (40x coverage) for one of the lettuce lines in order to rapidly assess the structural integrity of the genome. The team reported that the de novo assembly comprised 1,169 contigs, with a contig N50 of 7.3 Mb1 and covered 2.6 Gb, representing nearly the entire lettuce genome. Dr. Alexander Wittenberg, one of the lead scientists in this study, stated that these results compare extremely favourably to the recently published reference genome, which, using short-read sequencing technology together with a complex scaffolding approach, resulted in 21,116 contigs and an assembled genome size of 2.21 Gb1. He further pointed out that, in contrast to the reference assembly that took several years to produce, the nanopore assembly was obtained within two months of sample receipt1.

Alignment against the short read reference revealed significant structural differences, indicating likely mistakes in the public reference.

In order to validate their nanopore sequencing results, the team at KeyGene performed optical mapping of the lettuce genome. The largest nanopore contig of 32 Mb was shown to align perfectly with the optical map; however, alignment against the short read reference revealed significant structural differences, indicating likely mistakes in the public reference. Combining the nanopore and optical mapping data allowed the generation of a scaffold N50 of 146 Mb with near chromosome-level assembly in just 34 scaffolds1.

Commenting on this research, Dr. Wittenberg stated:

‘The PromethION is a real game changer, combining ultra-long reads with high sequence output for the production of contiguous, high-quality reference genomes. Using this platform, we sequenced the 2.56 Gb lettuce genome at >100x coverage using just a few flow cells’1.

The team now plan to analyse the full 100x coverage data set for both of their lettuce lines in order support breeding efforts for this important crop.

case study figure 2.PNG
Figure: Lettuce genome alignments indicate potential incorrect orientation of contigs in the short-read reference assembly. (a) The optical map and nanopore genome assembly show complete concordance while (b) significant structural differences are found between the optical map and short-read reference assembly. Figure courtesy of Dr. Alexander Wittenberg, KeyGene, Netherlands.

* Flow cell yields are expected to increase further with researchers currently (summer 2018) reporting yields in excess of 100 Gb per PromethION flow cell2 while internal results at Oxford Nanopore have demonstrated close to 200 Gb data per flow cell.

This case study was taken from the plant white paper.

  1. Wittenberg, A. PromethION sequencing of complex plant genomes. Presentation. Available at: talk/promethion-sequencing-complex-plant-genomes. [Accessed: 14 June 2018]
  2. Albertsen Lab. >100 Gb on 1 flowcell outside ONT. Online. Available at: [Accessed: 02 July 2018]