Tapestry: assessing small eukaryotic genome assemblies with long-reads

John Davey from the University of York began by introducing the York Genomics and Bioinformatics team, detailing some of the projects where they have used nanopore sequencing to aid genome assembly. These projects have varied enormously in size, from the 2 Mb Sulfolobus genome to the large tetraploid Rhubarb genome of 7.6 Gb. John went on to explain that long nanopore reads are helping to create a number of near-complete assemblies of small eukaryotic genomes (10-50 Mb), but problems can arise when trying to validate the assemblies.

Genomes can contain complex features such duplications, translocations, large inversions, and ploidy variations, which can make assembly and assembly validation difficult. John introduced a tool called Tapestry which can be used to visually validate small (<50 Mb), almost complete (<100 contig) eukaryotic genomes. Tapestry's inputs are a genome assembly, fastq reads, and the telomere sequence; it then aligns those reads and contigs to the assembly and generates summary statistics and a HTML report.

John went through a demo of how to use Tapestry on the Angomonas deanei genome, walking the audience through removing junk contigs, identifying haplotype contigs, and exporting the final assembly in fasta format. Wrapping up his talk, John said that "nanopore sequencing is making it possible to discover cool new features of the genome we were not able to to see previously".

Authors: John Davey