Efficient de novo assembly of telomere-to-centromere human genomes
50% improvement in NG50 of the nanopore human genome assembly was achieved with current Shasta v0.4 vs. the original Shasta version – from ~20 Mb to ~30 Mb; with ultra-long reads this almost doubled to ~58 Mb.
Human genome assembly time reduced to ~3h with current Guppy basecaller and Shasta v0.4, compared to 6h originally described in their Nature Biotech. publication.
Benedict: “with Shasta and PromethION sequencing, we think that we are achieving efficient, cost-effect, highly contiguous de novo assembly, and making that a practical reality”.
With ultra-long nanopore sequencing, telomere-to-centromere chromosome arm assembly is possible for the majority of chromosome arms.
With their diplotyping pipeline, SNV calling performance ‘was actually better than [on] short-read [data]...which is really exciting’.
Benedict: ‘in regions that are defined as low mappability, we clearly have an advantage’; short-read data maps less well than nanopore data, explaining the poorer SNV calling.
This is the first demonstration that long-read diplotyping can outperform short-read genotyping.
Introducing the Human Pangenome project, Benedict explained: ‘Genomics is failing on diversity – we need to increase the number of complete genomes that we have from a diversity of different human populations, to more fully understand our genetic heritage’.