A highly contiguous genome for the Golden-fronted Woodpecker (Melanerpes aurifrons) via a hybrid Oxford Nanopore and short read assembly

Background
Woodpeckers are found in nearly every part of the world, absent only from Antarctica, Australasia, and Madagascar. Woodpeckers have been important for studies of biogeography, phylogeography, and macroecology. Woodpeckers hybrid zones are often studied to understand the dynamics of introgression between bird species. Notably, woodpeckers are gaining attention for their enriched levels of transposable elements (TEs) relative to most other birds. This enrichment of TEs may have substantial effects on woodpecker molecular evolution. The Golden-fronted Woodpecker (Melanerpes aurifrons) is a member of the largest radiation of New World woodpeckers. However, comparative studies of woodpecker genomes are hindered by the fact that no high-contiguity genome exists for any woodpecker species.

Findings
Using hybrid assembly methods that combine long-read Oxford Nanopore and short-read Illumina sequencing data, we generated a highly contiguous genome assembly for the Golden-fronted Woodpecker. The final assembly is 1.31 Gb and comprises 441 contigs plus a full mitochondrial genome. Half of the assembly is represented by 28 contigs (contig N50), each of these contigs is at least 16 Mb in size (contig L50). High recovery (92.6%) of bird-specific BUSCO genes suggests our assembly is both relatively complete and relatively accurate. Accuracy is also demonstrated by the recovery of a putatively error-free mitochondrial genome. Over a quarter (25.8%) of the genome consists of repetitive elements, with 287 Mb (21.9%) of those elements assignable to the CR1 superfamily of transposable elements, the highest proportion of CR1 repeats reported for any bird genome to date.

Conclusion
Our assembly provides a useful tool for comparative studies of molecular evolution and genomics in woodpeckers and allies, a group emerging as important for studies on the role that TEs may play in avian evolution. Additionally, the sequencing and bioinformatic resources used to generate this assembly were relatively low-cost, and should provide a direction for the development of high quality genomes for future studies of animal biodiversity.

Authors: Graham Wiley, Matthew J Miller