The genome of the American groundhog, Marmota monax

We sequenced the genome of the North American groundhog, Marmota monax, also known as the woodchuck. Our sequencing strategy included a combination of short, high-quality Illumina reads plus long reads generated by both Pacific Biosciences and Oxford Nanopore instruments. Assembly of the combined data produced a genome of 2.74 Gbp in total length, with an N50 contig size of 1,094,236 bp.

To annotate the genome, we mapped the genes from another M. monax genome and from the closely related Alpine marmot, Marmota marmota, onto our assembly, resulting in 20,559 annotated protein-coding genes and 28,135 transcripts. The genome assembly and annotation are available in GenBank under BioProject PRJNA587092.

Authors: Daniela Puiu, Aleksey Zimin, Alaina Shumate, Yuchen Ge, Jiabin Qiu, Manoj Bhaskaran, Steven L. Salzberg