A giant genome for a giant crayfish (Cherax quadricarinatus) with insights into cox1 pseudogenes in Decapod genomes

Following on from mitogenomic and transcriptomic studies, we present the first draft genome for the red claw crayfish (Cherax quadricarinatus) based on relatively large volumes of short and long genomic reads from Illumina and Oxford Nanopore (ONT) platforms. While the assembly is relatively fragmented, it is much better than many of those currently available for decapod species, and the quality of the annotation is equivalent to other more recently-sequenced decapod genomes. Due to the very large size and repetitive structures of the red claw genome, the assembly was highly challenging.

However, we demonstrated the value of long Nanopore reads and a hybrid assembly approach for improving an assembly based on short reads alone, which gives encouragement for tackling other crayfish and crustacean taxa with large and complex genomes. This draft genome will be an important and valuable resource to support ongoing comparative genomic, phylogenomics and molecular-based breeding studies for aquaculture, conservation and biodiversity-related studies and can be approved upon over time, with the generation of additional long read data.

However, a major challenge still remains in relation to the computational resources needed to assemble large repetitive genomes from predominately short reads, even when aided with long reads. Computationally, assembly, scaffolding and polishing processes to achieve the final draft genome took almost 85 processor-weeks, and annotation required another 100 processor-weeks.

A worthwhile next step would be to investigate the efficacy of a long-read led crayfish genome assembly, now feasible as result of declining costs and improved accuracy of long read sequencing, and which should be less expensive and lead to greatly reduced processor time for assembly tasks.

Authors: Mun Hua Tan, Han Ming Gan, Yin Peng Lee, Frederic Grandjean, Laurence J. Croft, Christopher M. Austin