Identification of structural variation in chimpanzee using optical mapping and nanopore sequencing
Daniela (University of California, Davis) stated that one of the main questions that drives her research is: what is the genetic basis of the phenotypic differences between humans and our closest living relatives, chimpanzees? One approach to answer this question is to compare the genomes of humans and chimpanzees. Such research has shown that chimpanzees and human genomes differ by 1.2%, if only single nucleotide variants (SNVs) are considered. However, if you consider structural variants (SVs), then there is a >5% difference, indicating the importance of SVs in trait divergence. Investigating SVs in the chimpanzee genome using nanopore technology has not previously been undertaken, and this is what Daniela’s team set out to do.
Utilising 29x chimpanzee genome coverage obtained from PromethION, combined with optical mapping data, SVs were called in relation to the human reference genome (GRCh38). The team demonstrated that nanopore sequencing had higher sensitivity than optical mapping for discovering small variants down to 50 bp. There was greater overlap in calls for larger SVs (≥10 kbp). A confidence set of SVs was established from previously reported variants and through validation of their ≥10 kbp SVs in additional individuals. This identified 88 and 36 novel deletions and insertions, respectively.
Investigation of the functional impact of these SVs revealed that deletions and inversion breakpoints were depleted of protein-coding genes and less likely to disrupt domain boundaries (topologically associated domains/TADs). It was found that, in general, these regions were located near genes that are differentially expressed between chimpanzees and humans. ‘Diving deeper’ into chimpanzee-specific SVs, they identified 209 chimpanzee-specific deletions, and 18 chimpanzee-specific inversions, impacting 56 protein-coding genes. These genes were enriched for involvement in chemoreception. From this research, they produced a list of candidate genes putatively implicated in chimpanzee-specific traits.
In the last section of Daniela’s talk, she described how they combined all their sequencing data (nanopore, optical mapping, and chromatin conformation capture) to de novo assemble a high-quality chimpanzee genome. This produced an assembly of 2.88 Gb with 162 scaffolds, and a scaffold N50 of 82.7 Mb, representing ‘the most contiguous chimpanzee assembly to date’.