Pooled CRISPR Inverse PCR sequencing (PCIP-seq): simultaneous sequencing of retroviral insertion points and the associated provirus in thousands of cells with long reads


Retroviral infections create a large population of cells, each defined by a unique proviral insertion site. Methods based on short-read high throughput sequencing can identify thousands of insertion sites, but the proviruses within remain unobserved. We have developed Pooled CRISPR Inverse PCR sequencing (PCIP-seq), a method that leverages long reads on the Oxford Nanopore MinION platform to sequence the insertion site and its associated provirus. We have applied the technique to three exogenous retroviruses, HTLV-1, HIV-1 and BLV, as well as endogenous retroviruses in both cattle and sheep. The long reads of PCIP-seq improved the accuracy of insertion site identification in repetitive regions of the genome. The high efficiency of the method facilitated the identification of tens of thousands of insertion sites in a single sample. We observed thousands of SNPs and dozens of structural variants within proviruses and uncovered evidence of viral hypermutation, recombination and recurrent selection.

Authors: Maria Artesi, Vincent Hahaut, Fereshteh Ashrafi, Ambroise Marçais, Olivier Hermine, Philip Griebel, Natasa Arsic, Frank van der Meer, Arsène Burny, Dominique Bron, Carole Charlier, Michel Georges, Anne Van den Broeke, Keith Durkin