Cameron Soulette: Nanopore Sequencing reveals isoform-specific changes associated with U2AF1 S34F

Lightning talk: Cameron Soulette, of the University of California Santa Cruz, discussed his lab’s use of nanopore cDNA sequencing in the identification of cancer-specific mutations in the splicing factor U2AF1. He explained how U2AF1, which plays an essential role in defining the 3’ end of introns, is recurrently mutated in several cancer types, including in 3% of lung adenocarcinomas. He focused on the most common mutation, S34F, which produces significant alternative splicing events in lung adenocarcinoma. This results in the formation of aberrant mRNAS; however, the functional impact of these splicing alterations is poorly understood - Cameron questioned whether this variant could be selected for in lung adenocarcinoma due to it conferring an advantage to tumour progression or maintenance. He noted that characterisation of isoforms with short reads is difficult; the team used nanopore sequencing to generate long-read sequencing data for these isoforms. cDNA libraries were generated in triplicate for samples of two wildtype cell lines and two mutated cell lines; these were then sequenced on the MinION device. Data was also generated using a short-read sequencing technology. Full-length isoform analysis of RNA (FLAIR) was performed to characterise the isoforms present, revealing alternative splicing proportions for the wildtype vs S34F samples. Of the S34F-associated isoform changes identified, 30 were identified by FLAIR with long-read nanopore sequencing only, 3 by the short-read assembly only, and 5 by both methods. Of the significantly dysregulated isoforms identified, two thirds were not seen via short-read sequencing and were missing from the reference annotation, suggesting novel isoforms. Cameron then displayed data for UPP1 isoforms, associated with an unfavourable outcome in lung cancer: here, complicated, long-range isoforms were picked up by FLAIR analysis with long nanopore reads but not by short-read assembly. Cameron concluded that the use of long reads here enabled the identification of cancer-related genes for further functional analyses.