London Calling 2017: Day 1 updates
Fri 5th May 2017
It's been a long day at Old Billingsgate in London, where 400 people gathered from 25 countries to share their experiences of nanopore sequencing.
Other talk summaries, in brief:
Plenary: Karen Miga
Karen Miga, a PostDoc at the University of California, provided an update of her work in creating the first representation of a human centromere. Dr Miga’s research focuses on the sequencing of the missing regions of the genome, such as the challenging centromeric regions which contain long arrays of near-identical tandem repeats. Completing the sequencing of a centromeric region is a fundamental milestone in human genomics, as it brings us closer to complete telomere-to-telomere genome assembly. In order to produce a high quality linear assembly of the human Y centromere, Dr Miga’s team used a BAC-based strategy to generate 1D MinION albacore reads, creating 9 BACs with the longest at 221.4 kb. Illumina guided error correction was used to eliminate false positives resulting in a 346 kb assembly of the Y contromeric region. Going forward, Dr. Miga hopes to move away from a BAC based approach, instead using 400-500 kb reads to work directly from human genomic DNA. Her team is currently optimizing their 1D longboard strategy, improving the quality of the base call accuracy, and starting the construction of initial maps of chromosome assigned satellite sequence content and structure.
Plenary: Bjorn Usadel
Professor Björn Usadel, professor at the RWTH Aachen University and director at the Federal Research Center Jülich, presented his work sequencing plant genomes. Large plant genomes often pose a challenge for sequencing due to their size and highly repetitive nature, with many plants containing multiple whole genome duplication events. In addition, secondary metabolites can make it difficult to extract and purify DNA to a high enough standard. Professor Usadel described his work using Nanopore technologies to tackle the sequencing of Solanum pennellii, a green fruited and potentially slightly poisonous wild tomato species. In October 2016, the team started work on the assembly using nanopore flowcells. For the assembly, the team explored a variety of approaches, with the best results gained from Canu pre-processing followed by assembly with SMARTdenovo. The final assembly was polished with short read data to achieve a contig N50 of 2.5 MB and high BUCSO completeness scores. This data indicates that long read sequencing data can be used to affordably sequence and assemble Gbase sized diploid plant genomes within a small laboratory setting.
Plenary: Jared Simpson
After the break, there was a plenary talk from Professor Jared Simpson from the Ontario Institute for Cancer Research. Professor Simpson provided an update on the latest additions to Nanopolish, an open-source analysis tool which works from the raw current signals from the nanopore, using Hidden Markov Models to improve the consensus calling of bases. It can also perform reference-based SNP calling, methylation detection and read phasing. The use of the nanopolish Hidden Markov Models created one of the most accurate nanopore genomes to date, giving 99.98% sequence identify for the E. coli gene from the MinION R9 flowcell. Single nucleotide polymorphism (SNP) calling and genotyping is a newly developed feature to identify SNPs in both diploid and polyploid genomes. Testing genotype accuracy from the human genome gave 99.2% at all sites, and 94.8% accuracy at variable sites. The use of nanopolish to identify DNA methylation will be discussed further by Winston Timp during the afternoon breakout session. Going forward, Simpson aims to replace the nanopolish’s Hidden Markov Model with a neural network for higher consensus accuracy.
The nanopolish tool can be found on github: https://github.com/jts/nanopolish
Lightning talks - Session 1
Before lunch, there were a series of lightning talks where researchers showcased their latest research using Oxford Nanopore technologies. Each talk lasted five minutes, with the presenters kept strictly to time!
Michael Clark from the University of Oxford presented his work elucidating the full length transcript structure of the neuropsychiatric disease risk gene CACNA1C. Using 2D nanopore sequencing, he identifying 18 annotated isoforms and 40 novel isoforms, including novel exons. Michael hopes that this elucidation forms the first step in evaluating CACNA1C as a therapeutic target for treating diseases such as bipolar disorder.
Graziano Pesole from the University of Bari explored DNA methylation in mitochondria, the extent and function of which is still debated. As current traditional methylation detection techniques require high levels of DNA and produce short fragmented reads, he hoped the longer reads of the MinION would give higher accuracy and fewer false positives. Using the nanopolish tool to profile the methylation pattern of human mitochondrial DNA he detected 430 methylated cytosines, 93% of which overlapped with known methylated regions
Raja Mugasimangalam from Genotypic Technology discussed his work elucidating the bacterial composition of yoghurts. There are a range of both commercial and non-commercial yoghurt drinks, and the labelling differs from an exact list of species composition, to no bacterial information on the label. Raja used the R7 flowcells to study 12 different products. While 70 reads per sample was enough to identify bacteria in monocultures, at least 500 reads were needed for multiple component samples, with the sample going from the pot to results in 4-6 hours.
Sebastian Johansson from the SciLifeLab Royal Institute of Technology presented work on phenotyping the human leukocyte antigen; an important immune system component which has implications for human organ transplants. He performed long-range PCR in 8 patients simultaneously, pooling the amplicons sample-wise after barcoding. Filtered and aligned reads covered most of the gene, with better coverage for smaller vs. longer reads.
Franz-Josef Müller presented the SelectION workflow, which provides an intermediate step between basecalling and alignment. SelectION anchors nanopore reads to specific sections of the genome, speeding up the workflow and cutting down computational costs. The SelectION system was tested for the diagnosis of Fragile X syndrome, a neuropsychiatric disorder caused by expanded read repeats. Nanopore sequencing excels at finding the repeat-expanded regions, as short range sequencers find it difficult to map multiple repeats. For the next step, Müller is looking to map and predict coverage with the PromethION.
Sally James from the University of York, gave our first example of the conference of telomere-to-telomere sequencing of the Galdieria sulphuraria bacteria which live in thermal springs. G. sulphuraria sequesters heavy metals, which makes it an interesting prospect in biotechnologies but creates challenges for DNA extraction. Using a modified extraction protocol with the BluePippin, and Canu correction of the assembly, generated a 13.3 MB assembly from 76 contigs. One of these contigs was found to contain the full genome, with telomeric repeat regions on either side of the sequence.
David Eccles from the Malaghan Institute Of Medical Research presented his genome sequencing work on Nippostrogylus brasiliensis, a rodent parasite similar to human hookworm. After issues with DNA prep and some disappointing results, David sent worm samples to Oxford, where they were sequenced in-house, providing a full metagenomics profile from a single sequencing run of 35 thousand reads. He is currently working on assembling the full genome using Canu.
Beth Lodge from Nanopore presented an update on the VolTRAX, an automatic library preparation device which requires just a sample and a laptop. Tests of the VolTRAX have shown it is able to extract DNA from an E. coli sample, carry out PCR, and complete 16S identification of both an E. coli test and a sample of Actimel™ yoghurt! VolTRAX also shows a high consistency across multiple field runs compared to other systems. The burn-in sample is currently available for user trials.
Plenary - Dan Turner
The afternoon plenary started with a presentation from Dan Turner, head of the applications team at Oxford Nanopore Technologies, showcasing some of the new developments from the teams in Oxford and New York. The work he presented was also available as posters on the venue floor. One of the main advantages of nanopore technology is the long-read length, and Dan discussed how to improve read length by ensuring appropriate sample preparation and library prep. To aid researchers, Oxford Nanopore is currently working on a protocol selection tool which asks a series of simple questions and suggests the optimal combination of extraction protocol, library preparation and DNA analysis. Other novel research included the current work on the metagenomics assembly platform (which has been tested on both standard bacterial community mixes as well as probiotic food supplements), identification of structural variation, amplicon-free protein quantification, and a PCR-free cDNA library prep system planned for release in May 2017. His presentation finished with a video of the VolTRAX process in action, with DNA extracted from an Actimel™ sample and sequenced on the MinION, and the bacterial species identified. The whole process was completed within 20 minutes.
Plenary: Eric van der Helm and Lejla Imamovic
The second afternoon plenary presentation was from Dr. Eric van der Helm and Dr. Lejla Imamovic; research scientists at the Novo Nordisk Foundation Center for Biosustainability working on bacterial antibiotic resistance. Hospital patients, particularly those in intensive care units (ICU), are often exposed to high levels of antibiotics for both treatment and prophylactics and therefore develop a ‘resistome’, the set of antibiotic resistance genes within their microbiome.
Dr. van der Helm and Dr Imamovic described their work using the poreFUME workflow for functional metagenomics selections, monitoring the development and spread of novel resistance genes in ICU patients. Using nanopore sequencing for functional metagenomics allowed for high throughput with 98% mean sequence identify compared with Sanger methods and high reproducibility. As well as monitoring the dynamic changes in known antibiotic resistance genes, using the poreFUME for functional metagenomics selections allowed the monitoring of novel resistance genes. They also presented preliminary data on detecting and monitoring extended spectrum β-lactamase enzymes. They are currently working on optimizing the existing workflow to move to a PCR-free system, using fosmids to take advantage of the long read lengths provided by the nanopore, and exploring bacteriophage metagenomics.
Aleida Hommes de Vos van Steenwijk from Orvion BV showed how the MinION device has allowed their small biotech company to assess and monitor microbial processes in systems such as antibiotic resistance and nitrification in wastewater treatment; groundwater remediation and biodiversity monitoring of environmental surface waters.
Aleida showed that, while a number of antibiotic resistance genes were removed from wastewater via treatment processes, a large number passed through into the environment. Coupled with this, Aleida and her team detected new antibiotic resistance genes entering the environment suggesting these wastewater plants may be a source of microbial resistance genes.
During her talk Aleida highlighted how the MinION device has allowed her and her team to generate much more detailed data than they were previously able, without relying upon external sequencing companies.
Sarah Stewart Johnson from Georgetown University presented an exciting talk on how nanopore sequencing technology can be used to generate sequencing data in the most extreme of environments. In 2016 Sarah and her team became the first to sequence DNA on the continent of Antarctica and spoke about how long read sequencing could distinguish between genetic material obtained from live cells and from dead frozen cells in the dry valleys of the continent. Overcoming the cold conditions involved performing ligation reactions in a thermal flask filled with hot water and warming the minion with chemical hand warmers. The flexible sequencing time available on the minion seemed to come in useful during an evacuation due to extreme weather. Sarah concluded her talk by proposing experiments in more extreme environments such as the deep sea or harsh desert and finished by suggesting that nanopore technology could be used to detect chemical signatures of life on other planets without the assumption that the life is nucleic acid based.
Mick Watson of Edinburgh Genomics has been using the MinION to generate large data sets of long reads from bacteria that reside within the digestive system of ruminants such as cattle and sheep.
Ruminants rely upon their gut microbiome to break down plant material and understanding these organisms could help resolve the disparity in animal based food production between the developed world and the developing. Before getting into the meat of his presentation Mick announced a warning that “…if you are studying the microbiome and not bead beating, you are not seeing all of your organisms…” and showed convincing evidence of this.
Using bioinformatics tools such as Canu and nanopolish, Mick and his team were able to generate annotated contigs of over 1 Mb which came close to representing whole bacterial genomes. What was most interesting was the fact that a number of these high quality assemblies matched no known bacterial taxa in current databases. It was suggested that these novel genomes represented as yet undiscovered bacterial species or at least new strains of existing species.
Concluding his talk Mick stated that the MinION is easy to use, even a bioinformatician can do it!
Breakout: Consensus accuracy and variant calling
Chris Wright (Oxford Nanopore) presented work he has been doing to improve consensus accuracy focusing on using a combination of existing tools and fast methods. He developed a pipeline for overlapping reads, layout and consensus building using Minimap, Miniasm and Racon respectively. Using Chr21 from the CliveOME he could assemble in 1 hour compared to the 24 hrs taken for canu. Chris went on to discuss using the Read-until function of nanopore sequencing to generate even coverage across the genome using fewer reads. Finishing with how you can combine the read-until function with with real time assembly, showing that he could assemble E.coli genomes from individual nanopore channels.
Ryan Wick from the University of Melbourne presented his work using a hybrid of nanopore and illumina reads to create near perfect assemblies of bacterial genomes. Using short reads fails to resolve repeats present in bacterial genomes resulting in fragmented assemblies leaving you unable to distinguish plasmids from genomes.
Ryan has developed Unicycler an assembly pipeline to use both nanopore and illumina reads designed specifically for bacterial genomes. Unicycler overcomes some issues that other hybrid assemblies suffer from such as a failure to cope with circular genomes, creating near perfect assemblies of bacterial genomes.
Damien Tully from the Ragon Institute of MGH, MIT and Harvard presented his work studying the dynamics of HIV evolution using nanopore sequencing. Damien discussed how he uses a BLT ‘humanized’ mouse model to study the transmission of HIV. In 80% of cases a single virus is transmitted thus learning about this transmission works will enable us to block transmission. He found they were able to generate up to 99.9% consensus accuracy compared to the reference genome. Using two PCR reactions to cover the entire HIV genome doing he is able to look at the phasing of SNPs and resolve complex multivariant infections which is not possible with short reads. Damien is extending these studies to identify the foci of transmission of Hepatitis C virus.
Breakout: Epigenetics & methylation
In breakout room 2 there were 3 presentations focusing on epigenetic analyses including DNA methylation and other base modifications. Current methods for epigenetic profiling, such as bisulfite treatment, are often damaging and can result in excessive DNA fragmentation. The development of software to detect epigenetic events from the raw nanopore data allows the analysis of epigenetic modifications across long read-lengths.
Dr. Marcus Stoiber from the Lawrence Berkeley National Laboratory presented two computational tools designed to process and extract epigenetic information from the raw nanopore signal. The first, nanoraw, assigns individual bases to the raw signal with a high level of precision, giving a novel visualisation system and increased power and accuracy to detect modified bases. The newer basecRAWller algorithm is a streaming basecalling software which uses a unidirectional recurrent neural network applied directly to raw 16-bit data acquisition values providing read sequences in real-time. While basecRAWller has not performed as well as yet, it has the advantage of being a truly real-time system, and Dr. Stoiber aims to incorporate elements into the nanoraw system using bidirectional neural methods.
Professor Winston Timp of the John Hopkins University discussed his work on the methylation detector element of nanopolish. Hidden Markov Models are used to distinguish 5-methylcytosine from unmethylated cytosine in E. coli with ~90% accuracy. Applying this model to the NA12878 human genome allowed the measurement of global patterns of DNA methylation and quantification of the methylation status of CpG islands from a single MinION read. An advantage was gained from the length of the MinION reads, which allowed the researchers to examine phased methylation patterns over long stretches of DNA. They were even able to identify haplotype-phased allele specific methylation patterns. Preliminary data on the methylation profiles of mitochondrial genomes showed mostly low methylation levels, however a simple clustering analysis appeared to show intriguing patterns. Professor Timp is hoping to expand the reach of the epigenetic capabilities of the nanopore into non CpG methylation, and use exogenous labelling of DNA to explore other aspects of the epigenome.
Miten Jain presented the latest data from the UC Santa Cruz Genomics Institute. He started with a brief discussion of read-length, as nowadays long reads from the nanopore are routine, with up to 200 kb regularly achieved. The 1 MB pair is within reach, which Dr. Jain feels is vital for solving important biological questions. He discussed his work on detecting epigenetic modifications of bacterial DNA, with around mid-90% accuracy for detecting methylation events, and further Hidden Markov Model training to detect de novo modifications. He finished by presenting some of his latest work on RNA sequencing, which currently has a median sequence accuracy of 87%, with >50% of reads containing full length transcripts. He also discussed RNA modifications, including the modification of uridine to pseudouridine that was detected using the nanopore. This change is not visible on a mass spectrometer, however can be detected as an incorrect call on the nanopore which can be trained to recognise the modification.
The presentations were followed by a panel discussion where the 3 speakers answered questions from the audience. Discussions included the expansion of the current models to include identification of different types of methylation, conducting biology in pure signal space, and the expansion of the models into clinical samples for the detection of medically relevant epigenetic effects.