Interview: Sequencing the poplar with PromethION

with Héloïse Bastiaanse and Stephane Rombauts, members of the VIB-UGent Center for Plant Systems Biology

Conducted and written by Jonathan Pugh

“[On] PromethION we can actually sequence 10–12 genomes in one [flow cell], which lowers the cost per genome and speeds up the data production”

Stephane Rombauts of the VIB-UGent Center for Plant Systems Biology, discussing their experimental setup for sequencing 700 genomes of the poplar tree on the PromethION

In my first interview with scientists and staff members from the VIB centres in Belgium, the focus rested upon the development of the team’s expertise over time and how they get maximum performance from their PromethION device through applying their wealth of experience. In this second article, I leave behind the neurology projects and instead move the focus to plants, in particular the poplar tree.

It was news to me (and perhaps some readers) that the poplar is the model tree species for angiosperms. In the early 2000s, the emergence of poplar as a model organism began to be discussed in the literature, further aided by comprehensive short-read sequencing efforts, which over 6 years led to the generation of a reference assembly of the North American poplar Populus trichocarpa. Héloïse Bastiaanse and Stephane Rombauts from the VIB-UGent Center for Plant Systems Biology agree that Populus is an important genus to investigate, but don’t wish to limit themfselves to genotype alone. They are exploring the pangenome from the European species Populus nigra as part of a multi-omic approach, supported by sequencing ~700 samples on the PromethION.

More than just a genome

Héloïse, a Postdoctoral fellow of the Bio-Energy and Bio-Aromatics group of Wout Boerjan at VIB, sums up why poplar trees make wonderful model organisms: “if you take a branch of poplar and stick that into a pot it will grow and make a clonal replicate”. Furthermore, “the size of the genome [is] relatively small, and also it grows relatively fast”. In part, Populus nigra was chosen due to the vast number of samples available, collected — from the banks of major watercourses within Europe — through several EU-funded projects and maintained by the French research institute INRAE. Those samples are now growing at the VIB center in Ghent, meaning Héloïse and Stephane can readily access them to screen for metabolites, perform RNA sequencing, and ultimately determine if subpopulations display altered metabolic profiles or genome characteristics.

"The MinION is like a workhorse and the PromethION a racehorse”

The project is being driven and maintained by Héloïse, with support from Stephane in certain areas — notably “from sequencing to assembly to annotation” of a P. nigra reference genome obtained from a single genotype in their sample population. Stephane, a Staff Scientist in the lab of Yves Van de Peer, doesn’t spend all of his time working with just data however and is pleased to point out that “thanks to ONT and MinION, I can do it [the experiment] all the way from seed to database with [an] annotated genome, all by myself”. Stephane co-penned the original European Research Council (ERC) proposal for the sequencing work on poplars, and Héloïse is the one putting it into practice under the ERC advanced grant of Principle Investigator Wout Boerjan.

Populus nigra is commonly found along European waterways

Elaborating further on Héloïse’s role underlines the significance of this project. Guided by Stephane’s initial assembly, she will be responsible for coordinating (along with the Neuromics Support Facility team at VIB-UAntwerp [see part one of this interview]), the sequencing of all ~700 genomes, followed by assembly and SNP, SV, and methylation profiling of the resultant data. Once that job is complete, she will turn her attention to transcriptomics and gene expression, before finally assessing the metabolites present for each genotype. Sequencing of samples had just begun when we had our discussion, but a staggering amount of optimisation work had already been undertaken by Héloïse to reach this point.

The perfect prep

The significant cohort of samples from all over Europe, including the Netherlands, Germany, Italy, Spain, and France deserved only the best DNA extraction method, and so Héloïse got to work. “I started with the easy ones … but the problem with Populus nigra is they are rich in polysaccharides, they produce a kind of sap, it’s running out of the leaves”. This, she explains, was “causing a problem of precipitation of DNA”. Many tips and tricks, three to four months, and 50 to 60 protocols later Héloïse had her perfect prep. “I’m happy that I’ve done it! I combined different protocols, one from a technical paper, [one from] molecular biology …. I hired a little army of students for the full summer and they helped me out with priming and extracting”. When I asked if she had reached out for support, she smiled and responded, “I had no idea there was a Community I could talk to!” This project was her first experience with nanopore sequencing and “it was only when I started to have something that was working that we finally started to contact you [Oxford Nanopore] and Mojca and team [at VIB] … before that I didn’t know about the Community. I wish!” Had she accessed the Nanopore Community, a great starting point would have been our extraction protocols library, where customer-generated and Oxford Nanopore-validated methods are hosted.

During the development process, Héloïse and Stephane regularly assessed sequencing performance using MinION devices located at Ghent. For Stephane, “this last year was a learning curve in terms of library building [and] getting experience in how much to load”, and generally becoming comfortable with nanopore sequencing. “The MinION is like a workhorse and the PromethION a racehorse” Stephane quips. The cohort sequencing has only just got underway on the PromethION but they are already seeing yields of just over 130 Gb per flow cell, with their precisely extracted sample pairing well with the expertise of the sequencing support team in Antwerp, Tim de Pooter and Geert Joris. They are “very, very good” states Héloïse, “Together we have good quality DNA and [they have] good practice of loading and preparing the libraries so [we are] really happy to work together”. Having spoken with both parties, I can really see the mutual respect that runs through the teams and it’s clear how well they have combined to get the most from this project.

Data wrangling

Stephane informs me that the Populus nigra genome is around 400 Mb in size and “[when coupled with] the throughput of the PromethION, we can actually sequence 10–12 genomes in one [flow cell], which lowers the cost per genome and speeds up the data production”. This is still a tiny fraction of the potential throughput for their PromethION, which is capable of running 24 flow cells at once if required. They are generating about 10 genomes every two weeks, which they assemble using Flye as “it’s fairly easy to use and it’s pretty quick … [and compared to Canu] you don’t need to do any cleaning prior to the assembly”. Stephane prefers this approach as he feels that “pre-cleaning can mask similar sequences prior to assembly, which means you collapse some of your data … certainly when you do phasing and all those other things, you don’t want this!” Data from each flow cell of ten genomes is run through the same pipeline to generate a polished assembly:

  1. QC to remove any residual barcodes or adapter sequences
  2. Assembly with Flye, and preliminary polishing
  3. Curation of the highly heterozygous P. nigra genome using Purge haplotigs
  4. Long-read scaffolding to improve contiguity
  5. Nanopolish for comprehensive polishing

Key for Stephane has been “optimising all the different steps in the assembly to make sure it maximises the use of the [compute] cluster by parallelising whatever we can”. He continues, “hopefully every poplar genome that we’re going to sequence will be pretty much a draft reference genome”. But they’re not done there ― Héloïse cuts in at this point to highlight that there’s also “the gene expression and metabolome as well, don’t forget about that! It’s a huge project, it’s just insane!” This project will produce a unique multi-omic dataset to enable better understanding for genetic control of complex quantitative traits in trees, such as biomass, phenology, and even areas such as the wood properties that would be relevant to industry

With projects of this size, data volumes can become overwhelming, but Stephane has managed to minimise unnecessary storage requirements and make sure they end up with only the core data required. His wish is to have “a database that would contain the genomes, all the comparisons between all the genomes, all the RNA-seq data, all the metabolome data, all together”. This would populate “a system that any PhD student would be able to go and mine without having to bother about statistics and genome sequence”. Héloïse wishes to “work to develop a website where people [who aren’t bioinformaticians] can query and get themselves into the genome”, making the results of their research even more accessible and user-friendly.

Beyond the poplar

This project is Héloïse’s first major work for VIB, but to suggest the same for Stephane would be falling short. In his own words “I have long arms … [I work with] different core facilities and core groups within the VIB in order to help within every little aspect”. We turn to the discussion of soybeans, and the fascinating symbiotic relationship that they employ. Soybean plants need nitrogen but are incapable of extracting it from the atmosphere. Instead, they develop nodules in which bacterial species live and operate. The bacteria “fix” nitrogen for the plant by converting nitrogen gas (N2) into, for example, ammonia (NH3), using a nifty system known as the nitrogenase protein complex (encoded by the Nif genes).

In return, the bacteria are provided food. Clearly with a penchant for large numbers, Stephane explains “there is a project in Flanders, the 1,000 soy project”, which entails planting 1,000 soy plants in gardens, then collecting samples and determining which bacterial species have helped the “best” plants grow. He approached the project’s lead, Sofie Goormachtig, saying: “give me some DNA from a nodule, and I’ll sequence it on a MinION and we’ll see what happens”. They tried a control sample, that worked well, as did the real sample they tried next. "Now they’re very enthusiastic” beams Stephane. Once his pipeline has completed its metagenomic assembly, they will have “nice circular assemblies” from the best bacteria for the job.

This final story from Stephane cements my respect for VIB as an institution. On the front page of VIB’s website it states “Our entrepreneurial technology transfer approach ensures that scientific discoveries are turned into tangible innovations that benefit society”. I have a strong feeling this is possible due to the nature of the people who make up the teams at VIB as, over the course of these two interviews, I’ve been given just a small insight into the collaborative and friendly but, more importantly, optimistic attitude that they possess. Be it investigating the quest for the prevention of dementia, finding that one tree that offers up metabolites none other can, or even explaining the reason for a soybean plant being hardier than its peers, the VIB teams seem able to get the job done as they share their time and expertise. Whatever else they may investigate, it’s a pleasure to see nanopore sequencing and the PromethION at the heart of just some of these projects.

Jonathan Pugh is an Associate Director at Oxford Nanopore Technologies and has spent almost 10 years developing and introducing nanopore sequencing technology

Want to learn more?

See more plant-based applications for nanopore sequencing

Meet the Bio-Energy team of Wout Boerjan or find out more about the Van de Peer Lab

Learn more about assembly with nanopore sequence data

Read about the soy in Flanders grand challenge project and meet the project’s lead Sofie Goormachtig