16th March 2018 - BioRxiv
Over the last 35 years the soil bacterium Agrobacterium tumefaciens has been the workhorse tool for plant genome engineering. Replacement of native tumor-inducing (Ti) plasmid elements with customizable cassettes enabled insertion of a sequence of interest called Transfer DNA (T-DNA) into any plant genome. Although these T-DNA transfer mechanisms are well understood, detailed understanding of structure and epigenomic status of insertion events was limited by current technologies. To fill this gap, we analyzed transgenic Arabidopsis thaliana lines from three widely used collections (SALK, SAIL and WISC) with two single molecule technologies, optical genome mapping and nanopore sequencing. Optical maps for four randomly selected T-DNA lines revealed between one and seven insertions/rearrangements, and for the first time the actual length of individual transgene insertions from 27 to 236 kilobases. De novo nanopore sequencing-based genome assemblies for two segregating lines resolved T-DNA structures up to 36 kb into the insertions and revealed large-scale T-DNA associated translocations and exchange of chromosome arm ends. The multiple internally rearranged nature of T-DNA arrays made full assembly impossible, even with long nanopore reads. For the current TAIR10 reference genome, nanopore contigs corrected 83% of non-centromeric misassemblies. This unprecedented nucleotide-level definition of T-DNA insertions enabled the mapping of epigenome data. We identify variable small RNA transgene targeting and DNA methylation. SALK_059379 T-DNA insertions were enriched for 24nt siRNAs and contained dense cytosine DNA methylation. Transgene silencing via the RNA-directed DNA methylation pathway was confirmed by in planta assays. In contrast, SAIL_232 T-DNA insertions are predominantly targeted by 21/22nt siRNAs, with DNA methylation and silencing limited to a reporter, but not the resistance gene. With the emergence of genome editing technologies that rely on Agrobacterium for gene delivery, this study provides new insights into the structural impact of engineering plant genomes and demonstrates the utility of state-of-the-art long-range sequencing technologies to rapidly identify unanticipated genomic changes.