Saturday, February 23, 2013

Illumina Unveils Little About Long Read Technology at AGBT

This year's Advances in Genome Biology and Technology conference in Marco Island, Florida is as packed with exciting talks and announcements as last year's, though notably with less hype about new sequencing technologies.

One eagerly awaited talk entitled "10 kilobase Reads on a HiSeq" was given by Geoff Smith, from Illumina, which covered a few updates to their product pipelines, but mainly focused on their recently acquired technology from Moleculo that’s intended to enable read lengths of 10 kilobases on standard Illumina instruments.

The impact of reading very long stretches of DNA and RNA at once simplifies a lot of questions about genome abnormalities in inherited diseases and cancers, and of the structure of expressed genes.  In general, most sequencing technologies read 100 to 400 nucleotides at a time and any one of these reads doesn't serve as direct evidence for genomic features larger than this.

Extending read length is an arms race between the major sequencer providers: Illumina, PacBio, and Life Technologies.  So with the title of this talk, Illumina had everyone expecting something very profound.

However, the read technology covered was essentially Moleculo's technology, which Illumina acquired last monthUnfortunately, the talk contained less information than this interview with Mickey Kertesz.

The approach basically breaks up long stretches of  DNA into smaller, barcoded, fragments that can be sequenced using Illumina's current technologies.  These fragments are reassembled computationally to regenerate the original sequence.  

We did see some data suggesting 10 kilobase reads as a maximum but the median size of long reads was shown as 8 kilobases.  Smith also showed an example of a 4 kilobase read with several phased heterozygous SNPs, which will simplify a lot of questions involving inheritance of mutations.

This approach may be clever, but it's really not a "10 kilobase read" and shouldn't be presented as such.  In fact, several people I spoke with were slightly let down by the talk.  What I think they're offering is an easy was to generate "10 kilobase assemblies" and it some honest data is needed to indicate how often the technology fails, and especially the contexts that this is or isn't appropriate (repeat regions, tandem duplication of cancer genomic DNA). 

All in all, the kit and technology is progress and does add value to Illumina's product line, but it was made clear that even a simple term as "read" means different things to different people.