Attention readers: This blog has moved to a new home at https://chenghlee.wordpress.com/.

Saturday, February 18, 2012

Oxford Nanopore sequencer annoucements

Oxford Nanapore Technologies (ONT) announced the release of their GridION and MinION systems yesterday, which both use their nanopore sequencing technology. Luke Jostins, Nick Loman, and Keith Robison have all provided detailed commentary about this release, so I'll only offer a few of my thoughts here.

First off, the current 4% error rate (mostly in the form of deletions) isn't great, but that's almost certainly fixable prior to the production release; we should remember that the Solexa and SOLiD systems probably weren't doing much better when they were first announced either.

But most everything else about these systems looks really promising: e.g., true single molecule sequencing (no amplification needed) with reads in the tens of kilobases (kb) and promises of single reads into the 100-kb range (limited, of course, by the fragment sizes in your sample). Features like these would mean that several currently difficult informatics challenges—such as read mapping, de novo assembly, and phasing/haplotyping (if long single strand reads are achievable)—will be "solved" or at least highly simplified.

So yes, assuming that overall costs per base sequenced (including amortized capital and reagent costs) are highly competitive to the new Illumina and Ion Torrent systems, I'm willing to concede that the ONT systems could be significant "game changers". If nothing else, the types and quantity of data that such systems would make available will keep informaticists like me busy (read "employed") for a long time to come.

1 comment:

Cheng H. Lee said...

Going back to something I mentioned in the post: the error rate is still a little concerning. I'm sure ONT will be able to bring it into the ≤1% range, but from what I gather, most of this fix will be in the form of tweaks to the base calling algorithm and not chemistry changes. Thus, I'm worried that error rates might climb back up if you sequenced something with significantly different content than what was used in the training data.

Additionally, 10-kb+ (and even 100-kb+) reads would be great, but I wonder how consistent error rates will be throughout the reads. A rapid drop in quality in long reads seems to be one of the key issues hindering the release of a competing system (*cough* PacBio *cough*).