Nanopore Sequencing – a New D(N)Awn, a New D(N)Ay?
In 2000, former President Bill Clinton held a press conference, to unveil the recently sequenced human genome. After 10 years of blood, sweat and pipet-tips, the project to sequence, assemble and annotate 3.2 Gb of DNA was finished. Well, not really finished, but a pretty good draft, as many iterations of the human genome have been published since that day. At a cost of $3B it was not cheap. The project was sold with many promises, most of which have not been materialized as advertised, but then again, when is an advertisement accurate?
Researchers didn’t just focus on analyzing every base pair of our genome. Some were working on developing sequencing techniques that were faster and cheaper than Sanger sequencing. These efforts soon gave rise to Solexa (later bought by Illumina), 454 (later bought by Roche, who are now trying to buy Illumina), and SOLiD (Applied Biosystems). Today, Illumina is the standard for de novo whole genome sequencing, whereas SOLiD sequencing has not spawned as hoped. In the days that the Human Genome Project started, people were theorizing of sequencing in a radically different way, without the use of enzymes or PCR steps, but using semiconductors and nanopores.
Although Oxford Nanopore Technology’s announcement at the AGBT 2012 meeting lit up the blogosphere last month, a year earlier Pacific Biosciences went public with the first single molecule real-time sequencing machine. The machine is a bit on the larger side (it needs it own room). You need a fair amount of starting material for the current protocol. And it takes a long day of preparations to go from your starting DNA to loading a PacBio cell into the sequencer. At the AGBT 2012 Oxford Nanopore Technologies introduced the first USB-sized sequencer, which doesn’t need any sample preparation. The promise of a $900 USB-sized sequencer blows away any not shown experimental results. The biggest limitation is that it can only sequence up to 900 Mb per sequencer, or just 0.33X coverage of a human genome. The USB-sized sequencer called MinION (see picture) and the larger DVD-player sized GridION are expected to hit the shelves in the second half of this year. Does this mean that there are no competitors on the market? There are. Quickly after the Oxford Nanopore Technology’s coming out, Genia came with a press release claiming that they will hit the market in 2013 and their technology will allow people to sequence their genome for less $100. Of course, a sequenced genome does not mean that you know what you have in your hands, as you still have to analyze it.
How does nanopore sequencing work? Using an array of mutant transmembrane proteins that from the nanopores which are embedded in a polymer membrane. Through this nanopore a DNA strand will be ratcheted. A sensor array chip consists of microwells and each microwell has it’s own electrode. Application-specific integrated circuits, or ASICs, apply a potential across each nanopore and measure the ionic current flow (see figure), and as each base pair creates a unique disturbance of the current flow. Following the current flow disturbances the sequence is determined. The sensor chip is contained in a cartridge that contains all the reagents needed for an experiment.
This does not mean that sequencing techniques such as Sanger or Illumina will go extinct. If you want to quickly check if your favorite plasmid is what you think it is, a simple Sanger sequencing run would do. If you want to know where a certain protein binds in the genome a simple ChIP-chip or ChIP-seq would do. Illumina, 454 and Sanger sequencing will be forced into their own niche, as microarrays has done after Illumina and 454 came on the market.
The future as depicted in the Sci-Fi movie Gattica looks to be at the horizon. Assuming that Oxford Nanopore Technologies and Genia will be as successful as they advertise to be, how will/can they change the world as we know it?
Whole Genome Sequencing
The biggest hurdle dealing with Illumina and 454 sequences is assembling the millions if not billions of short reads produced in useable contigs. The human genome consists of 23 contigs, better known as chromosomes, and each of us has two copies of each chromosomes (except for sex chromosomes in males). Most contigs that are produces from Illumina and 454, as well as from Sanger sequencing are in the order of Mbs and don’t cover an entire chromosome. To facilitate the assembly process of short reads, an assembly competition was created, called the Assemblathon. These competitions will be rendered useless if nanopore sequencing can produce single DNA reads that are larger than most contigs produced by assembling Sanger, Illumina or 454 data. A 98% accuracy would not be a major hurdle, as a 20-30x coverage is already required for assembly of Illumina data. Re-sequencing all (eukaryotic) species that have been sequenced to date might not be a bad idea to greatly increase the contigs sizes. Sequencing through regions of our genome that cannot be assembled, such as the centromere would be trivial. Finding duplicated or inverted regions at base pair resolution would be easy as well.
Allele Genetics / Population Genetics
The field of population genetics will change, as it would be possible to sequence each allele individually (you have 2 copies of each chromosome, one from your father and one from your mother) and trace their origin within a population. Also getting a recombination map at base pair resolution and accurate mutation rate would be straightforward. The nice feature of the USB sequencer of Oxford Nanopore is that you can sequence your samples at location, which would mean that there is no need to prepare samples for shipment anymore.
A lot has been spoken about personalized medicine and companies such as 23&me that push for this development. Imagine you could go to the doctor’s office and one hour after a biopsy your MD knows the specifics of your tumor compared to your healthy tissue. Nanopore sequencing would make this possible. Fetal genetic testing would also become more genome based, but this would mean invasive procedures would become more prevalent, as DNA from the fetus in the uterus would be required. Sequencing your babies genome at birth, would allow for a quick scan of potential genetic predispositions for certain clinical traits. Although a lot of ethical concerns have been brought up, little hard evidence exists that support the hypothesis that people with have problems dealing with personalized genetic information. It has to be kept in mind that everyone (or just about everyone) knows a fair bit about their genetic risk for certain diseases. If heart-disease runs in the family, there is an increased chance that you might get it as well. Selection based on genetic information happens already. Why would you be sexually attracted to one person and not to another? Overall, genetics already is a major part of our lives, whether we know the actual DNA sequence responsible for it or not. Nanopore sequencing would make it possible to have your own genome sequence.
Forensic DNA testing will be dramatically different as well. If you can bring your sequencer with you on the road and put your sample directly on the chip without much processing and plug the sequencer in a USB hub on your laptop, it would be possible to have a more accurate DNA picture (not just a fingerprint) of a suspect within 15-30 minutes. Just imagine that at a murder scene a blood spec is found that might be from a suspect. Rather than having to collect the sample, transport it to a forensic lab, process it for forensic DNA testing and print out a report several days later, you can now get a genomic scan of your suspect at the scene within an hour. The advancement of population genetics is pivotal. Cataloging DNA markers (SNPs, duplications, etc.) will be critical, both for geographical correlations, as well as morphological trait correlation. Of course, reality is going to be much more complex, but the prospects are glowing hot. Depending on how much starting material you need to do a sequencing run, you can accurately discriminate between multiple donors in mixed biological samples. At this moment this is a hotly debated field in forensic genetics, as different labs use different statistical analysis and subsequently come to different conclusions. If it would be possible to sequence each cell individually, in other words, if the starting material just has to be a single cell containing a genome (human red blood cells would not be very useful), discriminating every donor would be trivial. If the biological sample is human blood, the investigator should even be able to discriminate the B cell genome from the T cell genome, based on unique recombination events of either the immunoglobulin genes (B cell specific) or T cell receptor genes (T cell specific). Of course, private investigators will see many uses for this new technology as well.
Insurance companies could see great advantages for their profits if they can very easily obtain your genome and estimate the likelihood of you getting a certain disease. Forcing you to pay a much higher premium or denying you healthcare all together. Linking your genomic information to other insurance (reduced impulse control for drivers, for instance) could save them a lot of money (or more accurately, make them a lot more money). Having regulation that limit or even prohibit companies or any other third-party for asking a DNA sequencing test (including prohibiting buying such machines), as well putting a tight limit on how many DNA sequencing tests you can order per time period per address could be considered. Than again, I am not a legal expert by any measure. The consequences of easy personalized genome sequencing needs to be address at the legal level.
Security by DNA sequence. If you want to work in a high security environment, demanding a genome test to be put in the system would create a highly personalized level of scrutiny. If your genome test at the gate doesn’t provide a match to the existing record, you are not given access. For large-scale security applications, this would be too slow, but if speed increases it could even be applied to for instant airport security. Only people banned from flying would be in the system, and if you don’t match you are free to continue your journey. This would be a problem, as it is imaginable that family members are affected as well, if by random change only one allele set is sequenced. Having this to work, you would need several allele markers on different chromosomes to have an accurate map. Experience with such systems is obviously essential. In short, the sequencing time can be tracked in real-time, you could continue sequencing for as long as needed until you have a match or not a match depending on what you are matching to (inclusion or exclusion).
Being able to sequence pathogens at the scene would allow for accurate identification and tracking. For this to become more efficient, it would greatly help to get rid of human DNA. By only focussing on the non-human DNA present in a sample, this should be possible. In other words, only record non-human DNA sequences. A good pan-human genome would be essential in this case. Of course, this feature can also be used in an academic environment, but its use for public health could be immense.
Many more applications will most likely be thought and some of these applications will be far out there. One such an application could be: an internet company that offers coupling of individuals based on their genetic diversity. In other words, how to create the most genetically diverse child possible if a relationship works out. If this will happen is remains in the future, but it is certainly an option.
What do nanopore companies have to do to jump from the DNA era into the genome era?
- show you can sequence a genome anywhere, anytime, easily and reliably without much if any sample preparation
- how starting material is need (one cell would be ideal of course)
- how to sequence more from less (increase the sequencing capacity of the sequencers)
- an easy to use interface, both hardware-wise and software-wise (new programs have to be written)
- promote open access and open science to facilitate collaboration with and between scientists
- have the capacity to produce high quantity and high quality nanopore sequencers soon
- find a way to recycle large quantities of used nanosequencers