Breaking Down The Genome
A billion dollar budget, a decade of work, and multinational collaboration brought the first human genome sequence—via the Human Genome Project—to the world at the turn of the century. Today, using so-called “next-generation sequencing (NGS),” an individual human genome sequence costs a mere $1,000 and takes 24 hours. Even greater advances in cost saving and time are expected in the next few years.
Full genome sequencing is revolutionizing medicine. Current applications are focused on understanding the genetic basis of cancers, and using that information to develop better diagnostics and therapies to target genetic subtypes. As the technology matures, we can expect to see frequent applications in additional therapeutic areas—even its use as a routine point-of-care test.
In this WEEKLY, we’ll break down NGS technology and examine what is up next in the pipeline.
One DNA Fragment At A Time
Today’s most successful NGS platforms work by taking many copies of an individual’s genome—usually 30—and breaking them up into overlapping fragments. Fragmentation is accomplished by mechanical force or through the use of sonication (sound energy).
Next, these individual fragments are attached to a solid support, such as a microscale bead, and amplified (copied) many times. By the end of this amplification step, multiple single-stranded copies of a particular DNA fragment are attached to each bead.
The sequencing machine is now ready to perform “massively parallel DNA sequencing.” The order of nucleotides—Adenine (A), Cytosine (C), Guanine (G), and Thymine (T)—that make up each fragment will be determined many times over. Once the individual fragments are sequenced, the sequencing machine determines the regions of fragment overlap, and from that information it pieces together the entire genome sequence.
Hundreds of millions of DNA fragments are sequenced simultaneously during NGS. The data from all of these fragments are processed to provide the final readout.
Sequencing By Synthesis: Hi-Seq
“Sequencing by synthesis” is a term applied to the most common methods of NGS, where the readout is established in an indirect fashion. Essentially, the sequence of each single-stranded DNA fragment is determined by using it as a template to produce a second strand of DNA and detecting which nucleotide base—A, C, G, or T—is incorporated at each position. Since we know that A always pair with T, and C always pair with G, by recording the order of the bases in the newly synthesized strand we have also determined the sequence of the template strand.
Illumina’s (San Diego, CA) Hi-Seq machine uses a sequencing by synthesis method known as reversible dye terminator. Simply put, Hi-Seq uses color coding to get a readout. For each round of sequencing, all four nucleotides are washed over the template strand. Each nucleotide has a specific fluorescent tag associated with it—for example, each A may be fluorescent green; the C, fluorescent yellow, and so on. When the enzyme that is synthesizing the new strand of DNA connects a new nucleotide onto the growing strand, the machine detects which color is incorporated, and correlates it to a specific base.

The illustration above shows how a color coded tagging system is used to determine the final readout.
Sequencing By Synthesis: Ion Torrent
Another leading sequencing by synthesis platform is Thermo Fisher’s (Carlsbad, CA) Ion Torrent system. The name “Ion Torrent” signifies the systems reliance on measuring the release of a hydrogen ion, which occurs whenever DNA polymerase (the enzyme that makes the DNA strand) adds a new nucleotide to a growing strand of DNA. This hydrogen ion release is detected as a slight lowering of pH.
How does this work in practice? To conduct Ion Torrent sequencing, whole genomes are fragmented and attached to microscale beads as described above. However, these beads are placed in wells that contain pH-sensitive microchips. Each nucleotide (A, C, G, and T) is washed over the well in turn. If a given nucleotide is incorporated, a hydrogen ion is released, and the microchip records a change in pH. Whichever nucleotide triggered a pH change is recorded as the next base in the sequence.
In The Pipeline: 3rd-Generation
The advances made in whole genome sequencing have been truly impressive. However, 3rd-generation sequencing is poised to revolutionize the field yet again—with faster and less expensive read times on even smaller machines.
Sometimes referred to as “single molecule sequencing”—because it does not require sequencing by synthesis—3rd-generation technologies work by directly “reading” the bases. One example is Oxford Nanopore’s (Oxford, UK) platform, which passes a single strand of DNA through a nanopore (or nano-scale hole) in a polymer membrane. The DNA strand can only pass through each pore one nucleotide at a time; simultaneously, an electric current is also passed through the membrane. The emergence of each nucleotide from the pore disrupts the current in a characteristic way, depending on which nucleotide is passing through. This distinct disruption is noted by a detector, and the corresponding base is recorded.
3rd-generation sequencing is not yet commercially available, although Oxford Nanopore is conducting pilot testing. Anticipated advantages of this technology include faster read times, less expensive whole genome sequencing, greater accuracy, and machines portable enough to make point-of-care sequencing a realistic possibility. Other companies working on 3rd-generation sequencing platforms include Pacific Biosciences (Menlo Park, CA) and Genapsys (Redwood City, CA).
The evolution of genome sequencing isn’t about to slow down any time soon. Our healthcare system is only as good as our technologies, and most believe that as our DNA sequencing technology gets faster, stronger and smarter so does our ability to diagnose and treat disease.