Polony sequencing

Polony sequencing is an inexpensive but highly accurate multiplex sequencing technique that can be used to “read” millions of immobilized DNA sequences in parallel. This technique was first developed by Dr. George Church's group at Harvard Medical School. Unlike other sequencing techniques, Polony sequencing technology is an open platform with freely downloadable, open source software and protocols. Also, the hardware of this technique can be easily set up with a commonly available epifluorescence microscopy and a computer-controlled flowcell/fluidics system. Polony sequencing is generally performed on paired-end tags library that each molecule of DNA template is of 135 bp in length with two 17–18 bp paired genomic tags separated and flanked by common sequences. The current read length of this technique is 26 bases per amplicon and 13 bases per tag, leaving a gap of 4–5 bases in each tag.

Workflow
The protocol of Polony sequencing can be broken into three main parts, which are the paired end-tag library construction, template amplification and DNA sequencing.

This protocol begins by randomly shearing the tested genomic DNA into a tight size distribution. The sheared DNA molecules are then subjected for the end repair and A-tailed treatment. The end repair treatment converts any damaged or incompatible protruding ends of DNA to 5’-phosphorylated and blunt-ended DNA, enabling immediate blunt-end ligation, while the A-tailing treatment adds an A to the 3’ end of the sheared DNA. DNA molecules with a length of 1 kb are selected by loading on the 6% TBE PAGE gel. In the next step, the DNA molecules are circularized with T-tailed 30 bp long synthetic oligonucleotides (T30), which contains two outward-facing MmeI recognition sites, and the resulting circularized DNA undergoes rolling circle replication. The amplified circularized DNA molecules are then digested with MmeI (type IIs restriction endonuclease) which will cuts at a distance from its recognition site, releasing the T30 fragment flanked by 17–18 bp tags (≈70 bp in length). The paired-tag molecules need to be end-repaired prior to the ligation of ePCR (emulsion PCR) primer oligonucleotides (FDV2 and RDV2) to their both ends. The resulting 135 bp library molecules are size-selected and nick translated. Lastly, amplify the 135 bp paired end-tag library molecules with PCR to increase the amount of library material and eliminate extraneous ligation products in a single step. The resulted DNA template consists of a 44 bp FDV sequence, a 17–18 bp proximal tag, the T30 sequence, a 17-18 bp distal tag, and a 25 bp RDV sequence.
 * Paired end-tag library construction

The mono-sized, paramagnetic streptavidin–coated beads are pre-loaded with dual biotin forward primer. Streptavidin has a very strong affinity for biotin, thus the forward primer will bind firmly on the surface of the beads. Next, an aqueous phase is prepared with the pre-loaded beads, PCR mixture, forward and reverse primers, and the paired end-tag library. This is mixed and vortexed with an oil phase to create the emulsion. Ideally, each droplet of water in the oil emulsion has one bead and one molecule of template DNA, permitting millions of non-interacting amplification within a milliliter-scale volume by performing PCR. After amplification, the emulsion from preceding step is broken using isopropanol and detergent buffer (10 mM Tris pH 7.5, 1 mM EDTA pH 8.0, 100 mM NaCl, 1% (v/v) Triton X‐100, 1% (w/v) SDS), following with a series of vortexing, centrifuging, and magnetic separation. The resulted solution is a suspension of empty, clonal and non-clonal beads, which arise from emulsion droplets that initially have zero, one or multiple DNA template molecules, respectively. The amplified bead could be enriched in the following step.
 * Template amplification
 * Emulsion PCR
 * Emulsion breaking

•	Bead enrichment The enrichment of amplified beads is achieved through hybridization to a larger, low density, non-magnetic polystyrene beads that pre-loaded with a biotinylated capture oligonucleotides (DNA sequence that complementary to ePCR amplicon sequence). The mixture is then centrifuged to separate the amplified and capture beads complex from the unamplified beads. The amplified, capture bead complex has a lower density and thus will remain in the supernatant while the unamplified beads form a pellet. The supernatant is recovered and treated with NaOH which will break the complex. The paramagnetic amplified beads are separated from the non-magnetic capture beads by magnetic separation. This enrichment protocol is capable in enriching five times of amplified beads.

•	Bead capping The purpose of bead capping is to attach a “capping” oligonucleotide to the 3’ end of both unextended forward ePCR primers and the RDV segment of template DNA. The cap that being use is an amino group that prevents fluorescent probes from ligating to these ends and at the same time, helping the subsequent coupling of template DNA to the aminosilanated flow cell coverslip.

•	Coverslip arraying First, the coverslips are washed and aminosilane-treated, enabling the subsequent covalent coupling of template DNA on it and eliminating any fluorescent contamination. The amplified, enriched beads are mixed with acrylamide and poured into a shallow mold formed by a Teflon-masked microscope slide. Immediately, place the aminosilane-treated coverslip on top of the acrylamide gel and allow to polymerize for 45 minutes. Next, invert the slide/coverslip stack and remove the microscope slide from gel. The silane-treated coverslips will bind covalently to the gel while the Teflon on the surface of microscope slide will enable the better removal of slide from the acrylamide gel. The coverslips then bonded to the flow cell body and any unattached beads will be removed.

DNA sequencing

The biochemistry of Polony sequencing mainly relies on the discriminatory capacities of ligases and polymerases. First, a series of anchor primers are flowed through the cells and hybridize to the synthetic oligonucleotide sequences at the immediate 3’ or 5’ end of the 17-18 bp proximal or distal genomic DNA tags. Next, an enzymatic ligation reaction of the anchor primer to a population of degenenerate nonamers that are labeled with fluorescent dyes is performed.

Differentially labeled nonamers:

5' Cy5‐NNNNNNNNT

5' Cy3‐NNNNNNNNA

5' TexasRed‐NNNNNNNNC

5' 6FAM‐NNNNNNNNG

The fluorophore-tagged nonamers anneal with differential success to the tag sequences according to a strategy similar to that of degenerate primers, but instead of submission to polymerases, nonamers are selectively ligated onto adjoining DNA- the anchor primer. The fixation of the flourophore molecule provides a fluorescent signal that indicates whether there is an A, C, G, or T at the query position on the genomic DNA tag. After four-colour imaging, the anchor primer/nonamer complexes are stripped off and a new cycle is begun by replacing the anchor primer. A new mixture of the fluorescently tagged nonamers is introduced, for which the query position is shifted one base further into the genomic DNA tag.

5' Cy5‐NNNNNNNTN

5' Cy3‐NNNNNNNAN

5' TexasRed‐NNNNNNNCN

5' 6FAM‐NNNNNNNGN

Seven bases from the 5’ to 3’ direction and six bases from the 3’ end could be queried in this fashion. The ultimate result is a read length of 26 bases per run (13 bases from each of the paired tags) with a 4-base to 5-base gap in the middle of each tag.

Analysis and software
The polony sequencing generates millions of 26 reads per run and this information needed to be normalized and converted to sequence. These can be done by the software that has been developed by Church Lab. All of the software is free and could be downloaded from the website. 

Strength and Weaknesses
Polony sequencing allows for a high throughput and high consensus accuracies of DNA sequencing based on a commonly available, inexpensive instrument. Also, it is a very flexible technique that enables variable application including BAC (bacterial artificial chromosome) and bacterial genome resequencing, as well as SAGE (serial analysis of gene expression) tag and barcode sequencing. Furthermore, the polony sequencing technique is emphasized as an open system that shares everything including the software that have been developed, protocol and reagents.

However, although the raw data acquisition could be achieved as high as 786 gigabits but only 1 bit of information out of 10,000 bits collected is useful. Another challenge of this technique is the uniformity of the relative amplification of individual targets. The non-uniform amplification could lower the efficiency of sequencing and posted as the biggest obstacle in this technique.

Cost
The sequencing instrument used in this technique could be set up by the commonly available fluorescence microscope and a computer controlled flowcell. According to the calculation in the year of 2005, the set up of a complete set of instruments will cost around US$ 130,000. However, the cost could be further lowered to US$ 100,000 in the near future. A biotech company, Dover, is in collaboration with the Church Laboratory of Harvard Medical School to produce an automated sequencing machine, Polonator G.007, based on polony sequencing technique. The current selling price of this machine is around US$ 170,000. According to the calculation in year 2005, every kilobase of generated raw sequence was estimated to $0.11, while omitting the paired-end tag library construction cost, the cost of every kilobase of sequence could drops to $0.08.

History
The polony sequencing is a “distant relative” of the classical polony technology which mainly developed by Dr. Rob Mitra. Together with a MD PhD student, Jay Shendure, Dr. Rob Mitra worked out ways to sequence in situ polonies using single-base extension which can achieved 5-6 bases reads. However, the existing polony sequencing technology was mainly developed by Jay Shendure and Greg Porreca. They have changed almost everything that was there in order to make this multiplex sequencing technology work.

Also, the highly parallel sequencing-by-ligation method of polony sequencing has contributed in forming the basis for ABI Solid Sequencing and others.