high_throughput_sequencing_for_genetic_diversity [CSBQ-QCBS Wiki]

Dye terminator sequencing has long been the main method for providing sequence data, but it has the disadvantage of being time consuming and expensive when a massive amount of data needs to be analysed. A revolution in the field of sequencing began at the turn of the 21st century, with the introduction of sequence by synthesis methods ¹⁾ ²⁾ and today (2011) there are many different platforms available for high throughput sequencing. What these methods have in common is that they parallelize the sequencing process, typically producing thousands of short sequencing reads at once. The Wikipedia page on DNA sequencing provides a rich historical review of the subject, and many scientific articles describe the differences among the technologies ³⁾ ⁴⁾ ⁵⁾ and compare the expected results ⁶⁾ ⁷⁾. Other names for high throughput sequencing methods are next generation sequencing, second generation sequencing, third generation sequencing or massively parallel sequencing.

High throughput sequencing for genetic diversity

Genetic diversity studies form the basis of many aspects of biodiversity science. High throughput sequencing has the potential to dramatically change how genetic diversity studies are planned and analyzed. Still, although the ratio of the number of reads produced in a single run is truly cost-effective, the relatively high cost of a single run has prevented many academic laboratories from using these innovative technologies. To overcome this limitation, barcoding systems were developed ⁸⁾, where different oligonucleotides (8 to 10 bp in length) are incorporated in the different DNA samples to be sequenced. After these samples are labelled with the barcodes, they can all be multiplexed and sequenced together in a single sequencing run. Each sample is then sorted using bioinformatics methods, by recognition of its barcode. When coupled with laboratory methods for genome complexity reduction, high throughput sequencing can be a very efficient strategy for providing a massive amount of sequence data from different samples, in a short time and at reasonable costs.

High throughput sequencing at the QCBS

A few QCBS members have used one of the high throughput sequencing methods currently available (as of 2011). One example, using a AFLP-like and a pyrosequencing step with a Genome Sequencer FLX (GS-FLX) System is detailed in the specific page High throughput sequencing at the QCBS of this wiki.

¹⁾

Brenner, S. et al. (2000). Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nature Biotechnology 18, 630-634.

²⁾

Margulies, M. et al. (2005). Genome Sequencing in Open Microfabricated High Density Picoliter Reactors. Nature 437, 376-380.

³⁾

Myllykangas, S., Buenrostro, J., and Ji, H. P. (2012). Overview of Sequencing Technology Platforms. In Bioinformatics for High Throughput Sequencing, N. Rodríguez-Ezpeleta, M. Hackenberg, A. M. Aransay, eds. (New York, NY: Springer New York), pp. 11-25.

⁴⁾

Glenn, T. C. (2011). Field guide to next‐generation DNA sequencers. Molecular Ecology Resources 11, 759-769.

⁵⁾

Morozova, O., Hirst, M., and Marra, M. A. (2009). Applications of New Sequencing Technologies for Transcriptome Analysis. Annual Review of Genomics and Human Genetics 10, 135-151.

⁶⁾

Dames, S., Durtschi, J., Geiersbach, K., Stephens, J., and Voelkerding, K. V. (2010). Comparison of the Illumina Genome Analyzer and Roche 454 GS FLX for resequencing of hypertrophic cardiomyopathy-associated genes. J Biomol Tech 21, 73-80.

⁷⁾

Wall, P. K. et al. (2009). Comparison of next generation sequencing technologies for transcriptome characterization. BMC Genomics 10, 347

⁸⁾

Binladen, J., Gilbert, M. T. P., Bollback, J. P., Panitz, F., Bendixen, C., Nielsen, R., and Willerslev, E. (2007). The Use of Coded PCR Primers Enables High-Throughput Sequencing of Multiple Homolog Amplification Products by 454 Parallel Sequencing. PLoS ONE 2, e197.