This is an old revision of the document!


===== AFLP-like step with 454 sequencing for studying population structure =====

In 2011, Simon Joly, researcher at the Jardin Botanique de Montréal and Annie Archambault research professional at the QCBS, set up an experiment using one of the high throughput (or next-generation) sequencing methods to study the population structure of ginseng (Panax quinquefolius) a rare plant species in southern Ontario and Quebec, which will be necessary for establishing conservation criteria. The protocol used unidirectional amplicons sequencing on a the Genome Sequencer FLX (GS-FLX) System with the current Titanium chemistry. The sequencing procedures were performed at the Centre d’Innovation McGill et Génome Québec, and the protocol for DNA library preparation are described in the following sections.

Methods

Sampling

Ten plants per population were collected for six populations of ginseng (Panax quinquefolius). Leaves were cut and dried in silica gel. The disclosure of the precise populations localities could have a negative impact this rare species, and it is therefore a sensitive information, kept confidential according to Agreement for the Protection and Recovery of Species at Risk between the Canada and the Quebec governments. All populations are at the northern limit of the Appalachians.

Molecular biology protocols

DNA extraction

An amount of 10 microgramm of dried leaves was ground for one minute in a microcentrifuge tube with one tungsten bead in the TissuLyser (Qiagen). Total DNA was extracted using EZ-10 Spin Column Genomic DNA kits for Plant Samples (BioBasics catalog number BS425-50) as recommended by the manufacturer. Quality and quantity of total DNA was evaluated by gel electrophoresis and by optical density measurement.

Genome complexity reduction

A modified AFLP strategy, inspired by the Crops technology1) (AFLP and CRoPS are registered trademarks of Keygene N.V.) and a published study 2) was applied to Panax quinquefolius total DNA, in order to efficiently discover sequence polymorphism in a wide and random range of the whole genome, but without actually sequencing the whole genome. One of the assumptions of this AFLP-like method is that restriction sites within the genome are conserved among populations. The steps are as follow:

  • Digestion of a moderate amount of DNA for each samples with two different restriction enzymes. One 4bp-cutter (here Mse1, T/TAA) and one 6bp-cutter (here, EcoR1, G/AATTC) that are not blunt-end, and leave a overhang of 2 (for Mse1) or 4 (for EcoR1) nucleotides.

Table 1 Reagents for digestion of plant genomic DNA, at 37 °C for 3 hours.

Reagent Initial conc. Qty added Final conc. or Final qty
Template DNA 20 ng/µl 9 µl 180 ng
NEB4 Buffer 10X 4 µl 1X
EcoR1 100,000 U/ml 0.05 µl 5 U
Mse1 10,000 U/ml 0.30 µl 3 U
BSA 10 mg/ml 0.4 µl 100 µg/ml
H2O - 26.25 µl -
Total volume - 40 µl -
  • Ligation of double stranded adaptors to the digested DNA. Two different double-stranded adaptors were designed with the oligonucleotides listed in Table 2. Resuspended EcoRI_adapter1 and EcoRI_adapter2 oligonucleotides were mixed together, heated and slowly cool down to make the double stranded. The same procedure was applied to MseI_adapter1 and MseI_adapter2 oligonucleotides. EcoRI adaptor were diluted to a final concentration of 5 micromolar (5 µM), while MseI adaptors were diluted to a final concentration of 50 micromolar (50 µM).

Table 2 Oligonucleotides for preparation of double stranded adaptors.

Oligo name Modification Sequence, 5' to 3'
EcoRI_adapter1 CTCGTAGACTGCGTACC
EcoRI_adapter2 5' phosphorylated AATTGGTACGCAGTCTAC
MseI adapter1 5' phosphorylated TACTCAGGACTCAT
MseI adapter2 GACGATGAGTCCTGAG

Reaction mix for adaptor ligation to digested DNA is described in Table 3, it is performed in NEB4 Buffer with the double-stranded adaptors using T4 DNA ligase and additional ATP. Figure 1 illustrates the DNA fragments involved in the ligation step.

Figure 1 Pictogram of the DNA fragments involved in the ligating double stranded adaptors to DNA previously digested with EcoRI and MseI restriction enzymes, in the context of a modified AFLP method for genome complexity reduction.

Table 3 Reagents for ligation of double stranded adaptors to previously digested DNA. A total volume of 10 µl of the ligation mix is added to the 40 µl volume of each digestion mix, and is incubated at 16 °C for 3 hours.

Reagent Initial conc. Qty added Final conc. or Final qty
NEB4 Buffer 10X 1 µl 1X
EcoRI double-stranded adaptor 5 µM 1.5 µl 0.15 µM
MseI double-stranded adaptor 50 µM 1.5 µl 1.5 µl
T4 DNA ligase 2000 unit/µl 0.1 µl 200 cohesive ends units
ATP 10 mM 5 µl 1 mM
ddH2O - 0.9 µl -
Volume added to digestion mix - 10 µl -
Total volume - 40 µl -
  • Amplification by PCR using primers specific to the adaptor sequence. The purpose of this step was to amplify only a small proportion of the total genome, thereby reducing the complexity of the nucleotides fragments pool that will be sequenced. In this step, the only the genomic fragments amplified were those that cut with EcoR1 on one side, and with Mse1 on the other side, and additionally only those fragments that end by a C on the EcorR1 side and by a AC on the Mse1 side. These primers enabled what is termed a selective amplification. The MID (multiplex identifiers) barcodes used for the pyrosequencing step (see next two paragraphs) are incorporated in the selective primers. Figure 2 illustrates the DNA fragments involved in the amplification step.
Pooling, multiplexing and barcoding samples for high throughput sequencing

One feature to high throughput sequencing is the ability to multiplex different samples into a single sequencing run, which is made possible with the use of MID (multiplex identifiers). These are 10 bp long segments that were here added to the 5’ side of the EcoRI section of the selective primers. The barcodes are being sequenced along with the organism DNA, and are then recognized and sorted using bioinformatics methods. The complete list of MID for the Genome Sequencer FLX system is available TCB No. 005-2009 April 2009 Using Multiplex Identifier (MID) Adaptors for the GS FLX Titanium Chemistry - Extended MID Set. Here, the 30 bp nucleotides segment (LibL-A and key) necessary for the sequencing instrument was further added to the 5’ side of the MID segment, following recommendations in APP No. 001-2009 unidirectional sequencing of Amplicon libraries using the GS FLX Titanium emPCR Kits (Lib-L). In the present study, the 6 different ginseng populations were labeled with 6 different MID (multiplex identifiers) barcodes, but each sample of a population was labeled with the same population-specific barcode (Table 4).

Table 4 List of selective primers used for reducing the genomic complexity of the Panax genome, and for making amplified fragments suitable for multiplexing different samples in a single run of pyrosequencing on a GS-FLX instrument. All oligonucleotides used as a forward primer include the LibL-A and the key segments necessary for the sequencing instrument, and the population-specific MID. They are followed by a unique EcoRI segment for the selective amplification. The reverse primer is made with a MseI segment (for selective amplification), and a LibL-B segment for the instrument. All oligonucleotides were purified by HPLC.


1)
van Orsouw, N. J. et al. (2007). Complexity Reduction of Polymorphic Sequences (CRoPSTM): A Novel Approach for Large-Scale Polymorphism Discovery in Complex Genomes. PLoS ONE 2, e1172.
2)
Gompert, Z., Forister, M. L., Fordyce, J. A., Nice, C. C., Williamson, R. J., and Alex Buerkle, C. (2010). Bayesian analysis of molecular variance in pyrosequences quantifies population genetic structure across the genome of Lycaeides butterflies. Molecular Ecology, 19, 2455-2473