Schematic illustration of the breeding setup used to create a reciprocal backcross population (BC). One pair of autosomes is represented by vertical lines and mitochondria are represented by circles. A BC population is useful for mapping QTLs, while also mapping parent-of-origin effects (mitochondria, sex chromosomes, and QTLs that depend on parental origin). Different breeding setups will allow these factors to vary (mitochondria, sex chromosomes, and maternal/paternal genotype origin), depending on the study aim. In this example, we are backcrossing F1 hybrids to strain B (allowing parental origin of B and W to vary) while keeping mitochondria fixed to B
Example: To create the F1 generation, establish four breeding pairs. For reciprocal breeding, establish two pairs with B female founders (F1B) and two pairs with W female founders (F1W). The reciprocal N2 generation is created from B (n ≥ 4) and W (n ≥ 4) females bred to F1 males and F1 females bred to B (n ≥ 4) and W (n ≥ 4) males. If reciprocal founders are used, a minimum of 16 pairs with F1B and 16 pairs with F1W are set up. Several litters from each pair can be used for experimentation.
3.2.2 Intercross (Fig. 2)
Alternatively, all three genotypes can be directly achieved in a population by mating two F1 hybrid parents. This is called an intercross and the most commonly used generation is the second (F2) (28).
Schematic illustration of breeding design for intercross (F2) and advanced intercross line (AIL) construction. One pair of autosomes is represented by vertical lines and mitochondria are represented by circles. An intercross captures all three possible genotypes in one population, which is useful for mapping dominant, additive, and recessive QTLs and QTL interactions. To create an F2 population, F1 hybrids are intercrossed. To create an AIL, intercrossing continues for additional generations, avoiding sister-brother mating
Example: Create the F1 generation by the reciprocal breeding described for BC (under Sect. 3.2.1). The F2 generation is created from eight pairs with B female founders (F1B) and W male founders (F1W) and eight pairs with W female founders (F1W) and B male founders (F1B). Several litters from each pair can be used for experimentation.
3.2.3 Advanced Intercross Line (Fig. 2)
An AIL can provide a more precise QTL location and can reduce the interval it spans (3). The population is created much the same as F2 intercross populations, with the crucial difference that the breeding is continued in a pseudo-random fashion for several generations.
Example: Create the F1 and F2 generations described for intercross (under Sect. 3.2.2). The G3 generation is created from breeding couples (n ≥ 50) with both types of female founders (25 pairs each). Random breeding of 50 males and females, avoiding sister-brother mating (n ≥ 50), creates all subsequent generations.
3.2.4 Heterogeneous Stock (Fig. 3)
The HS was established from eight inbred strains (4). The stock has been bred according to a standard pseudo-random outbreeding schedule more than 50 generations, using 40 breeding pairs for each generation. The breeding scheme is designed to minimize inbreeding and maximize recombination density to reduce the size of inherited haplotypes (6).
Schematic illustration of heterogeneous stock (HS) construction. One pair of autosomes is represented by vertical lines. Eight inbred strains are intercrossed for more than 50 generations in a pseudo-random fashion to create genetic mosaics
Example: There are HS colonies already established in both rat and mouse. The rat HS was established by Dr. Carl Hansen at the National Institutes of Health (NIH) in the 1980s from ACI/N, BN/SsN, BUF/N, F344/N, M520/N, MR/N, WKY/N, and WN/N (4). The MR, WN, and WKY strains trace their ancestry to the original Wistar stock, the ACI strain is a hybrid between the August and Copenhagen strains, the BN strain traces its ancestry to the Wistar Institute stock of wild rats, and the M520, F344, and BUF strains are of unknown origin. The European HS colony was established in 2004 by Dr. Alberto Fernandez Teruel at the Autonomous University of Barcelona obtained from the Northwestern University colony (Dr. Eva Redei). The mouse HS was established from A/J, AKR/J, BALB/cJ, C3H/HeJ, C57BL/6J, CBA/2J, DBA/2J, and LP/J (29). The AKR, C57BL, C3H, and BALB strains were originally obtained from Charles Rivers Laboratories (Wilmington, MA) and A, CBA, DBA, and LP strains were obtained from Jackson Laboratories (Bar Harbor, ME). The stock was created by Dr. Robert Hitzemann in the 1980s and is currently maintained in his laboratory at the Oregon Health & Science University. The HS colonies are maintained using a circular breeding design, i.e., family one male is bred to family two female, family two male is bred to family three female, etc., to keep the genetic heterogeneity while reducing allele fixation. New HS populations can also be created by eight-way intercrossing inbred strains, but generating new stocks demands a lot of time and resources and we recommend using existing HS if possible.
Measure your phenotype of interest in the entire experimental population. Measurements should be done as accurate as possible and all factors that can affect the phenotype should be appropriately recorded (e.g., sex, age, set, season, experimenter, reagents batch number, etc.). Collect relevant tissues at the end of experiment for potential follow-up studies.
Select genetic markers that cover the genome on an appropriate interval. Marker information (position and primer sequences) can be found at www.ensemble.org and www.rgd.mcw.edu (which also provides information about strain differences). To be informative, a marker must be polymorphic between the parental strains, i.e., have different numbers of repeats or different nucleotide (A, T, C, G). The effect of the QTLs being mapped, together with the type and size of the population used, dictates the appropriate interval for marker spacing to achieve power to detect the QTLs. In general, marker intervals of 10–25 cM are appropriate for intercross and BC populations and approximately 1–5 cM marker intervals are appropriate for comparable power in the AIL. The HS requires 100 times more markers than a cross from inbred strains to allow haplotype reconstruction (16). Extract genomic DNA from tissue biopsies, i.e., tail tips, ear clips, or any other tissue collected at the end of experiment (see Note 1 ). Genotype markers using your selected assay and protocol (see Note 2 ).
Genotypes can be determined by PCR amplification of microsatellite markers. Microsatellites are highly heterozygous di-, tri-, or tetra-nucleotide repeats that differ between different inbred strains in the number of repeats and thus the size of the fragment amplified using primers that anneal to the unique DNA sequence flanking the repeat region. Fluorophore-conjugated primers are used and PCR products are size fractionated on capillary sequencer (see Note 3 ). Genotypes are analyzed using software; however, we recommend manual confirmation of genotypes for quality assurance.
SNPs are bi-allelic base pair substitutions that occur with high density (~800 bp) that enable ultrahigh-throughput genotyping and development of dense genetic maps. Select SNP markers based on the strain sequences you are using and purchase/design SNP assays to use for PCR amplification. Allelic discrimination can be performed in the lab using fluorescence-based technology, i.e., TaqMan SNP genotyping assays (Applied Biosystems). More often, SNP genotyping is performed using a custom array (i.e., Affymetrix RATDIV array for rat HS) commercially or in specialized laboratories. Briefly, the array interrogates several hundred thousands of SNPs chosen based on sequence data for your strains of interest.
3.5 QTL Identification
Once the phenotype and genotype data are compiled for all individuals in the population, the likelihood of existence, location, and significance of QTLs is statistically determined by applying a model to the data. For a quick or preliminary test to scan your data for QTLs, we suggest using single-marker tests. This simple method is quick, requires no special software or need for a genetic map. More comprehensive analysis (given under Sections 3.5.2 , 3.5.3 , and 3.6 ) might require additional training or statistical and bioinformatics assistance.
3.5.1 Single-Marker Tests (Fig. 4a)
Let us consider an experiment in an F2 cross. Group animals into three groups according to their genotype (BB, BW, and WW) and compare phenotypes between the groups. Select the appropriate test for your data. ANOVA can be used if phenotypic values show normal distribution, while nonparametric tests are better suited for phenotypic values that deviate from normal distribution. A significant difference between genotype groups indicate that the marker is linked to a QTL and warrants more in-depth analysis (described below). Repeat this for every marker to identify all potential QTLs. A threshold for significance has to be established with more detailed analysis that takes into account the population structure, number of markers, individuals, and QTLs. For quick inspection we would consider everything with p < 0.01 as potentially interesting (given further follow-up).
Different approaches are used for QTL identification. (a) A single-marker test compares phenotype values between animals grouped according to their genotype (BB, BW, and WW). In this example, animals with genotype BB at marker 5 express higher phenotype than animals with BW or WW genotype, indicating that this marker is linked to a QTL. (b) Interval mapping scans for a putative QTL along the genetic map, thus adding information between markers. The genetic map for chromosome 12 is shown on the x-axis and the LOD scores, which measure the strength of evidence, are shown on the y-axis. The QTL is most likely to be located at marker 5 (highest LOD score), with the 95 % confidence interval between marker 3 and marker 6. (c) QTL location and confidence interval are more precisely estimated in populations with higher genetic resolution (higher recombination frequencies). An F2 intercross (red) identifies broad QTLs that contain many genes, while AIL (blue) and HS (black) maps narrower QTL intervals
3.5.2 Interval Mapping (Fig. 4b)
This method often entails heavy computations that require specialized software. There is a variety of software packages that can be used, and we will base our description on R/qtl, which is freely available (15). IM requires a genetic map, i.e., chromosomes and locations of markers, either physical based on the genomic sequence (Mb) or linkage based on recombination fractions in the population (cM). LOD scores are then generated in a reiterative process of associating the phenotype to genomic locations along the map and then re-evaluating linkage considering the newly created information until a QTL is detected. QTL are more precisely localized by this method, and missing genotypes and errors are accounted for to preserve power while multiple test corrections decrease the risk of false positive QTLs. Select the appropriate interval between steps for the analysis. In general, a BC or F2 population rarely has dense enough recombinations to warrant smaller steps than 5 cM while an AIL has accumulated recombinations and therefore warrants tighter mapping, usually 1–2 cM (depending on the size of the population). A good rule-of-thumb is that at least 1 % of the population should have recombined between two tested positions, which equals 1 cM in distance. Select the model to be used for QTL analysis. Please see the instructions for your software package regarding the models included. In R/qtl, standard interval mapping can be performed using the em model, while the simplified Hailey-Knott regression gives a very good approximation of em for normally distributed data. Other regression models include nonparametric regression (non-normally distributed data), binary model (yes/no data), two-part model (a combination of binary and nonparametric models for data containing a spike in the phenotype distribution), and imputation where missing genotypes are imputed based on surrounding marker genotypes. Once you have analyzed your data, select the most appropriate method for setting significance thresholds and confidence intervals. For BC and F2 intercross, standard methods of permutation and bootstrapping can be used. Permutation provides significance thresholds that are specific for the study. Essentially, the genotypes and phenotypes are mismatched before QTL analysis and the maximum LOD scores are recorded for a series of analyses (usually 1,000–2,000 but best 10,000) to estimate how often a certain LOD score occurs by chance in the population. The conventional significance threshold is 95 %, but other stringency can be used if desired. To account for the family structure in AIL, family residual values can be used to calculate significance thresholds. The within family variance (inheritance of phenotype with the causing genotype, i.e., linkage) is removed to determine LOD scores for between-family variance (representing random effects, i.e., no linkage) (14).