Chapter 16 Genetics of Sleep and Sleep Disorders in Humans

Abstract

Sleep is an evolutionarily old and vital process, and behavioral sleep has been observed in all vertebrates investigated, as well as some invertebrates. Although the precise physiologic functions remain elusive, the conservation of the behavior argues that sleep fulfils important biological needs. Despite the complexities, it is clear that genetic factors underlie the process of normal sleep, and they inevitably underlie sleep disorders. Recent work in animal models and in humans has begun to uncover the genetic underpinnings of various aspects of sleep, including circadian behavioral variation, homeostatic response, and numerous disorders affecting sleep. Many associations between observed human phenotypes and candidate loci have been published, but replications have been lacking. We present an overview of how genes regulate sleep and circadian processes, and we discuss what is known about how these relate to normal and disordered sleep in humans. We also provide a framework through which the numerous association studies published may be interpreted.

Behavioral sleep can be viewed as a complex phenotype. The process of sleep is initiated through interconnected drives responding to clock-dependent and sleep-debt–dependent cues. Two distinct forms of sleep, including five stages, are defined at the electrophysiologic level: non–rapid eye movement (NREM) sleep (stages 1 to 3) and rapid eye movement (REM) sleep. These stages are associated with distinctive physiologic changes affecting muscle tone, thermoregulation, endocrine function, gastrointestinal activity, and cardiorespiratory activity. Interaction with the environment at each of these stages adds additional complexity, and each of these aspects is potentially under the control of a wide variety of genes. A large number of studies have shown that specific set of neural systems, most notably aminergic and cholinergic systems in the brainstem and basal forebrain, as well as systems located in the posterior (hypocretin, histamine) and anterior hypothalamus (median and ventrolateral preoptic gamma-aminobutyric acid [GABA]-ergic systems), display changes across sleep stages and may be primarily important in orchestrating sleep stage and wake organization.¹^,²

Circadian rhythms drive a “clock-dependent” variation in sleep propensity, and they are observed throughout the animal kingdom, from unicellular organisms to mammals. These serve to coordinate the timing of important physiologic processes, including sleep, with the alternating photoperiod of the external environment. In mammals, circadian rhythms are primarily coordinated by the master clock in the suprachiasmatic nuclei (SCN) of the hypothalamus, and they are entrained by light through the retinohypothalamic tract, although it is now evident that many other tissues, notably peripheral tissues, have their own autonomous circadian clocks. The basis of these rhythms is a largely conserved transcriptional–translational feedback system plus a set of complex posttranscriptional regulatory steps (phosphorylation, targeted degradation) involving Clock, Bmal1, the Period genes, and the Cryptochrome genes.³ The elaborate nature of these loops provides many distinctive points at which single gene mutations can manifest with clearly observable phenotypes such as altered period, phase angle, or loss of rhythm, which have been identified in models such as Drosophila and rodents, and in humans.

Sleep homeostasis, on the other hand, responds to accumulated sleep debt, increasing the intensity and propensity to initiate sleep according to cumulative time spent awake. Less is known about the genetic, neurochemical, and neuroanatomic bases of the homeostat than about the circadian process. The neuroanatomic basis is likely to be diffuse,¹ but slow-wave activity in the delta frequency range, quantified as delta power, varies in proportion to prior sleep and wakefulness and thus serves as one measure of homeostatic sleep need.⁴^,⁵ Changes in glutamatergic transmission in the cortex, possibly in reaction to synaptic plasticity changes that have occurred in wakefulness, and intracellular metabolic changes may be critical.⁶ Studies in rodents have demonstrated that the rate of accumulation of this sleep need is also under genetic control.⁷

More generally, several electroencephalographic (EEG) features have been found to be highly heritable traits in humans, based on studies comparing monozygotic (MZ) and dizygotic (DZ) twins. Indeed, differences between MZ twins were not larger than successive recordings from the same individual. Autosomal dominant inheritance for an array of EEG variants has also been demonstrated through numerous family studies.⁸ EEG frequency bands in the delta through sigma frequency ranges during wakefulness showed high heritabilities of 76% to 89% in a large study of MZ and DZ twins.⁹ More recently, a twin study has also showed strong genetic control of spectral composition of NREM sleep, particularly in the delta, theta, alpha, and sigma frequencies.¹⁰

Thus, although the process of sleep is clearly under genetic control, the process is highly robust, making single gene mutations specifically abolishing sleep unlikely. The temporal dynamics of sleep are quite fast, indicating that gene expression per se is not likely to govern the state-to-state changes observed on the EEG. Rather, the roles of genes may manifest through differentially available stores of neurotransmitters, or variations in activity of transmembrane channels.

Genetic Studies of Human Sleep: Methodological Limitations

Identification of single gene mutations in animal models has proved quite successful in the areas of circadian rhythms and narcolepsy, demonstrating that single gene mutations can have dramatic effects on sleep patterns, although in relatively rare situations. Linkage studies of sleep phenotypes in human families have often resulted in unreproduced or controversial results, probably because of a combination of small family size (resulting in limited power), phenotypes that are not fully penetrant or susceptible to phenocopy, rare occurrence of families, and allelic or locus heterogeneity. Even in the best case, linkage studies result in broad peaks containing many potential candidate genes.

Numerous association studies have also been published, based either on candidate gene screens or hypothesis-free genome-wide association (GWA) scans. Association studies in general have had a very poor replication rate, and one review showed that for 166 putative associations that were studied three or more times, only six were consistently replicated.¹¹ Keeping in mind that spurious or nonreproducible findings outnumber replicated findings in the published literature, it is critical to understand the inherent limitations of these studies (Box 16-1).

Box 16-1 Common Issues in Published Association Studies

Population stratification: An association is observed because cases and controls do not have the same ethnic or sub-ethnic composition (it may not be apparent). This is most problematic in candidate gene associations.

Inclusion of phenotypes with poor test or retest reliability: In many cases, the phenotype under study is not robust or validated.

Studies of multiple phenotypes without correction for multiple testing: It is common for a study to start with a clear a priori hypothesis, but, despite nonsignificant results, to explore other phenotypes without properly stating that these were exploratory analyses. In a variation of this strategy (“phenotype dredging”), a borderline P value is improved by testing correlated sub-phenotypes (for example, studying amount of sleep after sleep deprivation rather than SWS power after sleep deprivation), or subdividing the groups (e.g., by sex). In general, these additional testing strategies are acceptable if they are clearly indicated as exploratory in reporting the study.

Pseudoreplication: Unfortunately, researchers often frown upon precisely replicating a protocol performed in a prior study, often using an “improved version.” This, together with phenotype dredging, leads to multiple studies “replicating” a similar effect, where in reality this was not the case.

Report of gene–gene interactions or testing of multiallele differences without adequate control for multiple testing: The study of interactions, a reasonable idea by itself, is really problematic with respect to power. Most notably, the number of interactions that can be explored is infinite and difficult to control. Interactions should be studied only when there is a clear biological basis, and in general they should be presented as exploratory until replicated.

Testing of multiple genetic models without control for multiple testing: In the study of any genetic association, multiple models are frequently tested: allelic association, dominance, recessiveness, and haplotype analysis. The accepted standard to date is to conduct an allelic association test, and to explore further genotypic models only if the allelic test is significant.

Functional characterization: In the presence of weaker results, a standard in the field is to show functional effects of the associated polymorphism. This is especially important in cases with single occurrence of a mutation in a family when other families with a similar phenotype and a different mutation in the same gene are not available. Importantly, however, a high standard should be applied to these studies. It may, for example, be easy to find unrelated small changes in lymphocytic gene expression in relation to a polymorphism. Similarly, studies in animal models have limitations: A similar T44A mutation in CKIδ causes opposite circadian phenotypes in Drosophila and mice.

Case-control designs are popular and practical for association studies but are susceptible to population stratification, which causes problems when patients and controls are unknowingly drawn from different ethnic groups or subgroups. If the disease is more prevalent in one of these groups, it can be overrepresented in patients and underrepresented in controls, and any polymorphism marking the higher-risk subgroup will appear to be associated with the disease. A typical example of this problem is illustrated by previously reported studies of the dopamine D4 receptor gene (DRD4) in relation to personality traits and drug of abuse phenotypes, which were later shown to mostly represent differences in African-American admixture across samples.¹²^,¹³ In this particular case, hundreds of studies have been published, and even a recent meta-analysis could not clearly conclude whether this association was genuine or artifactual.¹² Indeed, even when strict meta-analyses are performed, it is impossible to exclude bias toward publishing positive replications that could inflate the number of positive reports. As an example, in 1999, we found that the Clock 3111C polymorphism was associated with delayed phase in 410 subjects of the Wisconsin Sleep Cohort,¹⁴ a finding we could not replicate in 600 additional subjects and that was never published. When population stratification is a problem, alleles enriched in a sub-ethnicity will show differences for markers located across the entire genome, and they can thus be more easily detected in GWA-related designs (see later, notes on QQ plots, for example) than in candidate gene studies. For this reason, GWA studies are generally less susceptible to this problem.

False positives and nonreplications can also arise from a variety of other factors, including small study sizes, variable phenotype definition, insufficient correction for multiple testing, variable linkage disequilibrium (LD) between the polymorphism studied and the causal variant among different populations, and population-specific gene–gene or gene–environment interactions. Following the recent explosion of GWA studies, a white paper was published proposing best practices for conducting and publishing initial association reports and replication studies,¹⁵ and it focused on assessment of validity of association reports and criteria for establishing replication (Box 16-2). Although recommendations were directed toward GWA designs, the key points also apply to candidate gene association studies. First, not only should a biologically meaningful report provide evidence of an association supported by a substantial odds ratio (OR) and a statistically significant P value, but it should also report a systematic phenotype criterion, demonstration of adequate sample size and lack of stratification, demonstration of quality of assays, description of multiple testing correction, and a declaration a priori of any weighting schemes, including which markers warrant a reduced multiple testing threshold.

Box 16-2 Key Recommendations for Finding and Replicating Associations

Design and Methodology

Considerations Regarding Validity of Initial Report

Suitably large sample size

Description of the study’s power to detect an effect

Phenotypes assessed according to standard definitions, and specified in report

Testing for underlying population structure differences between cases and controls

Strength of observed effect

Sufficiently stringent criteria for significance (small P values)

Single-locus and multimarker haplotype analysis

Significance of effect not dependant on altering established quality controls or inclusion criteria, or on unusual sub-phenotype

Appropriate correction for multiple comparisons performed

Description of local linkage disequilibrium; typing of markers in strong linkage disequilibrium (LD) shows similar results

Biological or functional explanations firmly based on available data

If replication not included, preliminary nature of report should be emphasized

Replication Studies

Included in the initial association report

Description equivalent in detail to original sample

Sufficient size to distinguish between proposed effect and no effect

Uses independent sample, but similar population group and same phenotype as used for initial replication

Strong rationale for selection of additional markers to be studied in replication (from initial study, implicated through LD or function, implicated in published literature)

Discussion of choice of threshold for significance

Statistical significance obtained with same genetic model as initial study

Joint analysis (if possible) yields smaller P value than seen in initial study

Replication uses same marker allele or haplotype, shows similar effect in same direction as original study

Summary of replication attempts by authors and summary of known replication attempts, including nonreplications

Clean and well-defined phenotypes are more likely to result in robust associations, and phenotypic criteria need to be clearly described. Altering phenotype definitions to achieve greater statistical significance has resulted in unreplicated findings. Similarly, it is not uncommon to see reports of “pseudoreplications,” where either the replication is in the opposite direction or it uses a more or less modified definition of the phenotype. Associations that are significant only after post hoc selection for unusual or highly specific sub-phenotypes, or phenotypes representing only a small proportion of a sample study, warrant cautious interpretation. Small studies pose problems because of lack of power, they are prone to large variation in risk estimates, and they are especially susceptible to cryptic stratification effects.

Multiple methods are available to demonstrate lack of stratification in GWAs, but most commonly QQ plots of chi-square analysis are used to demonstrate that the distribution of values obtained, with the exception of the positive hits, are close to those expected by chance. Systematic deviation from expected values is a measure of general difference between patients and controls. These methods rely on the availability of vast numbers of genotypes and are not applicable to candidate gene association approaches. Stratification may thus underlie the high rate of nonreplications in candidate gene studies. This will become less of an impediment as genotyping costs decrease and sets of ancestry informative markers (AIMs) become increasingly well characterized. Alternatively, it may be necessary to form collaborations, as combining multiple independent samples overcomes obstacles of insufficient power and improves the generalizability of the findings.

The enormous numbers of genotype–phenotype comparisons made in GWA studies lead to correspondingly large numbers of spurious hits. Without rigorous correction for multiple testing and filtering of artifacts, any real results can become obscured. Extremely small P values often result from technical artifacts, and it is important to examine Hardy-Weinberg equilibrium and genotype clustering quality for these genotypes. Methods for multiple testing correction are evolving. Although Bonferroni correction is accepted, it is overly conservative in GWAs because, as a result of LD, many markers are not independent. Genome-wide significance for a 900,000 single nucleotide polymorphism (SNP) chip is on the order of 10⁻⁸, a daunting hurdle. Lowering the threshold for selected markers that may be anticipated to have a functional role is acceptable, but these will be rare, and they must be declared before analysis has begun, because there is considerable temptation to create credible biological hypotheses post hoc.

It is now recommended that any initial association report include a replication study, and this should be performed with the same phenotypic criteria and on an independent sample from a comparable population (preferably a collaboration, to avoid the temptation of splitting a well-powered study into two smaller samples). It is essential to replicate the same markers, and findings should show similar magnitude of effect and in the same direction. Patterns of LD may vary considerably among different ethnic groups. Studying other populations in subsequent replications can add substantial credibility and significantly narrow the region, but failure to replicate in different ethnic groups does not necessarily negate the initial finding. Where replication is not possible, functional analysis can be used to support the validity of the association.

In light of these issues, we have elected to primarily describe only findings that are supported by replication studies, or that have substantial functional biological evidence from model systems.

Genetic Factors Underlying the Circadian Clock and Circadian Rhythm Disorders

A wealth of information is now known regarding the genetic basis of circadian rhythmicity, which is coordinated by a network of transcriptional–translational feedback loops that drive expression of a series of core clock components with approximately a 24-hour cycle. Analysis of circadian mutants has now led to the discovery of clock protein mutations in fungi, plants, Drosophila, and rodents.¹ There is an extensive body of work on the genetics, functional biology, and behavioral and metabolic phenotypic effects of circadian mutants in the mouse, which has been extensively reviewed (see reference 16). Animals carrying circadian clock mutations have phenotypes extending beyond alterations of rhythmic behavior (sleep homeostasis, response to sleep deprivation, metabolism, cancers), probably a reflection of the widespread distribution and activity of clock proteins and their targets.

Mutations in human clock-related genes are now well established in the etiology of familial advanced sleep phase syndrome (FASPS). This disorder was first described in a large pedigree from Utah that was segregating an autosomal dominant allele associated with a lifelong tendency to wake up and to go to sleep at very early times. Affected family members had normal sleep quality and quantity, but their preferred sleep and wake times, melatonin, and temperature rhythms were all advanced by 4 to 6 hours.¹⁷ The free-running period of the proband was approximately 1 hour shorter than matched controls. The underlying mutation was a serine to glycine substitution (S662G) in the human PER2 gene, and in vitro data suggested this to be a potential phosphorylation site by CK1ε.¹⁸ A second FASPS pedigree was found to carry a threonine to alanine (T44A) mutation in the CK1δ gene, which reduced activity of the enzyme in vitro.¹⁹ These findings in humans correspond well with identification of a CK1ε mutation in tau mutant Syrian hamsters²⁰ that leads to deficient phosphorylation of PER. Although these results demonstrate a key role for CK1δ/ε in the function of the clock, further studies in rodents have demonstrated complexity.²¹^,²² Surprisingly, CK1ε does not phosphorylate PER2 at position 662 but instead acts to phosphorylate three serine residues nearby. Instead, phosphorylation at 662 by an unknown enzyme acts as a priming event leading to CK1ε activity elsewhere. Furthermore, the S662G mutation does not decrease PER stability, as had been anticipated, but instead resulted in decreased transcription at PER2,²² although this is debated.²¹ The results support a model in which CLOCK timing is regulated by expression, degradation, and nuclear entry and retention of PER2. In addition, there is modulation through multiple states of PER2 phosphorylation, some of which are not dependent on CK1δ/ε.²³ The search for this unknown kinase is ongoing.

Apart from the well-defined mendelian effect in FASPS, a study of 238 twin pairs found higher correlations for Horne-Ostberg (HO) diurnal preference scores among MZ twins, thus suggesting the presence of circadian factors in the general population.²⁴ A number of association studies have recently examined a connection of the CLOCK gene with diurnal preference. The initial study of Katzenberg examined 410 white individuals of the Wisconsin Sleep Cohort.¹⁴ Individuals with the CLOCK 3111C allele in the 3′ untranslated region had lower HO scores, with a 10- to 44-minute delay in preferred timing of activity or sleep, suggesting that this SNP, or another SNP in tight LD, could underlie the effect. Further studies gave variable results, and the association was not found in a study of 105 normal subjects, 26 blind, or 16 delayed-sleep-phase patients,²⁵ but it was identified in a larger study of 421 Japanese subjects.²⁶ These results thus remain controversial, as indeed we could not replicate the association in an additional sample of the Wisconsin cohort (although overall results for the entire sample remain significant).

Similarly problematic results have been reported in the study of human PER gene polymorphisms. A purported association between the human PER3 locus and delayed sleep phase remains provisional. Although two groups have reported this general association,²⁷^,²⁸ the small samples (16 discordant sib pairs [DSPs], 48 DSPs) were from different ethnic groups (Japan, United Kingdom, and the Netherlands), and they found similar but not equivalent associations. In one case, a rare five-marker haplotype containing the major variable number of tandem repeat (VNTR) four-repeat allele (G647, P864, 4-repeat, T1037, R1158) but not the VNTR four-repeat allele alone was present on seven predicted DSPs chromosomes total (15% carrier frequency) versus 2% of control Japanese.²⁸ In the other study, in England, the VNTR four-repeat allele was associated with HO scores and delayed sleep phase²⁷ in 484 subjects, 75% of whom were homozygous, an effect later suggested to be significant only in younger subjects through the study of HO extremes in a bigger sample.²⁹ Similarly, a T2434C polymorphism in HPER1 (rs2735611) was recently reported to be associated with extreme HO scores (80 individuals per group) drawn from 1590 British volunteers,³⁰ whereas in another, earlier 1999 study, G2548A (rs2253820) in PER1 was not associated with HO in 463 individuals drawn from the Wisconsin Sleep Cohort.³¹ The more recently published study, however, did not mention that these two PER1 polymorphisms are located within 114 base pairs of each other and are in almost complete LD (r2 = 1), so that typing one is equivalent to typing the other. Considering the relatively small effect size and the broad spectrum of preferences reported in the general population, in contrast to the high penetrance and tight ranges of preferred activity in FASPS, a variety of combinations of different alleles at a number of circadian genes probably underline diurnal preference in the general population. Clearly, the next step is to greatly increase sample size (to several thousand subjects) to have power to exclude or confirm these prior studies. Resequencing of candidate genes in extremes, and finding familial clustering of the phenotype in relatives, to identify other rare strong effect alleles are also viable strategies. It is also likely that the HO, like any subjective assessment instrument, is less amendable to genetic analysis than more objective physiologic measures of circadian phase.

Genetic Factors Regulating EEG and the Sleep Homeostat

The search for the genetic basis of selected EEG traits and the sleep homeostat is well underway using rodent models. Strong genetic effects are more clearly evident for spectral features of the EEG versus sleep architecture variability.³² Power in the delta and sigma frequency bands in slow-wave sleep (SWS) is associated with genetic background, and a deficiency in a single enzyme (acyl–coenzyme A dehydrogenase) results in slowed theta activity during sleep.³³ Quantitative trait locus (QTL) and mapping studies demonstrate that strong genetic effects on SWS-need (measured as rate of accumulation of delta power after extended wakefulness)⁷ may represent differences in synaptic plasticity mediated by the Homer1a gene through differential disruption of glutamatergic signaling complexes.³⁴^,³⁵ These same studies show that other genes have greater effects on sleep need depending genetic background, highlighting an issue of QTL models: one locus may represent a major gene in the context of two specific inbred strains, but the effect size in a more outbred population may be difficult to ascertain. These can complicate extrapolation to potential human phenotypes.

In humans, a PER3 genotype was recently reported to be associated with differences in EEG markers of sleep homeostasis after sleep deprivation (and behavioral consequences) (10 PER3^5/5 versus 14 PER3^4/4 individuals respectively). This remains tentative, as the sample size was small and it has not yet been replicated by other groups.³⁶ It is also worth noting that multiple papers linking other phenotype differences to the PER3^5/5 genotype have all been made using the same initial sample³⁷^,³⁸ and thus cannot be considered replications. The finding is interesting as it suggests that circadian genes may be involved in regulating not only circadian timing but also sleep homeostasis, consistent with studies in mice and Drosophila.¹ We expect that GWA studies will explore the genetic basis of the EEG in the near future.

Genetics of Narcolepsy

Heritability

Narcolepsy affects the control of sleep and wakefulness, and it is characterized by excessive daytime sleepiness, symptoms of dissociated REM sleep (sleep paralysis, hypnagogic hallucinations), disrupted nocturnal sleep, and cataplexy (brief episodes of muscle weakness triggered by emotions). Although most of these symptoms appear in the general population in the context of sleep deprivation or other sleep disorders, cataplexy is highly specific to narcolepsy.

Although narcolepsy is primarily sporadic, family and twin studies indicate a strong genetic basis for susceptibility to narcolepsy, as the prevalence of narcolepsy/cataplexy in first-degree relatives of probands is between 0.9% and 2.3%. Although this recurrence risk in siblings is low, it is considerably higher than the population prevalence and corresponds to a 20- to 40-fold increased risk.³⁹ MZ twins show a concordance rate of only 35%, indicating that narcolepsy/cataplexy results from an interaction of environmental factors on a susceptible genetic background.

Human Leukocyte Antigen in Narcolepsy

An association between narcolepsy and specific class II human leukocyte antigen (HLA) antigens (DR2 and DQ1) was first noted in the Japanese population.⁴⁰ HLA class II antigens are present on immune cells and function to present processed foreign peptides to T cells by engaging the T-cell receptor. The initial association was subsequently confirmed by many studies and further refined. DR2 and DQ1 are in complete disequilibrium in Japanese, but substantially less in African Americans. Using high-resolution mapping in different ethnic groups allowed refinement of the susceptibility region by examining the frequency of alternative haplotypes, and demonstrated that DQB1*0602 is the most specific marker for narcolepsy in all ethnic groups. Although 90% of narcolepsy cases are associated with DQB1*0602, this is a common allele across ethnic groups, ranging from 12% in Japanese, to 38% in African Americans, and thus is not sufficient for the development of the disease. Other HLA alleles also influence susceptibility to narcolepsy. A study of 420 narcolepsy-cataplexy patients and 1087 controls⁴¹ identified additional predisposing alleles: DQB1*0301, DQA1*06, DRB1*04, DRB1*08, DRB1*11, and DRB1*12. Approximately 10% of narcolepsy-cataplexy patients are DQB1*0602 negative, but a large proportion of these carry the DQB1*0301 allele. Four protective alleles, DQB1*0601, DQB1*0501, DQB1*0603, and DQA1*01 (non-DQA1*0102), were also found. It is notable that whereas HLA DQB1*0602 confers susceptibility to narcolepsy, the very similar DQB1*0601 antigen is rather protective. Thus very minor changes in the peptide binding pockets of these molecules (where these differences localize) may determine disease risk. Protective DQA1 alleles^41–41b may form transdimers, reducing formation of the susceptibility heterodimer DQα1*0102/DQβ1*0602.^41a It is clear that non-HLA genes also contribute to susceptibility, as the proportion of recurrence risk attributable to HLA is well below the relative risk observed in first-degree relatives.⁴¹

The tight association with DQB1*0602, the typical peripubertal onset, and the low concordance in MZ twins all suggest an autoimmune mechanism for narcolepsy. The association of MHC proteins, particularly class II antigens, is well recognized in a variety of autoimmune diseases, although narcolepsy shows the tightest such interaction (reviewed in reference 39). The interaction of HLA proteins with processed antigens determines the resulting immune response. However, some features of narcolepsy are not as consistent with a typical autoimmune mechanism, as females are not at increased risk. Surprisingly, there is little consistent direct evidence for humoral or cellular immunity in narcolepsy. The disease has not been transferred through injection of serum into mice, and activity of T-cell subsets, and natural killer cells were not altered in patients with narcolepsy.^39,^42,⁴³ Increased autoantibodies against Tribbles homolog 2 (TRIB2) were recently identified by three groups, and were more prevalent close to onset of cataplexy.^41c–e

Only gold members can continue reading. Log In or Register to continue