Genetics of ALS



Fig. 17.1
Genomic locations of genes that have been studied in the biology of ALS. A complete account of every mutation that has been studied in the biology of ALS, regardless of quality of evidence for pathogenicity, is provided in the ALS online genetics database (ALSoD) [12]. Note that the mitochondrial genome (Chr M) is not drawn to scale with the remainder of the genome



A323607_1_En_17_Fig2_HTML.gif


Fig. 17.2
Cytopathological and genetic overlap between ALS and FTD (Adapted from Ling et al. [13] including data from the ALS online genetics database [12] and the Alzheimer Disease and Frontotemporal Dementia Mutation Database [14]). (a) The proportion of ALS and FTD cases whose cellular pathology is driven by inclusions positive for major ALS and FTD proteins is shown. A large proportion of ALS and FTD cases overlap in terms of cellular pathology, with TDP-43-positive and FUS-positive inclusions observed in both. (b) The percentage of the known mutations in an array of ALS and FTD genes that cause the two diseases (note that this does not denote the proportion of disease cases caused by particular mutations) is shown. Proportions are based on approximations from the ALS and FTD mutation databases and expert opinion derived from the extant literature [13]



Table 17.1
Genes associated with familial ALS




































































































































































Gene

Year

Mutations

Model

Gene function

Phenotype

Refs.

ALS2

2001

27

AR

Guanine exchange factor

jALS, jPLS, iHSP

[16, 17]

ANG

2004

29

AD

Ribonuclease

ALS, PD

[18]

C9orf72

2011

1

AD

Guanine exchange factor

ALS, ALS-FTD, PD

[19, 20]

DAO

2010

2

AD

Oxidation of d-amino acids

ALS, SCA2

[21]

DCTN1

2003

6

AD

Microtubule transport

DHMN-7B, PS, ALS, ALS-FTD

[22]

ERBB4

2013

2

AD

Receptor tyrosine kinase

ALS

[23]

FUS

2009

77

AD, AR

RNA-binding protein

ALS, ALS-FTD, jALS, ETM4

[24, 25]

HNRNPA1

2013

1

AD

Heterogeneous nuclear ribonucleoprotein

ALS, MSP

[26]

MATR3

2014

4

AD

RNA-binding protein

ALS, MPD2, MSP

[27]

OPTN

2010

37

AD, AR

TNFα signaling regulator

ALS, GLC1E

[28]

PFN1

2012

4

AD

Actin-binding protein

ALS

[29]

SETX

2004

8

AD

DNA/RNA helicase

jALS, AOA2

[30]

SIGMAR1

2011

1

AR

Transmembrane receptor

jALS

[31]

SOD1

1993

177

AD, AR

Superoxide dismutase

ALS

[32]

TARDBP

2008

50

AD

DNA/RNA-binding protein

ALS, ALS-FTD

[33]

UBQLN2

2011

6

XD

Ubiquitin signaling

ALS, ALS-FTD

[34]

VAPB

2004

2

AD

Membrane protein

ALS, SMA

[35]

VCP

2010

7

AD

ATP-binding protein

ALS, IBMPFD

[36]


Mutation counts are based on those reported in the ALS Online Genetics Database

AD autosomal dominant, AOA2 ataxia with oculomotor apraxia type 2, AR autosomal recessive, DHMN-7B distal hereditary motor neuropathy type 7b, ETM4 essential tremor type 4, GLC1E glaucoma type 1e, IBMPFD inclusion body myopathy with Paget’s disease and FTD, iHSP infantile hereditary spastic paraplegia, jALS juvenile ALS, jPLS juvenile primary lateral sclerosis, MPD2 distal myopathy 2, MSP multisystem proteinopathy, PD Parkinson’s disease, PS Perry syndrome, SCA2 spinocerebellar ataxia type 2, SMA spinal muscular atrophy, XD X-linked dominant



Inherited Risk Factors


Before large-scale population-based methods were a feasible approach for the discovery of ALS genes, the use of fALS kindreds in linkage mapping was the principal method of choice. In 1993, the first ALS-causing mutations were discovered in SOD1 (encoding superoxide dismutase 1) [32]. This was a major breakthrough as it permitted the generation of the first credible transgenic rodent model of ALS, the SOD1 G93A mouse, which in turn has driven most of the research concerning the translational biology of ALS over the past 15 years. However, more recent research has suggested that that the pathology of SOD1 related ALS differs from the majority of sporadic disease, as the pathology of mutant SOD1 is not associated with deposition of TDP-43, a hallmark of classical ALS and most forms of FTD. Moreover, the epidemiology of SOD1 mutations varies across populations; while rare in Irish [37] and Dutch [38] populations, SOD1 mutations account for up to 23 % of familial ALS in the United States [39] and 18 % in Italy [40]. Since the initial discovery that SOD1 mutations cause ALS, 177 mutations in the gene have been associated with ALS etiology, with both autosomal dominant and autosomal recessive transmission. However, the evidence for pathogenicity varies for these mutations and some may represent incidental findings of benign polymorphisms [41]. Characteristic phenotypes have been described in association with some mutations of undoubted pathogenicity. For example, while homozygous carriers of the D90A mutation [42] and H46R carriers [43] exhibit a very long disease course (in excess of 10 years from symptom onset), A4V carriers typically die within a year of symptom onset [39].

For almost a decade, SOD1 remained the only gene in which mutations were known to cause ALS. In 2001, advances in linkage mapping and candidate gene resequencing led to the discovery of five new genes in quick succession: ALS2 (encoding alsin) [16, 17], DCTN1 (dynactin subunit 1) [22], ANG (angiogenin) [18], SETX (senataxin) [30], and VAPB (vesicle-associated membrane-associated protein B) [35]. However, with the possible exception of ANG, none of these genes was associated with typical ALS, and both ANG and DCTN1 are considered by many to be susceptibility genes rather than single genes of major effect. Moreover, the population frequency of putative pathogenic variants for all five genes is less than 1 % of all fALS. Likewise, DAO (d-amino acid oxidase) [21] and SIGMAR1 (sigma non-opioid intracellular receptor 1) [31], discovered later in 2010 and 2011, have received little attention since their initial publication as mutations in these genes are likely to represent a very small proportion of all ALS.

Two major discoveries, one in 2008 (TARDBP, encoding TAR DNA-binding protein 43 or TDP-43) [33] and the second in 2009 (FUS, encoding fused in sarcoma) [24, 25], represented a major advance in the genetics of ALS. Jointly, these discoveries opened major new approaches to understanding the neurobiology of ALS, as both are genes that are involved in RNA trafficking and processing [44]. The discovery of mutations in TARDBP was of particular importance, given that cytoplasmic accumulation of its protein product, TDP-43, had been shown 2 years earlier to be the major pathological hallmark of ALS and FTD [6]. With the exception of Sardinia [45], however, mutations in this gene are a relatively rare cause of ALS, indicating that the physiological signature of ubiquitinated TDP-43-positive inclusions is generally a consequence of a convergent pathological mechanism as opposed to direct perturbation of TDP-43 function through TARDBP mutation.

The following year, OPTN (optineurin) was discovered as a cause of an autosomal recessive form of ALS in Japanese families through a combination of homozygosity mapping and candidate gene resequencing [28]. Like many ALS genes, OPTN is pleiotropic: mutations also cause primary open-angle glaucoma, and the locus has been implicated in the etiology of Paget’s disease of the bone [46]. A similar pleiotropic phenotype is observed for VCP (vasolin-containing protein; Table 17.1), another gene discovered in 2010, and SQSTM1, a gene that has been implicated in ALS but currently without supporting evidence of familial segregation. Taken together, these findings indicate a clinical overlap between some forms of Paget’s disease and ALS [47]. OPTN mutations have been shown not to be a common cause of ALS in European populations [37, 48], further underscoring the population specificity and ethnic differentiation of the frequencies of ALS-causing mutations.

In 2011, UBQLN2 (ubiquilin 2) mutations were discovered in X-linked dominant ALS [34]. In these cases, transmission cannot be male to male because males, who carry one X and one Y sex chromosome, necessarily inherit their X chromosome from the mother (females carry two copies of the X chromosome and no Y chromosome). UBQLN2 mutations in ALS are rare [49, 50], but the potential importance of the ubiquitin signaling pathway in motor neuron degeneration highlights the central role that the gene may come to play in our understanding of the pathogenic mechanisms underlying ALS.

The discovery of an ALS-causing mutation in C9orf72 (chromosome 9 open reading frame 72) in 2011 represented a major milestone in ALS genetics research [19, 20]. Its location on chromosome 9p21 had been known to be an important ALS locus for several years following evidence from familial linkage [51] and genome-wide association studies [5254]. Extensive sequencing by many groups failed to identify credible pathogenic variants within the locus until a combination of elegant work using segregation analysis coupled with extensive next-generation sequencing of the region eventually led to the discovery of a hexanucleotide repeat expansion in the 1st intron or promoter region (depending on transcript variant) of the gene. Typically, an unaffected individual would carry three repeats of the hexanucleotide sequence GGGGCC, but in the cases of repeat expansion carriers, an excess of 30 repeats appears to be sufficient to cause ALS, with most mutation-carrier patients harboring hundreds or thousands of repeats. The mutation appears to have derived from a single European founder [55], and allelic heterogeneity in terms of the length of the repeat expansion indicates inherent instability of the sequence.

With an unstable expanded repeat sequence, there can be a propensity for the size of the repeat expansion to increase in subsequent generations. This is a principle known as anticipation, and there is some evidence to suggest that this may be the case in C9orf72-mediated ALS, with consecutive generations exhibiting younger age of onset [56]. However, inherent difficulties in accurately measuring the length of the repeat expansion, coupled with somatic mosaicism of the repeat expansion length (rendering convenient tissues like blood inappropriate for accurate genetic testing), complicate a straightforward interpretation of this finding. It has also been suggested that variation in the length of the repeat expansion may confer the clinical variation observed within repeat expansion carriers, as the mutation is also observed in FTD, Alzheimer’s disease, Parkinson’s disease, and Lewy body dementia [57]. A much broader ALS-FTD phenotype has also been associated with the repeat expansion in C9orf72, comprising a range of neuropsychiatric conditions including psychosis, bipolar affective disorder, and obsessive-compulsive disorder [9, 58, 59].

Since the discovery of the C9orf72 repeat expansion as a major cause of ALS, a substantial body of work has been generated, describing the clinical characteristics of mutation carriers, its population genetics, and its likely role in the cellular pathology of ALS. The C9orf72 repeat expansion currently represents the most common known cause of ALS in populations of European extraction, explaining up to 40 % of fALS and 7 % of sALS, depending on the population studied [60]. Conversely, the variant is rare in populations of non-European extraction. In European populations, the observation that a fALS gene is present in a high percentage of apparently sporadic ALS reinforces the indefinite nature of the distinction between the two forms of the disease. C9orf72 repeat expansions are also present in up to 25 % of familial FTD of European extraction [60], providing the strongest genetic basis to date for the clinical overlap between the two diseases. This is further reinforced by a significantly higher extent of comorbid FTD in ALS patients with the repeat expansion than in patients without it [61].

More recently implicated genes in fALS etiology include PFN1 (profilin 1) [29], ERBB4 (v-erb-a erythroblastic leukemia viral oncogene homolog 4) [23], HNRNPA1 (heterogeneous nuclear ribonucleoprotein A1) [26], and MATR3 (matrin 3) [27]. Owing to their recent discovery as fALS genes, it remains to be determined exactly how important each of these genes is in its pathophysiology. PFN1 mutations have been shown to be a rare cause of ALS [62] but the discovery of ALS mutations in this gene may be an important event in our understanding of the biology of ALS as it implicates disruption of the cytoskeleton as a novel disease mechanism. The same is true for ERBB4 mutations, which implicate the neuregulin pathway. HNRNPA1 mutations are observed in multisystem proteinopathy, providing further evidence of overlap between ALS and other clinical syndromes. A similar pleiotropic phenotype is observed in carriers of mutations in VCP and MATR3 [27], which is also a TDP-43 interactor and an RNA- and DNA-binding protein, so an understanding of the role of this gene in ALS biology is likely to shed further light on convergent underlying disease mechanisms.


Non-inherited Risk Factors


ALS may not be an entirely genetic disease, and some level of environmental contribution, or gene × environment interaction, may play a role. Non-inherited risk can also derive from genetic factors. For most of the genes listed in Table 17.1, affected individuals inherit the ALS-causing mutations from their parents. However, mutations can also arise spontaneously (termed de novo mutations) and would therefore not be observed in the non-germ cells of parents of affected individuals. The parents would also not be expected to manifest the disease. Examples of de novo mutations have been observed in SOD1 [63] and FUS [64, 65], with the latter often exhibiting an aggressive, early-onset, and short disease course [66]. A de novo FUS mutation has even been observed in a case of apparently familial ALS [67], highlighting the potential for the coincidence of two independent causes of ALS in the same fALS pedigree.

A systematic search for undiscovered de novo mutations was published in 2013, using exome sequencing (high-throughput sequencing of the entire protein-coding portion of the genome) in ALS trios [68]. In this study, the unaffected parents of ALS patients were sequenced along with the patients themselves with the assumption that, in these cases, a de novo mutation is the cause of ALS in the affected offspring. In any given parent-offspring trio (affected or unaffected by ALS), the expected number of de novo mutations in the exome of the offspring is roughly 1 [69], and directly implicating these rare events in ALS is accordingly challenging. Nevertheless, the authors used functional evidence to indicate a role for de novo mutations in SS18L1 (synovial sarcoma translocation gene on chromosome 18-like 1; also known as CREST) in ALS etiology, as well as forwarding suggestive evidence for a number of other de novo mutations.

Although the twin-based heritability of ALS is high (38–78 %) [10], heritability estimates describing the proportion of phenotypic variance of ALS attributable to common genetic variation are lower (two independent studies estimate this to be 11.0–12.7 % [70] and 17.1–24.9 % [71]). Therefore, although a large component of ALS risk is heritable, much of this heritability is unlikely to be due to common genetic variation. This suggests that the portion of ALS risk accounted for by undiscovered genetic risk factors may be explained, in part, by de novo mutations. Many of the undiscovered heritable risk factors are, however, likely to be due to rare mutations which are not well captured by common genetic variation. The discovery of the undetermined causes of ALS is therefore the focus of intensive research in the international research community.


Other Genes and Missing Heritability


Even without evidence of segregation of mutations within a pedigree, a gene can still be associated with disease. If the risk allele of a mutation (or a nearby benign polymorphism) is present at much higher frequency in a population-based cohort of sALS patients than a corresponding cohort of unaffected individuals or if there is functional evidence to support the role of a mutation in the pathological mechanisms underlying ALS, then the gene is considered to be involved in its etiology. Usually, these mutations would be associated with ALS in sporadic cohorts and they can be discovered through a variety of methodologies. Table 17.2 details some of the major genes to be associated with sALS by these methods.


Table 17.2
Genes associated with sporadic ALS




















































































































































Gene

Year

Method

Details

Refs.

ATXN2

2010

CG association testing

Intermediate CAG repeat expansions associated with ALS risk; longer expansions cause SCA

[72]

CABIN1

2013

GWAS

Association with disease susceptibility in single-marker analyses

[73]

CAMK1G

2013

GWAS

Association with disease susceptibility in single-marker analyses

[73]

CHMP2B

2006

CG resequencing

Rare heterozygous mutations identified among ALS patients

[74, 75]

CRMP4

2013

CG resequencing

Rare missense mutation higher frequency in ALS than controls; specific to France

[76]

DPP6

2008

GWAS

Association with disease susceptibility in single-marker and CNV analyses

[7779]

ELP3

2009

GWAS

Association with disease susceptibility in single-marker analyses. Supportive evidence from a Drosophila mutagenesis screen

[80]

FIG4

2009

CG resequencing

Rare heterozygous mutations identified among ALS patients

[81]

FGGY

2007

GWAS

Association with disease susceptibility in single-marker analyses

[82]

ITPR2

2007

GWAS

Association with disease susceptibility in single-marker analyses

[83]

KIFAP3

2009

GWAS

Association with patient survival in single-marker analyses

[84]

MAPT

2001

CG association testing

Association with susceptibility to the Guam ALS-PDC

[85]

NIPA1

2010

GWAS/CG resequencing

Association between disease susceptibility and deletions/polyalanine repeat expansions

[77, 86]

NEFH

1999

CG resequencing

Supported by additional reports of mutation carriers among ALS cases

[87]

PARK7

2005

CG resequencing

Homozygous mutation carriers identified among patients with ALS-PDC

[88]

PON13

2006

CG association testing

Associations with disease risk but original studies did not account for multiple testing

[8993]

SPG4

2005

CG resequencing

Heterozygous mutation identified among an individual with atypical ALS

[94]

SPG11

2010

CG resequencing

Mutations observed in homozygous configuration among patients with autosomal recessive juvenile ALS

[95]

SUSD2

2013

GWAS

Association with disease susceptibility in single-marker analyses

[73]

SQSTM1

2011

CG resequencing

Excess in the frequency of rare variants among cases. Supported by additional reports of mutation carriers among ALS cases

[96, 97]

TAF15

2011

CG resequencing

Mutations observed among fALS cases

[98]

UNC13A

2009

GWAS

Associations with disease risk and patient survival

[52]


ALS-PDC ALS-parkinsonism-dementia complex, CG candidate gene, CNV copy number variant, GWAS genome-wide association study, SCA spinocerebellar ataxia

One gene of particular importance in Table 17.2 is SQSTM1 [96, 97], which encodes the protein p62, a major component of pathological protein aggregates in ALS. Although evidence of segregation has not yet been observed in pedigrees, the finding that mutations in SQSTM1 account for around 1 % of ALS cases provides, like TARDBP and TDP-43, data relating causative genetic lesions to underlying protein pathology. Another important gene is ATXN2 (ataxin 2). In this gene, a polyglutamine repeat expansion in exon 1 causes spinocerebellar ataxia (SCA), but intermediate-length repeat expansions (27–33 repeats) are associated with ALS [72]. ATXN2 repeat expansions of this length are, however, present in the general population at a frequency of around 2.4 %, implying that this mutation exerts its effect with incomplete penetrance. 5.5 % of ALS patients harbor repeat expansions of intermediate length, indicating that this allele size range increases disease risk around 2.3-fold [72].

Taken together, Tables 17.1 and 17.2 present a large list of genes implicated in ALS, which may lead to the conclusion that the underlying genetic etiology of the condition is well understood. However, with the exception of C9orf72, and depending on population, each gene detailed in Tables 17.1 and 17.2 contributes very little to the overall percentage of cases of ALS. For example, in Ireland, only around 10 % of cases of ALS (sALS and fALS combined) can be explained by established genetic mutations in ALS-associated genes [37]. However, heritability estimates indicate that genetic risk factors play a role in a significant proportion of the unexplained cases – an observation termed missing heritability. One method that has dominated the search for the missing heritability of ALS (and a huge number of other diseases) in recent years is the genome-wide association study (GWAS).

A GWAS involves the simultaneous genotyping of hundreds of thousands of genetic markers (usually single nucleotide polymorphisms or SNPs) in a large cohort of individuals exhibiting a particular trait (for example, ALS), and, if it is a case-control study, a large cohort of control individuals not displaying the trait. The genetic markers act as proxies for nearby genetic variation (for example, disease-causing mutations) by virtue of the fact that the genome is inherited as a block-like mosaic of the haplotypes observed in an individual’s parents, and neighboring alleles are usually inherited together (they are linked). Neighboring alleles are linked because genetic recombination only occurs periodically on a chromosome, so two positions that are physically close on a chromosome are less likely to be separated by recombination than distant positions. Therefore, if an allele of a particular genetic marker is observed significantly more frequently in, for example, ALS cases, than in controls, the genetic locus surrounding the marker is implicated in the disease.

Because a GWAS conducts many hundreds of thousands of independent statistical tests, the experiment must have an extremely low level for alpha – the size of the p-value required for a result to be considered statistically significant – in order that truly significant results stand out from those that simply represent chance variation instead of systematic case-control effects. Because of this, unless effect sizes are large, the cohort sample sizes required to attain such extreme p-values number into the thousands. The extreme stringency required for alpha is often referred to as the multiple testing problem and it represented a major issue the early efforts of GWAS in ALS [78, 82, 83], whose sample sizes were too low to detect significant case-control associations. These studies did, however, indicate the extent of the expected heterogeneity within the undiscovered genetic causes of ALS, and the problem of small cohort sizes would come to be addressed later by much larger GWAS that were made possible through international collaboration [52, 54, 70, 99]. However, these larger studies have, to date, still only identified a small handful of significantly, and replicably, associated loci. Nevertheless, they proved invaluable in the identification of C9orf72 as a major risk locus [5254], and ongoing efforts are likely to continue to contribute to our understanding of the etiology of ALS.

Apart from the C9orf72 locus, other notable loci implicated in ALS by GWAS include chromosomes 19p13.3 (representing UNC13A as a risk locus and as a modifier of disease duration) [52], 1p34.1 [99], 17q11.2 [70], 1q32.2 (CAMK1G), and 22q11.23 (CABIN1 and SUSD2). The latter three loci were recently implicated in a GWAS of Han Chinese ALS patients [73] and had not previously been implicated in studies involving patients of European descent, indicating the utility of extending GWAS to worldwide ethnic groups. Although many of the loci implicated by GWAS represent excellent candidate genes for ALS etiology with strong supporting evidence, determining the causative genetic lesion in each case has not been trivial and is the focus of ongoing research.

A potential confounder in the search for novel genetic loci involved in ALS is the population differentiation observed in the frequency of ALS-causing mutations. If an undiscovered locus contributes differently to disease risk in different populations, this can mask the size of the observed effect if case and control cohorts are imbalanced in terms of their representation from each population (termed population stratification). Although large-scale international GWAS usually carefully control for this possibility, parallel studies involving single populations can serve to detect population-specific risk loci, particularly when a founder effect may play a major role. Furthermore, there is evidence that genetic admixture protects against ALS, with lower mortality rates observed in populations of mixed ancestry, indicating that undiscovered causes of ALS may act through recessive or oligogenic mechanisms [100]. The oligogenic basis for ALS has been supported by the observation of co-inheritance of mutations in more than one ALS gene in some ALS patients [101], providing a possible explanation for much of the missing heritability and incomplete penetrance observed in ALS. However, formally searching for the co-inheritance of novel risk loci by methods such as GWAS is extremely difficult, due to the number of combinations of potential loci to test (known as the curse of dimensionality). This can be ameliorated by reducing the number of loci tested using prior knowledge; to this end, an understanding of the underlying biological mechanisms contributing to ALS pathophysiology is extremely useful.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jun 14, 2017 | Posted by in NEUROLOGY | Comments Off on Genetics of ALS

Full access? Get Clinical Tree

Get Clinical Tree app for offline access