(1)
Cognitive Function Clinic, Walton Centre for Neurology and Neurosurgery, Liverpool, UK
4.1.1 MMSE Ala Subscore
4.3.1 Backward Clock Test
4.5.1 ACE Subscores: VLOM Ratio, Standardized Verbal Fluency, Semantic Index, and Modified Ala Subscore
4.5.2 ACE Meta-analysis
4.6.1 ACE-R Meta-analysis
4.8 DemTect
4.12 Poppelreuter Figure
Abstract
This chapter examines the diagnostic utility of various cognitive screening instruments, both multidomain and single domain tests, in the diagnosis of cognitive disorders. Methods to compare these tests, in terms of various diagnostic parameters and weighted comparisons, are also examined.
Keywords
DementiaDiagnosisCognitive screening instrumentsMeta-analysisWeighted comparisonIn addition to the standard clinical methods of history taking and examination (see Chap. 3), a large number of cognitive screening instruments (CSI) or assessment tools has become available to assist in the diagnosis of patients with cognitive complaints (for compendia, see for example Burns et al. 2004; Tate 2010; Larner 2013a). These have superseded the qualitative methods of earlier times, for example fixing the year at a much earlier date than it actually is being taken as evidence for disorientation in time (Allison 1962:175).
The available screening instruments encompass not only cognitive but also behavioural, psychiatric, and functional scales (see Chap. 5). Neuroimaging and other investigation techniques may also be required for adequate patient assessment and diagnosis (see Chap. 6). Application of consensus diagnostic criteria for dementia or dementia subtype (see Chap. 2) usually presupposes the use of at least some of these investigations, and although the diagnostic utility of such criteria is generally found to be good, they may sometimes mislead if there are atypical clinical features which fall outwith the criteria or are deemed exclusionary, for example an apparently acute onset of neurodegenerative disease (Larner 2005a), or epileptic seizures early in the course of Alzheimer’s disease (Lozsadi and Larner 2006; see Sect. 7.2.3).
Formal neuropsychological assessment by a neuropsychologist may be desirable in some cases of cognitive disorder, but these resources are not universally available and such assessment, usually encompassing tests of intelligence such as the Wechsler Intelligence Scale and potentially many other tests (Mitrushina et al. 2005; Strauss et al. 2006; Lezak et al. 2012), is often time-consuming and fatiguing for patients, sometimes requiring multiple outpatient visits. Hence, tests applicable by clinicians within the clinic room, so called “bedside” neuropsychological tests or “near patient testing” (i.e. results available without reference to a laboratory and rapidly enough to affect immediate patient management; Delaney et al. 1999:824), are desirable (Burns et al. 2004; Tate 2010; Larner 2013a). These tests may be evaluated on theoretical (Cullen et al. 2007) or pragmatic (Woodford and George 2007) grounds, but the proof of the pudding is in the eating, meaning that empirical evaluation of these instruments in the clinical setting (i.e. pragmatic studies) must be undertaken (see Sect. 2.3). Clearly, only a small selection of the many CSI potentially available can be sampled in any one clinic. A number of desiderata for CSI have been formulated (Malloy et al. 1997; Larner 2013b; see Box 4.1).
Ideally should take <15 min to administer by a clinician at any level of training.
Ideally should sample all major cognitive domains, including memory, attention/concentration, executive function, visual-spatial skills, language, and orientation.
Should be reliable, with adequate test-retest and inter-rater validity.
Should be able to detect commonly encountered cognitive disorders.
Should be easy to administer, i.e. not much equipment required beyond pencil and paper, or laptop computer.
Should be easy to interpret, i.e. clear test cutoffs, perhaps operationalised, e.g. a particular score on the test should lead to particular actions, such as patient reassurance, continued monitoring of cognitive function over specified time periods, or immediate initiation of further investigations and/or treatment.
As discussed in Chap. 2, CSI with high sensitivity may be particularly desirable, at the risk of false positives, in order to identify as many mild cases as possible (i.e. those with mild cognitive impairment [MCI] or prodromal Alzheimer’s disease [AD]) in order to initiate treatment and management strategies early in the disease course.
CSI may be broadly classified according to whether they test general (multidomain) or specific cognitive functions (Mitchell and Malladi 2010a, b; Tate 2010). CSI which attempt broad, multidomain, sampling include the Mini-Mental State Examination, Mini-Mental Parkinson, the Addenbrooke’s Cognitive Examination and its revision, DemTect, the Montreal Cognitive Assessment, and the Test Your Memory test (see Sects. 4.1, 4.2, 4.5, 4.6, 4.8, 4.9 and 4.10 respectively). Generally, the more comprehensive the neuropsychological coverage of a test, the longer it takes to administer, although the Clock Drawing Test (Sect. 4.3) may be an exception.
Although falling outwith some desiderata for CSI (Box 4.1), tests which are restricted to the examination of specific cognitive functions may nonetheless have a place in patient assessment (Mitchell and Malladi 2010b). For example, since episodic memory impairment is typically the earliest deficit manifest in AD patients, tests for anterograde amnesia may be appropriate, such as the Memory Impairment Screen (Buschke et al. 1999), the Free and Cued Selective Reminding Test (Grober and Buschke 1987), the Five Words Test (Dubois et al. 2002), and the Visual Association Test (Lindeboom et al. 2002). Similarly, there are tests specifically sensitive to executive function, such as the Frontal Assessment Battery (Sect. 4.11), and to visuoperceptual function, such as the Poppelreuter (overlapping) figure (Sect. 4.12).
It is important to emphasize from the outset that CSI are not stand-alone diagnostic measures. In patients whose performance falls below designated cutoffs consideration will need to be given to whether further investigations are required to ascertain a cause for their apparent cognitive impairment. Impaired performance on CSI may result from a number of variables beside disease state, including affective disorder (anxiety, depression), sleep disturbance, low premorbid abilities, medication use, and economy of effort (be that disease-related, subconscious, or wilful as in malingering). Some of these non-cognitive factors may also need to be assessed, formally or informally, during the clinical encounter. It is also important to emphasize that qualitative clinician-patient interaction during CSI administration may inform clinical judgements over and above any raw test scores, and it is for this reason that collaborative multi-agency judgements (“diagnosis by committee”), though advocated in some models of service (Banerjee et al. 2007), does, in this author’s opinion, present possible risks.
4.1 Mini-Mental State Examination (MMSE)
The Mini-Mental State Examination (MMSE; Folstein et al. 1975) has been the most commonly used bedside test of cognition, with almost 40 years of cumulative experience with its use, as a consequence of which it has been regarded as a benchmark against which newer CSI are measured (Mitchell 2013). The MMSE was originally designed to differentiate organic from functional disorders in psychiatric practice, and as a quantitative measure of cognitive impairment useful in monitoring change, but not primarily as a diagnostic tool (Folstein et al. 1975). However, it has proved acceptable and useful in the assessment of cognitive status in general medical and neurological patients (Dick et al. 1984; Tangalos et al. 1996; Ridha and Rossor 2005) and has become the most widely used brief cognitive assessment. Surely no other medical investigation can claim to have been memorialised, at least in part, in a sonnet (by Rafael Campo; see Levin 2001:334)! The enforcement of copyright restrictions on the use of the MMSE in recent years may adversely impact on its future use (Newman and Feldman 2011; Seshadri and Mazi-Kotwal 2012).
MMSE has good intra- and inter-rater reliability and internal consistency, although debate continues about interpretation and appropriate cutoff scores (Tombaugh and McIntyre 1992; Nieuwenhuis-Mark 2010). Patient age and years of education influence MMSE scores, norms for which may be factored into the cutoffs (Crum et al. 1993) although this is seldom done in practice. Meta-analysis of MMSE diagnostic validity studies in dementia indicates that it performs best in a rule-out (screening) capacity, consistent with its high specificity, but is of more limited value for identification of MCI (Mitchell 2009, 2013).
MMSE may also be useful in tracking cognitive decline in AD (Han et al. 2000), falling on average by 3 points per year, although there is variability, with some untreated patients remaining stable or even improving (Holmes and Lovestone 2003). In the UK, the MMSE has been the required instrument for monitoring the efficacy of treatment with cholinesterase inhibitors for AD (National Institute for Clinical Excellence 2001), even though there is evidence to suggest that it is unsuitable for this purpose (Bowie et al. 1999; Holmes and Lovestone 2003; Davey and Jamieson 2004).
As the item content shows (Box 4.2), MMSE is dominated by language based tests and is perfunctory in its testing of memory, visuoperceptual and executive functions.
Analyses have shown that certain MMSE items are statistically significant predictors of the diagnosis of AD (especially recall memory and orientation to place, with, in decreasing order of significance, copying pentagons, failed serial 7s, and orientation to time) whilst other items (registration, naming, repetition, three-step verbal command, written command, writing a sentence) are only weak predictors (Galasko et al. 1990). An examination of the factorial structure of the MMSE found most of the variance to be accounted for by the orientation in time, delayed recall, attention/concentration, and copying pentagons tasks, with measures of comprehension (three-step command, written command) showing low sensitivity with performance often at ceiling (Brugnolo et al. 2009). The attention/concentration items (serial 7s or spelling WORLD backwards) differ in item difficulty (serial sevens more difficult) and scores are weakly correlated (Ganguli et al. 1990). The language repetition item is often failed by healthy adults, possibly related to poor hearing or attention (Valcour et al. 2002), and it is difficult to translate into other languages (Werner et al. 1999). A number of short MMSE variants have been developed which attempt to exploit these various observations by using only those MMSE elements with high predictive value (Davies and Larner 2013a).
Box 4.2: Item Content of Mini-Mental State Examination (MMSE)
Orientation | 10 |
Registration | 3 |
Attention/Concentration (serial 7s or DLROW) | 5 |
Memory recall | 3 |
Language naming | 2 |
Language comprehension: | |
“Close your eyes” | 1 |
3 stage command | 3 |
Language writing | 1 |
Language repetition | 1 |
Visuospatial abilities (intersecting pentagons) | 1 |
Total score | 30 |
The diagnostic utility of MMSE for the diagnosis of dementia in day-to-day clinical practice has been examined in several separate studies in CFC (Hancock and Larner 2009, 2011; Larner 2005b, 2009a, b, 2012a, b, 2013c).
In a study of the Addenbrooke’s Cognitive Examination (see Sect. 4.5), which incorporates the MMSE (Larner 2005b), MMSE diagnostic utility was investigated at cutoffs of 27/30 and 24/30 (Table 4.1), with results comparable to those found for the MMSE in other studies of the ACE (Mathuranath et al. 2000; Bier et al. 2004).
Table 4.1
Demographic and diagnostic parameters for MMSE for diagnosis of dementia (1) (Larner 2005b)
MMSE | ||
---|---|---|
N | 154 | |
Prevalence dementia | 51 % | |
M:F (% male) | 87:57 (56) | |
Age range (years) | 25–84 | |
Cutoff | 27/30 | 24/30 |
Accuracy | 0.81 (0.74–0.87) | 0.79 (0.73–0.86) |
Sensitivity (Se) | 0.91 (0.84–0.97) | 0.73 (0.63–0.83) |
Specificity (Sp) | 0.70 (0.60–0.80) | 0.86 (0.78–0.94) |
Y | 0.61 | 0.59 |
PPV | 0.75 (0.66–0.84) | 0.84 (0.75–0.92) |
NPV | 0.88 (0.81–0.97) | 0.76 (0.67–0.85) |
PSI | 0.63 | 0.60 |
LR+ | 3.04 (2.14–4.31) = small | 5.09 (2.90–8.95) = moderate |
LR− | 0.13 (0.09–0.18) = moderate | 0.32 (0.18–0.56) = small |
DOR | 23.5 (16.6–33.3) | 16.0 (9.10–28.1) |
CUI+ | 0.68 (good) | 0.61 (adequate) |
CUI− | 0.62 (adequate) | 0.65 (good) |
AUC ROC curve | 0.88 (0.83–0.94) |
In a study of the Addenbrooke’s Cognitive Examination-Revised (ACE-R) which also incorporates the MMSE (see Sect. 4.6), the sensitivity and specificity of the MMSE for cross-sectional use was examined at all cutoff values, with the optimal cutoff being defined by maximal test accuracy for the differential diagnosis of dementia/not dementia. The optimal accuracy of MMSE was found to be 0.82 at a cutoff of 24/100 (this optimal cutoff was similar to that reported in other studies of MMSE, e.g. Feher et al. 1992, and as originally recommended by Folstein et al. 1975). The various parameters of diagnostic accuracy were then calculated at this cutoff (Table 4.2; Larner 2009a, b, 2013c), and proved to be similar to those found at the same cutoff in the ACE study (Larner 2005b), namely sensitivities and specificities around 0.7–0.9, PPV around 0.7–0.8, with LRs moderate to small, and CUIs good to adequate.
MMSE | |
---|---|
N | 242 |
Prevalence dementia | 35 % |
M:F (% male) | 134:108 (55) |
Age range in years | 24–85 (mean 59.8 ± 10.9) |
Cutoff | 24/30 |
Accuracy | 0.82 (0.77–0.87) |
Sensitivity (Se) | 0.70 (0.60–0.80) |
Specificity (Sp) | 0.89 (0.84–0.94) |
Y | 0.69 |
PPV | 0.77 (0.67–0.86) |
NPV | 0.85 (0.79–0.90) |
PSI | 0.62 |
LR+ | 6.17 (3.91–9.73) = moderate |
LR− | 0.34 (0.21–0.53) = small |
DOR | 18.4 (11.6–29.0) |
CUI+ | 0.54 (adequate) |
CUI− | 0.76 (good) |
AUC ROC curve | 0.91 (0.88–0.95) |
In a study of the Test Your Memory (TYM) test (Hancock and Larner 2011; see Sect. 4.10) the results for concurrently administered MMSE (n = 210) showed sensitivity and specificity that were somewhat better than found in previous studies (Table 4.3), perhaps related to the casemix which was drawn from both CFC and an old age psychiatry memory clinic (the mean age for the whole study, n = 224, was 63.3 ± 12.6 years, a little higher than typically seen in CFC cohorts; see Sect. 1.3). For the group with dementia tested with the MMSE (n = 71), the mode, median, and mean scores were 19, 20, and 19.7 ± 4.8, respectively; for the non-demented group (n = 139) the mode, median, and mean MMSE scores were 30, 29, and 27.6 ± 2.8 (Fig. 4.1). The mean MMSE scores differed significantly between the two groups (t = 15.0, df = 208, p < 0.001). At the MMSE cutoff of ≤23/30, 81 % of the AD/mixed dementia cases (n = 52 tested with MMSE) were detected, as compared to 52 % in the index TYM paper (Brown et al. 2009).
Table 4.3
Demographic and diagnostic parameters for MMSE for diagnosis of dementia (3) (Hancock and Larner 2011)
MMSE | |
---|---|
N | 210 |
Prevalence dementia | 34 % |
Cutoff | ≤23/30 |
Accuracy | 0.90 (0.85–0.94) |
Sensitivity (Se) | 0.79 (0.69–0.88) |
Specificity (Sp) | 0.95 (0.91–0.99) |
Y | 0.74 |
PPV | 0.89 (0.81–0.97) |
NPV | 0.90 (0.85–0.95) |
PSI | 0.79 |
LR+ | 15.7 (7.53–32.6) = large |
LR− | 0.22 (0.11–0.46) = small |
DOR | 70.4 (33.9–146.4) |
CUI+ | 0.70 (good) |
CUI− | 0.85 (excellent) |
AUC ROC curve | 0.94 (0.91–0.97) |
In a study of the Montreal Cognitive Assessment (MoCA) undertaken at CFC (see Sect. 4.9; Larner 2012a), MMSE performance was examined for diagnosis of cognitive impairment, i.e. both dementia and MCI (Table 4.4). In the cognitively impaired group the mean MMSE score was 23.6 ± 3.8, and in the non-impaired group 27.7 ± 2.1. The mean MMSE scores differed significantly between the two groups (t = 6.62, df = 148, p < 0.001). Mean MMSE scores in the demented and MCI groups were 22.2 ± 3.9 and 25.3 ± 3.1 respectively and differed significantly between the two groups (t = 2.02, df = 63, p < 0.05; Fig. 4.2). Overall the results for MMSE sensitivity and specificity were more akin to those seen in earlier studies from CFC (Tables 4.1 and 4.2) than in the TYM study (Table 4.3).
Table 4.4
Demographic and diagnostic parameters for MMSE for diagnosis of cognitive impairment (both dementia and MCI) (Larner 2012a)
MMSE | |
---|---|
N | 150 |
Prevalence cognitive impairment (dementia and MCI) | 43 % (24 % and 19 %) |
M:F (% male) | 93:57 (62) |
Age range in years | 20–87 (median 61) |
Cutoff | ≥ 26/30 |
Accuracy | 0.79 (0.72–0.86) |
Sensitivity (Se) | 0.65 (0.53–0.77) |
Specificity (Sp) | 0.89 (0.83–0.96) |
Y | 0.54 |
PPV | 0.82 (0.71–0.93) |
NPV | 0.78 (0.69–0.86) |
PSI | 0.60 |
LR+ | 6.15 (3.23–11.7) = moderate |
LR− | 0.39 (0.21–0.74) = small |
DOR | 15.7 (8.3–30.0) |
CUI+ | 0.53 (adequate) |
CUI− | 0.69 (good) |
AUC ROC curve | 0.83 (0.77–0.90) |
Figure 4.2
MMSE scores vs. diagnosis (cognitive impairment/no cognitive impairment) in MoCA study (Larner 2012a)
The diagnostic utility of MMSE for the diagnosis of MCI in day-to-day clinical practice has also been examined, in the context of a study of the Mini-Mental Parkinson (Sect. 4.2; Larner 2012b). Sensitivity was inadequate (0.32) although specificity was good (0.90) (Table 4.5).
Table 4.5
Demographic and diagnostic parameters for MMSE for diagnosis of MCI (corrected from Larner 2012b)
MMSE | |
---|---|
N | 154 |
Prevalence MCI | 18 % |
M:F (% male) | 93:61 (60) |
Age range in years | 20–85 (median 60) |
Cutoff | ≤22/30 |
Accuracy | 0.80 (0.74–0.86) |
Sensitivity (Se) | 0.32 (0.15–0.49) |
Specificity (Sp) | 0.90 (0.85–0.96) |
Y | 0.22 |
PPV | 0.43 (0.22–0.64) |
NPV | 0.86 (0.80–0.92) |
PSI | 0.29 |
LR+ | 3.37 (1.58–7.22) = small |
LR− | 0.75 (0.35–1.61) = unimportant |
DOR | 4.50 (2.10–9.63) |
CUI+ | 0.14 (very poor) |
CUI− | 0.78 (good) |
AUC ROC curve | 0.72 (0.62–0.82) |
4.1.1 MMSE Ala Subscore
A number of variants and subscores derived from elements of the MMSE have been described (Davies and Larner 2013a). MMSE subscores have been suggested to help in the differential diagnosis of AD from multi-infarct dementia (Magni et al. 1996) and of AD from dementia with Lewy bodies (DLB) (Ala et al. 2002). The latter, the Ala subscore, is given by the formula:
Hence the Ala subscore may range from −5 to +10. In a small cohort of patients, an Ala subscore of <5 was associated with a pathologically confirmed diagnosis of DLB with sensitivity of 0.82 and specificity 0.81 in patients with an MMSE ≥13/30 (Ala et al. 2002).
The Ala subscore has been evaluated in a prospective study of clinically diagnosed patients seen in CFC (Larner 2003, 2004). Very few patients with DLB were seen (3/271), in keeping with prior experience in this clinic (Ferran et al. 1996), local epidemiological studies (Copeland et al. 1992, 1999), and within the range of population prevalence estimates of DLB (Zaccai et al. 2005). Hence, no meaningful statement about Ala score sensitivity, PPV, or NPV could be made since this might involve a type II statistical error (failure to detect an effect that does exist). However, specificity and false positive rates of the Ala score could be calculated, 0.51 (95 % CI = 0.45–0.57) and 0.49 (95 % CI = 0.43–0.55) respectively, with a diagnostic odds ratio of 0.52. These figures did not encourage the view that the Ala score might be useful prospectively for the clinical diagnosis of DLB, although individual pathologically confirmed cases of DLB with Ala score <5 have been encountered in CFC (Doran and Larner 2004, case 1).
4.2 Mini-Mental Parkinson (MMP)
The Mini-Mental Parkinson (MMP) test is a derivative of the MMSE which was specifically devised to detect cognitive impairments in Parkinson’s disease (PD; Mahieux et al. 1995). A review of studies of its use (Davies and Larner 2013a:54) identified few published to date, but these indicated the utility of MMP in detecting cognitive impairment comparing PD to PD with dementia or cognitive impairment short of dementia (Parrao-Diaz et al. 2005), or in comparison with normal controls (Caslake et al. 2009, 2013). As the item content shows (Box 4.3), MMP addresses many of the shortcomings of the MMSE (in a manner similar to the ACE and ACE-R; see Sects. 4.5 and 4.6 respectively).
Box 4.3: Item Content of Mini-Mental Parkinson (MMP)
Orientation | 10 (as for MMSE) |
Visual registration | 3 |
Attention | 5 (as for MMSE) |
Two set fluency | 3 |
Visual recall | 4 |
Shifting | 4 |
Concept processing | 3 |
Total score | 32 |
In a study of MMP in newly referred patients to CFC and in patients with established PD seen in general neurology clinics (Larner 2010, 2012b), MMP scores did not correlate with patient age (r = −0.26). For the PD patients, there was a moderate correlation between disease duration and the modified Hoehn and Yahr (MHY) score (Hoehn and Yahr 1967; r = 0.58; t = 3.39, df = 23, p < 0.01), but no correlation between MMP score and disease duration (r = 0.16; t = 0.80, p > 0.1), or between MMP score and MHY score (r = 0.02; t = 0.11, p > 0.5).
In a cohort of 201 patients seen in CFC over a 12-month period (August 2009-August 2010) and prospectively administered the MMP (Larner 2012b), the most accurate cutoff for the differentiation of dementia from no dementia was ≤17/32, at which cutoff MMP had excellent specificity, positive and negative predictive values but poor sensitivity (Table 4.6). The various parameters of diagnostic utility were comparable to the MMSE. The very high correlation between MMP and MMSE scores (r = 0.93; t = 35.7, df = 199, p < 0.001) suggested concurrent validity. Diagnostic agreement between tests was also high (κ = 0.85, 95 % CI 0.74–0.96).
MMP | |
---|---|
N | 201 |
Prevalence dementia | 23 % |
M:F (% male) | 115:86 (57) |
Age range in years | 20–86 (median 62) |
Cutoff | ≤17/32 |
Accuracy | 0.86 (0.81–0.91) |
Sensitivity (Se) | 0.51 (0.37–0.65) |
Specificity (Sp) | 0.97 (0.94–0.99) |
Y | 0.48 |
PPV | 0.83 (0.69–0.97) |
NPV | 0.87 (0.82–0.92) |
PSI | 0.70 |
LR+ | 15.7 (6.35–38.9) = large |
LR− | 0.51 (0.20–1.25) = unimportant |
DOR | 31.1 (12.6–77.0) |
CUI+ | 0.42 (poor) |
CUI− | 0.84 (excellent) |
AUC ROC curve | 0.89 (0.84–0.94) |
For patients with dementia (n = 47), median and mean MMP scores were 17 and 17.1 ± 6.4, respectively; for the non-demented group (n = 154) the median and mean MMP scores were 27 and 26.5 ± 4.3. For single group comparisons, the mean MMP scores differed significantly between the demented and non-demented groups (t = 11.7, df = 199, p < 0.001).
The diagnostic utility of MMP for the diagnosis of MCI has also been examined (Larner 2012b). Of the 154 non-demented patients in this cohort, 28 fulfilled modified diagnostic criteria for MCI (as used in Petersen et al. 2005). In the non-demented group, the median and mean scores for the MCI patients were 24.5 and 24.0 ± 3.7. The mean MMP scores differed significantly between the demented and MCI groups (t = 5.2, df = 73, p < 0.001). For the non-demented and non MCI group (n = 126), median and mean MMP scores were 28 and 27.1 ± 4.2. For the intra-group comparison, the mean MMP scores differed significantly between MCI and the non-demented non MCI groups (t = 3.6, df = 152, p < 0.001). Mean MMSE scores of the MCI (24.9 ± 3.2) and non-demented non MCI groups (27.1 ± 3.2) also differed significantly (t = 3.3, df = 152, p < 0.01).
Examining all test cutoff scores for MMP, optimal test accuracy for a diagnosis of MCI versus no dementia (0.81) was at the cutoff of ≤20/32, at which cutoff MMP had excellent specificity and negative predictive value but poor sensitivity and positive predictive value (Table 4.7). The various parameters of diagnostic utility were again comparable to the MMSE (see Table 4.5; Larner 2012b).
MMP | |
---|---|
N | 154 |
Prevalence MCI | 18 % |
M:F (% male) | 93:61 (60) |
Age range in years | 20–85 (median 60) |
Cutoff | ≤20/32 |
Accuracy | 0.81 (0.74–0.87) |
Sensitivity (Se) | 0.29 (0.12–0.45) |
Specificity (Sp) | 0.92 (0.87–0.97) |
Y | 0.21 |
PPV | 0.44 (0.21–0.67) |
NPV | 0.85 (0.79–0.91) |
PSI | 0.29 |
LR+ | 3.59 (1.56–8.29) = small |
LR− | 0.78 (0.34–1.79) = unimportant |
DOR | 4.64 (2.01–10.7) |
CUI+ | 0.13 (very poor) |
CUI− | 0.79 (good) |
AUC ROC curve | 0.74 (0.65–0.83) |
MMP has also proved useful in individual cases to detect cognitive impairment not identified using the MMSE, for example due to non-dominant hemisphere pathology of traumatic (Aji et al. 2012) or neoplastic (Smithson and Larner 2013; Case Study 4.1) origin, and in a case of Perry syndrome (Aji et al. 2013).
Case Study 4.1: Clinical Utility of Cognitive Screening Instruments in Diagnosis of Cognitive Impairment: MMP
Two months after an episode of presumed herpes simplex encephalitis with oedematous change confined to the right (non-dominant) anterior and medial temporal lobes on MR imaging, and treated with aciclovir, a 48 year-old man declared himself back to normal, although his partner thought he was occasionally confused. Cognitive testing with the MMSE was unremarkable (29/30) but using the MMP he scored 27/32 with impairment on a test of visual recall. Subsequent re-imaging showed an intrinsic right temporal lobe mass lesion, not evident on review of the original MR images. Stereotactic biopsy of the lesion showed histological evidence of a glioblastoma multiforme.
4.3 Clock Drawing Test (CDT)
The Clock Drawing Test (CDT) is a quick and simple CSI which has been used for many years, with a large literature on its scoring and utility. It is thought to assess attentional mechanisms, auditory comprehension, verbal working memory, numerical knowledge, visuospatial skills, praxis, and executive function, hence a multidomain test. Many variants and scoring systems have been developed, and it has been used in a wide variety of cognitive disorders, partly due to its high acceptability to both patients and clinicians (Freedman et al. 1994; Mainland and Shulman 2013).
No specific examination of the CDT per se has been undertaken in CFC. However, some form of clock drawing test has been incorporated into other CSI which have been examined in CFC such as the Addenbrooke’s Cognitive Examination and its revision, the Montreal Cognitive Assessment, and the Test Your Memory test (see Sects. 4.5, 4.6, 4.9 and 4.10 respectively) as well as the Codex decision tree (Sect. 4.4).
4.3.1 Backward Clock Test
A variant on the theme of the CDT has been developed using a “Backward Clock” (Accoutrements, Seattle, USA), the mirror image of normal analogue clock (Larner 2007a). In a convenience cohort (n = 17) recruited from CFC, patients were asked to read matched strings of times shown either backward (= Backward Clock, or normal analogue clock viewed in a mirror) or forward (= normal analogue clock, or Backward clock viewed in a mirror). Patients with dementia (6 AD, 1 FTLD) failed to read backward times correctly, with most errors resulting from reading the long hand according to its position rather than the number to which it pointed. Patients with posterior cortical atrophy (4) could read neither forward nor backward times, indeed could not discriminate any difference between the two clocks. Patients with focal lesions, namely isolated amnesia (3; amnestic MCI, post severe hypoglycaemia; Larner et al. 2003a) and agnosia (1; developmental prosopagnosia; Larner et al. 2003b), made only occasional errors on backward times, like a normal aged control (1). One patient with amnesia due to a fornix lesion with additional evidence of executive dysfunction (Ibrahim et al. 2009) performed at the level of the demented patients. The Backward Clock Test may therefore be useful in differentiating focal from global cognitive deficits, and hence in the diagnosis of dementia.
4.4 Cognitive Disorders Examination (Codex)
Belmin et al. (2007) developed a two-step decision tree incorporating the three-word recall and spatial orientation components from the MMSE along with a simplified clock drawing test (sCDT) which took around three minutes to perform. This cognitive disorders examination or Codex produced four diagnostic categories with differing probabilities of dementia (A = very low, B = low, C = high, D = very high). In a validation study in elderly people, taking categories C and D as indicators of dementia, Codex was found to have high sensitivity and specificity for the diagnosis of dementia (0.92 and 0.85 respectively), a better sensitivity than the MMSE (Belmin et al. 2007).
The diagnostic utility of Codex has been examined (Larner 2013d; Ziso and Larner 2013). In a cohort of 162 patients seen in CFC over a 9-month period (February-November 2012), all patients completed the MMSE and sCDT and could therefore be categorized according to the Codex decision tree (A = 42, B = 63, C = 5, D = 52). The probability of dementia in each Codex category was A = 0.05, B = 0.08, C = 0.2 and D = 0.67 (Fig. 4.3); the probability of cognitive impairment in each Codex category was A = 0.07, B = 0.32, C = 0.6 and D = 0.88. The correlation coefficient between Codex diagnostic categories (A-D translated to 1–4 respectively) and MMSE scores in a subgroup of patients (n = 57) showed a moderate negative correlation (r = −0.68; t = 6.83, df = 55, p < 0.001). Taking Codex categories C and D as indicators of dementia (as in Belmin et al. 2007), Codex was found to have good sensitivity and specificity for the diagnosis of dementia (0.84 and 0.82 respectively) (Table 4.8, left hand column).
Taking Codex categories C and D as indicators of cognitive impairment (cases of dementia and of MCI), Codex sensitivity declined (0.68; more false negatives) whilst specificity improved (0.91; fewer false positives) (Table 4.8, right hand column). It appeared from this study that Codex may not be equivalent to other instruments designed specifically to identify MCI, such as the Montreal Cognitive Assessment (Sect. 4.9).
Table 4.8
Demographic and diagnostic parameters for Codex for diagnosis of dementia and cognitive impairment (Ziso and Larner 2013)
CODEX | ||
---|---|---|
N | 162 | |
Prevalence cognitive impairment (dementia and MCI) | 44 % (26 % and 18 %) | |
M:F (% male) | 83:79 (51) | |
Age range in years | 20–89 (median 61) | |
Diagnosis of dementia | Diagnosis of cognitive impairment | |
Accuracy | 0.83 (0.77–0.89) | 0.81 (0.75–0.87) |
Sensitivity (Se) | 0.84 (0.73–0.95) | 0.68 (0.57–0.79) |
Specificity (Sp) | 0.82 (0.76–0.89) | 0.91 (0.85–0.97) |
Y | 0.66 | 0.59 |
PPV | 0.63 (0.51–0.77) | 0.86 (0.77–0.95) |
NPV | 0.93 (0.89–0.98) | 0.78 (0.70–0.86) |
PSI | 0.56 | 0.64 |
LR+ | 4.74 (3.15–7.15) = small | 7.66 (3.88–15.1) = moderate |
LR− | 0.20 (0.13–0.30) = small | 0.35 (0.18–0.69) = small |
DOR | 24.0 (15.9–36.2) | 21.8 (11.1–43.1) |
CUI+ | 0.53 (adequate) | 0.59 (adequate) |
CUI− | 0.77 (good) | 0.71 (good) |
AUC ROC curve | 0.85 (0.78–0.92) | 0.85 (0.80–0.91) |
4.5 Addenbrooke’s Cognitive Examination (ACE)
The Addenbrooke’s Cognitive Examination (ACE; Mathuranath et al. 2000) and its revision, ACE-R (see Sect. 4.6; Mioshi et al. 2006), were theoretically motivated developments of the MMSE, which they both incorporated. Copyright issues concerning the MMSE (Newman and Feldman 2011) have prompted the removal of these elements from ACE-III which supersedes ACE and ACE-R (Hsieh et al. 2013; available at www.neura.edu.au/frontier/research/test-downloads/).
ACE may be used as a brief “bedside” cognitive screen, encompassing tests of attention/orientation, memory, language, visual perceptual and visuospatial skills, and executive function, with a total score out of 100 (Box 4.4). It attempts to address some of the recognised shortcomings of the MMSE (i.e. perfunctory memory and visuospatial testing, absence of executive function tests). ACE was initially reported to have good sensitivity and specificity for identifying dementia, was relatively quick to administer (ca. 15 min), and had good patient acceptability (Mathuranath et al. 2000).
Box 4.4: Item Content of Addenbrooke’s Cognitive Examination (ACE)
Orientation | 10 |
Registration | 3 |
Attention/Concentration (serial 7s, DLROW) | 5 (best performed task) |
Recall | 3 |
Memory: | |
Anterograde | 28 |
Retrograde | 4 |
Verbal fluency: | |
Letters | 7 |
Animals | 7 |
Language: | |
Naming | 12 |
Comprehension | 8 |
Repetition | 5 |
Reading | 2 |
Writing | 1 |
Visuospatial abilities: | |
Intersecting pentagons | 1 |
Wire (Necker) cube | 1 |
Clock drawing | 3 |
Total score | 100 |
The ACE has been widely adopted and translated into various languages (Davies and Larner 2013b). A systematic review concluded that ACE is useful in detecting cognitive impairment in AD and other causes of cognitive decline (Crawford et al. 2012). For example, ACE has been used in the diagnosis of cognitive impairment in Parkinson’s disease (Reyes et al. 2009; Robben et al. 2010), atypical parkinsonian syndromes (multiple system atrophy, progressive supranuclear palsy, corticobasal degeneration; Bak et al. 2005), in vascular cognitive impairment (Alexopoulos et al. 2006), and in the setting of brain injury rehabilitation (Gaber 2008). ACE scores may help to predict conversion of amnestic MCI to dementia (Lonie et al. 2010). It has also been reported that ACE scores may discriminate cognitive decline due to depression (see Sect. 5.2.3) from that due to dementia (Dudas et al. 2005) although there is other evidence contradictory to this claim (Stokholm et al. 2009). A systematic review found the evidence base for ACE being able to distinguish dementia subtypes and MCI was lacking (Crawford et al. 2012).
The diagnostic utility of the ACE in screening for dementia in day-to-day clinical practice has been assessed prospectively in new referrals to CFC over a 3½-year period (February 2002-August 2005; Larner 2005b, 2006, 2007b). ACE was used in 285 patients, a cohort in which dementia prevalence was 49 % (Table 4.9). ACE was easy to use but a few patients failed to complete the test, including three patients with frontotemporal lobar degeneration who had features of either profound apathy or marked motor restlessness. The correlation coefficient between ACE scores and MMSE scores (n = 154) was r = 0.92 (t = 28.9, df = 152, p < 0.001) (Larner 2005b).
ACE | ||
---|---|---|
N | 285 | |
Prevalence dementia | 49 % | |
M:F (% male) | 147:138 (52) | |
Cutoff | 88/100 | 75/100 |
Accuracy | 0.71 (0.66–0.76) | 0.84 (0.80–0.88) |
Sensitivity (Se) | 1.00 | 0.85 (0.79–0.91) |
Specificity (Sp) | 0.43 (0.35–0.42) | 0.83 (0.77–0.89) |
Y | 0.43 | 0.68 |
PPV | 0.63 (0.57–0.69) | 0.83 (0.77–0.89) |
NPV | 1.00 | 0.85 (0.79–0.91) |
PSI | 0.63 | 0.65 |
LR+ | 1.77 (1.53–2.04) = unimportant | 5.14 (3.54–7.45) = moderate |
LR− | 0 = large | 0.18 (0.12–0.26) = moderate |
DOR | ∞ | 28.6 (19.7–41.4) |
CUI+ | 0.63 (adequate) | 0.71 (good) |
CUI− | 0.43 (poor) | 0.71 (good) |
AUC ROC curve | 0.93 (0.90–0.96) |
Using the ACE cutoffs of 88/100 and 83/100 as defined in the index paper (Mathuranath et al. 2000), test sensitivity was high but specificity less good (Table 4.9; Larner 2007b). Using a lower cutoff of 75/100 (Larner 2006), justified on the basis that, unlike the index study, this pragmatic study did not include a normal control group, and hence was more representative of day-to-day clinical practice, sensitivity and specificity were both greater than 80 % and PPV approached 90 % (Table 4.9, right hand column), values approximating those recommended for biomarkers of AD (The Ronald and Nancy Reagan Research Institute of the Alzheimer’s Association and the National Institute on Aging Working Group 1998).
Longitudinal use of the ACE has proven useful in individual cases (e.g. Larner et al. 2003a; Wilson et al. 2010; Wong et al. 2008, 2010), and has also been examined more systematically (Larner 2006). Over the 3½ year period that the ACE was in use in CFC, 23 of the 285 patients tested had more than one assessment with the ACE over periods of follow-up ranging from 7 to 36 months. At first assessment, six patients were suspected to have dementia and 17 were not demented. Based on patient and caregiver report and clinical judgement, 16 patients declined over follow-up, six remained static and one improved, with final clinical diagnoses of dementia in 16 and no dementia in seven. On the ACE, 17 patients had declined, 4 remained static (≤2 point change in ACE scores) and two improved. The diagnostic utility of longitudinal use of the ACE is summarised in Table 4.10.
ACE | ||
---|---|---|
N | 23 | |
Cutoff | 88/100 | 75/100 |
Accuracy | 0.74 (0.56–0.92) | 0.74 (0.56–0.92) |
Sensitivity (Se) | 1.00 | 0.88 (0.71–1.04) |
Specificity (Sp) | 0.14 (−0.12–0.40) | 0.43 (0.06–0.80) |
Y | 0.14 | 0.31 |
PPV | 0.73 (0.54–0.91) | 0.78 (0.59–0.97) |
NPV | 1.00 | 0.60 (0.17–1.03) |
PSI | 0.73 | 0.38 |
LR+ | 1.16 (0.86–1.57) = unimportant | 1.53 (0.78–2.98) = unimportant |
LR− | 0 = large | 0.29 (0.15–0.57) = small |
DOR | ∞ | 5.25 (2.69–10.2) |
CUI+ | 0.73 (good) | 0.69 (good) |
CUI− | 0.14 (very poor) | 0.26 (very poor) |
4.5.1 ACE Subscores: VLOM Ratio, Standardized Verbal Fluency, Semantic Index, and Modified Ala Subscore
A number of subscores derived from the ACE have been described (Davies and Larner 2013b).
Mathuranath et al. (2000) defined the VLOM ratio, which was reported to differentiate AD and frontotemporal lobar degenerations (FTLD), given by the formula:
with possible maxima of (verbal fluency + language) = 42 and (orientation + delayed recall) = 17. A VLOM ratio >3.2 showed sensitivity of 0.75 and specificity of 0.84 for the diagnosis of AD (Mathuranath et al. 2000), a finding later confirmed in an independent cohort (Bier et al. 2004). A VLOM ratio <2.2 showed sensitivity of 0.58 and specificity of 0.97 for the diagnosis of FTLD (Mathuranath et al. 2000). A later independent study confirmed the specificity figure, but reported a much lower sensitivity of VLOM ratio <2.2 for the diagnosis of FTLD (Bier et al. 2004).
In the cohort from CFC, the diagnostic utility of the VLOM ratio >3.2 for the diagnosis of AD was confirmed, whereas the diagnostic utility of the VLOM ratio <2.2 for the diagnosis of FTLD showed poor sensitivity but good specificity, with accordingly very poor and excellent positive and negative utility indices respectively (Table 4.11; Larner 2007b). Others have also questioned the utility of the VLOM ratio in identifying FTLD, particularly behavioural variant FTD (Bier et al. 2004).
ACE VLOM ratio | ||
---|---|---|
N | 130 (AD 114, FTLD 16) | |
Cutoff | >3.2 (for diagnosis of AD) | <2.2 (for diagnosis of FTLD) |
Accuracy | 0.76 (0.71–0.81) | 0.87 (0.83–0.91) |
Sensitivity (Se) | 0.76 (0.69–0.84) | 0.31 (0.09–0.54) |
Specificity (Sp) | 0.76 (0.69–0.84) | 0.90 (0.87–0.94) |
Y | 0.52 | 0.21 |
PPV | 0.69 (0.60–0.77) | 0.16 (0.03–0.29) |
NPV | 0.83 (0.77–0.89) | 0.96 (0.93–0.98) |
PSI | 0.52 | 0.12 |
LR+ | 3.21 (2.40–4.28) = small | 3.20 (1.42–7.21) = small |
LR− | 0.31 (0.23–0.42) = small | 0.76 (0.34–1.72) = unimportant |
DOR | 10.3 (7.72–13.8) | 4.19 (2.99–5.88) |
CUI+ | 0.52 (adequate) | 0.05 (very poor) |
CUI− | 0.63 (adequate) | 0.86 (excellent) |
AUC ROC (AD vs. FTD) | 0.80 (0.64–0.96) |
ACE includes verbal fluency (VF) tests for both letter (P) and category (animals) (Box 4.4). Scaled scoring systems for letter fluency (LF) and category fluency (CF) derived using a Gaussian distribution of raw scores from normal controls (n = 127) took account of the finding that CF is easier than LF for normals. This component of the ACE had good concordance with standard neuropsychological tests (κ = 0.60 against FAS test), indicating good construct validity (Mathuranath et al. 2000).
Verbal fluency has been described as the “ESR of cognition”, impairment being a nonspecific indicator of cognitive ill-health. VF tasks have very high sensitivity in detecting dementia (Duff Canning et al. 2004), although there may be differential impairments. One study comparing patients with dementia and pure affective disorder suggested that LF < CF was suggestive of affective disorder (Dudas et al. 2005; see Sect. 5.2.3). Since patients with AD generally show greater impairment in CF than LF, reflecting degradation in semantic knowledge stores and/or access to this knowledge (Henry et al. 2004), whilst LF is particularly sensitive to FTLD, especially the behavioural variant FTD (Hodges et al. 1999), differential impairment of CF and LF might possibly be useful in the differentiation of AD from FTLD.
Examining this in AD (n = 114) and FTLD (n = 16) patients in the CFC cohort who were administered the ACE, VF parameters showed similar patterns to VLOM ratios, i.e. VLOM ratio >3.2 and LF > CF favoured diagnosis of AD, whereas VLOM ratio <2.2 and CF > LF favoured diagnosis of FTLD, but overall the standardized verbal fluency offered no diagnostic advantage over the VLOM ratios (Table 4.12, compare with Table 4.11; Larner 2013e).
ACE Standardized Verbal Fluency scores | ||
---|---|---|
N | 130 (AD 114, FTLD 16) | |
LF > CF (for diagnosis of AD) | LF < CF (for diagnosis of FTD) | |
Accuracy | 0.63 (0.55–0.71) | 0.91 (0.86–0.96) |
Sensitivity (Se) | 0.66 (0.57–0.75) | 0.25 (0.04–0.46) |
Specificity (Sp) | 0.44 (0.19–0.68) | 0.86 (0.80–0.92) |
Y | 0.10 | 0.11 |
PPV | 0.89 (0.83–0.96) | 0.20 (0.03–0.38) |
NPV | 0.15 (0.05–0.26) | 0.89 (0.83–0.95) |
PSI | 0.04 | 0.09 |
LR+ | 1.17 (0.74–1.84) = unimportant | 1.78 (0.68–4.66) = unimportant |
LR− | 0.78 (0.50–1.23) = unimportant | 0.87 (0.33–2.28) = unimportant |
DOR | 1.50 (0.95–2.35) | 2.04 (0.78–5.35) |
CUI+ | 0.59 (adequate) | 0.05 (very poor) |
CUI− | 0.07 (very poor) | 0.77 (good) |
AUC ROC (AD vs. FTD) | 0.56 (0.49–0.65) |
Another ACE subscore is the Semantic Index (SI) which was reported to differentiate AD from semantic dementia (Davies et al. 2008), and is given by the formula:
Hence SI ranges from +14 to −15, with a cutoff of zero said to differentiate AD cases (SI = 3.8 ± 3.6) from semantic dementia cases (SI = −6.7 ± 4.7). Few cases of semantic dementia have been identified in CFC but all those scored by this method (n = 4) had SI < 0 (range −7 to −15), suggesting that this probably is a useful score for differentiating AD and semantic dementia.
The Ala subscore derived from the MMSE (see Sect. 4.1.1) which was reported to differentiate AD and DLB (Ala et al. 2002), may also be derived, in a modified form, from the ACE (Larner 2003), namely:
Like the Ala subscore, this may range from −5 to +10. The modified Ala subscore was evaluated in a prospective study of clinically diagnosed patients seen in CFC (Larner 2003, 2004). Only specificity and false positive rates could be calculated because of the very small number of DLB cases seen, with results similar to those found for the Ala subscore (see Sect. 4.1.1), specificity 0.47 (95 % CI = 0.41–0.53) and false positive rate 0.53 (95 % CI = 0.47–0.59) and a diagnostic odds ratio of 0. These figures did not encourage the view that the modified Ala subscore might be useful prospectively for the clinical diagnosis of DLB.
4.5.2 ACE Meta-analysis
A systematic review of ACE/ACE-R studies identified 9 of 45 studies as suitable for review, excluding translated versions, and concluded that ACE/ACE-R was capable of differentiating between those with and without cognitive impairment, but the evidence base on distinguishing dementia subtypes and MCI was lacking (Crawford et al. 2012). A meta-analysis of ACE studies has been undertaken to better understand ACE utility (Larner and Mitchell 2014), using methods similar to those applied in previous meta-analyses of MMSE diagnostic accuracy (Mitchell 2009, 2013).
Literature search to end May 2013 identified 29 reports of studies of the ACE, 13 using the English version and 16 using non-English versions. All the studies identified were from high prevalence specialist secondary care settings. After application of exclusion criteria, 5 ACE studies were deemed suitable for meta-analysis (Mathuranath et al. 2000; Garcia-Caballero et al. 2006; Larner 2007b; Stokholm et al. 2009; Yoshida et al. 2011). Across the 5 included studies there were 529 cases of dementia out of a population of 1,090, a prevalence of 49 %. There was no evidence of publication bias (Harbord bias = −8.23, 95 % CI = −29.1 to 12.6, p = 0.37; Harbord et al. 2006).