Measure
Number of items
Response format
Features
Can be mapped to DSM criteria
Includes item on suicidal ideation
Possible cut points for MDD or clinically significant depressive symptomsa
Average completion times
PHQ-9
9
4-Point Likert
Yes
Yes
≥10, or algorithm
<5 min
GDS-15
15
Yes/no
No
No
≥5
<5 min
CESD-10
10
Yes/no, or 4-point Likert
No
No
≥4 For yes/no format; ≥10 for Likert format
<5 min
BDI-II
21
4-Point Likert
Yes
Yes
Variable; Beck et al. [28] suggest score ranges: minimal (0–13), mild (14–19), moderate (20–28), and severe (29–63)
5–10 min
CES-D
20
4-Point Likert
No
No
≥16
5–10 min
CES-D-revised
20
4-Point Likert
Yes
Yes
≥16
5–10 min
The Centers for Epidemiologic Studies Depression Scale, or CES-D [1], is among the most popular depression screening measures in research. The CES-D can also be used to screen for both depression in very old populations [2]. Long forms (20 items) and short forms (e.g., 10 items) [3, 4] are available; the abbreviated ten-item version of the CES-D [4] takes <5 min to complete on average. Furthermore, both Likert-style [3] and binary (yes/no) [4] formats are available for the ten-item version, further increasing ease of administration. Reliability of the CES-D, as assessed using measures of internal consistency, for example, is reported to be high [4, 5]. Additionally, cut points for clinically significant depressive symptoms have been created for long (≥16 points) and short versions of the CES-D (≥10 points is suggested for the 10-item Likert-style version [3], and ≥4 points for the binary format version) [4, 6]; however, as illustrated in Table 6.2 of this chapter, variation has been observed in the optimal cut points for identifying depression among older adults [7–9]. The key aspect of the CES-D is that the symptoms focus heavily on core affective aspects of depression—such as feelings of being sad, depressed, fearful, or hopeless. Thus, a potential limitation of the CES-D is that depressed individuals who do not respond to items with a high focus on the affective aspects of depression may not be detected as such by the CES-D. This is a particular challenge among older adults, among whom the state of anhedonia—as opposed to expressed, subjective sadness—may be the dominant feature of depression [4, 6]. A related limitation is that of item bias, or differential item-functioning. For example, in an analysis [10] among a large-scale Australian cohort of women, there was notable response variation for one item on the CES-D: older women, compared to younger women, were substantially more likely to be missing responses some items (e.g., “I felt hopeful about the future”); although the precise explanation for this missing pattern is not clear, it is possible that it may have been more challenging for the oldest versus younger women to answer a question related to speculating on future time. Other investigators have found evidence of item-response bias among older adults by race/ethnicity on the CES-D [11, 12]. Furthermore, the development of the CES-D pre-dated the advent of the modern criteria for major depression (which emerged with the publication of the Diagnostic and Statistical Manual of Mental Disorders III, or DSM-III in 1980); thus, the items do not explicitly map to all of the DSM criteria for major depressive disorder (MDD). Consequently, a revised version of the CES-D was recently developed by Gallo [13]. The CES-D-revised is a 20-item scale with a similar Likert response format, and it provides increased emphasis of symptoms of depression as they may be experienced by older persons (e.g., more questions related to anhedonia and less reliance on subjective sadness) and also includes questions about thoughts of death/self-harm. In addition, unlike the original CES-D, the CES-D-revised can be mapped directly to the criteria of the DSM-IV [14].
Table 6.2
Comparisons of common brief depression scales with diagnostic gold standards in late-life depressiona
Study authors (location) | Year | Study population | N | Brief scale | Gold standard | Author findings |
---|---|---|---|---|---|---|
Haringsma et al. (Leiden, The Netherlands) | 2004 | Dutch community residents aged 55–85 years who referred themselves to depression prevention program | 318 | CES-D | MINI | ROC analysis found cut-off score of 25 to be optimal for MDD (sensitivity = 0.85, specificity = 0.64) and 22 for clinically relevant depression (sensitivity = 0.84, specificity = 0.6) |
Beekman et al. (Amsterdam, The Netherlands) | 2002 | Dutch adults between 55 and 85 years old, enrolled in the Longitudinal Aging Study Amsterdam | 277 | CES-D | DIS | Using a cutoff score of ≥16 for MDD, sensitivity was 1 and specificity = 0.88 |
Lyness et al. (Rochester, NY) | 1997 | Patients ≥60 of 3 primary care practices in Rochester, NY | 130 | GDS, GDS-15, CES-D | SCID | ROC analyses found that using a cutoff point of ≥21 on the CES-D for MDD, sensitivity = 0.92 and specificity = 0.87. A cutoff point for MDD of ≥10 on the GDS resulted in a sensitivity of 1 and specificity of 0.84. A cutoff of ≥5 on the GDS-15 had a sensitivity of 92 % and a specificity of 81 %. All scales lost accuracy for detection of minor depression |
Irwin et al. (San Diego, CA) | 1999 | Patients ≥60 enrolled in university-affiliated primary care clinics | 68 | CES-D-10 | SCID | An optimal cutoff point of ≥4 was found for major depression (sensitivity = 1, specificity = 0.92). The ten-item CES-D appeared as effective at screening for depression as the original CES-D |
Blank et al. (Hartford, CT) | 2004 | Patients ≥60 from urban primary care practices, nursing homes and a hospital | 360 | CES-D-10, GDS-15 | DIS | The GDS-15 had the highest sensitivity/specificity when used to screen for major depression in nursing homes (sensitivity = 0.86, specificity = 0.82 using cutoff of ≥6). The CES-D-10 performed better than other instruments in clinics (sensitivity = 0.79, specificity = 0.81 using cutoff of ≥4) and hospital settings (sensitivity = 0.92, specificity = 0.77) |
Nyunt et al. (Singapore) | 2009 | Adults ≥60 living in communities or nursing homes | 4,253 | GDS-15 | SCID | Using cutoff of ≥5 for MDD, sensitivity was 0.97 and specificity was 0.95 |
Chen et al. (Huangzhou, China) | 2010 | Patients (aged ≥60) of primary care clinics in Huangzhou | 77 | PHQ-9, PHQ-2 | SCID | When using a cutoff point of ≥9 for major depression, the PHQ-9 had a sensitivity of 0.86 and specificity of 0.77. The PHQ-2 was also found to be an effective screening tool, with sensitivity = 0.84 and specificity = 0.9 using the cutoff of ≥3 |
Marc et al. (Westchester, NY) | 2008 | Adults aged ≥65 receiving home care from a visiting nurse service agency | 526 | GDS-15 | SCID | The optimal cutoff point for major depression was found to be ≥5, which yielded sensitivity = 0.72 and specificity = 0.78 |
Lamers et al. (Maastricht, The Netherlands) | 2008 | Patients ≥60 of general practices in The Netherlands | 105 | PHQ-9 | MINI | For ADD (any depression), a cutoff of ≥6 for the summed PHQ-9 score was used, and corresponded to sensitivity = 0.96 and specificity = 0.81. The cutoff point ≥7 was used for MDD, with sensitivity = 0.92 and specificity = 0.78. When using the algorithm-based score of the PHQ-9 instead of the summed score, the test for both ADD and MDD had low sensitivity |
Ros et al. (Albacete, Spain) | 2011 | Older adults with or without cognitive impairment | 623 | CES-D | CIDI | Among participants without cognitive impairment, the optimal cut-off point was found to be 28, with sensitivity = 0.82 and specificity = 0.94. The optimal cut-off point among those with cognitive impairment was just 13, however, which resulted in sensitivity = 0.86 and specificity = 0.72 |
Mitchell et al. (Leicester, UK) | 2010 | Older adults in medical inpatient/outpatient settings and in nursing homes | 4,969 | GDS-30, GDS-15, GDS-4 | Semi-structured psychiatric interviews | Meta-analysis found the GDS-30 to have a sensitivity of 0.82 and a specificity of 0.78. The GDS-15 had sensitivity = 0.84 and specificity = 0.74, and the GDS-4 was found to be the most effective at screening for depression in medical settings, with sensitivity = 0.93 and specificity = 0.77 |
Watson et all (Chapel Hill, NC) | 2009 | Adults ≥65 in residential care/assisted living communities | 112 | GDS-15, PHQ-2 | SCID | Using the cutoff point of one positive answer as criterion for depression, the PHQ-2 had sensitivity = 0.80 and specificity = 0.71. The GDS-15 had sensitivity = 0.60 and specificity = 0.75 with a cutoff point of ≥5, and yielded sensitivity = 0.73 and specificity = 0.65 when using a cutoff of ≥4 |
Bunevicius et al. (Palanga, Lithuiana) | 2012 | Mid-life and older adults with coronary disease (mean age = 58 years; range 32–77 years) | 522 | BDI-II | MINI | A cutoff point of ≥14 produced a sensitivity of 0.89 and specificity of 0.74 for MDD as diagnosed by the MINI |
Frasure-Smith and Lespérance (Montreal, QC, Canada) | 2008 | Mid-life and older adults with coronary disease (mean age = 60 years; SD = 10 years) | 811 | BDI-II | SCID | A cutoff point of ≥14 produced a sensitivity of 0.91 and specificity of 0.78 for DSM-IV MDD |
Huffman et al. (Boston, MA) | 2010 | Mid-life and older adults with coronary disease (mean age = 62 years; SD = 12.5 years) | 131 | BDI-II | SCID | A cutoff point of ≥16 produced a sensitivity of 0.88 and specificity of 0.92 for DSM-IV MDD |
Another scale that has achieved widespread use, as has the CES-D, is the Geriatric Depression Scale (GDS) [15]. The Geriatric Depression Scale is a self-report instrument measuring depressive symptoms in older adults [15, 16]. The scale is designed to identify symptoms that distinguish depression in the elderly from dementia. The GDS is comprised of the affective (e.g., sadness, apathy, crying) and cognitive (e.g., thoughts of hopelessness, helplessness, guilt, worthlessness) symptoms of depression in the elderly [17]. It downplays the somatic items (e.g., disturbances in appetite, sleep, energy level, and sexual interest) that many suggest are important confounds among older adults [18, 19]. For example, compared with younger depressed adults, depressed older persons tend to present more often with somatic symptoms (e.g., general somatic and gastrointestinal symptoms) and hypochondriasis [20].
Available in both long (30-item) and shorter versions (e.g., the commonly used GDS-15, with 15 items)—the shorter version [15] (GDS-15) with a yes/no response format was developed to decrease fatigue or lack of focus that may be experienced by older persons [21]—the GDS has several advantages, and its 15-item format is primarily used in the context of brief screening and assessment. First, the yes/no response format allows for brevity in administration, and scoring is very straightforward for the busy clinician [7, 22]. Validated cut points have been published both for older adults in primary care settings and also for older adults with specific co-morbidities (see Table 6.2 for details) [6, 7, 17]. Moreover, as its name implies, the GDS-15 has particular value for detecting late-life depression because featured items also emphasize symptoms that may be most relevant to the experience of depression in older persons—especially several questions honing in on anhedonia (e.g., items re: dropping activities, not doing new activities, boredom, emptiness) [6, 17]. Potential limitations include a decrease in the kind of dimensionality (i.e., ability for symptom scores to be expressed on a broader continuum) than is possible with the Likert-style CES-D and other measures, as well as the inability to map directly to DSM diagnostic criteria for MDD; the latter is not surprising, however, as—like the CES-D—the original development period of the GDS pre-dated the era of the modern DSM criteria. Another potential limitation of the GDS-15 is that the items do not feature emphasis on appetite/weight or sleep disturbance; such symptoms are not only potentially relevant to detecting depression but also allow for discrimination of subtypes of depression—for example, typical (reduced appetite/weight, decreased sleep, etc.) versus atypical (increased appetite/weight, hypersomnia, etc.) [23–25]. Nevertheless, there is some evidence that dimensionality in the GDS-15 is likely adequate. For example, in a recent analysis of long-term depression patterns in a large cohort of older female adults, Byers et al. were able to employ repeated measures of the GDS-15 in order to derive distinct latent subgroups of depression trajectories [26].
The Beck Depression Inventory, or BDI, is another widely used measure [27]. Although the original BDI was developed by Aaron Beck in the 1960s, the most commonly used form is the 21-item BDI-II [28]. Importantly, the BDI-II, unlike its predecessor, was developed in light of the newly available diagnostic criteria for MDD in the DSM nosology. The Beck Depression Inventory (BDI) is a valid instrument for the diagnosis of depression in older adults [29]; there is also a seven-item BDI–PC which is an abbreviated primary care version of the BDI–II, but it has not been specifically tested in elderly medical patients [30]. Validation studies have found the BDI-II to be effective at screening for depression in middle-aged and older adults with coronary disease, even though it contains several questions regarding somatic symptoms that could act as confounders [27, 31, 32]. Like the GDS, the BDI-II incorporates more aspects relevant to the behaviors and cognitions in depression, as opposed to the relatively greater emphasis on the affective elements seen in the CES-D [32]. The BDI-II enjoys advantages of reliability, validity, dimensionality, sensitivity to change (useful in context of symptom monitoring), and applicability in assessing a broad range of aspects in depression, including the elements of typicality versus atypicality [33, 34]; although, its 21-item length renders it less practical than other screeners, especially in the primary care context [29]. Also, less is known about the BDI regarding differential item functioning and any concerns regarding its use in culturally diverse contexts. Finally, there have been concerns about overlapping symptoms between medical conditions and the depressive somatic symptoms on the BDI [33], and some elderly individuals may not be able to complete the scale due to reading difficulty, physical debility, or compromised cognitive functioning [34].
A measure that has rapidly emerged as a screener of choice among many experts [35] is the Patient Health Questionnaire-9. Originally developed in 1990s as part of the PRIME-MD (Primary Care Evaluation of Mental Disorders) by Spitzer, Williams and Kroenke [36], the PHQ-9 is now publicly available and features a Likert-style response format for easy self-administration and rapid scoring by the clinician. The PHQ-9 has been shown to be effective and efficient as a depression screening tool for older adults in primary care settings [37] and has been successfully used to identify depression in homebound older persons [38]. It has high reliability and has been extensively validated in a variety of contexts [39–45]; both a diagnostic algorithm and cut point (of 10 points or above) have been validated for the PHQ. Furthermore, the measure has dimensionality and demonstrated sensitivity to change [40, 41], both of which are useful in the context of tracking symptoms among persons under treatment or monitoring persons at risk for depression. Among commonly used measures, the PHQ-9 was designed to map most directly to the modern criteria for major depression in the DSM [46]. To increase applicability, an eight-item version (PHQ-8) is also available and it removes the item pertaining to thoughts of death/suicide, as this may be inappropriate in certain large-scale screening contexts where rapid clarification and follow-up is not possible; however, current evidence shows that the validated cutoff of the PHQ-8 has comparable sensitivity and specificity for depression as the original nine-item version [47]. A two-item version of the PHQ has also performed well in validation studies, and has been recommended for initial depression screening due to its brevity [48, 49]. Unlike most other screeners, the PHQ also does not show any evidence of differential item functioning that was observed for other measures [42, 43]. In part because of the attributes noted above, the PHQ has gained traction as a screener and has been recommended for the purpose by a variety of agencies/organizations, and Medicare has integrated the PHQ-9 into the revised Minimum Data Set assessments (MDS 3.0) for nursing home residents [50]. (For more re: policy, see Table 9.1 in Chap. 9.)

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

