PRINCIPLES OF SCALE DEVELOPMENT AND USAGE
Over the past two decades, extensive research has focused on the natural history, genetics, and treatment of movement disorders, including Parkinson’s disease (PD), Huntington’s disease (HD), dystonia, and Gilles de la Tourette’s syndrome (GTS). Both the care of patients and research associated with movement disorders demand the availability of reliable and valid clinical rating instruments. The ability to measure movement disorder characteristics over time allows for comparison of outcomes, helps the interpretation of results, and minimizes errors of measurement. Clearly, a study with a reliable and valid outcome measure is more powerful in detecting meaningful differences among groups compared to a study without such measures.
Measurement is the assignment of numerals to symptoms, behaviors, or clinical signs according to agreed-upon rules. Scales are the means used in clinical medicine for assigning numbers to observable qualities (1). The scales used in clinical medicine can be classified as nominal, ordinal, interval, or ratio scales. In nominal or categorical scales, the basic relationships described are equality or difference, and therefore, signs are present or absent. In an ordinal scale, an attribute is classified according to its rank order. However, the steps in the scale are not necessarily presumed to be equal. The basic relationship described in an ordinal scale is greater or less. Interval scales, like ordinal scales, order behaviors, but all steps in the scale are presumed to be equal, so the difference between 30 and 35 is the same as the difference between 60 and 65. The basic relationship described in an interval scale is equality or difference of intervals. True interval scales are rare in neurology. Ratio scales have equal intervals, but in addition, they have a true (nonarbritary) zero—for example, the Kelvin temperature scale is a ratio scale since it is based on absolute zero.
The ability of scales to produce consistent results and, thereby, to be useful instruments in monitoring patients and conducting research depends upon their psychometric properties. Reliability is the extent to which a measure yields the same results on repeated trials. This is the proportion of variation in any given measurement that is due to true variation and not error. Reliability can be classified as intrarater and interrater reliabilities. The former refers to the reproducibility of a single examiner’s rating on separate occasions (test–retest), the latter to the reproducibility of ratings when multiple examiners independently examine the same patient. Dimensionality refers to the ability of a rating instrument to measure several subdomains of a particular construct. Dimensionality and internal consistency of a scale are important in forming composite measures of tic severity or of subscale scores. Internal consistency is the extent to which the items making up a composite score measure the same latent factor. Validity is the extent to which an instrument measures what it is designed to measure (1).
Rating scales in movement disorders focus on two primary concepts of dysfunction: impairment and disability. Whereas impairment relates to objective deficits, disability refers to the impact of disease on patient function. As such, items usually rated by the investigator and based on the neurologic examination assess impairment, whereas interviews that involve the patient’s, caregiver’s, and investigator’s assessments of activities of daily living or quality of life rate disability. Some scales are uniquely impairment ratings, others are uniquely disability assessments, and many combine the two in separate sections, allowing a total score to estimate a global level of disease severity. Global scales that combine the two concepts also exist.
For all scales, movement disorders pose a number of implicit challenges. First, even within a single diagnosis, there are a variety of movements, and multiple variables, such as frequency of movements, number of different types of movements, intensity, complexity, body distribution, suppressibility, and interference, can be elements that affect severity ratings. Second, symptoms vary spontaneously and under certain environmental conditions, such as stress or excitement. Third, patients with some movement disorders are able to suppress voluntarily some of their symptoms for minutes to hours. These factors complicate the evaluation of movement disorders, and rating scales that are accurate measures of movement disorders must accommodate these factors to the maximum. In some instances, strategies such as videotaping or controlling the time of day or the environment of evaluation have helped to control these confounding influences. Finally, rater and subject bias are important limiting issues in rating scales. In scales that rely on the subject’s impression of the movement disorder, personal expectations, affective illness, and educational level can strongly impact the ratings, and even in objective evaluations, the raters may be influenced by their own expectations of change or stability. Efforts to remove investigator bias from ratings through computer-generated testing or examinations by raters who otherwise have no contact with the patient are useful in research paradigms but have little application to clinical practice. In this chapter, rating scales for PD, other parkinsonian syndromes, dyskinesia, tremor, dystonia, HD, GTS, and functional movement disorders are discussed.
PARKINSON’S DISEASE
UNIFIED PARKINSON’S DISEASE RATING SCALE
The Unified Parkinson’s Disease Rating Scale (UPDRS) is the international gold standard of clinical rating scales for PD. It combines assessments of impairment and disability derived from several earlier scales and is currently the most widely applied scale in clinical trials (2). The UPDRS is a four-part scale, but the predominant areas of clinical focus are Part II (activities of daily living) and Part III (objective motor ratings). These sections can be assessed for both the on and the off periods. Each item is rated using a 5-point system, in which 0 is normal and 4 represents severe abnormality. In Part II, 13 activities are assessed, ranging from speech, salivation, and swallowing to handwriting, eating, dressing, turning in bed, and walking. Part III has 14 components, but several are rated separately for more than one body area (tremor, rigidity, finger taps, and hand movements). As such, the range for Part II is 0 to 52, whereas the range for Part III is 0 to 96.
In addition to its wide usage, the scale has passed a number of psychometric tests. Martínez-Martín and colleagues (3) showed its high internal consistency (Cronbach’s α = 0.96), and interrater reliability was satisfactory for all items (κ values always at least 0.40). There was a statistically significant correlation between UPDRS score and Hoehn and Yahr stage (rs = 0.71, p < 0.001) and UPDRS and timed finger tapping. Richards et al. (4) also examined interrater reliability among three neurologists experienced with the scale and found that interclass correlation coefficients for the total motor section of the UPDRS, as well as for several of the individual items, were high (0.40–0.92). Only speech and facial expression items had poor agreements.
In spite of these excellent psychometric dimensions, the length of the UPDRS has disturbed several investigators and prompted attempts to eliminate duplicative items. To this end, factor analyses have examined the actual domains assessed by the items of the motor section of the UPDRS. Among 294 patients assessed in the on state, Stebbins et al. (5) found that the scale had six clinically distinct factors: three bradykinesia factors (axial/gait, right, and left), one rigidity measure, and two tremor measures (rest and postural). These same six factors retain their independence in both the on and off states.
Van Hilten et al. (6) examined Parts II and III (total of 27 items) and suggested that the UPDRS could be reduced to 16 items (8 items in each part) without loss of reliability or validity. The authors suggested dropping salivation, falling, freezing, tremor, and sensory symptoms from Part II and dropping hand movements, pronation/supination, leg agility, facial expression, posture, and action tremor from Part III. Their factor analysis of the motor section suggested only three factors, the first dealing predominantly with extremity bradykinesia, the second with midline functions, and the third with tremor. However, because they did not apply their ratings using the prescribed UPDRS directions and did not assess left and right sides independently, these results are difficult to apply to the actual UPDRS. An important element of any scale’s success is its uniformity of usage and communication. To enhance the standardized application of the scale, a teaching tape of the motor subsection of the UPDRS is available (7). This tape provides examples of PD patients fitting each rating possibility as reviewed by a panel of three movement disorder specialists with extensive experience in the use of the scale. Furthermore, the tape includes a series of complete UPDRS motor examinations for investigators and trainees to self-administer and report their internal concordance and agreement with the rating panel when they use the UPDRS for publishable studies. The rate of agreement among the three experts was always statistically significant for the selected samples with Kendall’s coefficient of concordance ranging between 0.97 and 0.62.
MOVEMENT DISORDER SOCIETY–UNIFIED PARKINSON’S DISEASE RATING SCALE
Based on a 2002 critique of the UPDRS (2), the Movement Disorder Society sponsored a revision of the UPDRS (MDS-UPDRS)(8). It can be administered in approximately 30 minutes and is a four-part scale with Part I (nonmotor experiences of daily living), Part II (motor experiences of daily living), Part III (objective motor ratings), and Part IV (motor complications) sections. These sections can be assessed for both the on and the off periods. Each item is rated using a 5-point system in which 0 is normal and 4 represents severe abnormality. It contains questions across Part I (13 items), Part II (13 items), Part III (33 scores based on 18 items, several with right, left, or other body distribution scores), and Part IV (6 items). The MDS-UPDRS rates 65 items in comparison to 55 on the original UPDRS.
The advantages of MDS-UPDRS over the UPDRS are that the former has added emphasis on nonmotor components of PD, a greater emphasis on mild impairments and disabilities, greater cultural sensitivity, clearer wording, and resolution of ambiguities (8). In addition to its wide usage, the scale has passed a number of psychometric tests. Goetz and colleagues (8) showed its high internal consistency (Cronbach’s α = 0.79–0.93 across parts), high correlation with the original UPDRS (ρ = 0.96) and all intraclass correlation coefficient values higher than 0.90 (9). Gallagher and colleagues (10) showed high internal consistency of Part I (Cronbach’s α = 0.85), small floor and ceiling effects (2% floor and 0% ceiling effect) and good concurrent validity (correlation with original UPDRS Part I: r = 0.81, p < 0.001).
An important element of any scale’s success is its uniformity of usage and communication. To enhance the standardized application of the scale, a teaching tape of the motor subsection of the MDS-UPDRS is available (11). This tape provides examples of PD patients fitting each rating possibility as reviewed by a panel of three movement disorder specialists. The rate of agreement among the three experts was always statistically significant for the selected samples with Kendall’s coefficient of concordance ranging between 0.99 and 0.78 (11).
HOEHN AND YAHR STAGE SCALE
The Hoehn and Yahr (HY) scale is a widely used clinical rating measure that combines impairment and disability assessment to divide patients into five broad stages, based on two primary issues: (a) unilateral versus bilateral signs and (b) the absence, presence, and severity of balance/gait difficulties (12). Among the scale’s advantages is the fact that it is simple and quick to complete, involving a single one-to-five integer designation. It has been successfully applied by raters without movement disorder expertise, as well as by specialists (13). Further, HY captures typical patterns of PD progression with and without dopaminergic therapy (14). Progression in HY stage correlates with motor decline on the UPDRS and deterioration in quality of life, as measured by several different instruments (15). HY stage also correlates with neuroimaging measures of PD-related pathology, such as β-CIT SPECT scanning and 18F-fluorodopa positron emission tomography (PET) scanning (16,17). The limited clinimetric analyses conducted to date support its scientific and clinical credibility (14,18).
On the other hand, because of its simplicity, the scale is not comprehensive, and by focusing on the issues of unilateral versus bilateral disease and the presence or absence of postural reflex impairment, it leaves other aspects of PD unassessed. By combining disability and impairment, ambiguities exist, and all clinical presentations of PD are not covered. Because the scale has only 5 scores, effective pharmacologic or surgical interventions, even when they induce significant changes in UPDRS, often fail to show HY changes (16). Attempts to rectify weaknesses have included the introduction of widely used 1.5 and 2.5 increments to the scale, but this adaptation has not been clinimetrically tested and introduces unresolved analytic problems (19). Though still frequently used as an outcome measure in clinical trials, the HY scale has been largely replaced by the UDPRS as a primary outcome measure of treatment efficacy. Instead, it is currently utilized to generally describe populations of patients and as an inclusion/exclusion criterion in baseline assessments during clinical trials. Because it is a categorical scale, populational descriptions must be reported as medians and ranges, not as means with standard deviations, and comparisons must utilize nonparametric statistical tools. The time to development of a given HY stage has been used successfully to distinguish patients with PD from other parkinsonism-plus syndromes, and this measure potentially could be incorporated into interventional studies designed to test delay in clinical progression (20).
OTHER SCALES
Parkinsonism
The Core Assessment Program for Intracerebral Transplantation (CAPIT) and its modifications are conglomerations of different scales, including the UPDRS and HY, developed by international investigators to capture the essential elements to test efficacy of neurotransplantation surgery in PD (21). The CAPIT introduced the widely adopted operational examination of patients in “practically defined off,” meaning early morning function after no medications for 12 hours and at least 1 hour after initial rising from bed. Clearly, this off is not the absolute nadir of function for most patients, but these restrictions allow a standard protocol for off assessment that is clear and easy to standardize across multiple centers. To contrast with “practically defined off,” the CAPIT calls for a second UPDRS to be performed during the “best on” state. This level of function refers to the point at which clinician and patient agree that function is optimized. In most instances, patients are very clear of the time when they arrive at this state. A few other tasks are attached to the CAPIT evaluation: timed motor tasks and a dyskinesia assessment developed by Obeso (see the following). Finally, a levodopa challenge test is administered in the “practically defined off” state, with regularly repeated timed motor tasks and dyskinesia assessments every 20 minutes throughout the time of levodopa activity. This provides an “area under the curve” assessment of function that demonstrates both duration and intensity of improvements in motor disability.
The Core Assessment Program for Surgical Interventional Therapies in Parkinson’s Disease (CAPSIT) is a more recent modification that added additional cognitive and behavioral assessments along with some modifications in the CAPIT protocol aimed at simplifying elements of the motor examination (22).
Global Disability and Quality of Life
The Schwab and England Activities of Daily Living Scale is a questionnaire that measures the patient’s perception of functional independence (23). It is not specific to PD but has been widely applied to this condition. The scale includes general statements intended as anchors and ranging from normal (100%) to completely independent, able to do all chores but with some degree of slowness (90%), and so forth until 0% (completely bedridden and vegetative functions are not functioning). The scale is easy to apply, and all numeric values are possible so that parametric analyses can be applied to differences among groups. Its scores also correlate well with UPDRS ratings.
The Parkinson’s Disease Questionnaire-39 (PDQ-39) is a 39-item questionnaire aimed at assessing numerous areas of health-related quality of life (24). It is self-completed by patients and is designed as a series of statements that patients endorse with 1 of 5 options (0 = never, 1 = occasional… 4 = always). Its scores correlate with other measures of parkinsonian impairments, and the scale has been incorporated into several pharmaceutical trials. The Parkinson’s Disease Quality of Life Questionnaire has 37 questions aggregated into four divisions: parkinsonism, systemic symptoms, emotional functioning, and social functioning (25). Constructed in a fashion similar to the PDQ-39, patients respond to a series of statements with endorsements that range from never to always. This scale has been less widely used internationally than the PDQ-39.
Dyskinesias
Patients with moderate and advanced PD often develop involuntary movements, termed “dyskinesias,” and an erratic or uneven response to medication, termed “motor fluctuations.” Because the problems are intermittent and may not be witnessed by the investigator, many scales rely on patient-derived data.
The Unified Dyskinesia Rating Scale (UDysRS) is the newest rating scale for dyskinesias, and contains a subjective (patient- and caregiver-based) questionnaire as well as objective physician ratings (standard 0–4 severity). It is designed specifically for PD-associated dyskinesia. The two subjective parts evaluate historical disability (patient perceptions) of on dyskinesia and off dystonia. The two objective parts evaluate impairment: dyskinesia severity, anatomical distribution over seven body regions, and type (choreic or dystonic) based on four activities observed or video-recorded. The scale has demonstrated high internal consistency (Cronbach’s α: 0.915, 0.971). Interrater reliability for the objective sections was acceptable for all items and likewise for intrarater reliability except for the items concerning the right leg (26).
MDS-UPDRS Part IV (Motor Complications) section has a series of questions on duration of dyskinesia, patient perception of disability, and the presence of pain and dystonia. The new version of the MDS-UPDRS has clear definitions of dyskinesia for patients and incorporates information from patient, caregiver, and rater, with all questions on dyskinesia rated with the standard 0 to 4 options that characterize the rest of the scale (27). Although the full MDS-UPDRS scale demonstrates good inter- and intrarater reliability, the clinimetric dimensions of individual items covering dyskinesia have not yet been established.
In contrast, the modified version of the Abnormal Involuntary Movement Scale (AIMS) is a purely objective scale that rates dyskinesia across body parts by severity. It was originally developed for the evaluation of tardive dyskinesia but has been widely used in PD research efforts (28). This scale rates seven body areas at rest with 0 to 4 severity rating, and it also has three global assessments: overall severity, incapacitation for the patient, and patient awareness of dyskinesias. It is a simple and quick scale to complete but does not distinguish between choreic and dystonic dyskinesias and does not involve activation maneuvers that often increase dyskinesia and provide information on severity that is likely more relevant to activities of daily living.
To deal with functional disability, the modified Rush Dyskinesia Rating Scale (RDRS) rates patients during activities of daily living, such as drinking, dressing, and walking (29). This scale has a teaching tape that accompanies the scale with several patient examples of the ratings. It is an adaptation of the Obeso Dyskinesia Scale (30), which combines examiner-based ratings with patient-based historical ratings on function. The RDRS is based on objective observation alone and rates the interference by dyskinesia with the patient’s ability to carry out activities. The scale’s clinimetric properties have been examined, and both physicians and nurse coordinators use the scale with high inter- and intrarater reliability.
The Movement Disorders Society commissioned a task force to assess and critique available clinical rating scales and make recommendations regarding their clinical utility (31). The scales reviewed included the AIMS, the UPDRS Part IV, the Obeso Dyskinesia Rating Scale, the RDRS, the Clinical Dyskinesia Rating Scale (CDRS), the Lang-Fahn Activities of Daily Living Dyskinesia Scale, the Parkinson Disease Dyskinesia Scale (PDYS-26), and the UDysRS. The task force evaluated the existing scales’ previous use, performance parameters, and quality of validation data. A scale was designated “recommended” if the scale had been used in clinical studies beyond the group that developed it, had been specifically used in PD reports, and if clinimetric studies had established that it as valid, reliable, and sensitive; “suggested” scales met two of the above criteria and those meeting one were “listed.” Based on this review, two of the reviewed dyskinesia scales (AIMS and the RDRS) fulfilled criteria for “recommended” for use in PD populations, albeit weakly so; all of the remaining scales met criteria to be “suggested.” However, the two most recent scales (PDYS-26 and UDysRS) were found to have excellent clinimetric properties (26

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

