Important Research Principles in the Field of Spinal Osteotomy

Fig. 21.1

Assessment of thoracic kyphosis, lumbar lordosis, and C7 sagittal vertical axis (SVA)

Fig. 21.2

Assessment of spinopelvic parameters

Measurements of spinal deformity similar to those used in the thoracic and lumbar spine are being increasingly used to assess cervical deformity. Lordosis within the cervical spine is measured from inferior endplate of C2 to the inferior endplate of C7 (Fig. 21.3). The C2 SVA is an analogous measurement to the C7 SVA, specifically relevant for cervical deformity. It is measured by dropping a vertical line from the tip of the odontoid and measuring the horizontal distance from this line to the center of the C7 vertebral body. Other measures of cervical SVA are described which use the center of the occiput or anterior ring of C1 among others as a reference point. C2 SVA is the most commonly used in the literature. A study at our institution showed that choosing a point in the upper cervical spine yields a more reliable measurement than points in the cranium.

Fig. 21.3

Assessment of cervical kyphosis and C2 sagittal vertical axis

In the coronal plane, deformity is characterized using Cobb angles for each curve in a scoliotic deformity. Measurement begins with the major curve of the deformity (Fig. 21.4). Curves less than 10° are not considered significant for the purposes of measurement. The major Cobb angle is then measured from the superior endplate of the upper vertebrae to the inferior endplate of the lower vertebrae in the curve. Choice of the upper and lower vertebrae is intended to maximize the magnitude of the curve dimension. Adjacent upper or lower curves are then measured from the same lines to either the upper or lower endplate of the curve. In cases where the endplates are not visible, the lower edge of the pedicles can be used to define vertebral angulation. Bending films are also utilized to determine if the curve bends to less than 25° in magnitude. This determination of which curves are structural is an essential component of the Lenke classification;[4] however, a detailed discussion of the implications of this classification is beyond the scope of this chapter.

Fig. 21.4

Assessment of coronal plane deformity using Cobb angles

21.2.2 Patient-Reported Outcome Measures

Patient-reported outcome measures can be broadly grouped into three categories: (1) Pain measures, (2) General Health measures, and (3) Disease-Specific measures. Although there are many such instruments in the literature, in this section we will focus on those measures most relevant to spinal deformity surgery.

21.2.2.1 Pain Measures

The visual analog scale and numerical rating scale are the most commonly used instruments for assessing pain. The numerical rating scale asks the patient to rate their pain on a scale from 0 to 10, where 0 represents no pain and 10 the worst imaginable pain [5, 6]. The visual analog scale (VAS) is a horizontal line 10-cm in length with no graduations. It is labeled “no pain” at the left end and “worst imaginable pain” on the right. The distance in centimeters of the patient’s mark from the left-hand side is measured to determine the outcome. Not all numerical changes in the value of an outcome instrument are considered clinically relevant. The minimum clinically important difference (MCID) defines the minimum degree of change measured with an instrument such as VAS that can reliably indicate a clinically meaningful change in the status of the patient. Parker et al. have demonstrated that the MCID for VAS change in a population of patients undergoing transforaminal lumbar interbody fusion (TLIF) for degenerative spondylolisthesis ranged from 2.1 to 5.3 cm for back pain and 2.1–4.7 cm for leg pain [7]. Childs et al. followed patients undergoing physical therapy for low-back pain over a period of 4 weeks [8]. The authors concluded that a two-point change of the numerical rating scale is the clinical threshold beyond which changes in the numerical rating scale represent a clinically meaningful change.

21.2.2.2 General Health Measures

The Medical Outcomes Study Short Form questionnaires are the most commonly utilized instruments to assess general health status in the United States. They exist in several forms depending on the number of questions (SF-36, SF-12, and SF-6). SF-12 is the most commonly used in spinal surgery research. It has separate components for both physical (PCS) and mental (MCS) parameters. Short Form questionnaires have been extensively validated across many different disease states. Consequently, they can be used to compare health outcomes between patients with very different conditions such as spondylolisthesis or diabetes. Short Form questionnaires have also been shown to correlate well with outcome measures specific to the spine [9–11]. PCS and MCS are computed using the scores of 12 questions and range from 0 to 100, where a zero score indicates the lowest level of health measured by the scales and 100 indicates the highest level of health. Both PCS and MCS combine the 12 items in such a way that they compare to national norms with a mean score of 50.0 and a standard deviation of 10.0. The MCID for SF-12 PCS ranges from 6.1 to 12.6 and for MCS 2.4–10.8 [12].

A second commonly used general health measure is the EuroQol Five Dimension (EQ-5D) questionnaire [13]. The EQ-5D consists of many health states which are subdivided into five dimensions with five levels within each dimension based on problem severity for the most recent modification of the form (no problems, slight problems, moderate problems, severe problems, and extreme problems). The dimensions address mobility, self-care, pain, and anxiety/depression [5]. The questionnaire is designed to be easily completed by patients, allowing for indirect data collection such as via mailings. It has been translated into more than 150 languages [14–16]. The relative value of different dimensions can be adjusted based on normative data for different countries and populations, ensuring that results are accurately adjusted to societal norms. The MCID for the EQ-5D is 0.05 [17, 18].

21.2.2.3 Disease-Specific Measures

Although numerous validated outcome measures exist to assess the relative functional impairment experienced by patients with spinal pathology such as the Roland Morris Disability Questionnaire, Cervical Spine Outcomes Questionnaire, and the Japanese Orthopaedic Association myelopathy questionnaire, we will focus on the Oswestry Disability Index (ODI) and Neck Disability Index (NDI), as these are the most relevant to research focused on patients undergoing corrective spinal osteotomies.

The ODI is one of the most widely utilized outcome measures in spinal surgery. It was developed by John O’Brien and published in 1980 [19]. Several versions of the form have been described, but the one most commonly used by orthopedic and spinal surgeons is the version described by Fairbank et al. [20]. The ODI contains ten elements, each with six statements, describing successively greater levels of disability. The elements focus on pain intensity, personal care, lifting, walking, sitting, standing, sleeping, sex life, social life, and travelling. Each element is scored from 0 to 5 and the total score is doubled and expressed as a percentage, with higher scores indicating a greater amount of disability. The reliability and validity of the ODI to assess lumbar pathology has been extensively studied [20–22]. Like other common outcome questionnaires, the ODI has been designed with a focus on ease of use. It can generally be completed in about 5 min and scored manually in less than 2 min, although modern computerized interfaces make the process even more efficient [23]. The MCID of the ODI is 12.8 [24].

The NDI was originally published in 1991 [25, 26]. The NDI has ten elements that rate neck disability related to recreation, sleeping, driving, work, concentration, headaches, reading, lifting, self-care, and pain intensity on a scale from 0 to 5, with higher scores representing higher levels of disability. The total score can then be multiplied by two to express it as a percentile. The NDI has been translated into 19 languages [27]. It has been shown to have acceptable validity and responsiveness in patients with cervical radiculopathy, as well as acute and chronic neck pain [28]. However, it may be limited by floor and ceiling effects [29, 30]. The MCID for the NDI is 7.5 [31].

21.2.2.4 Cost-Utility Analysis

The economic costs of spine care to the overall health care system are significant [32, 33]. Annual expenditures on spine care in the United States exceed 100 billion dollars annually. Costs have risen 65 % between 1997 and 2005, and current data suggests that they will continue to rise at a significant rate. Projections suggest that Medicare could become insolvent as soon as 2017 [34]. Given these economic realities, it is unsurprising that cost-containment is an important element of the Affordable Care Act. Cost-Utility Analysis seeks to assess the “value” of health care interventions. In this context, value is defined not as a pursuit of outcomes at any cost but attaining the best possible outcome per unit of cost. The determination of value, therefore, requires a calculation of the effect of the intervention and a calculation of cost.

The central concept when calculating the effect of a health care intervention for this type of analysis is the determination of “Utility.” Utility is a preference-based measure of health-related quality of life expressed numerically within a range from 0 (death) to 1 (perfect health). The utility measure can then be extrapolated over time to generate a Quality Adjusted Life Year (QALY). This is a core measure that allows for comparison of interventions not just within the field of spine care but between fields as diverse as diabetes care, total joint arthroplasty, or interventional cardiology. Within spine care research, utility measures are derived from the patient-reported outcome measures derived previously. SF-12, EQ-5D, ODI, and NDI outcome scores all have validated conversions into utility measures [35, 36].

The calculation of cost for an intervention could be as simple as checking the records of a Hospital’s billing department or the payments of an insurance company. However, this approach lacks broader applicability and may fail to capture the whole economic effect of an illness. If one looks at the effect of illness from a societal perspective, the calculation of cost, especially for nonoperative interventions, becomes more complex. One must try to incorporate the effects of lost wages and productivity, child care, and numerous other potential factors. The US Panel on Cost-Effectiveness in Health and Medicine published consensus-based recommendations for performing a Cost-Utility Analysis [37]. According to these recommendations, the four key components of a Cost-Utility Analysis are (1) the use of the societal perspective, (2) appropriate incremental comparisons between treatments, (3) appropriate discounting of both the cost and health effect of the treatment, and (4) the use of a community preference-based utility measure [37, 38]. An additional element of vital importance for Cost-Utility Analysis is determination of an appropriate duration of follow-up. Especially, when one looks at surgical interventions where there may be a large up-front cost, utilizing an appropriate duration of clinical follow-up is essential to produce meaningful studies, since an intervention that provides lasting benefit allows those costs to be spread out across multiple years.

There is no consensus regarding a threshold beyond which an intervention ceases to be cost-effective. In the United Kingdom, the National Institute for Health and Clinical Excellence utilizes a threshold of approximately 30,000 lb, which is roughly equivalent to 50,000 US dollars per QALY as their threshold. In the United States current literature often uses a threshold of 100,000 US dollars per QALY as a cutoff, but this value is fairly arbitrary.

In light of this discussion, it should come as no surprise that performing methodologically rigorous, high-quality Cost-Utility Analysis is a difficult and time consuming undertaking. Consequently, the number of such studies across the entire field of spine care is quite limited [32]. Even for a relatively straight forward procedure such as one-level lumbar decompression and fusion for degenerative spondylolisthesis, the reported values can range from 67,000 to over 130,000 dollars per QALY [39]. Numerous factors contribute to this variability including the quality of the underlying clinical data, calculation of indirect costs, and length of follow-up. The Spine Outcomes Research Trial (SPORT) represents some of the highest quality data currently available in this field (Fig. 21.5). Decompressive surgeries both for spinal stenosis and disc herniation were cost-effective at 2 and 4 year time points. The analysis of fusion for degenerative spondylolisthesis, however, illustrates the importance of adequate follow-up for this type of analysis. Though the cost per QALY at the 2-year analysis was greater than 100,000 US dollars [40, 41], the clinical benefit of surgery was maintained at the 4-year analysis at which time, the intervention could be deemed cost-effective. Unfortunately, there is an extreme paucity of high-level Cost-Utility Analysis related to adult deformity surgery in general and the use of spinal osteotomy specifically. In part, this may be due to the relative heterogeneity of this patient population and also the complexity of the surgical interventions. Additionally, most surgeries of this type require a large up-front cost, and any valid study would, consequently, require a significant duration of follow-up. Certainly, in the current health care environment there is a huge need for these studies, and it represents a valuable direction for future research.