Blood Oxygen Level Dependent (BOLD) functional magnetic resonance imaging (fMRI) depicts changes in deoxyhemoglobin concentration consequent to task-induced or spontaneous modulation of neural metabolism. Since its inception in 1990, this method has been widely employed in thousands of studies of cognition for clinical applications such as surgical planning, for monitoring treatment outcomes, and as a biomarker in pharmacologic and training programs. More recently, attention is turning to the use of pattern classification and other statistical methods to draw increasingly complex inferences about cognitive brain states from fMRI data. This article reviews the methods, challenges, and future of fMRI.
Functional magnetic resonance imaging (fMRI) is a class of imaging methods developed to demonstrate regional, time-varying changes in brain metabolism. These metabolic changes can be consequent to task-induced cognitive state changes or the result of unregulated processes in the resting brain. Since its inception in 1990, fMRI has been used in an exceptionally large number of studies in the cognitive neurosciences, clinical psychiatry/psychology, and presurgical planning (between 100,000 and 250,000 entries in PubMed, depending on keywords). The popularity of fMRI derives from its widespread availability (can be performed on a clinical 1.5 T scanner), noninvasive nature (does not require injection of a radioisotope or other pharmacologic agent), relatively low cost, and good spatial resolution. Increasingly, fMRI is being used as a biomarker for disease, to monitor therapy, or for studying pharmacologic efficacy. Thus, it is of interest to review the contrast mechanisms, the strengths and weaknesses, and evolutionary trends of this important tool.
Basis for fMRI
fMRI is of course based on MRI, which in turn uses nuclear magnetic resonance coupled with gradients in magnetic field to create images that can incorporate many different types of contrast such as T1 weighting, T2 weighting, susceptibility, flow, and so forth. To understand the particular contrast mechanism predominantly used in fMRI it is necessary to first discuss brain metabolism.
All the processes of neural signaling in the brain, including formation and propagation of action potentials, binding of vesicles to the presynaptic junction, the release of neurotransmitters across the synaptic gap, their reception and regeneration of action potentials in the postsynaptic structures, scavenging of excess neurotransmitters, and so forth, require energy in the form of adenosine triphosphate (ATP). This nucleotide is produced principally by the mitochondria from glycolytic oxygenation of glucose, and its production results in carbon dioxide as a by-product. When a region of the brain is upregulated (ie, activated) by a cognitive task such as finger tapping, the additional neural firing and other increased signaling processes result in a locally increased energy requirement, in turn resulting in upregulated cerebral metabolic rate of oxygen (CMRO 2 ) in the affected brain region. As the local stores of oxygen in tissues adjacent to capillaries are transiently consumed by glycolysis and waste products build up, various chemical signals (CO 2 , NO, H + ) cause a vasomotor reaction in arterial sphincters upstream of the capillary bed, causing dilation of these vessels. The increased blood flow acts to restore the local [O 2 ] level required to overcome the transient deficit; however, for reasons that are still not fully understood more oxygen is delivered than is needed to offset the increase in CMRO 2 . As a result, neural upregulation results initially in a buildup of deoxygenated hemoglobin (Hb) and a decrease in oxygenated hemoglobin (HbO 2 ) in the intra- and extravascular spaces, followed within a second or two by a vasodilatory response that reverses the situation to result in an increase in [HbO 2 ] and decrease in [Hb] over that in the resting condition ( Fig. 1 ). This sequence of processes is described as the hemodynamic response to the neural event.

Thus, there are 2 primary consequences of increased neural activity, and both can be detected by MRI: increased local cerebral blood flow (CBF) and changes in oxygenation concentration (Blood Oxygen Level Dependent, or BOLD, contrast). The change in CBF can be observed using an injected contrast agent and perfusion-weighted MRI, first demonstrated by Belliveau and colleagues, or noninvasively by arterial spin labeling (ASL). However, ASL suffers from reduced sensitivity, increased acquisition time, and increased sensitivity to motion compared with the BOLD contrast method, and its use has therefore centered on obtaining quantitative measurements of baseline CBF for studies modeling the neurobiological mechanisms of activation or calibration of vasoreactivity, rather than in routine mapping of brain function.
The second mechanism, termed BOLD contrast, was first demonstrated in rats and later in humans, and is the contrast that is used in virtually all conventional fMRI experiments. BOLD contrast results from the change in magnetic field surrounding the red blood cells depending on the oxygen state of the hemoglobin. When fully oxygenated, HbO 2 is diamagnetic and is magnetically indistinguishable from brain tissue. However, fully deoxygenated Hb has 4 unpaired electrons and is highly paramagnetic. This paramagnetism results in local gradients in magnetic field whose strength depends on the [Hb] concentration. These endogenous gradients in turn modulate the intra- and extravascular blood’s T2 and T2* relaxation times through diffusion and intravoxel dephasing, respectively. Using a gradient refocused echo (GRE) MRI pulse sequence, the acquisition is made sensitive to T2* and T2. At 1.5 T and 3 T, the T2* contrast is predominant and is largest in venules, whereas at higher field strength the diffusion-weighted contrast of T2 relaxation becomes more important and, because signals are generated preferentially in capillaries and tissue with spin-echo acquisitions, provides greater spatial specificity. Because most fMRI is currently performed at 3 T or below, BOLD fMRI uses primarily GRE methods because of the increased T2* contrast.
Task activation fMRI studies seek to induce different neural states in the brain as the visual, auditory, or other stimulus is manipulated during the scan, and activation maps are obtained by comparing the signals recorded during the different states. Therefore, it is important to collect each image in a snapshot mode to avoid head motion, and prevent physiologic processes of respiration and cardiovascular functions from injecting noise signals unrelated to the neural processing being interrogated. In general, most fMRI is performed using an echo planar imaging (EPI) method, which can collect data for a 2-dimensional image in approximately 60 milliseconds at typical resolutions (3.4 × 3.4 × 4 mm 3 voxel size). Whole brain scans with approximately 32 2-dimensional slices typically are acquired with a repetition time (TR) of 2 seconds per volume. Each voxel in the resulting scan produces a time series that is subsequently analyzed in accordance with the task design.
The fMRI experiment
The typical fMRI task activation experiment uses visual, auditory, or other stimuli to alternately induce 2 or more different cognitive states in the subject, while collecting MRI volumes continuously as already described. With a 2-condition design, one state is called the experimental condition and the other is denoted the control condition; the goal is to test the hypothesis that the signals differ between the 2 states. Using a block design, the trials are arranged to alternate between the experimental and control conditions, as shown in Fig. 2 , with each block typically being a few tens of seconds long. The block design is optimum for detecting activation, but a jittered event-related (ER) design is superior when characterization of the amplitude or timing of the hemodynamic response is desired. In the ER design, task events are relatively brief and occur at nonconstant intertrial intervals with longer periods of control condition, which allows the hemodynamic response to return more fully to baseline. Jittering the timing serves to sample the hemodynamic response with higher temporal frequency in the overall time series, but may also be used to induce a desired cognitive strategy, for example, to avoid an anticipatory response or maintain attention.

The degree to which valid inferences can be drawn from the measured time series data depends in large part on careful design of the task. The investigator must take care that only the effect of interest changes between experimental and control conditions, while confounding effects such as attention and valence are maintained constant or irrelevant. In some studies this is straightforward, such as in the use of a sensory task for presurgical mapping, where the goal is only to localize activation so that important brain functions can be maintained after surgery. In this case the signal intensity is of minor interest as long as it is adequate to characterize the functional substrates to be preserved during surgical intervention. In many other cases, however, comparative inferences are desired, such as in parametric studies of the influence of task difficulty on a cognitive process, and thus control of such factors as learning, adaptation, and salience must be considered.
The fMRI experiment
The typical fMRI task activation experiment uses visual, auditory, or other stimuli to alternately induce 2 or more different cognitive states in the subject, while collecting MRI volumes continuously as already described. With a 2-condition design, one state is called the experimental condition and the other is denoted the control condition; the goal is to test the hypothesis that the signals differ between the 2 states. Using a block design, the trials are arranged to alternate between the experimental and control conditions, as shown in Fig. 2 , with each block typically being a few tens of seconds long. The block design is optimum for detecting activation, but a jittered event-related (ER) design is superior when characterization of the amplitude or timing of the hemodynamic response is desired. In the ER design, task events are relatively brief and occur at nonconstant intertrial intervals with longer periods of control condition, which allows the hemodynamic response to return more fully to baseline. Jittering the timing serves to sample the hemodynamic response with higher temporal frequency in the overall time series, but may also be used to induce a desired cognitive strategy, for example, to avoid an anticipatory response or maintain attention.

The degree to which valid inferences can be drawn from the measured time series data depends in large part on careful design of the task. The investigator must take care that only the effect of interest changes between experimental and control conditions, while confounding effects such as attention and valence are maintained constant or irrelevant. In some studies this is straightforward, such as in the use of a sensory task for presurgical mapping, where the goal is only to localize activation so that important brain functions can be maintained after surgery. In this case the signal intensity is of minor interest as long as it is adequate to characterize the functional substrates to be preserved during surgical intervention. In many other cases, however, comparative inferences are desired, such as in parametric studies of the influence of task difficulty on a cognitive process, and thus control of such factors as learning, adaptation, and salience must be considered.
Analysis methods
Once the images have been acquired, the time series data must be processed to obtain maps of brain activation. Because the BOLD contrast is small (<1% in many studies of higher cognitive processes), simply averaging images over the experimental and control conditions and then subtracting (as sketched in Fig. 2 ) is inadequate to reliably determine differences because noise will compete to render false positives and negatives. The noise results from thermal sources in the subject and electronics, bulk motion of the head, cardiac and respiratory-induced noise, and variations in baseline neural metabolism. Because the noise can sometimes be larger than the signal of interest, fMRI analyses compare the signal difference between the states using a statistical test. These tests result in an activation map that is a function of the probability that the brain states differ. The statistical test for activation can use a general linear model, cross-correlation with a modeled regressor, or one of several data-driven approaches such as independent components analysis. The models against which the acquired data are tested include the experimental design of interest as well as “nuisance regressors” of no interest such as signal drift, motion, and noise reflected in global or white-matter signals. In all cases, the activation testing is preceded by a series of preprocessing steps.
The steps in preprocessing can include all or some of the following: (1) time-slice correction, to eliminate differences between the time of acquisition of each slice in the volume; (2) motion coregistration, in which affine head motion is detected and the time series of volumes is resampled to register each time frame to a reference frame, such as the first or middle time series point; (3) correction for physiologic noise from breathing and cardiovascular function, low pass and/or high pass temporal filtering to improve the statistics while removing spectral components of no interest; (4) spatial smoothing to improve the signal to noise ratio (SNR) and improve the normality of the noise distribution; (5) prewhitening to correct for autocorrelation in the time series. The analysis of fMRI data continues to be a subject of intense research at this time, and is one about which numerous books have been written, to which the reader is referred for further information (eg, Sarty ).
Comparisons with other functional imaging modalities
fMRI can be compared with other imaging methods used to obtained functional assessment of brain metabolism in terms of spatial and temporal resolution and availability. The primary alternatives are positron emission tomography (PET), near-infrared spectroscopy (NIRS), electroencephalography (EEG), and magnetoencephalography (MEG).
Spatial Resolution
Resolution in fMRI is limited primarily by SNR because of the necessity for rapid acquisition of time series information. For MRI, SNR ∝p2w√TacqN
∝ p 2 w T a c q N
, where p is the pixel size, w the slice thickness, T acq is the k-space readout time, and N the number of time frames. Thus, as T acq is reduced for single-shot imaging (typically 20–30 milliseconds) the pixel size must be increased over that for conventional anatomic imaging to maintain an acceptable SNR. Accordingly, the typical fMRI pixel size is 3 to 4 mm, although with higher field magnets (7 T) a pixel size of 500 μm or less may be readily achieved. The resolution of PET is limited by the size of the gamma-ray detectors as well as the positron-electron annihilation range, and is typically 5 to 10 mm or more. NIRS resolution is low (10–20 mm) and is limited predominantly by the strong scatter and attenuation of infrared photons (which also limits the depth of cortex that can be imaged within a banana-shaped region connecting optodes), the modest density of optodes, and the ill-conditioned inverse problem of reconstructing 3-dimensional maps of [Hb] from scalp recordings. The resolution in EEG and MEG is similarly limited to greater than 10 to 20 mm by the fact that a unique reconstruction of dipoles is not possible from scalp-based measurements of electrical or magnetic distributions and models, and regularization must be employed for model estimation. Unlike EEG, MEG does not have the confounding factor that scalp recordings may be spatially distorted by heterogeneous electrical conduction paths within the brain/skull.
Temporal Resolution
Temporal resolution of fMRI is limited by hemodynamic response time; typically the BOLD response has a width of ∼3 seconds and a peak occurring approximately 5 to 6 seconds after the onset of a brief neural stimulus. This rate is much slower than the underlying neural processes, and temporal information is thereby heavily blurred. Nevertheless, by jittering ER stimuli and using appropriate analysis methods, temporal inferences in the 100-millisecond resolution range can be achieved. PET scans require minutes to complete because of the low count rates of injected radionuclides, so changes in neural processes can only be studied by repeated scanning. Like BOLD, NIRS reports changes in blood oxygenation and, exacerbated by low SNR of near-infrared photons in the brain, has temporal limitations similar to those of fMRI. EEG and MEG, on the other hand, have millisecond temporal resolution and can easily capture the dynamics of evoked responses that last a few milliseconds to several hundred milliseconds. Multimodal approaches combining fMRI and EEG use fMRI maps as spatial priors to reconstruct high temporal resolution electrophysiology, thereby gaining resolution in both dimensions.

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

