Quantitative comparison of three brain extraction algorithms

Kristi Boesen*, Kelly Rehm, Kirt Schaper*, Sarah Stoltzner, Roger Woods, David Rottenberg*

*University of Minnesota, Department of Neurology
†University of Minnesota, Department of Radiology
‡UCLA, Department of Neurology

Modeling & Analysis

Abstract

Segmentation of brain/non-brain tissue is traditionally one of the more time-consuming preprocessing steps performed in neuroimaging laboratories. Several brain extraction algorithms (BEAs) have been developed recently to perform this step automatically. While automated BEAs speed up overall image processing, their output can greatly affect the results of image analysis. We therefore compared the performance of three BEAs against manual brain extraction using a high-resolution set of T1-weighted MRI brain volumes.

Methods

Sixteen T1-weighted MRI scans of normal subjects were acquired during an fMRI static force experiment [1]; voxel dimensions were 0.86 x 0.86 x 1mm. Three algorithms for brain/non-brain segmentation were evaluated: (i) Brain Surface Extractor (BSE), v. 2.99.8 [2], (ii) Brain Extraction Tool (BET), v. 1.2 [3], and (iii) Minneapolis Consensus Strip (MCS) [4]. Manual brain extraction was performed by one of the authors (KR). BSE and BET are software packages with parameters that may be adjusted by the user; for each algorithm parameters were tuned on two training volumes, and the set resulting in the "best" strip (removal of skull, CSF and dura with preservation of brain tissue) was applied to all 16 brain volumes. In order to perform adequately, BSE required manual cropping of the brain with a bounding cube. MCS was initialized with a warp mask and incorporated both intensity thresholding and BSE. MCS masks were created in a separate experiment and were optimized for the entire 16-volume dataset. The following performance metrics were calculated: (i) processing time and (ii) number of misclassified voxels relative to the manually-stripped "gold standard." In order to assess the influence of edge effects on the misclassification metrics the manual mask was dilated and eroded by 1 (thin) and 2 (thick) voxels.

Results and Conclusions

The average time required to process a single brain volume was 1 minute for BSE (exclusive of manual cropping), 40 seconds for BET, and 75 minutes for MCS on a 500 MHz Linux workstation. The performance of each algorithm with respect to the gold standard is summarized in Table 1. "Missed" voxels are voxels classified as brain by the manual strip and non-brain by the candidate algorithm, whereas "extra" voxels are voxels classified as non-brain by the manual strip and brain by the candidate algorithm. Misclassified voxels are expressed as a percentage of total brain voxels. One volume that could not be satisfactorily stripped by any of the BEAs was excluded from the averages reported in Table 1. MCS, though slower, consistently outperformed BSE and BET (see Table 1 and Figure 1). In the future, we will develop additional metrics, including the effect of masking on subsequent data analysis and will extend our evaluation to include additional algorithms.

Table 1. Average segmentation error
MethodMissed, thinExtra, thinMissed, thickExtra, thick
BSE3.6%0.4%1.9%0.3%
BET0.5%11.7%0.1%10.7%
MCS0.2%1.3%0.1%0.9%


References

1. Muley, et al. Neuroimage 13:185-195.
2. Shattuck, et al. Neuroimage 13:856-876.
3. Smith, Human Brain Mapping 17(3): 143-155.
4. Rehm, et al. Neuroimage 9:S86.
This work was supported in part by NIH grant MH57180.