Quantitative Evaluation of MRI Brain Tissue Segmentation Algorithms
Kirt A Schaper1 , Timothy R Jarvis1 , Kristi Boesen1 , Kelly Rehm2 , Joseph Gati3 , Ravi Menon3 , David A Rottenberg1,2
1Department of Neurology, University of Minnesota, MPLS, USA, 2Department of Radiology, University of Minnesota, MPLS, USA, 3Robarts Research Institute, London, Canada

Modeling & Analysis

Abstract
Segmentation of MRI brain volumes to determine the tissue composition at each voxel location is a critical step in many medical imaging applications. Numerous algorithms have been developed to perform this task, employing a variety of strategies, requiring different data inputs, and producing different types of tissue segmentations -- "hard", "soft" and "semi-soft". The performance of three popular segmentation algorithms which operate on T1-weighted MRI brain volumes -- FAST [1], SPM [2] and PVS [3] -- was assessed in terms of both accuracy and precision. Accuracy was assessed by comparing the segmentation output to manual segmentations produced by three expert raters and precision by segmenting repeat MRI brain volumes of the same subject. Additionally, we assessed the impact of varying the SNR and voxel size of the MRI volume on algorithmic performance.

Methods
Four MRI brain volumes of the same normal volunteer subject were acquired: one at 1.5 mm isotropic resolution and three repeat scans at 1 mm isotropic resolution. All volumes were corrected for intensity non-uniformity using N3 [4] and a brain mask produced by McStrip [5]. The stripped intensity-corrected volumes were then input to the three segmentation algorithms. Manual segmentation was performed by three expert raters using a global dual-threshold approach (CSF/Grey and Grey/White) on the intensity-corrected volumes. All raters underwent an initial training period on a brain volume acquired from a different subject. Raters adjusted the brightness/contrast of the display and then interactively adjusted the two thresholds to create a CSF-GM-WM (black-gray-white) cartoon that most closely corresponded to the input volume. The average fractional intensity in every algorithmically-segmented volume was computed for the labels (CSF, GM, WM) obtained from each rater-segmented volume and averaged across raters (Table 1). Additionally, we reconstructed the input volume to the segmentation algorithms from the fractional content volumes using pure tissue estimates based on the average of voxels containing at least 98% of the given tissue type (i.e., v' = g*G + w*W + c*C, where v' is the reconstructed voxel intensity, g, w and c are the percent GM, WM and CSF at that voxel location and G, W, and C are the corresponding pure-tissue estimates). These reconstructed volumes were correlated with the non-uniformity corrected volume to assess algorithmic consistency (Table 1, right column).

Results/Discussion
The average percentage for the "correct" tissue label was minimally affected by partial voluming (1.5 mm vs. 1 mm voxel size). [We were unable to generate an SPM result for the 1.5 mm volume for technical reasons.] Whereas FAST performed better overall with respect to GM and WM content, PVS was aided most by increased SNR with an average increase of 0.30 in GM and WM percentages in the corresponding reference label. With regard to internal consistency, SPM performed least well of the three algorithms evaluated (Table 1, right column).

This work was supported in part by NIH grant P20 EB02013.

References
1. Zhang Y, et al. IEEE TMI 20(1):45-57, 2001.
2. Ashburner J, Friston C. NeuroImage 11: 805-821, 2000.
3. Shattuck DW, et al. NeuroImage 13(5):856-876, 2001.
4. Sled JG, et al., IEEE TMI 17:87-97,1998.
5. Rehm K, et al. NeuroImage (submitted), 2004.


Table 1. Average fractional content and correlation.
method volume CSF GM WM r*
PVS 1.5 mm 0.838 0.626 0.534 0.9680

1 mm 0.755 0.531 0.558 0.9510

3 NEX 0.685 0.838 0.883 0.9785
FAST 1.5 mm 0.618 0.744 0.767 0.9364

1 mm 0.559 0.710 0.840 0.9215

3 NEX 0.549 0.767 0.817 0.9375
SPM 1 mm 0.312 0.730 0.636 0.6875

3 NEX 0.348 0.817 0.666 0.8045
*Correlations are between the reconstructed and input volumes.