Evaluating functional neuroimaging results: A comparison of ROC methods and data-driven performance metrics
*††‡
*Brain Sciences Institute, Melbourne, Australia
†Brain Research Institute, Melbourne, Australia
‡VA Medical Center and University of Minnesota, Minneapolis, USA
Modeling & Analysis
Abstract
This study was motivated by a need to objectively evaluate the quality and validity of functional neuroimaging results. Such evaluation is important to ensure that neuroimaging results are suitable for neuroscientific interpretation. To evaluate results, receiver-operating characteristic (ROC) curves (i.e. summaries of true/false positive rates) are often used but they can generally only be applied to simulated data (where the ground truth is known). Performance metrics, introduced recently by Strother et al., are an alternative to ROC methods [1,2]. The metrics are generated using cross-validation resampling and are thought to reflect the validity and quality of results. This study compares, for the first time, the use of the performance metrics in comparison to standard ROC methods.
Methods
We investigated the relative evaluation of functional neuroimaging results with (a) ROC methods and (b) data-driven performance metrics. The evaluation methods were compared on the basis of their application to a simulated data set, smoothed with four different spatial filters (3D Gaussian, 4/8/12/20 FWHM). Multivariate analyses were carried out using NPAIRS, an analysis framework developed recently by Strother et al [1]. For each analysis, NPAIRS was used to generate spatial pattern reproducibility (SPR) and prediction probability (PP) metrics. The 'optimal' analysis was defined as that with the two metrics closest to (SPR,PP=1,1) i.e. the location with perfect prediction ability and infinite signal-to-noise. False/true positive rates were also measured for each analysis and ROC curves were plotted. For the ROC curves, the 'optimal' analysis was defined as that with the greatest area under the ROC curve.
Results
Figure 1 demonstrates the results obtained with the (left) ROC method and (right) performance metrics. Note that both methods show similar relative evaluation of the four analyses tested. In particular, for both evaluation methods, the 8mm smoothing filter was found to be optimal according to the criteria defined above.
Discussion
The performance metrics appear to be a reliable tool for the evaluation of neuroimaging results. In particular, they offer the advantage that they can be used on any data, whereas ROC measures are generally limited to simulation studies. Given the limited extent to which simulated data represent real fMRI data, the performance metrics may be a valuable alternative.
References
[1] Strother, S.C. et al., (2002). Neuroimage. 15:747-771[2] LaConte, S. et al., (2003). Neuroimage. 18:10-27