The preprocessed connectomes project repository of manually corrected skull-stripped T1-weighted anatomical MRI data

Background Skull-stripping is the procedure of removing non-brain tissue from anatomical MRI data. This procedure can be useful for calculating brain volume and for improving the quality of other image processing steps. Developing new skull-stripping algorithms and evaluating their performance requires gold standard data from a variety of different scanners and acquisition methods. We complement existing repositories with manually corrected brain masks for 125 T1-weighted anatomical scans from the Nathan Kline Institute Enhanced Rockland Sample Neurofeedback Study. Findings Skull-stripped images were obtained using a semi-automated procedure that involved skull-stripping the data using the brain extraction based on nonlocal segmentation technique (BEaST) software, and manually correcting the worst results. Corrected brain masks were added into the BEaST library and the procedure was repeated until acceptable brain masks were available for all images. In total, 85 of the skull-stripped images were hand-edited and 40 were deemed to not need editing. The results are brain masks for the 125 images along with a BEaST library for automatically skull-stripping other data. Conclusion Skull-stripped anatomical images from the Neurofeedback sample are available for download from the Preprocessed Connectomes Project. The resulting brain masks can be used by researchers to improve preprocessing of the Neurofeedback data, as training and testing data for developing new skull-stripping algorithms, and for evaluating the impact on other aspects of MRI preprocessing. We have illustrated the utility of these data as a reference for comparing various automatic methods and evaluated the performance of the newly created library on independent data.

the process of brain extraction. The manual creation and correction of brain masks is tedious, time-consuming, and susceptible to experimenter bias. On the other hand, fully automated brain extraction is not a simple image segmentation problem. Brains differ in orientation and morphology, especially pediatric, geriatric, and pathological brains. In addition, non-brain tissue may resemble brain in terms of voxel intensity. Differences in MRI scanner, acquisition sequence, and scan parameters can also have an effect on automated algorithms due to differences in image contrast, quality, and orientation. Image segmentation techniques with low computational time, high accuracy, and high flexibility are extremely desirable.
Developing new automated skull-stripping methods, and comparing these with existing methods, requires large quantities of gold standard skull-stripped data acquired from a variety of scanners using a variety of sequences and parameters. This is due to the variation in performance of algorithms using different MRI data. Repositories containing gold standard skull-stripped data already exist: the Alzheimer's Disease Neuroimaging Initiative (ADNI) [1]; BrainWeb: Simulated Brain Database (SBD) [2]; the Internet Brain Segmentation Repository (IBSR) at the Center for Morphometric Analysis [3]; the LONI Probabilistic Brain Atlas (LPBA40) at the UCLA Laboratory of Neuro Imaging [4]; and the Open Access Series of Imaging Studies (OASIS) [5], the last of which is not manually delineated but has been used as gold standard data [6,7]. We extend and complement these existing repositories by releasing manually corrected skull strips for 125 individuals from the Nathan Kline Institute (NKI) Enhanced Rockland Sample Neurofeedback Study (NFB). These are the first 125 participants who finished the entire 3-day protocol, consented to have their data shared, and were not excluded from data sharing for having an incidental finding during neuroradiological review.

Data acquisition
The repository was constructed from defaced and anonymized anatomical data downloaded from the NFB [8]. The NFB is a 3-visit study that involves a deep phenotypic assessment on the first and second visits, a 1-h connectomic MRI scan on the second visit, and a 1-h neurofeedback scan on the last visit. Up to 3 months may have passed between the first and last visits. The 125 participants included 77 females and 48 males in the 21-45 age range (average: 31, standard deviation: 6.6).
Consistent with the the Research Domain Criteria (RDoC) [9], the goal of the NFB study is to examine default network regulation across a range of clinical and subclinical psychiatric symptoms. To preserve this variance, while being representative of the general population, a community-ascertained sample was recruited with minimally restrictive psychiatric exclusion criteria [8]. Only the most severe illnesses were screened out, excluding those who were unable to comply with instructions, tolerate the MRI, and participate in the extensive phenotyping protocol. As a result, 66 of the participants had one or more current or past psychiatric diagnosis as determined by the structured clinical interview for the DSM-IV (SCID) [10] (see Table 1). No brain abnormalities or incidental findings were present in the images, as determined by a board-certified neuroradiologist. None of the participants had any other major medical condition such as cancer or AIDS.
Anatomical MRI data from the third visit of the NFB protocol were used to build the Neurofeedback Skullstripped (NFBS) repository. MRI data were collected on Anatomical data were acquired immediately after a fast localizer scan and preceded the collection of a variety of other scans [13], whose description is beyond the scope of this report.

Brain mask definition
Many researchers differ on the standard for what to include and exclude from the brain. Some brain extraction methods, such as brainwash, include the dura mater in the brain mask to use as a reference for measurements [14]. The standard we used was adapted from Eskildsen et al. (2012) [15]. Non-brain tissue is defined as skin, skull, eyes, dura mater, external blood vessels and nerves (e.g., optic chiasm, superior sagittal sinus, and transverse sinus). Cerebrum, cerebellum, brainstem, and internal vessels and arteries are included in the brain, along with cerebrospinal fluid (CSF) in ventricles, internal cisterns, and deep sulci.

NFBS repository construction
The BEaST method (brain extraction based on nonlocal segmentation technique) was used to initially skull-strip the 125 anatomical T1-weighted images [15]. This software uses a patch-based label fusion method that labels each voxel in the brain boundary volume by comparing it to similar locations in a library of segmented priors. The segmentation technique also incorporates a multiresolution framework in order to reduce computational time. The version of BEaST used was 1.15.00 and our implementation was based on a shell script written by Qingyang Li [16]. The standard parameters were used in the configuration files and beast-library-1.1 (which contains data from 10 young individuals) was used for the initial skull-strip of the data. Before running mincbeast, the main segmentation script of BEaST, the anatomical images were normalized using the beast_normalize script. mincbeast was run using the probability filter setting, which smoothed the manual edits, and the fill setting, which filled any holes in the masks. The failure rate for masks using BEaST was similar to that of the published rate of approximately 29 % [15]. Visual inspection of these initial skull-stripped images indicated whether additional edits were necessary. Manual edits were performed using the Freeview visualization tool from the FreeSurfer software package [17]. The anatomical image was loaded as a track volume and the brain mask was loaded as a volume. The voxel edit mode was then used to include or exclude voxels in the mask. As previously mentioned, all exterior non-brain tissue was removed from the head image, specifically the skull, scalp, fat, muscle, dura mater, and external blood vessels and nerves (see Fig. 1). Time spent editing each mask ranged from 1-8 h, depending on the quality of the anatomical image and the BEaST mask. Afterwards, Fig. 1 Manual Editing. Axial and coronal slices in the AFNI viewer of the brain mask and image pair, before and after manual editing in Freeview. The anatomical image was loaded into the viewer as a grayscale image. The mask, which can be seen in a transparent red, was loaded as an overlay image manually edited masks were used create a NFB specific prior library for BEaST. This iterative bootstrapping technique was repeated until approximately 85 of the datasets were manually edited and all skull-strips were considered acceptable.
For each of the 125 subjects, the repository contains the de-faced and anonymized anatomical T1-weighted image, skull-stripped brain image, and brain mask. Each of these are in compressed NIfTI file format (.nii.gz). The size of the entire data set is around 1.9 GB. The BEaST library created using these images is also available.

Data validation
The semi-automated skull-stripping procedure was repeated until all brain masks were determined to be acceptable by two raters (BP and ET). Once this was completed, the brain masks were used as gold standard data for comparing different automated skull-stripping algorithms. Additionally, we evaluated the performance of the newly created BEaST library by comparing it to other skull-stripping methods on data from the IBSR [3] and the LPBA40 [4].
• BET is an algorithm incorporated in the FSL software that is based on a deformable model of the surface of the brain [23]. First, an intensity histogram is used to find the center of gravity of the head. Then a tessellated sphere is initialized around the center of gravity and expanded by locally adaptive forces. The method can also incorporate T2-weighted images to isolate the inner and outer skull and scalp. The bias field and neck setting (bet -B) was used since the anatomical images contained the subjects' necks. The version of FSL used was 5.0.7. • 3dSkullStrip is a modified version of BET that is incorporated in the AFNI toolkit [24]. The algorithm begins by preprocessing the image to correct for spatial variations in image intensity and repositioning the brain to roughly the center of the image. Then a modified algorithm based on BET is used to expand a mesh sphere until it envelops the entire brain surface. Among the modifications are procedures to avoid the eyes and ventricles and operations to avoid cutting into the brain. The version of the AFNI toolkit used was AFNI_2011_12_21_1014. • HWA is a hybrid technique that uses a watershed algorithm in combination with a deformable surface algorithm [25]. The watershed algorithm is first used to create an initial mask under the assumption of the connectivity of white matter. Then a deformable surface model is used to incorporate geometric constraints into the mask. The version of FreeSurfer used was 5.3.0.

Data analysis
To illustrate the use of the NFBS as testing data, it was used to compare the performance of BET, 3dSkullStrip and HWA for automatically skull-stripping the original NFB data. In a second analysis we compared the performance of the NFBS BEaST library to the default BEaST library and the three aforementioned methods. Each of the methods was used to skull-strip data from the IBSR (version 2.0) and LPBA40 [3,4]. To ensure consistent image orientation across methods and datasets, they were all converted to LPI orientation 1 using AFNI's 3dresample program [24]. Additionally, a step function was applied to all of the outputs using AFNI's 3dcalc tool to binarize all of the generated masks. The performance of the various methods was compared using the Dice similarity [26] between the mask generated for an image and its corresponding reference ('gold standard') mask. Dice was calculated using: D = 2 · |A ∩ B|/(|A| + |B|), where A is the set of voxels in the test mask, B is the set of voxels in the gold standard data mask, A∩B is the intersection of A and B, and |·| is the number of voxels in a set. Dice was implemented in custom Python scripts that used the NiBabel neuroimaging package [27] for data input. Dice coefficients were subsequently graphed as box plots using the ggplot2 package [28] for the R statistical computing language [29]. Figure 2 displays box plots of the Dice coefficients that result from using NFBS as gold standard data. The results indicate that 3dSkullStrip performed significantly better than the two alternative methods, with HWA coming in second. In particular, average Dice similarity coefficients were 0.893 ± 0.027 for BET, 0.949 ± 0.009 for 3dSkull-Strip, and 0.900 ± 0.011 for HWA. It is perhaps worth noting that BET, the method that performed worst on the NFBS library, took substantially more time to run (25 min) compared to 3dSkullStrip (2 min) and HWA (1 min).

Results
Switching now from using NFBS as the repository of gold standard skull-stripped images to using the IBSR and LPBA40 repositories as the source of gold standard images, Fig. 3 shows box plots of the Dice similarity coefficients for BET, 3dSkullStrip, HWA, BEaST using beast-library-1.1, and BEaST using NFBS as the library of priors. For IBSR, 3dSkullStrip performs better than BET and HWA, similarly to NFBS. However, for LPBA40, BET performs much better than the other two algorithms. The Fig. 2 Comparison of methods on NFBS. Boxplots of Dice coefficients measuring the similarity between masks generated from each image using BET, 3dSkullStrip, HWA, and the image's corresponding reference brain masks BEaST method was also applied to the anatomical data in these repositories using two different methods: first with the original beast-library-1.1 set as the prior library, and second with the entire NFBS set as the prior library.
For the BEaST method, using NFBS as the prior library resulted in higher average Dice similarity coefficients and smaller standard deviations 2 . Differences in Dice coefficients between datasets may be due the size and quality of the NFB study, as well as the pathology and age of the participants. In particular, the NFBS library of priors reflects a much wider range of individuals than does beast-library-1.1, which only contains 10 young individuals. There also may be differences in the standard of the masks, such as length of brainstem and inclusion of exterior nerves and sinuses.
Placing our results in the context of other skull-stripping comparisons, differences between the Dice coefficients reported here and values already published in the literature may be due to the version and implementation of the skull stripping algorithms, a possibility that has received support in the literature [6]. These differences may also result from our application of AFNI's 3dcalc step function to the skull-stripped images in order to get a value determined more by brain tissue and less influences by CSF. As the NFBS dataset is freely accessible by members of the neuroimaging community, these possibilities may be investigated by the interested researcher.

Importance for the neuroimaging community
In summary, we have created and shared the NFBS repository of high quality, skull-stripped T1-weighted anatomical images that is notable for its quality, its heterogeneity, and its ease of access. The procedure used to populate the repository combined the automated, stateof-the-art BEaST algorithm with meticulous hand editing to correct any residual brain extraction errors noticed on visual inspection. The manually corrected brain masks will be a valuable resource for improving the quality of preprocessing obtainable on the NFB data. The corresponding BEaST library will improve skull-stripping of future NFB releases and may outperform the default beast-library-1.1 on other datasets (see Fig. 3). Additionally, the corrected brain masks may be used as gold standards for comparing alternative brain extraction algorithms, as was illustrated in our preliminary analysis (see Fig. 2).
The NFBS repository is larger and more heterogeneous than many comparable datasets. It contains 125 skullstripped images, is composed of images from individuals with ages ranging from 21-45, and represents individuals diagnosed with a wide range of psychiatric disorders (see Table 1). This variation is a crucial feature of NFBS, as it accounts for more than the average brain. Ultimately, this variation may prove useful for researchers interested in developing and evaluating predictive machine learning algorithms on both normal populations and those with brain disorders [30].
Finally, the repository is completely open to the neuroscience community. NFBS contains no sensitive personal health information, so researchers interested in using it may do so without submitting an application or signing a data usage agreement. This is in contrast to datasets such as the one collected by the Alzheimer's Disease Neuroimaging Initiative (ADNI) [1]. Researchers can use ADNI to develop and test skull-stripping algorithms [21], but in order to do so must first apply and sign a data usage agreement, which bars them from distributing the results of their efforts. Thus, we feel that NFBS has the potential to accelerate the pace of discovery in the field, a view that resonates with perspectives on the importance of making neuroimaging repositories easy to access and easy to use [31]. 1 This refers to the manner in which the 3D image data are saved in the file. With LPI orientation, the voxel at memory location (0,0,0) is located at the leftmost, posterior, inferior voxel in the image. As the indices increase, they scan the voxels from left-to-right, along lines that advance from posterior-to-anterior, and planes that advance from inferior-to-superior. Additional details concerning the orientation of MRI images are available online [32]. 2 BEaST was unable to segment 1 subject, IBSR_11, in IBSR, only when using beast-library-1.1. For LPBA40, BEaST was also unable to segment 1 subject, S35, when using beast-library-1.1 and NFBS. These subjects were left out of the Dice calculations.