Neuroconductor: an R platform for medical imaging analysis

Neuroconductor: an R platform for medical imaging analysis Summary Neuroconductor (https://neuroconductor.org) is an open-source platform for rapid testing and dissemination of reproducible computational imaging software. The goals of the project are to: (i) provide a centralized repository of R software dedicated to image analysis, (ii) disseminate software updates quickly, (iii) train a large, diverse community of scientists using detailed tutorials and short courses, (iv) increase software quality via automatic and manual quality controls, and (v) promote reproducibility of image data analysis. Based on the programming language R (https://www.r-project.org/), Neuroconductor starts with 51 inter-operable packages that cover multiple areas of imaging including visualization, data processing and storage, and statistical inference. Neuroconductor accepts new R package submissions, which are subject to a formal review and continuous automated testing. We provide a description of the purpose of Neuroconductor and the user and developer experience. Introduction Medical imaging analysis software is heterogeneous, complex, and difficult to use in fully reproducible analysis pipelines. These problems have been accentuated by the diversity of new imaging data sets and associated scientific problems. Indeed, many studies now collect data on thousands of subjects, at multiple visits, and using different modalities. Storing, understanding, and analyzing such data is daunting. Neuroconductor provides the infrastructure for using, improving, and designing open-source, scripted software that depends on a minimum number of software platforms and is dedicated to improving the correctness, reproducibility, and speed of medical image data analysis. To achieve this, Neuroconductor interweaves pre- and post-processing image analysis and provides integrated data-analytic approaches. Neuroconductor provides data, methods, and software packages designed to support the analysis of populations of images in R (R Core Team, 2017), a programming language with state-of-the-art statistical analysis tools and a vectorized data paradigm that is well suited for neuroimaging (Tabelow and others, 2011). Neuroconductor supports many types of imaging data including magnetic resonance imaging (MRI: structural, functional, and dynamic), computed tomography (CT), single-photon and positron emission computed tomography, (SPECT and PET), electroencephalography (EEG), and magnetoencephalography (MEG). It is currently focused on human imaging, especially of the brain, but it also supports other biological imaging, such as the lungs and MR spectroscopy. Neuroconductor is able to interface with mature R distribution platforms, such as Bioconductor (Gentleman and others, 2004; Huber and others, 2015) and the comprehensive R archive network (CRAN) (Hornik, 2016; R Core Team, 2017), as well as other R-based imaging projects such as TractoR (Clayden and others, 2011). Although these other distribution platforms are well established, we believe that Neuroconductor enables better support and testing for imaging-specific packages, and provides a focal point for imaging-specific training materials, similar to how Bioconductor supports bioinformatics packages. NiPype (http://nipy.org/packages/nipype/index.html) is a Python project with a goal of standardizing neuroimaging software syntax, similar to Neuroconductor. NiPype is incorporated in the larger NiPy project (http://nipy.org/), which provides tools for neuroimaging analysis in Python on multiple neuroimaging modalities. We believe Neuroconductor is in many ways the R analog to NiPy, with the hopes of including additional imaging problems and platforms. Although the utility of each of these projects may be similar, we do not believe Neuroconductor competes with NiPy, but allows R users similar capabilities as Python users. Moreover, with R-Python interfaces such as rPython (Bellosta, 2015), PythonInR (Schwendinger, 2015), and reticulate (Allaire and others, 2017) in R, and the rpy2 (https://pypi.python.org/pypi/rpy2) and pyRserve (https://pypi.python.org/pypi/pyRserve/) modules in Python, it is possible for users can combine these two efforts. Neuroconductor may be used seamlessly for data fusion analyses where imaging, genomics, and other types of high-throughput data types are analyzed together with traditional measurements and health outcomes. In R, these analyses are integrated with well-designed and thoroughly tested analytic packages for visualization, statistical inference, longitudinal and survival analysis, regression, network analysis, and machine learning. The user perspective Neuroconductor users interact with the platform via the Neuroconductor website (https://neuroconductor.org). Users may install an R package, explore packages, or identify a workflow designed for their specific problem. If the workflow does not exist in Neuroconductor, then users are able to create and submit their own. To download a package a user needs to know the package name, which is obtained from the list of packages on the Neuroconductor webpage (https://neuroconductor.org/list-packages/all). For example, to download the fslr and neurohcp packages in R, the command line instructions are: Many Neuroconductor packages contain help manuals and documentation as vignettes, which developers are strongly encouraged to provide. Large areas of research covering multiple package combinations are described under Help. Some of these areas contain tutorials and Massive Open Online Courses (MOOCs) that provide a rapid introduction to more complex concepts and workflows. Examples of such courses are Principles of fMRI (https://www.coursera.org/learn/functional-mri) and Introduction to Neurohacking in R (https://www.coursera.org/learn/neurohacking). If the user does not find what they are looking for, they can request a tutorial on a specific topic and developers can also submit their own tutorial or course to Neuroconductor. Furthermore, the user can turn to the Support forum (https://neuroconductor.org/forum), can attend short courses, take free online MOOCs, or read tutorials (https://neuroconductor.org/neuroc-help). All package developers are users and some users become package developers. The user may have their own data, use a data package from Neuroconductor, or use a Neuroconductor package to download data from an internet repository. After obtaining data, a first step is to manipulate, visualize, and ensure data quality. Most quality control procedures can be performed at the image or subject level, though the distribution of quality metrics in the population can be used as well; see Mejia and others (2015) for an example. Once data quality has been assessed, an analytic database of subjects for subsequent analyses may be created. A major advantage of the R environment is that analyses can take advantage of a large collection of well-tested, state-of-the-art statistical packages. For example, incorporating demographic information, regression, mixed-effects models, or survival analysis is straightforward in R. Moreover, the voxel package implements most of these modeling methods at the voxel level directly on images (Garcia de la Garza and others, 2016). Results containing spatial information are mapped back to an image, while other results may be displayed using plotting packages such as ggplot2 (Wickham, 2009). All of these steps can be wrapped in a series of reproducible reports using knitr and rmarkdown (Allaire and others, 2015; Xie, 2013). Another advantage of R is that complex processing steps and complete analyses can also be wrapped into R packages. For example, the ichseg (Muschelli, 2016b; Muschelli and others, 2016) R package is designed for intra-cranial hemorrhage segmentation and uses multiple functions from other Neuroconductor packages. It uses a single CT scan as input, processes the data using FSL (Smith and others, 2004), called through fslr in R (Muschelli and others, 2015), and creates a segmentation of the hemorrhagic stroke based on a pre-trained supervised machine learning algorithm. Last, data products resulting from analysis and package development may be deployed as Shiny applications (Chang and others, 2015), which are web applications built in R and accessed via a browser. Shiny applications are hosted on a user’s custom launched or commercial Shiny server (such as https://www.shinyapps.io). As an example, the segmentation of the hemorrhagic stroke method described above has been implemented in a Shiny application (http://johnmuschelli.com/ich_segment.html). Below we provide several case studies to build up the intuition for interfacing with Neuroconductor. Case study 1: Processing of structural MRI Figure 1 provides an example of a typical processing pipeline for multi-sequence structural MRI data. Analysis of such data often begins with a conversion of Digital Imaging and Communications in Medicine (DICOM) files, which are essentially binary formatted pictures with header information. DICOM is one of the most common formats for imaging data that come directly from the imaging device. DICOM files are commonly converted to the much simpler Neuroimaging Informatics Technology Initiative (NIfTI) format, which represents imaging data as a multi-dimensional array instead of a collection of single two-dimensional (2D) slices (Li and others, 2016). This process may be performed in R using the dcm2niir package (Muschelli, 2016a), which calls dcm2niix (https://github.com/rordenlab/dcm2niix) from the command line, the divest package (Clayden and Rorden, 2017), which incorporates dcm2niix and provides an in-memory bridge to it, or the oro.dicom (Whitcher and others, 2011b) package, which uses completely native R code. The resulting NIfTI files may be converted to array-based S4 objects using the oro.nifti (Whitcher and others, 2011b) package or objects based on C++ pointers using the ANTsR (Avants and others, 2011a) or RNifti (Clayden and others, 2016) packages. These packages also provide the ability to read other medical imaging formats, such as ANALYZE, AFNI, NRRD or any other format supported by the insight toolkit (ITK) (Yoo and others, 2002). Fig. 1. View largeDownload slide Typical processing pipeline for multi-sequence structural MRI data. Fig. 1. View largeDownload slide Typical processing pipeline for multi-sequence structural MRI data. After conversion, image intensity inhomogeneity correction is applied to ensure that each tissue has similar intensity distributions across locations in the brain (e.g. top versus bottom of the brain). Image inhomogeneities are typically caused by magnetic field inhomogeneities and may be handled by multiple methods. Neuroconductor currently contains four such methods implemented in FSL via fslr (Zhang and others, 2001), freesurfer (Sled and others, 1998), and ANTsR (N3 and N4 correction) (Sled and others, 1998; Tustison and others, 2010). The next step is co-registration, which spatially aligns all images in the sequence to one of the images in the sequence, usually a T1-weighted (T1w) image. Co-registration may be performed using flirt (fslr), antsRegistration (ANTsR), niftyreg (RNiftyReg) (Clayden and others, 2017), or dramms (drammsr) (Ou and others, 2011). Co-registration is followed by brain extraction, also known as skull stripping. This procedure is commonly performed on the T1w image and may be implemented using spm12r (Penny and others, 2011; Muschelli, 2016), fslr, freesurfer, or ANTsR. At this point, the image sequences for the subject are in the same space, and extracranial tissues have been removed. Brain extraction is followed by intensity normalization, which transforms the arbitrary MRI units into interpretable units across subjects. This is not always thought of as a standard pre-processing step, but it is important in many applications. For example, one could be interested in subtracting two images that were collected longitudinally in order to identify changes or one may want to investigate changes in voxel or region-of-interest (ROI) intensities over time. This is typically achieved using z-scoring with respect to a particular tissue class as implemented in the WhiteStripe method (WhiteStripe) (Shinohara and others, 2014), standard and robust z-scoring relative to whole brain (neurobase) (Muschelli, 2016c), histogram matching (implemented in RAVEL), or removal of unwanted variation (RAVEL) (Fortin and others, 2016c). Further subject-specific and population analyses may require registration to a population-level template and/or tissue class segmentation. Both of these approaches are possible using Neuroconductor packages including spm12r, fslr, and ANTsR. While the flowchart in Figure 1 provides a conceptual pre-processing pipeline, deploying an explicit and reproducible pipeline requires specific choices at every step that may depend on multiple tuning parameters. In R one can be explicit about these choices, provide a software suite of packages and tuning parameters, and quickly compare results based on different combinations of software and platforms. Case study 2: A cross-package workflow for diffusion tensor imaging In this section, we discuss an example of complete analysis (preprocessing and statistical analysis) performed entirely within Neuroconductor. We start with a simple question: in a population of healthy subjects, is there any difference in the white matter (WM) microstructure between males and females? Diffusion tensor imaging (DTI) has been used extensively to study WM fiber structure by taking advantage of the differential water diffusivity in the WM tracts relative to other brain structures (Basser and others, 1994; Le Bihan and others, 2001). Fractional anisotropy (FA) and mean diffusivity (MD) are two scalar maps commonly derived from DTI images to study the diffusivity properties of the brain. FA measures the degree of directional diffusivity in a voxel and MD measures the total diffusion within a voxel (Koay and others, 2006). To investigate potential gender differences in WM fiber tracts, we use DTI data from healthy young adults in the Human Connectome Project (HCP), available at http://db.humanconnectome.org. HCP includes a large cohort of individuals ($$N\,{>}\,1200$$) with a vast amount of neuroimaging data, including structural magnetic resonance imaging (sMRI), task and resting-state functional MRI (fMRI), and diffusion tensor imaging (DTI). Step 1: Downloading data using the neurohcp package The first step is to download the minimally preprocessed DTI data (Glasser and others, 2013) and structural T1w MRI images, available for 781 subjects (436 females and 345 males), using the neurohcp (Van Essen and others, 2013) package. The neurohcp package is an R interface for downloading data from the HCP database, which is publicly available from the Amazon Web Services (AWS). A description of how to connect to the HCP database via AWS is available at the Human Connectome Wiki. After accepting the data use terms, one needs to obtain AWS credentials, which include an access key identifier and a secret key. The neurohcp package in R accesses the data from an Amazon Simple Storage Solution (S3) bucket using these two AWS access keys. We can set these keys using set_aws_api_key: After the access keys are set, neurohcp can download data from the S3 bucket. For instance, the complete diffusion data directory for subject 100307 may be downloaded using the download_hcp_dir function in the neurohcp package: The result is an R list containing the file names of all downloaded files, the directory where the files were downloaded, and the http request that was sent to the Amazon S3 bucket. Data from other subjects are downloaded similarly. The demographic data, which includes age and gender, are not located on the S3 bucket and must be downloaded directly from the website. Step 2: Processing DTI data with rcamino After the minimally processed DTI data are downloaded, processing continues using the Neuroconductor package rcamino (Muschelli, 2016), an R interface for the open-source DTI software Camino (Cook and others, 2006). The package creates and fits the diffusion tensor models, and generates the FA and MD maps. We illustrate below the associated R code. Note that Camino requires the b-values and b-vectors of the DTI to conduct the DTI model fit. The b-values are the amount of diffusion weighting used for each volume. The b-vectors are the gradients of the magnetic field that imply direction of flow; for more details see (O’Donnell and Westin, 2011). The process is then repeated for all subjects with available DTI data and may be sped up via parallel computing. Now, we have an FA and MD image for each subject. Step 3: Nonlinear registration to template The next step is to prepare the data for voxel-wise analysis, which requires the FA and MD maps to be spatially registered to a common template. For each subject, we use the download_hcp_file function from the neurohcp package to download the T1w image with extra-cranial voxels removed. We then use the symmetric diffeomorphic non-linear registration implemented in ANTsR, wrapped in extrantsr (Avants and others), to register FA and MD maps to the 1 mm isotropic Eve template T1w image (Oishi and others, 2009). The Eve template is a single-subject template created by the Laboratory of Brain Anatomical MRI led by Professor Susumu Mori at Johns Hopkins University (Oishi and others, 2009). The Eve template is made available in the Neuroconductor data package EveTemplate (Fortin and others, 2016a). Alternatively, one could register images to the MNI template (Grabner and others, 2006), which is available in the Neuroconductor data package MNITemplate (Fortin and others, 2016b). Each registered DTI map is saved as a standard NIfTI file for further analyses. The R code is presented below. The results of these processing steps are images containing the FA and MD maps registered to the Eve atlas for every subject. While at this point we do not conduct ROI analyses, the template may be used to extract subject-specific ROIs. An ROI is defined as an anatomical region, regions obtained from other analyses, or regions that are manually delineated. Here, we focus on voxel-wise analysis across subjects, which can be performed because images are registered to a common template space. Although we do not present any measures of quality control or registration accuracy, users should inspect registration and image quality using either automated methods or visual inspection. Step 4: Statistical analysis The next step is the analysis of the population of images using statistical inference. Neuroimaging convention refers to this step as “statistical analysis,” a convention we adopt here, despite the fact that earlier steps involve a substantial amount of statistical operations. We first read the registered NIfTI files into R and create a matrix of voxels intensities with voxels as rows, and subjects as columns. For the analysis, we only consider voxels in the WM and GM. We create one matrix for the FA maps, and one for the MD maps, using the function images2matrix from the package neurobase. The registered-to-Eve images for all subjects are located in the lists files.fa and files.md for FA and MD maps, respectively. For brevity, we present only the analysis for the MD maps. In this example, the matrix (Y.md) has $$1\,372\,619$$ rows (number of GM and WM voxels in the Eve template) and $$781$$ columns (number of subjects in the data set): There are many options in R to quantify the association of the FA and MD intensities with gender. The simplest approach is to calculate mass-univariate two-sample t-statistics (t-statistics computed at each voxel separately), which can be quickly computed using limma (Smyth, 2004, 2005; Ritchie and others, 2015), a popular R package from the Bioconductor project (https://www.bioconductor.org/). The package was originally developed for the analysis of high-throughput genomics data, but much of its functionality may be used in neuroimaging applications without additional effort. Among other methods, the limma package implements Empirical-Bayes (EB) methods to estimate t-statistics based on variance shrinkage (called moderated t-statistics). Below, we present the code to compute the moderated t-statistics for the association between gender and MD, adjusted for age. The code produces t-statistics comparing males versus females in the Eve-template space. The gender value is coded so that negative values of the t-statistic correspond to higher values of MD in females. Below, we investigate where the largest differences are located. Step 5: Visualization and localization of the results A considerable advantage of the close integration between pre- and post-processing tools is that results can be easily mapped back into the native or template space. This helps localize significant associations using template labels. For example, the voxels that exhibit the largest differences between males and females are identified using the Eve white matter parcellation map (WMPM) (Oishi and others, 2009), included in the package EveTemplate. Below, we provide the R code for localization of these results. A quick inspection of the top six voxels reveals that the most pronounced differences are located in the hippocampus and thalamus, where females have higher mean diffusivity compared to males. However, it may be useful to locate these findings on the template and visually study the degree of spatial clustering. The following commands produce the three panels of Figure 2, using the function ortho2 from the neurobase package: Fig. 2. View largeDownload slide Visualization of the diffusion tensor imaging (DTI) analysis results in R. (a) Visualization of the anatomical structures in template space (Eve template, 1-mm isotropic T1-weighted modality) in coronal, sagittal and axial planes (coordinates $$x=77$$, $$y=114$$ and $$z=84$$). (b) T-statistics characterizing the differences between males and females in the mean diffusivity (MD) values for grey matter (GM) and white matter (WM) voxels, in template space. The DTI analysis was performed using data from the Human Connectome Project (HCP). Blue and red regions represent higher and lower values of MD in males, respectively. The areas in the central areas of the WM show lower areas of MD in males, such as the area given the cross-hairs, whereas areas of higher MD values are located near the cortical GM and occipital horns of the lateral ventricles. (c) Visualization of the Eve atlas white matter parcellation map (WMPM) with selected structures. Fig. 2. View largeDownload slide Visualization of the diffusion tensor imaging (DTI) analysis results in R. (a) Visualization of the anatomical structures in template space (Eve template, 1-mm isotropic T1-weighted modality) in coronal, sagittal and axial planes (coordinates $$x=77$$, $$y=114$$ and $$z=84$$). (b) T-statistics characterizing the differences between males and females in the mean diffusivity (MD) values for grey matter (GM) and white matter (WM) voxels, in template space. The DTI analysis was performed using data from the Human Connectome Project (HCP). Blue and red regions represent higher and lower values of MD in males, respectively. The areas in the central areas of the WM show lower areas of MD in males, such as the area given the cross-hairs, whereas areas of higher MD values are located near the cortical GM and occipital horns of the lateral ventricles. (c) Visualization of the Eve atlas white matter parcellation map (WMPM) with selected structures. Figure 2 displays the T1w Eve template in panel (a), the map of t-statistics for gender differences in MD values in this population in panel (b), and the annotated neuroanatomical structures of the hippocampus, thalamus, and caudate nucleus in panel (c). For panel (b), blue represents higher values of MD in males and red represents higher values of MD in females. Much more detailed information can be obtained and quantified from these results including percent voxels in the thalamus with a t-statistic passing a particular threshold (e.g. the Bonferroni correction) or the difference in the number of voxels with higher MD for females versus males in the right hippocampus, adjusting for age. Advanced data analysis and visualization An advantage of the R environment is that more sophisticated voxel-level analyses may be easily implemented. For example, one may want to investigate whether there are additional confounders for the association between FA and gender, or whether image intensities predict health outcomes. The lm and glm functions in R are designed specifically to address such questions. If images are observed longitudinally or they are used as baseline predictors in longitudinal studies, one could use the mixed effects gee (Carey, 2015) and lme4 (Bates and others, 2015) packages. If one is interested in modeling survival time based on baseline images, then the survival package (Therneau, 2015; Therneau and Grambsch, 2000) may be used. More advanced visualization of data and statistical results can be done using the packages papayar, which is an R wrapper around Papaya (https://www.nitrc.org/projects/papaya/), a JavaScript medical research image viewer. The brainR package in Neuroconductor is an R package for $$3$$D and $$4$$D visualization based on rgl (Adler and others, 2016). Harmonization of multi-site neuroimaging data An increasingly common strategy in neuroimaging is to combine multi-site imaging data across scanners and protocols. This approach pools results from different sources and may improve statistical power to detect small effects that would be undetectable in separate studies. However, the combination of multi-site data can introduce substantial unwanted variation in the data, due to differences in the scanner characteristics, acquisition parameters, and preprocessing pipelines. This is highly problematic when the level of technical variability is larger than the biological variation of the phenotype of interest. We refer to the process of removing the technical variation in multi-site studies as “harmonization.” The harmonization problem also exists in genomics, where data across batches often exhibit large technical variability. In genomics, this technical variability is referred to as batch effects. When there is confounding between batch and phenotype, failing to account for batch effects can lead to spurious associations (Leek and others, 2010). Batch effects have been under intense methodological development, which produced widely-used R software packages for the removal of batch effects. Some popular examples of such packages deployed in Bioconductor are ComBat (Johnson and others, 2007), SVA (Leek and Storey, 2007, 2008), RUV (Gagnon-Bartsch and Speed, 2012), and limma (Smyth, 2004, 2005). The Neuroconductor platform allows users to integrate, adapt, and extend these methods to imaging studies. For example, the RUV approach was adapted to structural MRI images and was shown to successfully remove technical variation in multi-site data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI). The method is implemented in the Neuroconductor package RAVEL (Fortin and others, 2016c). ANTsR: a Neuroconductor package highlight Neuroconductor packages provide scripting access to large toolkits constructed primarily in other languages. For example, ANTsR (Avants and others, 2011a) is a general purpose biomedical image analysis package built on top of the C++ based Insight ToolKit (ITK) (Yoo and others, 2002), wrapped in the ITKR package, and Advanced Normalization Tools (ANTs) (Avants and others, 2011a). ANTsR leverages Rcpp (Eddelbuettel and others, 2011) to wrap well-validated ANTs methods such as joint label fusion for image labeling based on a library of anatomical templates (Wang and Yushkevich, 2013), Atropos for multi-channel segmentation (Avants and others, 2011b), as well as core methods of computational anatomy (Avants and others, 2014). ANTsR also contributes R-level access to ITK image iterators, multi-channel image classes and transformation objects as well as specialized methods for arterial spin labeling and BOLD-based network analysis via igraph interfaces (Weber and others, 2013). ANTsR is an example of how Neuroconductor can support innovative applications of machine learning to medical imaging. ANTsR provides the first freely available implementation of Rotation Invariant Patch-based Multi-Modality Analysis aRChitecture (RIPMMARC) (Kandel and others, 2015), which uses dictionary learning to extract modality-specific information from a multi-modal data set. ANTsR also provides prior-constrained, sparse dimensionality reduction methods, including implementations for both eigenanatomy (Cook and others, 2014) and sparse canonical correlation analysis for neuroimaging (Avants and others, 2014a). Furthermore, ANTsR serves as a platform for disseminating potentially clinically useful tools such as LINDA (Lesion Identification with Neighborhood Data Analysis, https://github.com/dorianps/LINDA) (Pustina and others, 2016) which implements a “convolutional” random forest for T1-based lesion segmentation. We anticipate that such downstream benefits of Neuroconductor will become more common in the future. Because ANTsR leverages most of the functionality of the ITK library, this package is large on disk. It also requires additional build tools, such as CMake (https://cmake.org/), which is not allowed in other repositories, such as CRAN. However, this is a very important package and provides an example of how Neuroconductor can handle large and complex packages. External software dependencies Many pieces of software exist for medical imaging analyses that are standalone. For example, fslr requires a working installation of FSL (Smith and others, 2004). The commands from FSL are called from the command line through R after checks on the data inputs. The thinking of this methodology is that FSL is an updated, well-maintained, and well-documented piece of software that will change with time. Without working extensively with the authors of FSL (or any other external software), porting all of the functionality would be redundant work and would need to be updated with a new release of the software. Therefore, we believe that requiring an additional system requirement (in the SystemRequirements field of the DESCRIPTION file) for user in some instances is necessary. Other packages, such as ANTsR, dcm2niir, and rcamino, external libraries are either bundled in the package or downloaded and installed at package building. If possible, installation at run-time or build-time is desired to reduce the necessary installation steps for the user. Data packages Neuroconductor hosts data packages that allow users to test software or contain highly relevant data, such as templates. To be posted, data need to be de-identified, the author/maintainer needs to have approval to make the data public, and all user agreements must be respected. Neuroconductor also hosts packages that can access data from public repositories, while respecting the data user agreements. Neuroconductor starts with a series of packages based on a multi-parametric neuroimaging study, which we refer to as Kirby21. Kirby21 contains data on 21 subjects scanned a day apart (Landman and others, 2011) and includes the following modalities: T1w, T2, FLAIR, proton density (PD), DTI, and fMRI. Neuroconductor also contains templates for T1w images: EveTemplate and MNITemplate; for a complete list of data packages see Table 1. Although Neuroconductor interacts with many neuroimaging data platforms, including the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC), many studies have restrictions on redistribution of data. Table 1. Referenced R packages available on neuroconductor Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Table 1. Referenced R packages available on neuroconductor Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Medical imaging and neuroimaging analyses The analysis of medical imaging data is not restricted to a specific region or organ in the body. The acquisition of medical imaging data may occur anywhere from head to toe and the choice of anatomical coverage is driven by the clinical need. While some packages in Neuroconductor have been designed specifically to analyze neuroimaging data, all packages that manipulate file formats (DICOM, NIfTI, etc.) and provide visualization are not specific to the brain. In addition, some of the analysis packages have already been applied below the neck. For example, the dcemriS4 package provides estimates for the key parameters in T1w dynamic contrast-enhanced MRI (DCE-MRI) experiments for perfusion imaging commonly used to assess tissue in cancer or from inflammatory processes; such as the breast or neck (Schmid and others, 2006; Whitcher and others, 2011a). The oro.pet package provides estimates of standard uptake values (SUVs) in PET experiments, where anatomical coverage may vary from the whole body (to assess the primary cancer and/or metastatic disease) to specific anatomical regions, such as the breast, liver, or prostate (Wahl and others, 2009). The developer perspective The comprehensive R archive network (CRAN) is the most standardized, popular, and common way to distribute R packages. Neuroconductor is a complementary platform dedicated to developers of R packages for image analysis. Neuroconductor contains extensive specific training materials and includes packages that do not integrate directly with CRAN or may require additional checks with external dependencies. Although Bioconductor is a framework that provides additional checks and has similar goals to Neuroconductor, Bioconductor packages were developed for bioinformatics. As one goal for Neuroconductor is to centralize imaging work in R, a separate platform is necessary. Neuroconductor is based on Git, GitHub, and continuous integration (CI) services via Travis CI and AppVeyor. Many developers use Git, a version control system, for their projects. To distribute, host, and collaborate on projects, many use online tools, with GitHub (https://github.com/) being one of the most popular. As GitHub is an online site, distribution of a package is performed by downloading or cloning a repository. Users can easily install R packages directly from GitHub using functions from add-on R packages, most notably devtools (Wickham and Chang). Along with a detailed and revertible timeline for each package, GitHub provides a page that allows users to flag issues for developers, tag stable releases of the package, and obtain information on package activity. Along with GitHub, Travis CI (https://travis-ci.org/) and AppVeyor (https://www.appveyor.com) provide a system to check packages as they are updated. These tools provide a cloud-based infrastructure that: (i) can host and distribute packages; (ii) check packages for specific requirements; (iii) provide a platform for bug reporting and feature requests; and (iv) check how frequently issues are addressed. Travis CI will check packages on Linux and Mac OSX distributions. Appveyor will check packages on Windows platforms; a small percentage of packages will not be applicable for Windows machine due to intrinsic nature of the non-R components of the software. Windows 10 currently has a Linux subsystem that may be used in these few exceptions. Therefore, a package submitted to Neuroconductor does not need to pass checks for Windows to be incorporated into the platform. As some software in medical imaging has only implemented versions for *nix-based systems, we will allow users to submit Unix-only R packages, but will encourage them to refactor their code if possible to enable all of Neuroconductor to be cross-platform. To submit a package to Neuroconductor, the author/maintainer of the package provides the GitHub link for the package. Once the package is submitted several initial checks are conducted (Figure 3). These checks ensure that the package has been created correctly. After initial checks are complete, the package must be verified by email. This verification is designed to prevent spam and allow the developer to stop a package if they would like to revise the package before re-submitting. Fig. 3. View largeDownload slide Neuroconductor initial package submission. If the DESCRIPTION file is present, it will parsed and the submitted version of the package is checked against existing Neuroconductor packages. If this version is not already in Neuroconductor, the email-based verification process is started. Once the maintainer verifies the submission the package is ready to be tested. Fig. 3. View largeDownload slide Neuroconductor initial package submission. If the DESCRIPTION file is present, it will parsed and the submitted version of the package is checked against existing Neuroconductor packages. If this version is not already in Neuroconductor, the email-based verification process is started. Once the maintainer verifies the submission the package is ready to be tested. Once the verification is complete, the package is processed according to the workflow described in Figure 4. Overall, the package is copied/cloned to a remote server. Standardized Travis CI and Appveyor configuration files, specific to Neuroconductor, are added (see https://neuroconductor.org/package-changes for links to examples). These are to ensure that the checks performed on these services are consistent for each package. For example, any warnings when checking the package (using R CMD check) will be treated as an error. Some parameters of the package DESCRIPTION file are changed. These parameters ensure that when a package is downloaded from Neuroconductor, the correct versions of the dependent packages are used. Fig. 4. View largeDownload slide Neuroconductor package code testing. The package is cloned/updated on the Neuroconductor server and Travis CI/AppVeyor checks are initiated for the original version of the package together with a stable and current version of this package. The DESCRIPTION, travis.yml and appveyor.yml files are updated to reflect the stable/current package status. Once the Travis CI / AppVeyor checks are done the package listing is updated to reflect the continuous integration test results. Fig. 4. View largeDownload slide Neuroconductor package code testing. The package is cloned/updated on the Neuroconductor server and Travis CI/AppVeyor checks are initiated for the original version of the package together with a stable and current version of this package. The DESCRIPTION, travis.yml and appveyor.yml files are updated to reflect the stable/current package status. Once the Travis CI / AppVeyor checks are done the package listing is updated to reflect the continuous integration test results. Next, the package is pushed to the central Neuroconductor GitHub (https://github.com/neuroconductor) and submitted to Travis CI and AppVeyor to be built and checked on multiple systems. Parameters are set to ensure that Travis CI and AppVeyor use the correct versions of Neuroconductor packages for checking and external dependencies are installed. The author of the package receives an automatic email indicating whether the package was built successfully and is integrated with Neuroconductor together with a description file containing pertinent information about the process. The code coverage, the percentage of the code in the package run when checked, is computed using the covr package (Hester, 2017) and the Coveralls.io platform. This coverage is displayed on the Neuroconductor package page. Stable and current package versions We use the terminology “Stable” and “Current” to differentiate a different status of development for a Neuroconductor package. On the initial submission, after all checks are passed, the package is incorporated into Neuroconductor and deemed the Stable version. The Current version of the package is the result of nightly pulls and mirror the latest package version from the developer’s GitHub repository. This provides Neuroconductor users with a way to use the latest versions of a package and at the same time it provides the Neuroconductor platform with a safe way of checking new versions of a package against the existing set of Current Neuroconductor packages. If a Current version of a package passes all the required Neuroconductor tests, we contact the developer of the package and suggest an official re-submission to Neuroconductor. If the newly re-submitted version of the package passes the checks against the Stable Neuroconductor packages, this version is incorporated to the Stable version of Neuroconductor. The package author can ask for help from the maintainers of Neuroconductor (https://neuroconductor.https://neuroconductor.org/contact-us or email neuroconductor@gmail.com) both for compatibility issues as well as for R-specific questions. All of these steps are automated after the verification has been completed. Conclusion Neuroconductor is a platform for developing and distributing R packages for medical image analysis. Neuroconductor aims to provide similar resources and content to Bioconductor, but is primarily focused on imaging. Neuroconductor can leverage all tools available in CRAN, Bioconductor, and R to provide complete image analytic pipelines. This framework is likely to increase the reliability and reproducibility of medical image analysis software and enable a larger number of users to perform image analyses using state-of-the-art tools. Neuroconductor currently accepting any packages that fit within this framework, even if they are hosted on other platforms such as CRAN and Bioconductor. As the platform matures, we hope to centralize these packages in only one repository to reduce inconsistencies that may occur. Funding Neuroconductor and its maintainers have been partially supported by the R01 grant NS060910 from the National Institute of Neurological Disorders and Stroke at the National Institutes of Health (NINDS/NIH). We are also grateful to the Department of Biostatistics at Johns Hopkins University for providing support during the early stages of development. References Adler D., Murdoch D., Nenadic O., Urbanek S., Chen M., Gebhardt A., Bolker B., Csardi G., Strzelecki A. and Senger A. ( 2016). rgl: 3D Visualization Using OpenGL . R package version 0.96.0. Allaire J., Cheng J., Xie Y., McPherson J., Chang W., Allen J., Wickham H. and Hyndman R. ( 2015). rmarkdown: Dynamic Documents for Rr . R package version 0.5. Allaire J. J., Ushey K., Tang Y. and Eddelbuettel D. ( 2017). reticulate: R Interface to Python . Avants B. B., Epstein C. L., Grossman M. and Gee J. C. Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. 12, 26– 41. Avants B. B., Libon D. J., Rascovsky K., Boller A., McMillan C. T., Massimo L., Coslett H. B., Chatterjee A., Gross R. G. and Grossman M. ( 2014a). Sparse canonical correlation analysis relates network-level atrophy to multivariate cognitive measures in a neurodegenerative population. Neuroimage  84, 698– 711. Google Scholar CrossRef Search ADS   Avants B. B., Tustison N. J., Stauffer M., Song G., Wu B. and Gee J. C. ( 2014b). The insight toolkit image registration framework. Frontiers in Neuroinformatics  8, 44. Google Scholar CrossRef Search ADS   Avants B. B., Tustison N. J., Song G., Cook P. A., Klein A. and Gee J. C. ( 2011a). A reproducible evaluation of ants similarity metric performance in brain image registration. NeuroImage  54, 2033– 2044. Google Scholar CrossRef Search ADS   Avants B. B., Tustison N. J., Wu J., Cook P. A. and Gee J. C. ( 2011b). An open source multivariate framework for n-tissue segmentation with evaluation on public data. Neuroinformatics  9, 381– 400. Google Scholar CrossRef Search ADS   Basser P. J, Mattiello J. and LeBihan D. ( 1994). MR diffusion tensor spectroscopy and imaging. Biophysical Journal  66, 259– 267. Google Scholar CrossRef Search ADS PubMed  Bates D., Mächler M., Bolker B. and Walker S. ( 2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software  67, 1– 48. Google Scholar CrossRef Search ADS   Bellosta C. J. G. ( 2015). rPython: Package Allowing R to Call Python . R package version 0.0-6. Chang W., Cheng J., Allaire J., Xie Y. and McPherson J. ( 2015) shiny: Web application framework for r, 2015. URL http://CRAN.R-project.org/package=shiny. R version 0.11. Clayden J. D., Cox B., Jenkinson M., Reynolds R., Fissell K., Gailly J. and Adler M. ( 2016). RNifti: Fast R and C++ Access to NIfTI Images . R package version 0.3.0. Clayden J. D., Modat M., Daga P., Presles B., Anthopoulos T. and Dag P. ( 2017). RNiftyReg: Image Registration Using the NiftyReg Library . R package version 2.6.0. Clayden J. D., Muñoz M., Susana S., Amos J., King M. D., Bastin M. E. and Clark C. A. ( 2011). TractoR: magnetic resonance imaging and tractography with R. Journal of Statistical Software  44, 1– 18. Google Scholar CrossRef Search ADS   Clayden J. D. and Rorden C. ( 2017). divest: Get Images Out of DICOM Format Quickly . R package version 0.2.0. Cook P. A., Bai Y., Nedjati-Gilani S. K. K. S., Seunarine K. K., Hall M. G., Parker G. J. and Alexander D. C. ( 2006). Camino: open-source diffusion-MRI reconstruction and processing. In: 14th scientific meeting of the international society for magnetic resonance in medicine , vol. 2759. Seattle WA, USA. Cook P. A. McMillan C. T., Avants B. B., Peelle J. E., Gee J. C. and Grossman M. ( 2014). Relating brain anatomy and cognitive ability using a multivariate multimodal framework. NeuroImage  99, 477– 486. Google Scholar CrossRef Search ADS PubMed  Eddelbuettel D., François R., Allaire J., Chambers J., Bates D. and Ushey K. ( 2011). Rcpp: seamless R and C++ integration. Journal of Statistical Software  40, 1– 18. Fischl B. ( 2012). Freesurfer. Neuroimage  62, 774– 781. Google Scholar CrossRef Search ADS PubMed  Fortin J.-P., Muschelli J. and Shinohara R. T. ( 2016a). EveTemplate: JHU-MNI-ss (Eve) Template . R package version 0.99.14. Fortin J.-P., Muschelli J. and Shinohara R. T. ( 2016b). MNITemplate: MNI152 Template . R package version 0.99.4. Fortin J.-P., Sweeney E. M., Muschelli J., Crainiceanu C. M., Shinohara R. T., Alzheimer’s Disease Neuroimaging Initiative and others. ( 2016c). Removing inter-subject technical variability in magnetic resonance imaging studies. NeuroImage  132, 198– 212. Google Scholar CrossRef Search ADS   Gagnon-Bartsch J. A. and Speed T. P. ( 2012). Using control genes to correct for unwanted variation in microarray data. Biostatistics  13, 539– 552. Google Scholar CrossRef Search ADS PubMed  Garcia de la Garza A., Vandekar S., Roalf D., Ruparel K., Gur R., Gur R., Satterthwaite T. and Shinohara R. T. ( 2016). voxel: Mass-Univariate Voxelwise Analysis of Medical Imaging Data . R package version 1.2.1. Gentleman R. C., Carey V. J., Bates D. M., Bolstad B., Dettling M., Dudoit S., Ellis B., Gautier L., Ge Y., Gentry J. and others. ( 2004). Bioconductor: open software development for computational biology and bioinformatics. Genome Biology  5, 1. Google Scholar CrossRef Search ADS   Glasser M. F., Sotiropoulos S. N., Wilson J. A., Coalson T. S., Fischl B., Andersson J. L., Xu J., Jbabdi S., Webster M., Polimeni J. R and others. ( 2013). The minimal preprocessing pipelines for the human connectome project. Neuroimage  80, 105– 124. Google Scholar CrossRef Search ADS PubMed  Grabner G., Janke A. L., Budge M. M., Smith D., Pruessner J. and Collins D. L. ( 2006). Symmetric atlasing and model based segmentation: an application to the hippocampus in older adults. In: International Conference on Medical Image Computing and Computer-Assisted Intervention . Berlin, Heidelberg: Springer, pp. 58– 66. Hester J. ( 2017). covr: Test Coverage for Packages . R package version 3.0.0. Hornik K. ( 2016). R FAQ. Huber W., Carey V. J., Gentleman R., Anders S., Carlson M., Carvalho B. S., Bravo H. C., Davis S., Gatto L., Girke T. and others. ( 2015). Orchestrating high-throughput genomic analysis with bioconductor. Nature Methods  12, 115– 121. Google Scholar CrossRef Search ADS PubMed  Johnson W. E., Li C. and Rabinovic A. ( 2007). Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics  8, 118– 127. Google Scholar CrossRef Search ADS PubMed  Kandel B. M., Wang D. J. J., Detre J. A., Gee J. C. and Avants B. B. ( 2015). Decomposing cerebral blood flow mri into functional and structural components: a non-local approach based on prediction. NeuroImage  105, 156– 170. Google Scholar CrossRef Search ADS PubMed  Koay C. G., Chang L.-C., Carew J. D., Pierpaoli C. and Basser P. J. ( 2006). A unifying theoretical and algorithmic framework for least squares methods of estimation in diffusion tensor imaging. Journal of Magnetic Resonance  182, 115– 125. Google Scholar CrossRef Search ADS PubMed  Landman B. A., Huang A. J., Gifford A., Vikram D. S., Lim I. A. L, Farrell J. A. D., Bogovic J. A., Hua J., Chen M., Jarso S. and others. ( 2011). Multi-parametric neuroimaging reproducibility: a 3-t resource study. Neuroimage  54, 2854– 2866. Google Scholar CrossRef Search ADS PubMed  Le Bihan D., Mangin J.-F., Poupon C., Clark C. A., Pappata S., Molko N. and Chabriat H. ( 2001). Diffusion tensor imaging: concepts and applications. Journal of Magnetic Resonance Imaging  13, 534– 546. Google Scholar CrossRef Search ADS PubMed  Leek J. T., Scharpf R. B., Bravo H. C., Simcha D., Langmead B., Johnson W. E., Geman D., Baggerly K. and Irizarry R. A. ( 2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics  11, 733– 739. Google Scholar CrossRef Search ADS PubMed  Leek J. T. and Storey J. D. ( 2007). Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genetics  3, 1724– 1735. Google Scholar CrossRef Search ADS PubMed  Leek J. T. and Storey J. D. ( 2008). A general framework for multiple testing dependence. Proceedings of the National Academy of Sciences  105, 18718– 18723. Google Scholar CrossRef Search ADS   Li X., Morgan P. S., Ashburner J., Smith J. and Rorden C. ( 2016). The first step for neuroimaging data analysis: DICOM to NIfTI conversion. Journal of Neuroscience Methods  264, 47– 56. Google Scholar CrossRef Search ADS PubMed  Mejia A. F, Nebel M. B., Eloyan A., Caffo B. and Lindquist M. A. ( 2017). PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data. Biostatistics  13, 521– 536. Muschelli J. ( 2016a). dcm2niir: Conversion of ‘DICOM’ to ‘NIfTI’ Imaging Files Through R . R package version 0.3.3. Muschelli J. ( 2016b). ichseg: Intracerebral Hemorrhage Segmentation of X-Ray Computed Tomography (CT) Images . R package version 0.5.1. Muschelli J. ( 2016c). neurobase: ‘Neuroconductor’ Base Package with Helper Functions for ‘nifti’ Objects . R package version 1.9.1. Muschelli J. ( 2016d). rcamino: Port of the Camino Software . R package version 0.3.1. Muschelli J. ( 2016e). spm12r: Wrapper Functions for ‘SPM’ (Statistical Parametric Mapping) Version 12 from the ‘Wellcome’ Trust Centre for ‘Neuroimaging’ . R package version 2.1. Muschelli J., Sweeney E., Lindquist M. and Crainiceanu C. ( 2015). fslr: Connecting the FSL software with R. The R Journal  7, 163– 175. Google Scholar PubMed  Muschelli J., Sweeney E. M., Ullman N. L., Vespa P., Hanley D. F. and Crainiceanu C. M. ( 2017). PItcHPERFeCT: Primary intracranial hemorrhage probability estimation using random forests on CT. NeuroImage: Clinical , 14, 379– 390. Google Scholar CrossRef Search ADS PubMed  O’Donnell L. J. and Westin C.-F. ( 2011). An introduction to diffusion tensor image analysis. Neurosurgery Clinics of North America  22, 185– 196. Google Scholar CrossRef Search ADS PubMed  Oishi K., Faria A., Jiang H., Li X., Akhter K., Zhang J., Hsu J. T., Miller M. I., van Zijl P. C. M., Albert M. and others. ( 2009). Atlas-based whole brain white matter analysis using large deformation diffeomorphic metric mapping: application to normal elderly and alzheimer’s disease participants. Neuroimage  46, 486– 499. Google Scholar CrossRef Search ADS PubMed  Ou Y., Sotiras A., Paragios N. and Davatzikos C. ( 2011). Dramms: deformable registration via attribute matching and mutual-saliency weighting. Medical Image Analysis  15, 622– 639. Google Scholar CrossRef Search ADS PubMed  Penny W. D., Friston K. J., Ashburner J. T., Kiebel S. J and Nichols T. E. ( 2011). Statistical Parametric Mapping: The Analysis of Functional Brain Images . London, UK: Academic Press. Pustina D., Coslett H., Turkeltaub P. E., Tustison N., Schwartz M. F. and Avants B. ( 2016). Automated segmentation of chronic stroke lesions using LINDA: lesion identification with neighborhood data analysis. Human Brain Mapping  37, 1405– 1421. Google Scholar CrossRef Search ADS PubMed  R Core Team. ( 2017). R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing, Vienna, Austria. Ritchie M. E., Phipson B., Wu D., Hu Y., Law C. W., Shi W. and Smyth G. K. ( 2015). Limma powers differential expression analyses for rna-sequencing and microarray studies. Nucleic Acids Research  43, e47. Google Scholar CrossRef Search ADS PubMed  Schmid V. J., Whitcher B., Padhani A. R., Taylor N. J. and Yang G.-Z. ( 2006). Bayesian methods for pharmacokinetic models in dynamic contrast-enhanced magnetic resonance imaging. IEEE Transactions on Medical Imaging  25, 1627– 1636. Google Scholar CrossRef Search ADS PubMed  Schwendinger F. ( 2015). PythonInR: Use Python from Within R . R package version 0.1-3. Shinohara R. T., Sweeney E. M., Goldsmith J., Shiee N., Mateen F. J., Calabresi P. A., Jarso S., Pham D. L., Reich D. S., Crainiceanu C. M. and others. ( 2014). Statistical normalization techniques for magnetic resonance imaging. NeuroImage: Clinical  6, 9– 19. Google Scholar CrossRef Search ADS PubMed  Sled J. G., Zijdenbos A. P. and Evans A.C. ( 1998). A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging  17, 87– 97. Google Scholar CrossRef Search ADS PubMed  Smith S. M, Jenkinson M., Woolrich M. W., Beckmann C. F., Behrens T. E. J., Johansen-Berg H., Bannister P. R., De Luca M., Drobnjak I., Flitney D. E. and others. ( 2004). Advances in functional and structural mr image analysis and implementation as FSL. Neuroimage  23, S208– S219. Google Scholar CrossRef Search ADS PubMed  Smyth G. K. ( 2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology  3, 1– 25. Google Scholar CrossRef Search ADS   Smyth G. K. ( 2005). Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions Using R and Bioconductor . New York, NY: Springer, pp. 397– 420. Google Scholar CrossRef Search ADS   Sweeney E. M., Shinohara R. T., Shiee N., Mateen F. J., Chudgar A. A., Cuzzocreo J. L., Calabresi P. A., Pham D. L., Reich D. S. and Crainiceanu C. M. ( 2013). Oasis is automated statistical inference for segmentation, with applications to multiple sclerosis lesion segmentation in MRI. NeuroImage: Clinical  2, 402– 413. Google Scholar CrossRef Search ADS PubMed  Tabelow K., Clayden J. D., Lafaye de Micheaux P., Polzehl J., Schmid V. J. and Whitcher B. J. ( 2011). Image analysis and statistical inference in neuroimaging with R. NeuroImage  55, 1686– 1693. Google Scholar CrossRef Search ADS PubMed  Therneau T. M. and Grambsch P. M. ( 2000). Modeling Survival Data: Extending the Cox Model . New York: Springer. Google Scholar CrossRef Search ADS   Therneau T. M. ( 2015). A Package for Survival Analysis in S . version 2.38. Carey V. J. Ported to R by Thomas Lumley and Brian Ripley. ( 2015). gee: Generalized Estimation Equation Solver . R package version 4.13-19. Tustison N. J., Avants B. B., Cook P. A., Zheng Y., Egan A., Yushkevich P. A. and Gee J. C. ( 2010). N4itk: improved N3 bias correction. IEEE Transactions on Medical Imaging  29, 1310– 1320. Google Scholar CrossRef Search ADS PubMed  Van Essen D. C., Smith S. M., Barch D. M., Behrens T. E. J., Yacoub E., Ugurbil K., WU-Minn HCP Consortium, and others. ( 2013). The WU-Minn human connectome project: an overview. Neuroimage  80, 62– 79. Google Scholar CrossRef Search ADS PubMed  Wahl R. L., Jacene H., Kasamon Y. and Lodge M. A. ( 2009). From RECIST to PERCIST: evolving considerations for PET response criteria in solid tumors. Journal of Nuclear Medicine  50, 122S– 150S. Google Scholar CrossRef Search ADS PubMed  Wang H. and Yushkevich P. ( 2013). Multi-atlas segmentation with joint label fusion and corrective learning an open source implementation. Frontiers in Neuroinformatics  7, 27. Google Scholar PubMed  Weber M. J., Detre J. A., Thompson-Schill S. L. and Avants B. B. ( 2013). Reproducibility of functional network metrics and network structure: a comparison of task-related BOLD, resting ASL with BOLD contrast, and resting cerebral blood flow. Cognitive, Affective, & Behavioral Neuroscience  13, 627– 640. Google Scholar CrossRef Search ADS   Whitcher B., Schmid V. J., Collins D. J., Orton M. R., Koh D.-M., de Corcuera I. D., Parera M., del Campo J. M., Leach M. O., Harrington K. and others. ( 2011a). A Bayesian hierarchical model for DCE-MRI to evaluate treatment response in a phase II study in advanced squamous cell carcinoma of the head and neck. Magnetic Resonance Materials in Physics, Biology and Medicine  24, 85– 96. Google Scholar CrossRef Search ADS   Whitcher B., Schmid V. J. and Thornton A. ( 2011b). Working with the DICOM and NIfTI data standards in R. Journal of Statistical Software  44, 1– 28. Wickham H. ( 2009). ggplot2: Elegant Graphics for Data Analysis . New York, NY: Springer Science & Business Media. Wickham H. and Chang W. ( 2017). devtools: Tools to Make Developing R Packages Easier . R package version 1.12.0.9000. Xie Y. ( 2013). knitr: A General-purpose Package for Dynamic Report Generation in R.  R package version, vol. 1( 7), 1. Yoo T. S., Ackerman M. J., Lorensen W. E., Schroeder W., Chalana V., Aylward S., Metaxas D. and Whitaker R. ( 2002). Engineering and algorithm design for an image processing Api: a technical report on ITK-the insight toolkit. Studies in Health Technology and Informatics , 85, 586– 592. Google Scholar PubMed  Zhang Y., Brady M. and Smith S. ( 2001). Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Transactions on Medical Imaging  20, 45– 57. Google Scholar CrossRef Search ADS PubMed  © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Biostatistics Oxford University Press

Loading next page...
 
/lp/ou_press/neuroconductor-an-r-platform-for-medical-imaging-analysis-P1iPbdVCfn
Publisher
Oxford University Press
Copyright
© The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISSN
1465-4644
eISSN
1468-4357
D.O.I.
10.1093/biostatistics/kxx068
Publisher site
See Article on Publisher Site

Abstract

Summary Neuroconductor (https://neuroconductor.org) is an open-source platform for rapid testing and dissemination of reproducible computational imaging software. The goals of the project are to: (i) provide a centralized repository of R software dedicated to image analysis, (ii) disseminate software updates quickly, (iii) train a large, diverse community of scientists using detailed tutorials and short courses, (iv) increase software quality via automatic and manual quality controls, and (v) promote reproducibility of image data analysis. Based on the programming language R (https://www.r-project.org/), Neuroconductor starts with 51 inter-operable packages that cover multiple areas of imaging including visualization, data processing and storage, and statistical inference. Neuroconductor accepts new R package submissions, which are subject to a formal review and continuous automated testing. We provide a description of the purpose of Neuroconductor and the user and developer experience. Introduction Medical imaging analysis software is heterogeneous, complex, and difficult to use in fully reproducible analysis pipelines. These problems have been accentuated by the diversity of new imaging data sets and associated scientific problems. Indeed, many studies now collect data on thousands of subjects, at multiple visits, and using different modalities. Storing, understanding, and analyzing such data is daunting. Neuroconductor provides the infrastructure for using, improving, and designing open-source, scripted software that depends on a minimum number of software platforms and is dedicated to improving the correctness, reproducibility, and speed of medical image data analysis. To achieve this, Neuroconductor interweaves pre- and post-processing image analysis and provides integrated data-analytic approaches. Neuroconductor provides data, methods, and software packages designed to support the analysis of populations of images in R (R Core Team, 2017), a programming language with state-of-the-art statistical analysis tools and a vectorized data paradigm that is well suited for neuroimaging (Tabelow and others, 2011). Neuroconductor supports many types of imaging data including magnetic resonance imaging (MRI: structural, functional, and dynamic), computed tomography (CT), single-photon and positron emission computed tomography, (SPECT and PET), electroencephalography (EEG), and magnetoencephalography (MEG). It is currently focused on human imaging, especially of the brain, but it also supports other biological imaging, such as the lungs and MR spectroscopy. Neuroconductor is able to interface with mature R distribution platforms, such as Bioconductor (Gentleman and others, 2004; Huber and others, 2015) and the comprehensive R archive network (CRAN) (Hornik, 2016; R Core Team, 2017), as well as other R-based imaging projects such as TractoR (Clayden and others, 2011). Although these other distribution platforms are well established, we believe that Neuroconductor enables better support and testing for imaging-specific packages, and provides a focal point for imaging-specific training materials, similar to how Bioconductor supports bioinformatics packages. NiPype (http://nipy.org/packages/nipype/index.html) is a Python project with a goal of standardizing neuroimaging software syntax, similar to Neuroconductor. NiPype is incorporated in the larger NiPy project (http://nipy.org/), which provides tools for neuroimaging analysis in Python on multiple neuroimaging modalities. We believe Neuroconductor is in many ways the R analog to NiPy, with the hopes of including additional imaging problems and platforms. Although the utility of each of these projects may be similar, we do not believe Neuroconductor competes with NiPy, but allows R users similar capabilities as Python users. Moreover, with R-Python interfaces such as rPython (Bellosta, 2015), PythonInR (Schwendinger, 2015), and reticulate (Allaire and others, 2017) in R, and the rpy2 (https://pypi.python.org/pypi/rpy2) and pyRserve (https://pypi.python.org/pypi/pyRserve/) modules in Python, it is possible for users can combine these two efforts. Neuroconductor may be used seamlessly for data fusion analyses where imaging, genomics, and other types of high-throughput data types are analyzed together with traditional measurements and health outcomes. In R, these analyses are integrated with well-designed and thoroughly tested analytic packages for visualization, statistical inference, longitudinal and survival analysis, regression, network analysis, and machine learning. The user perspective Neuroconductor users interact with the platform via the Neuroconductor website (https://neuroconductor.org). Users may install an R package, explore packages, or identify a workflow designed for their specific problem. If the workflow does not exist in Neuroconductor, then users are able to create and submit their own. To download a package a user needs to know the package name, which is obtained from the list of packages on the Neuroconductor webpage (https://neuroconductor.org/list-packages/all). For example, to download the fslr and neurohcp packages in R, the command line instructions are: Many Neuroconductor packages contain help manuals and documentation as vignettes, which developers are strongly encouraged to provide. Large areas of research covering multiple package combinations are described under Help. Some of these areas contain tutorials and Massive Open Online Courses (MOOCs) that provide a rapid introduction to more complex concepts and workflows. Examples of such courses are Principles of fMRI (https://www.coursera.org/learn/functional-mri) and Introduction to Neurohacking in R (https://www.coursera.org/learn/neurohacking). If the user does not find what they are looking for, they can request a tutorial on a specific topic and developers can also submit their own tutorial or course to Neuroconductor. Furthermore, the user can turn to the Support forum (https://neuroconductor.org/forum), can attend short courses, take free online MOOCs, or read tutorials (https://neuroconductor.org/neuroc-help). All package developers are users and some users become package developers. The user may have their own data, use a data package from Neuroconductor, or use a Neuroconductor package to download data from an internet repository. After obtaining data, a first step is to manipulate, visualize, and ensure data quality. Most quality control procedures can be performed at the image or subject level, though the distribution of quality metrics in the population can be used as well; see Mejia and others (2015) for an example. Once data quality has been assessed, an analytic database of subjects for subsequent analyses may be created. A major advantage of the R environment is that analyses can take advantage of a large collection of well-tested, state-of-the-art statistical packages. For example, incorporating demographic information, regression, mixed-effects models, or survival analysis is straightforward in R. Moreover, the voxel package implements most of these modeling methods at the voxel level directly on images (Garcia de la Garza and others, 2016). Results containing spatial information are mapped back to an image, while other results may be displayed using plotting packages such as ggplot2 (Wickham, 2009). All of these steps can be wrapped in a series of reproducible reports using knitr and rmarkdown (Allaire and others, 2015; Xie, 2013). Another advantage of R is that complex processing steps and complete analyses can also be wrapped into R packages. For example, the ichseg (Muschelli, 2016b; Muschelli and others, 2016) R package is designed for intra-cranial hemorrhage segmentation and uses multiple functions from other Neuroconductor packages. It uses a single CT scan as input, processes the data using FSL (Smith and others, 2004), called through fslr in R (Muschelli and others, 2015), and creates a segmentation of the hemorrhagic stroke based on a pre-trained supervised machine learning algorithm. Last, data products resulting from analysis and package development may be deployed as Shiny applications (Chang and others, 2015), which are web applications built in R and accessed via a browser. Shiny applications are hosted on a user’s custom launched or commercial Shiny server (such as https://www.shinyapps.io). As an example, the segmentation of the hemorrhagic stroke method described above has been implemented in a Shiny application (http://johnmuschelli.com/ich_segment.html). Below we provide several case studies to build up the intuition for interfacing with Neuroconductor. Case study 1: Processing of structural MRI Figure 1 provides an example of a typical processing pipeline for multi-sequence structural MRI data. Analysis of such data often begins with a conversion of Digital Imaging and Communications in Medicine (DICOM) files, which are essentially binary formatted pictures with header information. DICOM is one of the most common formats for imaging data that come directly from the imaging device. DICOM files are commonly converted to the much simpler Neuroimaging Informatics Technology Initiative (NIfTI) format, which represents imaging data as a multi-dimensional array instead of a collection of single two-dimensional (2D) slices (Li and others, 2016). This process may be performed in R using the dcm2niir package (Muschelli, 2016a), which calls dcm2niix (https://github.com/rordenlab/dcm2niix) from the command line, the divest package (Clayden and Rorden, 2017), which incorporates dcm2niix and provides an in-memory bridge to it, or the oro.dicom (Whitcher and others, 2011b) package, which uses completely native R code. The resulting NIfTI files may be converted to array-based S4 objects using the oro.nifti (Whitcher and others, 2011b) package or objects based on C++ pointers using the ANTsR (Avants and others, 2011a) or RNifti (Clayden and others, 2016) packages. These packages also provide the ability to read other medical imaging formats, such as ANALYZE, AFNI, NRRD or any other format supported by the insight toolkit (ITK) (Yoo and others, 2002). Fig. 1. View largeDownload slide Typical processing pipeline for multi-sequence structural MRI data. Fig. 1. View largeDownload slide Typical processing pipeline for multi-sequence structural MRI data. After conversion, image intensity inhomogeneity correction is applied to ensure that each tissue has similar intensity distributions across locations in the brain (e.g. top versus bottom of the brain). Image inhomogeneities are typically caused by magnetic field inhomogeneities and may be handled by multiple methods. Neuroconductor currently contains four such methods implemented in FSL via fslr (Zhang and others, 2001), freesurfer (Sled and others, 1998), and ANTsR (N3 and N4 correction) (Sled and others, 1998; Tustison and others, 2010). The next step is co-registration, which spatially aligns all images in the sequence to one of the images in the sequence, usually a T1-weighted (T1w) image. Co-registration may be performed using flirt (fslr), antsRegistration (ANTsR), niftyreg (RNiftyReg) (Clayden and others, 2017), or dramms (drammsr) (Ou and others, 2011). Co-registration is followed by brain extraction, also known as skull stripping. This procedure is commonly performed on the T1w image and may be implemented using spm12r (Penny and others, 2011; Muschelli, 2016), fslr, freesurfer, or ANTsR. At this point, the image sequences for the subject are in the same space, and extracranial tissues have been removed. Brain extraction is followed by intensity normalization, which transforms the arbitrary MRI units into interpretable units across subjects. This is not always thought of as a standard pre-processing step, but it is important in many applications. For example, one could be interested in subtracting two images that were collected longitudinally in order to identify changes or one may want to investigate changes in voxel or region-of-interest (ROI) intensities over time. This is typically achieved using z-scoring with respect to a particular tissue class as implemented in the WhiteStripe method (WhiteStripe) (Shinohara and others, 2014), standard and robust z-scoring relative to whole brain (neurobase) (Muschelli, 2016c), histogram matching (implemented in RAVEL), or removal of unwanted variation (RAVEL) (Fortin and others, 2016c). Further subject-specific and population analyses may require registration to a population-level template and/or tissue class segmentation. Both of these approaches are possible using Neuroconductor packages including spm12r, fslr, and ANTsR. While the flowchart in Figure 1 provides a conceptual pre-processing pipeline, deploying an explicit and reproducible pipeline requires specific choices at every step that may depend on multiple tuning parameters. In R one can be explicit about these choices, provide a software suite of packages and tuning parameters, and quickly compare results based on different combinations of software and platforms. Case study 2: A cross-package workflow for diffusion tensor imaging In this section, we discuss an example of complete analysis (preprocessing and statistical analysis) performed entirely within Neuroconductor. We start with a simple question: in a population of healthy subjects, is there any difference in the white matter (WM) microstructure between males and females? Diffusion tensor imaging (DTI) has been used extensively to study WM fiber structure by taking advantage of the differential water diffusivity in the WM tracts relative to other brain structures (Basser and others, 1994; Le Bihan and others, 2001). Fractional anisotropy (FA) and mean diffusivity (MD) are two scalar maps commonly derived from DTI images to study the diffusivity properties of the brain. FA measures the degree of directional diffusivity in a voxel and MD measures the total diffusion within a voxel (Koay and others, 2006). To investigate potential gender differences in WM fiber tracts, we use DTI data from healthy young adults in the Human Connectome Project (HCP), available at http://db.humanconnectome.org. HCP includes a large cohort of individuals ($$N\,{>}\,1200$$) with a vast amount of neuroimaging data, including structural magnetic resonance imaging (sMRI), task and resting-state functional MRI (fMRI), and diffusion tensor imaging (DTI). Step 1: Downloading data using the neurohcp package The first step is to download the minimally preprocessed DTI data (Glasser and others, 2013) and structural T1w MRI images, available for 781 subjects (436 females and 345 males), using the neurohcp (Van Essen and others, 2013) package. The neurohcp package is an R interface for downloading data from the HCP database, which is publicly available from the Amazon Web Services (AWS). A description of how to connect to the HCP database via AWS is available at the Human Connectome Wiki. After accepting the data use terms, one needs to obtain AWS credentials, which include an access key identifier and a secret key. The neurohcp package in R accesses the data from an Amazon Simple Storage Solution (S3) bucket using these two AWS access keys. We can set these keys using set_aws_api_key: After the access keys are set, neurohcp can download data from the S3 bucket. For instance, the complete diffusion data directory for subject 100307 may be downloaded using the download_hcp_dir function in the neurohcp package: The result is an R list containing the file names of all downloaded files, the directory where the files were downloaded, and the http request that was sent to the Amazon S3 bucket. Data from other subjects are downloaded similarly. The demographic data, which includes age and gender, are not located on the S3 bucket and must be downloaded directly from the website. Step 2: Processing DTI data with rcamino After the minimally processed DTI data are downloaded, processing continues using the Neuroconductor package rcamino (Muschelli, 2016), an R interface for the open-source DTI software Camino (Cook and others, 2006). The package creates and fits the diffusion tensor models, and generates the FA and MD maps. We illustrate below the associated R code. Note that Camino requires the b-values and b-vectors of the DTI to conduct the DTI model fit. The b-values are the amount of diffusion weighting used for each volume. The b-vectors are the gradients of the magnetic field that imply direction of flow; for more details see (O’Donnell and Westin, 2011). The process is then repeated for all subjects with available DTI data and may be sped up via parallel computing. Now, we have an FA and MD image for each subject. Step 3: Nonlinear registration to template The next step is to prepare the data for voxel-wise analysis, which requires the FA and MD maps to be spatially registered to a common template. For each subject, we use the download_hcp_file function from the neurohcp package to download the T1w image with extra-cranial voxels removed. We then use the symmetric diffeomorphic non-linear registration implemented in ANTsR, wrapped in extrantsr (Avants and others), to register FA and MD maps to the 1 mm isotropic Eve template T1w image (Oishi and others, 2009). The Eve template is a single-subject template created by the Laboratory of Brain Anatomical MRI led by Professor Susumu Mori at Johns Hopkins University (Oishi and others, 2009). The Eve template is made available in the Neuroconductor data package EveTemplate (Fortin and others, 2016a). Alternatively, one could register images to the MNI template (Grabner and others, 2006), which is available in the Neuroconductor data package MNITemplate (Fortin and others, 2016b). Each registered DTI map is saved as a standard NIfTI file for further analyses. The R code is presented below. The results of these processing steps are images containing the FA and MD maps registered to the Eve atlas for every subject. While at this point we do not conduct ROI analyses, the template may be used to extract subject-specific ROIs. An ROI is defined as an anatomical region, regions obtained from other analyses, or regions that are manually delineated. Here, we focus on voxel-wise analysis across subjects, which can be performed because images are registered to a common template space. Although we do not present any measures of quality control or registration accuracy, users should inspect registration and image quality using either automated methods or visual inspection. Step 4: Statistical analysis The next step is the analysis of the population of images using statistical inference. Neuroimaging convention refers to this step as “statistical analysis,” a convention we adopt here, despite the fact that earlier steps involve a substantial amount of statistical operations. We first read the registered NIfTI files into R and create a matrix of voxels intensities with voxels as rows, and subjects as columns. For the analysis, we only consider voxels in the WM and GM. We create one matrix for the FA maps, and one for the MD maps, using the function images2matrix from the package neurobase. The registered-to-Eve images for all subjects are located in the lists files.fa and files.md for FA and MD maps, respectively. For brevity, we present only the analysis for the MD maps. In this example, the matrix (Y.md) has $$1\,372\,619$$ rows (number of GM and WM voxels in the Eve template) and $$781$$ columns (number of subjects in the data set): There are many options in R to quantify the association of the FA and MD intensities with gender. The simplest approach is to calculate mass-univariate two-sample t-statistics (t-statistics computed at each voxel separately), which can be quickly computed using limma (Smyth, 2004, 2005; Ritchie and others, 2015), a popular R package from the Bioconductor project (https://www.bioconductor.org/). The package was originally developed for the analysis of high-throughput genomics data, but much of its functionality may be used in neuroimaging applications without additional effort. Among other methods, the limma package implements Empirical-Bayes (EB) methods to estimate t-statistics based on variance shrinkage (called moderated t-statistics). Below, we present the code to compute the moderated t-statistics for the association between gender and MD, adjusted for age. The code produces t-statistics comparing males versus females in the Eve-template space. The gender value is coded so that negative values of the t-statistic correspond to higher values of MD in females. Below, we investigate where the largest differences are located. Step 5: Visualization and localization of the results A considerable advantage of the close integration between pre- and post-processing tools is that results can be easily mapped back into the native or template space. This helps localize significant associations using template labels. For example, the voxels that exhibit the largest differences between males and females are identified using the Eve white matter parcellation map (WMPM) (Oishi and others, 2009), included in the package EveTemplate. Below, we provide the R code for localization of these results. A quick inspection of the top six voxels reveals that the most pronounced differences are located in the hippocampus and thalamus, where females have higher mean diffusivity compared to males. However, it may be useful to locate these findings on the template and visually study the degree of spatial clustering. The following commands produce the three panels of Figure 2, using the function ortho2 from the neurobase package: Fig. 2. View largeDownload slide Visualization of the diffusion tensor imaging (DTI) analysis results in R. (a) Visualization of the anatomical structures in template space (Eve template, 1-mm isotropic T1-weighted modality) in coronal, sagittal and axial planes (coordinates $$x=77$$, $$y=114$$ and $$z=84$$). (b) T-statistics characterizing the differences between males and females in the mean diffusivity (MD) values for grey matter (GM) and white matter (WM) voxels, in template space. The DTI analysis was performed using data from the Human Connectome Project (HCP). Blue and red regions represent higher and lower values of MD in males, respectively. The areas in the central areas of the WM show lower areas of MD in males, such as the area given the cross-hairs, whereas areas of higher MD values are located near the cortical GM and occipital horns of the lateral ventricles. (c) Visualization of the Eve atlas white matter parcellation map (WMPM) with selected structures. Fig. 2. View largeDownload slide Visualization of the diffusion tensor imaging (DTI) analysis results in R. (a) Visualization of the anatomical structures in template space (Eve template, 1-mm isotropic T1-weighted modality) in coronal, sagittal and axial planes (coordinates $$x=77$$, $$y=114$$ and $$z=84$$). (b) T-statistics characterizing the differences between males and females in the mean diffusivity (MD) values for grey matter (GM) and white matter (WM) voxels, in template space. The DTI analysis was performed using data from the Human Connectome Project (HCP). Blue and red regions represent higher and lower values of MD in males, respectively. The areas in the central areas of the WM show lower areas of MD in males, such as the area given the cross-hairs, whereas areas of higher MD values are located near the cortical GM and occipital horns of the lateral ventricles. (c) Visualization of the Eve atlas white matter parcellation map (WMPM) with selected structures. Figure 2 displays the T1w Eve template in panel (a), the map of t-statistics for gender differences in MD values in this population in panel (b), and the annotated neuroanatomical structures of the hippocampus, thalamus, and caudate nucleus in panel (c). For panel (b), blue represents higher values of MD in males and red represents higher values of MD in females. Much more detailed information can be obtained and quantified from these results including percent voxels in the thalamus with a t-statistic passing a particular threshold (e.g. the Bonferroni correction) or the difference in the number of voxels with higher MD for females versus males in the right hippocampus, adjusting for age. Advanced data analysis and visualization An advantage of the R environment is that more sophisticated voxel-level analyses may be easily implemented. For example, one may want to investigate whether there are additional confounders for the association between FA and gender, or whether image intensities predict health outcomes. The lm and glm functions in R are designed specifically to address such questions. If images are observed longitudinally or they are used as baseline predictors in longitudinal studies, one could use the mixed effects gee (Carey, 2015) and lme4 (Bates and others, 2015) packages. If one is interested in modeling survival time based on baseline images, then the survival package (Therneau, 2015; Therneau and Grambsch, 2000) may be used. More advanced visualization of data and statistical results can be done using the packages papayar, which is an R wrapper around Papaya (https://www.nitrc.org/projects/papaya/), a JavaScript medical research image viewer. The brainR package in Neuroconductor is an R package for $$3$$D and $$4$$D visualization based on rgl (Adler and others, 2016). Harmonization of multi-site neuroimaging data An increasingly common strategy in neuroimaging is to combine multi-site imaging data across scanners and protocols. This approach pools results from different sources and may improve statistical power to detect small effects that would be undetectable in separate studies. However, the combination of multi-site data can introduce substantial unwanted variation in the data, due to differences in the scanner characteristics, acquisition parameters, and preprocessing pipelines. This is highly problematic when the level of technical variability is larger than the biological variation of the phenotype of interest. We refer to the process of removing the technical variation in multi-site studies as “harmonization.” The harmonization problem also exists in genomics, where data across batches often exhibit large technical variability. In genomics, this technical variability is referred to as batch effects. When there is confounding between batch and phenotype, failing to account for batch effects can lead to spurious associations (Leek and others, 2010). Batch effects have been under intense methodological development, which produced widely-used R software packages for the removal of batch effects. Some popular examples of such packages deployed in Bioconductor are ComBat (Johnson and others, 2007), SVA (Leek and Storey, 2007, 2008), RUV (Gagnon-Bartsch and Speed, 2012), and limma (Smyth, 2004, 2005). The Neuroconductor platform allows users to integrate, adapt, and extend these methods to imaging studies. For example, the RUV approach was adapted to structural MRI images and was shown to successfully remove technical variation in multi-site data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI). The method is implemented in the Neuroconductor package RAVEL (Fortin and others, 2016c). ANTsR: a Neuroconductor package highlight Neuroconductor packages provide scripting access to large toolkits constructed primarily in other languages. For example, ANTsR (Avants and others, 2011a) is a general purpose biomedical image analysis package built on top of the C++ based Insight ToolKit (ITK) (Yoo and others, 2002), wrapped in the ITKR package, and Advanced Normalization Tools (ANTs) (Avants and others, 2011a). ANTsR leverages Rcpp (Eddelbuettel and others, 2011) to wrap well-validated ANTs methods such as joint label fusion for image labeling based on a library of anatomical templates (Wang and Yushkevich, 2013), Atropos for multi-channel segmentation (Avants and others, 2011b), as well as core methods of computational anatomy (Avants and others, 2014). ANTsR also contributes R-level access to ITK image iterators, multi-channel image classes and transformation objects as well as specialized methods for arterial spin labeling and BOLD-based network analysis via igraph interfaces (Weber and others, 2013). ANTsR is an example of how Neuroconductor can support innovative applications of machine learning to medical imaging. ANTsR provides the first freely available implementation of Rotation Invariant Patch-based Multi-Modality Analysis aRChitecture (RIPMMARC) (Kandel and others, 2015), which uses dictionary learning to extract modality-specific information from a multi-modal data set. ANTsR also provides prior-constrained, sparse dimensionality reduction methods, including implementations for both eigenanatomy (Cook and others, 2014) and sparse canonical correlation analysis for neuroimaging (Avants and others, 2014a). Furthermore, ANTsR serves as a platform for disseminating potentially clinically useful tools such as LINDA (Lesion Identification with Neighborhood Data Analysis, https://github.com/dorianps/LINDA) (Pustina and others, 2016) which implements a “convolutional” random forest for T1-based lesion segmentation. We anticipate that such downstream benefits of Neuroconductor will become more common in the future. Because ANTsR leverages most of the functionality of the ITK library, this package is large on disk. It also requires additional build tools, such as CMake (https://cmake.org/), which is not allowed in other repositories, such as CRAN. However, this is a very important package and provides an example of how Neuroconductor can handle large and complex packages. External software dependencies Many pieces of software exist for medical imaging analyses that are standalone. For example, fslr requires a working installation of FSL (Smith and others, 2004). The commands from FSL are called from the command line through R after checks on the data inputs. The thinking of this methodology is that FSL is an updated, well-maintained, and well-documented piece of software that will change with time. Without working extensively with the authors of FSL (or any other external software), porting all of the functionality would be redundant work and would need to be updated with a new release of the software. Therefore, we believe that requiring an additional system requirement (in the SystemRequirements field of the DESCRIPTION file) for user in some instances is necessary. Other packages, such as ANTsR, dcm2niir, and rcamino, external libraries are either bundled in the package or downloaded and installed at package building. If possible, installation at run-time or build-time is desired to reduce the necessary installation steps for the user. Data packages Neuroconductor hosts data packages that allow users to test software or contain highly relevant data, such as templates. To be posted, data need to be de-identified, the author/maintainer needs to have approval to make the data public, and all user agreements must be respected. Neuroconductor also hosts packages that can access data from public repositories, while respecting the data user agreements. Neuroconductor starts with a series of packages based on a multi-parametric neuroimaging study, which we refer to as Kirby21. Kirby21 contains data on 21 subjects scanned a day apart (Landman and others, 2011) and includes the following modalities: T1w, T2, FLAIR, proton density (PD), DTI, and fMRI. Neuroconductor also contains templates for T1w images: EveTemplate and MNITemplate; for a complete list of data packages see Table 1. Although Neuroconductor interacts with many neuroimaging data platforms, including the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC), many studies have restrictions on redistribution of data. Table 1. Referenced R packages available on neuroconductor Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Table 1. Referenced R packages available on neuroconductor Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Package  Description  References  Software packages  neurobase  Base functions for neuroconductor     oro.nifti  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.dicom  Working with the DICOM and NIfTI Data Standards in R  Whitcher and others (2011b)  oro.asl  oro.asl: Rigorous - Aterial Spin Labelling     oro.pet  oro.pet: Rigorous - Positron Emission Tomography     ANTsR  Advanced Normalization Tools  Avants and others (2011a)  extrantsr  Extensions for ANTsR           Muschelli and others (2015);  fslr  R package for FSL  Smith and others (2004)  freesurfer  R package for FreeSurfer  Fischl (2012)  oasis  OASIS lesion segmentation  Sweeney and others (2013)  WhiteStripe  White Stripe intensity normalization  Shinohara and others (2014)  RAVEL  Statistical analysis of structural MRIs  Fortin and others (2016c)  neurohcp  R interface for the Human Connectome Project database     Data packages  kirby21.t1  Example T1 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.t2  Example T2 Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.flair  Example FLAIR Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.dti  Example DTI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.fmri  Example fMRI Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.mt  Example MT Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.vaso  Example VASO Structural Data from the Kirby21 Dataset  Landman and others (2011)  kirby21.asl  Example ASL Structural Data from the Kirby21 Dataset  Landman and others (2011)  Template packages  MNITemplate  MNI152 template  Fortin and others (2016b)        Fortin and others (2016a);  EveTemplate  Eve Atlas and White Matter parcellation map  Oishi and others (2009)  Medical imaging and neuroimaging analyses The analysis of medical imaging data is not restricted to a specific region or organ in the body. The acquisition of medical imaging data may occur anywhere from head to toe and the choice of anatomical coverage is driven by the clinical need. While some packages in Neuroconductor have been designed specifically to analyze neuroimaging data, all packages that manipulate file formats (DICOM, NIfTI, etc.) and provide visualization are not specific to the brain. In addition, some of the analysis packages have already been applied below the neck. For example, the dcemriS4 package provides estimates for the key parameters in T1w dynamic contrast-enhanced MRI (DCE-MRI) experiments for perfusion imaging commonly used to assess tissue in cancer or from inflammatory processes; such as the breast or neck (Schmid and others, 2006; Whitcher and others, 2011a). The oro.pet package provides estimates of standard uptake values (SUVs) in PET experiments, where anatomical coverage may vary from the whole body (to assess the primary cancer and/or metastatic disease) to specific anatomical regions, such as the breast, liver, or prostate (Wahl and others, 2009). The developer perspective The comprehensive R archive network (CRAN) is the most standardized, popular, and common way to distribute R packages. Neuroconductor is a complementary platform dedicated to developers of R packages for image analysis. Neuroconductor contains extensive specific training materials and includes packages that do not integrate directly with CRAN or may require additional checks with external dependencies. Although Bioconductor is a framework that provides additional checks and has similar goals to Neuroconductor, Bioconductor packages were developed for bioinformatics. As one goal for Neuroconductor is to centralize imaging work in R, a separate platform is necessary. Neuroconductor is based on Git, GitHub, and continuous integration (CI) services via Travis CI and AppVeyor. Many developers use Git, a version control system, for their projects. To distribute, host, and collaborate on projects, many use online tools, with GitHub (https://github.com/) being one of the most popular. As GitHub is an online site, distribution of a package is performed by downloading or cloning a repository. Users can easily install R packages directly from GitHub using functions from add-on R packages, most notably devtools (Wickham and Chang). Along with a detailed and revertible timeline for each package, GitHub provides a page that allows users to flag issues for developers, tag stable releases of the package, and obtain information on package activity. Along with GitHub, Travis CI (https://travis-ci.org/) and AppVeyor (https://www.appveyor.com) provide a system to check packages as they are updated. These tools provide a cloud-based infrastructure that: (i) can host and distribute packages; (ii) check packages for specific requirements; (iii) provide a platform for bug reporting and feature requests; and (iv) check how frequently issues are addressed. Travis CI will check packages on Linux and Mac OSX distributions. Appveyor will check packages on Windows platforms; a small percentage of packages will not be applicable for Windows machine due to intrinsic nature of the non-R components of the software. Windows 10 currently has a Linux subsystem that may be used in these few exceptions. Therefore, a package submitted to Neuroconductor does not need to pass checks for Windows to be incorporated into the platform. As some software in medical imaging has only implemented versions for *nix-based systems, we will allow users to submit Unix-only R packages, but will encourage them to refactor their code if possible to enable all of Neuroconductor to be cross-platform. To submit a package to Neuroconductor, the author/maintainer of the package provides the GitHub link for the package. Once the package is submitted several initial checks are conducted (Figure 3). These checks ensure that the package has been created correctly. After initial checks are complete, the package must be verified by email. This verification is designed to prevent spam and allow the developer to stop a package if they would like to revise the package before re-submitting. Fig. 3. View largeDownload slide Neuroconductor initial package submission. If the DESCRIPTION file is present, it will parsed and the submitted version of the package is checked against existing Neuroconductor packages. If this version is not already in Neuroconductor, the email-based verification process is started. Once the maintainer verifies the submission the package is ready to be tested. Fig. 3. View largeDownload slide Neuroconductor initial package submission. If the DESCRIPTION file is present, it will parsed and the submitted version of the package is checked against existing Neuroconductor packages. If this version is not already in Neuroconductor, the email-based verification process is started. Once the maintainer verifies the submission the package is ready to be tested. Once the verification is complete, the package is processed according to the workflow described in Figure 4. Overall, the package is copied/cloned to a remote server. Standardized Travis CI and Appveyor configuration files, specific to Neuroconductor, are added (see https://neuroconductor.org/package-changes for links to examples). These are to ensure that the checks performed on these services are consistent for each package. For example, any warnings when checking the package (using R CMD check) will be treated as an error. Some parameters of the package DESCRIPTION file are changed. These parameters ensure that when a package is downloaded from Neuroconductor, the correct versions of the dependent packages are used. Fig. 4. View largeDownload slide Neuroconductor package code testing. The package is cloned/updated on the Neuroconductor server and Travis CI/AppVeyor checks are initiated for the original version of the package together with a stable and current version of this package. The DESCRIPTION, travis.yml and appveyor.yml files are updated to reflect the stable/current package status. Once the Travis CI / AppVeyor checks are done the package listing is updated to reflect the continuous integration test results. Fig. 4. View largeDownload slide Neuroconductor package code testing. The package is cloned/updated on the Neuroconductor server and Travis CI/AppVeyor checks are initiated for the original version of the package together with a stable and current version of this package. The DESCRIPTION, travis.yml and appveyor.yml files are updated to reflect the stable/current package status. Once the Travis CI / AppVeyor checks are done the package listing is updated to reflect the continuous integration test results. Next, the package is pushed to the central Neuroconductor GitHub (https://github.com/neuroconductor) and submitted to Travis CI and AppVeyor to be built and checked on multiple systems. Parameters are set to ensure that Travis CI and AppVeyor use the correct versions of Neuroconductor packages for checking and external dependencies are installed. The author of the package receives an automatic email indicating whether the package was built successfully and is integrated with Neuroconductor together with a description file containing pertinent information about the process. The code coverage, the percentage of the code in the package run when checked, is computed using the covr package (Hester, 2017) and the Coveralls.io platform. This coverage is displayed on the Neuroconductor package page. Stable and current package versions We use the terminology “Stable” and “Current” to differentiate a different status of development for a Neuroconductor package. On the initial submission, after all checks are passed, the package is incorporated into Neuroconductor and deemed the Stable version. The Current version of the package is the result of nightly pulls and mirror the latest package version from the developer’s GitHub repository. This provides Neuroconductor users with a way to use the latest versions of a package and at the same time it provides the Neuroconductor platform with a safe way of checking new versions of a package against the existing set of Current Neuroconductor packages. If a Current version of a package passes all the required Neuroconductor tests, we contact the developer of the package and suggest an official re-submission to Neuroconductor. If the newly re-submitted version of the package passes the checks against the Stable Neuroconductor packages, this version is incorporated to the Stable version of Neuroconductor. The package author can ask for help from the maintainers of Neuroconductor (https://neuroconductor.https://neuroconductor.org/contact-us or email neuroconductor@gmail.com) both for compatibility issues as well as for R-specific questions. All of these steps are automated after the verification has been completed. Conclusion Neuroconductor is a platform for developing and distributing R packages for medical image analysis. Neuroconductor aims to provide similar resources and content to Bioconductor, but is primarily focused on imaging. Neuroconductor can leverage all tools available in CRAN, Bioconductor, and R to provide complete image analytic pipelines. This framework is likely to increase the reliability and reproducibility of medical image analysis software and enable a larger number of users to perform image analyses using state-of-the-art tools. Neuroconductor currently accepting any packages that fit within this framework, even if they are hosted on other platforms such as CRAN and Bioconductor. As the platform matures, we hope to centralize these packages in only one repository to reduce inconsistencies that may occur. Funding Neuroconductor and its maintainers have been partially supported by the R01 grant NS060910 from the National Institute of Neurological Disorders and Stroke at the National Institutes of Health (NINDS/NIH). We are also grateful to the Department of Biostatistics at Johns Hopkins University for providing support during the early stages of development. References Adler D., Murdoch D., Nenadic O., Urbanek S., Chen M., Gebhardt A., Bolker B., Csardi G., Strzelecki A. and Senger A. ( 2016). rgl: 3D Visualization Using OpenGL . R package version 0.96.0. Allaire J., Cheng J., Xie Y., McPherson J., Chang W., Allen J., Wickham H. and Hyndman R. ( 2015). rmarkdown: Dynamic Documents for Rr . R package version 0.5. Allaire J. J., Ushey K., Tang Y. and Eddelbuettel D. ( 2017). reticulate: R Interface to Python . Avants B. B., Epstein C. L., Grossman M. and Gee J. C. Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. 12, 26– 41. Avants B. B., Libon D. J., Rascovsky K., Boller A., McMillan C. T., Massimo L., Coslett H. B., Chatterjee A., Gross R. G. and Grossman M. ( 2014a). Sparse canonical correlation analysis relates network-level atrophy to multivariate cognitive measures in a neurodegenerative population. Neuroimage  84, 698– 711. Google Scholar CrossRef Search ADS   Avants B. B., Tustison N. J., Stauffer M., Song G., Wu B. and Gee J. C. ( 2014b). The insight toolkit image registration framework. Frontiers in Neuroinformatics  8, 44. Google Scholar CrossRef Search ADS   Avants B. B., Tustison N. J., Song G., Cook P. A., Klein A. and Gee J. C. ( 2011a). A reproducible evaluation of ants similarity metric performance in brain image registration. NeuroImage  54, 2033– 2044. Google Scholar CrossRef Search ADS   Avants B. B., Tustison N. J., Wu J., Cook P. A. and Gee J. C. ( 2011b). An open source multivariate framework for n-tissue segmentation with evaluation on public data. Neuroinformatics  9, 381– 400. Google Scholar CrossRef Search ADS   Basser P. J, Mattiello J. and LeBihan D. ( 1994). MR diffusion tensor spectroscopy and imaging. Biophysical Journal  66, 259– 267. Google Scholar CrossRef Search ADS PubMed  Bates D., Mächler M., Bolker B. and Walker S. ( 2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software  67, 1– 48. Google Scholar CrossRef Search ADS   Bellosta C. J. G. ( 2015). rPython: Package Allowing R to Call Python . R package version 0.0-6. Chang W., Cheng J., Allaire J., Xie Y. and McPherson J. ( 2015) shiny: Web application framework for r, 2015. URL http://CRAN.R-project.org/package=shiny. R version 0.11. Clayden J. D., Cox B., Jenkinson M., Reynolds R., Fissell K., Gailly J. and Adler M. ( 2016). RNifti: Fast R and C++ Access to NIfTI Images . R package version 0.3.0. Clayden J. D., Modat M., Daga P., Presles B., Anthopoulos T. and Dag P. ( 2017). RNiftyReg: Image Registration Using the NiftyReg Library . R package version 2.6.0. Clayden J. D., Muñoz M., Susana S., Amos J., King M. D., Bastin M. E. and Clark C. A. ( 2011). TractoR: magnetic resonance imaging and tractography with R. Journal of Statistical Software  44, 1– 18. Google Scholar CrossRef Search ADS   Clayden J. D. and Rorden C. ( 2017). divest: Get Images Out of DICOM Format Quickly . R package version 0.2.0. Cook P. A., Bai Y., Nedjati-Gilani S. K. K. S., Seunarine K. K., Hall M. G., Parker G. J. and Alexander D. C. ( 2006). Camino: open-source diffusion-MRI reconstruction and processing. In: 14th scientific meeting of the international society for magnetic resonance in medicine , vol. 2759. Seattle WA, USA. Cook P. A. McMillan C. T., Avants B. B., Peelle J. E., Gee J. C. and Grossman M. ( 2014). Relating brain anatomy and cognitive ability using a multivariate multimodal framework. NeuroImage  99, 477– 486. Google Scholar CrossRef Search ADS PubMed  Eddelbuettel D., François R., Allaire J., Chambers J., Bates D. and Ushey K. ( 2011). Rcpp: seamless R and C++ integration. Journal of Statistical Software  40, 1– 18. Fischl B. ( 2012). Freesurfer. Neuroimage  62, 774– 781. Google Scholar CrossRef Search ADS PubMed  Fortin J.-P., Muschelli J. and Shinohara R. T. ( 2016a). EveTemplate: JHU-MNI-ss (Eve) Template . R package version 0.99.14. Fortin J.-P., Muschelli J. and Shinohara R. T. ( 2016b). MNITemplate: MNI152 Template . R package version 0.99.4. Fortin J.-P., Sweeney E. M., Muschelli J., Crainiceanu C. M., Shinohara R. T., Alzheimer’s Disease Neuroimaging Initiative and others. ( 2016c). Removing inter-subject technical variability in magnetic resonance imaging studies. NeuroImage  132, 198– 212. Google Scholar CrossRef Search ADS   Gagnon-Bartsch J. A. and Speed T. P. ( 2012). Using control genes to correct for unwanted variation in microarray data. Biostatistics  13, 539– 552. Google Scholar CrossRef Search ADS PubMed  Garcia de la Garza A., Vandekar S., Roalf D., Ruparel K., Gur R., Gur R., Satterthwaite T. and Shinohara R. T. ( 2016). voxel: Mass-Univariate Voxelwise Analysis of Medical Imaging Data . R package version 1.2.1. Gentleman R. C., Carey V. J., Bates D. M., Bolstad B., Dettling M., Dudoit S., Ellis B., Gautier L., Ge Y., Gentry J. and others. ( 2004). Bioconductor: open software development for computational biology and bioinformatics. Genome Biology  5, 1. Google Scholar CrossRef Search ADS   Glasser M. F., Sotiropoulos S. N., Wilson J. A., Coalson T. S., Fischl B., Andersson J. L., Xu J., Jbabdi S., Webster M., Polimeni J. R and others. ( 2013). The minimal preprocessing pipelines for the human connectome project. Neuroimage  80, 105– 124. Google Scholar CrossRef Search ADS PubMed  Grabner G., Janke A. L., Budge M. M., Smith D., Pruessner J. and Collins D. L. ( 2006). Symmetric atlasing and model based segmentation: an application to the hippocampus in older adults. In: International Conference on Medical Image Computing and Computer-Assisted Intervention . Berlin, Heidelberg: Springer, pp. 58– 66. Hester J. ( 2017). covr: Test Coverage for Packages . R package version 3.0.0. Hornik K. ( 2016). R FAQ. Huber W., Carey V. J., Gentleman R., Anders S., Carlson M., Carvalho B. S., Bravo H. C., Davis S., Gatto L., Girke T. and others. ( 2015). Orchestrating high-throughput genomic analysis with bioconductor. Nature Methods  12, 115– 121. Google Scholar CrossRef Search ADS PubMed  Johnson W. E., Li C. and Rabinovic A. ( 2007). Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics  8, 118– 127. Google Scholar CrossRef Search ADS PubMed  Kandel B. M., Wang D. J. J., Detre J. A., Gee J. C. and Avants B. B. ( 2015). Decomposing cerebral blood flow mri into functional and structural components: a non-local approach based on prediction. NeuroImage  105, 156– 170. Google Scholar CrossRef Search ADS PubMed  Koay C. G., Chang L.-C., Carew J. D., Pierpaoli C. and Basser P. J. ( 2006). A unifying theoretical and algorithmic framework for least squares methods of estimation in diffusion tensor imaging. Journal of Magnetic Resonance  182, 115– 125. Google Scholar CrossRef Search ADS PubMed  Landman B. A., Huang A. J., Gifford A., Vikram D. S., Lim I. A. L, Farrell J. A. D., Bogovic J. A., Hua J., Chen M., Jarso S. and others. ( 2011). Multi-parametric neuroimaging reproducibility: a 3-t resource study. Neuroimage  54, 2854– 2866. Google Scholar CrossRef Search ADS PubMed  Le Bihan D., Mangin J.-F., Poupon C., Clark C. A., Pappata S., Molko N. and Chabriat H. ( 2001). Diffusion tensor imaging: concepts and applications. Journal of Magnetic Resonance Imaging  13, 534– 546. Google Scholar CrossRef Search ADS PubMed  Leek J. T., Scharpf R. B., Bravo H. C., Simcha D., Langmead B., Johnson W. E., Geman D., Baggerly K. and Irizarry R. A. ( 2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics  11, 733– 739. Google Scholar CrossRef Search ADS PubMed  Leek J. T. and Storey J. D. ( 2007). Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genetics  3, 1724– 1735. Google Scholar CrossRef Search ADS PubMed  Leek J. T. and Storey J. D. ( 2008). A general framework for multiple testing dependence. Proceedings of the National Academy of Sciences  105, 18718– 18723. Google Scholar CrossRef Search ADS   Li X., Morgan P. S., Ashburner J., Smith J. and Rorden C. ( 2016). The first step for neuroimaging data analysis: DICOM to NIfTI conversion. Journal of Neuroscience Methods  264, 47– 56. Google Scholar CrossRef Search ADS PubMed  Mejia A. F, Nebel M. B., Eloyan A., Caffo B. and Lindquist M. A. ( 2017). PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data. Biostatistics  13, 521– 536. Muschelli J. ( 2016a). dcm2niir: Conversion of ‘DICOM’ to ‘NIfTI’ Imaging Files Through R . R package version 0.3.3. Muschelli J. ( 2016b). ichseg: Intracerebral Hemorrhage Segmentation of X-Ray Computed Tomography (CT) Images . R package version 0.5.1. Muschelli J. ( 2016c). neurobase: ‘Neuroconductor’ Base Package with Helper Functions for ‘nifti’ Objects . R package version 1.9.1. Muschelli J. ( 2016d). rcamino: Port of the Camino Software . R package version 0.3.1. Muschelli J. ( 2016e). spm12r: Wrapper Functions for ‘SPM’ (Statistical Parametric Mapping) Version 12 from the ‘Wellcome’ Trust Centre for ‘Neuroimaging’ . R package version 2.1. Muschelli J., Sweeney E., Lindquist M. and Crainiceanu C. ( 2015). fslr: Connecting the FSL software with R. The R Journal  7, 163– 175. Google Scholar PubMed  Muschelli J., Sweeney E. M., Ullman N. L., Vespa P., Hanley D. F. and Crainiceanu C. M. ( 2017). PItcHPERFeCT: Primary intracranial hemorrhage probability estimation using random forests on CT. NeuroImage: Clinical , 14, 379– 390. Google Scholar CrossRef Search ADS PubMed  O’Donnell L. J. and Westin C.-F. ( 2011). An introduction to diffusion tensor image analysis. Neurosurgery Clinics of North America  22, 185– 196. Google Scholar CrossRef Search ADS PubMed  Oishi K., Faria A., Jiang H., Li X., Akhter K., Zhang J., Hsu J. T., Miller M. I., van Zijl P. C. M., Albert M. and others. ( 2009). Atlas-based whole brain white matter analysis using large deformation diffeomorphic metric mapping: application to normal elderly and alzheimer’s disease participants. Neuroimage  46, 486– 499. Google Scholar CrossRef Search ADS PubMed  Ou Y., Sotiras A., Paragios N. and Davatzikos C. ( 2011). Dramms: deformable registration via attribute matching and mutual-saliency weighting. Medical Image Analysis  15, 622– 639. Google Scholar CrossRef Search ADS PubMed  Penny W. D., Friston K. J., Ashburner J. T., Kiebel S. J and Nichols T. E. ( 2011). Statistical Parametric Mapping: The Analysis of Functional Brain Images . London, UK: Academic Press. Pustina D., Coslett H., Turkeltaub P. E., Tustison N., Schwartz M. F. and Avants B. ( 2016). Automated segmentation of chronic stroke lesions using LINDA: lesion identification with neighborhood data analysis. Human Brain Mapping  37, 1405– 1421. Google Scholar CrossRef Search ADS PubMed  R Core Team. ( 2017). R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing, Vienna, Austria. Ritchie M. E., Phipson B., Wu D., Hu Y., Law C. W., Shi W. and Smyth G. K. ( 2015). Limma powers differential expression analyses for rna-sequencing and microarray studies. Nucleic Acids Research  43, e47. Google Scholar CrossRef Search ADS PubMed  Schmid V. J., Whitcher B., Padhani A. R., Taylor N. J. and Yang G.-Z. ( 2006). Bayesian methods for pharmacokinetic models in dynamic contrast-enhanced magnetic resonance imaging. IEEE Transactions on Medical Imaging  25, 1627– 1636. Google Scholar CrossRef Search ADS PubMed  Schwendinger F. ( 2015). PythonInR: Use Python from Within R . R package version 0.1-3. Shinohara R. T., Sweeney E. M., Goldsmith J., Shiee N., Mateen F. J., Calabresi P. A., Jarso S., Pham D. L., Reich D. S., Crainiceanu C. M. and others. ( 2014). Statistical normalization techniques for magnetic resonance imaging. NeuroImage: Clinical  6, 9– 19. Google Scholar CrossRef Search ADS PubMed  Sled J. G., Zijdenbos A. P. and Evans A.C. ( 1998). A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging  17, 87– 97. Google Scholar CrossRef Search ADS PubMed  Smith S. M, Jenkinson M., Woolrich M. W., Beckmann C. F., Behrens T. E. J., Johansen-Berg H., Bannister P. R., De Luca M., Drobnjak I., Flitney D. E. and others. ( 2004). Advances in functional and structural mr image analysis and implementation as FSL. Neuroimage  23, S208– S219. Google Scholar CrossRef Search ADS PubMed  Smyth G. K. ( 2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology  3, 1– 25. Google Scholar CrossRef Search ADS   Smyth G. K. ( 2005). Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions Using R and Bioconductor . New York, NY: Springer, pp. 397– 420. Google Scholar CrossRef Search ADS   Sweeney E. M., Shinohara R. T., Shiee N., Mateen F. J., Chudgar A. A., Cuzzocreo J. L., Calabresi P. A., Pham D. L., Reich D. S. and Crainiceanu C. M. ( 2013). Oasis is automated statistical inference for segmentation, with applications to multiple sclerosis lesion segmentation in MRI. NeuroImage: Clinical  2, 402– 413. Google Scholar CrossRef Search ADS PubMed  Tabelow K., Clayden J. D., Lafaye de Micheaux P., Polzehl J., Schmid V. J. and Whitcher B. J. ( 2011). Image analysis and statistical inference in neuroimaging with R. NeuroImage  55, 1686– 1693. Google Scholar CrossRef Search ADS PubMed  Therneau T. M. and Grambsch P. M. ( 2000). Modeling Survival Data: Extending the Cox Model . New York: Springer. Google Scholar CrossRef Search ADS   Therneau T. M. ( 2015). A Package for Survival Analysis in S . version 2.38. Carey V. J. Ported to R by Thomas Lumley and Brian Ripley. ( 2015). gee: Generalized Estimation Equation Solver . R package version 4.13-19. Tustison N. J., Avants B. B., Cook P. A., Zheng Y., Egan A., Yushkevich P. A. and Gee J. C. ( 2010). N4itk: improved N3 bias correction. IEEE Transactions on Medical Imaging  29, 1310– 1320. Google Scholar CrossRef Search ADS PubMed  Van Essen D. C., Smith S. M., Barch D. M., Behrens T. E. J., Yacoub E., Ugurbil K., WU-Minn HCP Consortium, and others. ( 2013). The WU-Minn human connectome project: an overview. Neuroimage  80, 62– 79. Google Scholar CrossRef Search ADS PubMed  Wahl R. L., Jacene H., Kasamon Y. and Lodge M. A. ( 2009). From RECIST to PERCIST: evolving considerations for PET response criteria in solid tumors. Journal of Nuclear Medicine  50, 122S– 150S. Google Scholar CrossRef Search ADS PubMed  Wang H. and Yushkevich P. ( 2013). Multi-atlas segmentation with joint label fusion and corrective learning an open source implementation. Frontiers in Neuroinformatics  7, 27. Google Scholar PubMed  Weber M. J., Detre J. A., Thompson-Schill S. L. and Avants B. B. ( 2013). Reproducibility of functional network metrics and network structure: a comparison of task-related BOLD, resting ASL with BOLD contrast, and resting cerebral blood flow. Cognitive, Affective, & Behavioral Neuroscience  13, 627– 640. Google Scholar CrossRef Search ADS   Whitcher B., Schmid V. J., Collins D. J., Orton M. R., Koh D.-M., de Corcuera I. D., Parera M., del Campo J. M., Leach M. O., Harrington K. and others. ( 2011a). A Bayesian hierarchical model for DCE-MRI to evaluate treatment response in a phase II study in advanced squamous cell carcinoma of the head and neck. Magnetic Resonance Materials in Physics, Biology and Medicine  24, 85– 96. Google Scholar CrossRef Search ADS   Whitcher B., Schmid V. J. and Thornton A. ( 2011b). Working with the DICOM and NIfTI data standards in R. Journal of Statistical Software  44, 1– 28. Wickham H. ( 2009). ggplot2: Elegant Graphics for Data Analysis . New York, NY: Springer Science & Business Media. Wickham H. and Chang W. ( 2017). devtools: Tools to Make Developing R Packages Easier . R package version 1.12.0.9000. Xie Y. ( 2013). knitr: A General-purpose Package for Dynamic Report Generation in R.  R package version, vol. 1( 7), 1. Yoo T. S., Ackerman M. J., Lorensen W. E., Schroeder W., Chalana V., Aylward S., Metaxas D. and Whitaker R. ( 2002). Engineering and algorithm design for an image processing Api: a technical report on ITK-the insight toolkit. Studies in Health Technology and Informatics , 85, 586– 592. Google Scholar PubMed  Zhang Y., Brady M. and Smith S. ( 2001). Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Transactions on Medical Imaging  20, 45– 57. Google Scholar CrossRef Search ADS PubMed  © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Journal

BiostatisticsOxford University Press

Published: Jan 6, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off