TY - JOUR AU - Bastow, Ruth AB - Abstract High-throughput sequencing technologies have rapidly moved from large international sequencing centres to individual laboratory benchtops. These changes have driven the ‘data deluge’ of modern biology. Submissions of nucleotide sequences to GenBank, for example, have doubled in size every year since 1982, and individual data sets now frequently reach terabytes in size. While ‘big data’ present exciting opportunities for scientific discovery, data analysis skills are not part of the typical wet bench biologist's experience. Knowing what to do with data, how to visualize and analyse them, make predictions, and test hypotheses are important barriers to success. Many researchers also lack adequate capacity to store and share these data, creating further bottlenecks to effective collaboration between groups and institutes. The US National Science Foundation-funded iPlant Collaborative was established in 2008 to form part of the data collection and analysis pipeline and help alleviate the bottlenecks associated with the big data challenge in plant science. Leveraging the power of high-performance computing facilities, iPlant provides free-to-use cyberinfrastructure to enable terabytes of data storage, improve analysis, and facilitate collaborations. To help train UK plant science researchers to use the iPlant platform and understand how it can be exploited to further research, GARNet organized a four-day Data mining with iPlant workshop at Warwick University in September 2013. This report provides an overview of the workshop, and highlights the power of the iPlant environment for lowering barriers to using complex bioinformatics resources, furthering discoveries in plant science research and providing a platform for education and outreach programmes. Big data, cloud computing, cyberinfrastructure, data analysis, data storage, high-performance computing, resource, workshop report Introduction The iPlant Collaborative was established in 2008 in response to a US National Science Foundation (NSF) call for proposals to create ‘a new type of organization: a cyberinfrastructure (CI) collaborative for plant science’ (NSF, http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=13704, last accessed 18 September 2014). The call was a response to increasingly accessible genome sequencing and other high-throughput biology techniques that make large data sets, several terabytes in size, a common feature in modern plant biology. While ‘big data’ create exciting opportunities for scientific discovery, the data collected outpace our capacity to store, share, analyse, and curate these huge outputs. iPlant ‘is designed to support the computational needs of the research community and facilitate progress toward solutions of major problems in plant biology’ (Goff et al., 2011). The power of high-performance computing (HPC) hardware (located at the University of Arizona and the Extreme Science and Engineering Discovery Environment [XSEDE] supercomputer facilities at the Texas Advanced Computing Center [TACC]) can be harnessed via a number of integrated middleware services; this has led to the development of a suite of iPlant products that facilitate collaboration in order to develop ‘better tools, workflows, algorithms, and ontologies’ (Stanzione, 2011) (Fig. 1). Fig. 1. View largeDownload slide iPlant Services Layer Cake. At the base of iPlant cyberinfrastructure are hardware resources including storage and supercomputing hardware. Low-level services such as iRODS and Condor furnish functionality that higher-level resources take advantage of. This means that users can carry out analyses on Atmosphere, the Discovery Environment, or other community-facing resources, while keeping all their data in one place. The low-level resources are packaged into foundational services—these are suites of functionalities that iPlant products use; other projects can also build on these services and developers typically interact with iPlant at this level. Finally, the top layer of community-facing resources are the portals with which most users will interact—everything from graphical interfaces to bioinformatics applications to user-friendly cloud services. (This figure is available in colour at JXB online.) Fig. 1. View largeDownload slide iPlant Services Layer Cake. At the base of iPlant cyberinfrastructure are hardware resources including storage and supercomputing hardware. Low-level services such as iRODS and Condor furnish functionality that higher-level resources take advantage of. This means that users can carry out analyses on Atmosphere, the Discovery Environment, or other community-facing resources, while keeping all their data in one place. The low-level resources are packaged into foundational services—these are suites of functionalities that iPlant products use; other projects can also build on these services and developers typically interact with iPlant at this level. Finally, the top layer of community-facing resources are the portals with which most users will interact—everything from graphical interfaces to bioinformatics applications to user-friendly cloud services. (This figure is available in colour at JXB online.) Based in the US, iPlant's CI is a platform open to anyone around the world; its 18 500 users can freely access a comprehensive suite of intuitive, user-friendly tools and resources created with data storage, sharing, and analysis challenges relevant to the life sciences. iPlant is open source. Users can adapt existing resources, develop new ones, build automated workflows, and share these with the global community. Biology is increasingly data-driven (Smith et al., 2011). The number of submissions of nucleotide sequences to GenBank, for example, has doubled in size every year since 1982 (Scherm et al., 2014), and individual data sets frequently reach the order of terabytes in size. Despite this, many wet laboratory plant biologists lack the skills and experience to use computational and bioinformatics data analysis methods confidently (Oliver et al., 2013). GARNet, the UK Arabidopsis research network, identified a need to address this skills gap within the UK plant science community and held a four-day Data mining with iPlant workshop at Warwick University in 2013. Led by iPlant Collaborative instructors, attendees were taught, via hands-on exercises and tutorials, how to use iPlant's three core services: the Data Store, for cloud-based large data storage and retrieval; the Discovery Environment, for user-friendly data analysis software; and Atmosphere, a platform allowing researchers to custom-build virtual workbenches and share these with collaborators anywhere in the world. Attendees were also introduced to DNA Subway, an interactive educational tool designed to facilitate the teaching of genomic analysis. Training plant scientists, particularly early career researchers and PhD students, in how to use the data storage and analysis resources provided by iPlant's CI could have wide-reaching benefits. Access to supercomputing power could increase the capacity for better and more rapid data sharing between research institutions, allow faster and more reliable data analysis, and provide increased opportunities to collaborate effectively with international research groups. The capability to share existing data with the global community within an open access framework may increase the frequency and impact of scientific discoveries to solve important, global research challenges. In addition, thanks to the ability to share novel applications and workflows within iPlant more easily, the global plant science community could benefit from the wealth of programming expertise that exists in the UK, and vice versa. This report provides an overview of the workshop and highlights the benefits of iPlant's CI for the plant science community in terms of making complex bioinformatics resources more accessible to researchers lacking specific computational expertise, furthering discoveries in plant science research, and providing a platform for education and outreach programmes. Exploring iPlant The first day of the Data mining with iPlant workshop gave an introduction to iPlant's core services: the Data Store, Discovery Environment, and Atmosphere, iPlant's cloud resource. More than 80 researchers, from PhD students to Principal Investigators, attended the introductory day, coming mostly from the UK but with some visiting from as far afield as the Philippines and Saudi Arabia. On the remaining three days, workshop participants were guided through a series of hands-on exercises to provide more in-depth training on how to use and manipulate iPlant's applications. Storing, analysing, and sharing data The Data Store Cloud-based data storage such as the services offered by DropBox, Google Drive or iCloud are ideal for storing and sharing small files between scientists engaged in research collaborations. However, they are inadequate—and can be expensive—when more than a few gigabytes of data need to be easily accessed and disseminated. Delegates on the Data mining with iPlant workshop were trained in the use of iPlant's solution to this challenge: the Data Store. Using an open source data grid technology called iRODS, developed at the University of San Diego (Rajasekar et al., 2006), the Data Store provides users with 100 GB of rapid-access, cloud-based storage, with terabyte allocations available upon request. This storage capacity is federated and linked between the physical HPC facilities in Texas and Arizona (Goff et al., 2011). To safeguard data privacy where required, the user can allow or deny access to specific users, or openly share data files with the global community. This is useful in collaborative situations so that, for example, data can be kept confidential within a team while research is being conducted and a paper is being written, but once the research has been published the data can then be made public. Furthermore, data in iPlant's Data Store can easily be imported into, downloaded from, and manipulated using iPlant's other CI applications. The Discovery Environment To help close the gap between the requirement for complex data analysis methods and the average bench biologist's computational skills, iPlant has created the Discovery Environment (DE) platform, a web-based, graphical interface that allows users to access iPlant's compute power, as well as its modular, integrated data analysis applications. Having a web-based interface and applications (apps) with user-friendly ‘wrappers’ that look and feel familiar to non-expert users is far more accessible to researchers used to working with Windows or Mac operating systems, and/or who have little or no experience with UNIX or Linux command line programming. Visually and functionally, the DE web interface looks like a computer desktop, with data and apps side by side on screen and accessible to the user at all times. For expert users, command line access is also available when using the Atmosphere cloud and via iPlant's science application programming interfaces (APIs; (Goff et al., 2011). To demonstrate one of iPlant's most commonly used apps, attendees of the GARNet–iPlant workshop were guided through a simple multiple sequence alignment using a small RNS-Seq file available from the community Data Store. The same exercise was then repeated using a much larger file to demonstrate that, although not limitless, access to HPC resources via open source CI can be very powerful in terms of scalability. As well as RNA-Seq analysis, the iPlant DE has over 400 apps designed for specific uses, and these can also be combined into custom workflows for very complex analyses. Fifteen of the most popular public apps in the DE, ranked by how frequently they are used, are shown in Table 1. Workshop attendees were shown how to use apps for Genotype by Sequencing workflows (e.g. TASSEL package apps), Genome Wide Association Sequencing (GWAS; e.g. Qxpak, TASSEL, PLINK, FaST-LMM), Quantitative Trait Loci (QTL) analysis (e.g. QTL cartographer), and CHiPSeq and genomic interval analysis (e.g. MACS, PeakRanger, Chromatra L, Chromatra T). Table 1. The top 15 apps in the iPlant Discovery Environment, listed in order of frequency of use in 2013–2014 A complete list of apps in the Discovery Environment can be found at http://iplantcollaborative.org/apps1 (last accessed 18 September 2014) App name  Brief description  TopHat2-PE  TopHat 2.0.9 (for paired-end reads) is a fast splice junction mapper for    RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized    genomes using the short read aligner Bowtie and analyses the mapping    results to identify splice junctions between exons.  TopHat2-SE  A fast splice junction mapper as above, but for single-end reads.  FastQC 0.10.1  FastQC aims to provides simple quality control checks for raw    sequence data from high-throughput sequencing pipelines. It provides    a modular set of analyses to help you quickly see whether your data    have any problems you should be aware of before analysing further.  Cuffdiff2  Cuffdiff performs differential transcript abundance analysis for two or    more RNA-Seq samples.  Uncompress files with gunzip  An app used to uncompress one or more.gz files.  FASTX trimmer  FASTX Trimmer is used for shortening reads in a FASTQ or FASTA    files, often for the purposes of removing barcodes or noise.  Word Count  Counts and summarizes the number of lines, words, and bytes in a    target file.  Cufflinks2  Version 2 of Cufflinks, which assembles transcripts using sequence    alignments (BAM) generated by the TopHat/Bowtie apps.  NCBI SRA Toolkit fastq-dump 2.1.9  Part of the NCBI SRA Toolkit used to convert archive files in the    .SRA format into.FASTQ format suitable for downstream analysis.  Concatenate multiple files  Concatenate Multiple Files (link title to source) joins files head-to-tail.    It's useful for combining data files, adding headers to files, and a host    of other purposes.  BWA aln 0.5.9 HPC (Customizable)  HPC-boosted BWA alignment with user-specified genome file.  BWA 0.5.9  BWA is a software package for mapping low-divergent sequences    against a large reference genome.  NCBI SRA Import  NCBI SRA Import is a utility for importing accessions from the NCBI    read archives using the fast Aspera network protocol. Files imported    this way, rather than via the FTP interface, will appear in the iPlant    Data Store anywhere from 5-25x more quickly.  Pre-Process Reads  Illumina read quality control pipeline for paired and unpaired reads    (maintains pairing).  Sickle-quality-based-trimming  An application for trimming FASTQ files based on quality values.    Includes paired-end trimming. Sickle performs quality trimming of    reads and works with paired or single reads.  App name  Brief description  TopHat2-PE  TopHat 2.0.9 (for paired-end reads) is a fast splice junction mapper for    RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized    genomes using the short read aligner Bowtie and analyses the mapping    results to identify splice junctions between exons.  TopHat2-SE  A fast splice junction mapper as above, but for single-end reads.  FastQC 0.10.1  FastQC aims to provides simple quality control checks for raw    sequence data from high-throughput sequencing pipelines. It provides    a modular set of analyses to help you quickly see whether your data    have any problems you should be aware of before analysing further.  Cuffdiff2  Cuffdiff performs differential transcript abundance analysis for two or    more RNA-Seq samples.  Uncompress files with gunzip  An app used to uncompress one or more.gz files.  FASTX trimmer  FASTX Trimmer is used for shortening reads in a FASTQ or FASTA    files, often for the purposes of removing barcodes or noise.  Word Count  Counts and summarizes the number of lines, words, and bytes in a    target file.  Cufflinks2  Version 2 of Cufflinks, which assembles transcripts using sequence    alignments (BAM) generated by the TopHat/Bowtie apps.  NCBI SRA Toolkit fastq-dump 2.1.9  Part of the NCBI SRA Toolkit used to convert archive files in the    .SRA format into.FASTQ format suitable for downstream analysis.  Concatenate multiple files  Concatenate Multiple Files (link title to source) joins files head-to-tail.    It's useful for combining data files, adding headers to files, and a host    of other purposes.  BWA aln 0.5.9 HPC (Customizable)  HPC-boosted BWA alignment with user-specified genome file.  BWA 0.5.9  BWA is a software package for mapping low-divergent sequences    against a large reference genome.  NCBI SRA Import  NCBI SRA Import is a utility for importing accessions from the NCBI    read archives using the fast Aspera network protocol. Files imported    this way, rather than via the FTP interface, will appear in the iPlant    Data Store anywhere from 5-25x more quickly.  Pre-Process Reads  Illumina read quality control pipeline for paired and unpaired reads    (maintains pairing).  Sickle-quality-based-trimming  An application for trimming FASTQ files based on quality values.    Includes paired-end trimming. Sickle performs quality trimming of    reads and works with paired or single reads.  View Large The DE itself is an open source platform, rather than a single tool, much in the same way that the web browsers Mozilla Firefox or Google Chrome are open platforms that allow users to develop add-ons and extensions. Users with the necessary domain knowledge can expand, improve, and adapt apps already compatible with the DE to customize them for their, or their laboratories’, particular needs. Users familiar with command line applications are also empowered to create entirely new ones. This increases the sustainability and longevity of the platform in the light of potential future upscaling, changing technologies or operating systems. These apps are designed to work individually, or together in workflows, to address particular data analysis requirements (Goff et al., 2011). For example at the recent international Plant and Animal Genome XXII (PAG 2014) conference (San Diego, USA, January 2014), David Horvath (United States Department of Agriculture) participated in a workshop where several users detailed successes using iPlant. David's presentation explained how iPlant CI was used in his project to sequence and analyse the genome of an invasive weed, leafy spurge (Euphorbia esula) (Horvath et al., 2014). The ‘SOAPdenovo’ and ‘BLAT’ apps were used to identify Illumina sequences conserved between leafy spurge and three related, fully sequenced species (poplar, cassava, and castor bean), while the ‘Trinity’ app yielded contigs from those conserved sequences. Genes of interest were selected and a bespoke program, ‘PRICE-TI’, developed by Dr Graham Ruby from the University of California, San Francisco (Ruby et al., 2013), was run on a virtual machine using iPlant's Atmosphere platform (see below) to fill in missing promoter and intron sequences. Atmosphere Another way to engage with iPlant's CI is via Atmosphere. This is a cloud-based platform similar to Amazon Cloud, but freely available without a subscription, which enables users to run a software-based emulation of a computer called a virtual machine (VM). With an Atmosphere VM, a researcher can use their own desktop or laptop computer to launch a base ‘image’ of a virtual computer using physical supercomputing resources located in the US, and then build upon this to custom-design a virtual workbench that precisely meets their needs. For example, regardless whether a user's physical machine is a Mac/Windows/Linux/Unix computer, or even if it is a device such as a smartphone or tablet, an image can be designed to run a virtual desktop with preferred specifications. The image can also be configured to run whichever software the researcher chooses, whether ready-made programmes installed from the web or uploaded from a local system, or bespoke, self-coded applications. The compute power of the image can be scaled to suit; small, simple tasks may only require a small number of central processing units (CPUs), whereas more complex jobs may request greater capacity. Furthermore, being in sync with iPlant's other CI resources, a VM can be fully integrated with public or private data saved into the iRODS-based iPlant Data Store, giving long-term storage and back-up (Goff et al., 2011). A major advantage of the VM is that once an image has been built, it can be saved and shared exclusively with other laboratory members or collaborators so that everyone has access to the same computing resources. The ability to launch parallel ‘instances’ of the same image eliminates the need for individuals to install software on their own computers and avoids problems with version updates, operating system compatibility, and resource capacity. Approved users may also be granted access to a live instance; for example, if one researcher has a complex analysis running overnight, a collaborator in a different time zone may connect to the same instance to check on the progress of the analysis. The ability to set permissions on a given image means it may be made private to individuals within a research team, or made publically available to the global community. This would be useful when collaborating on a research paper: during manuscript preparation, data can be kept private, but once the scientific journal paper is published, details of the specific image used can also be published, giving open access to other researchers wishing to reproduce those results faithfully and assuring backwards compatibility. This is particularly helpful in next generation sequencing, where rapidly advancing technology and software might otherwise cause a particular approach to become obsolete. Speaking at PAG 2014, Jon Duvick of Iowa State University illustrated how Atmosphere has been used to create and share a community resource. Using a Linux-based VM, Duvick and colleagues built xGDBvm, a feature-rich platform to bridge the gap between assembled data and a fully annotated browsable genome. By launching the ‘xGDBvm-beta’ instance, anyone with an iPlant account can access and use the platform, and take advantage of HPC power via the cloud (https://pag.confex.com/pag/xxii/webprogram/Paper10310.html, last accessed 18 September 2014). Community resource building Open working and the iPlant community One of the biggest benefits of iPlant's CI is that it provides an open source platform for many applications, which can be built upon, improved, and expanded, rather than a single tool that can be outgrown. With over 400 unique applications currently available in the DE, ranging from tools for simple tasks such as format conversion or file splitting to full service analysis software packages that can be combined in complex workflows, it is likely that, for a given data analysis challenge, there is already an iPlant app that can help, thus reducing the need for de novo programming. iPlant's goal to help biologists avoid ‘reinventing the wheel’ even goes so far as its platform documentation and help services. Just as iPlant's CI is a fully open source, so too are all CI and app documentation files, organized as editable wikis that may be annotated and improved by the community as appropriate. Together with the apps’ intuitive design, this means research productivity is increased because biologists do not require high-level skills to be able to launch and use them. To add a level of integrity to the user-generated content shared on iPlant, the community has the ability to rate apps and workflows and comment on their usefulness and effectiveness. Provision of feedback not only lets users understand what each app can and cannot do, but also highlights to the programming community those apps with bugs or limitations that can be improved upon. Users can contribute new apps directly within the Discovery Environment. Users can also use this mechanism or contact iPlant support to suggest new apps that should be integrated. ‘Ask iPlant’, an open forum available throughout the iPlant CI (http://ask.iplantcollaborative.org/questions/, last accessed 18 September 2014), gives not only iPlant staff, but also any member of the community, the opportunity to ask and respond to user questions and suggestions for new apps. In this way, iPlant is a truly interactive and collaborative community that works together. Educational resources As well as providing practical platforms, software, and resources for researchers, the iPlant Collaborative is helping to address the skills gap in bench biology by developing tools to teach the principles of bioinformatics and computational data analysis to students. At the GARNet–iPlant workshop, delegates were introduced to a useful tool for high-level genomics analysis called DNA Subway. DNA Subway (Fig. 2) takes a step-by-step approach to genomic analysis using the visual analogy of a subway map—by hovering over the interconnected ‘stations’ along a ‘line’, students can learn more about the steps in a given process. In the current version release, students can explore DNA sequence annotation, mine a genome for sequences related to a gene or region of interest, determine sequence relationships, and measure differential expression. The app includes sample data to work with, or students may also upload their own data in FASTA format. Fig. 2. View largeDownload slide Screenshot of the DNA Subway homepage. DNA Subway is an educational tool designed to help teach genomics analysis to undergraduate students. (This figure is available in colour at JXB online.) Fig. 2. View largeDownload slide Screenshot of the DNA Subway homepage. DNA Subway is an educational tool designed to help teach genomics analysis to undergraduate students. (This figure is available in colour at JXB online.) As with the rest of iPlant's CI, DNA Subway is freely available to be used by anyone around the world, and registered users have the additional benefit of being able to save and store their data. An international collaborative As illustrated throughout this report, the iPlant CI provides many benefits for open science, collaborations, and training and researchers are strongly encouraged to take advantage of its many benefits. However, as methods and technologies improve, and as big data become increasingly important in research, it is very likely that more and more researchers will wish to take advantage of the iPlant CI. The user base for iPlant is, therefore, likely to grow beyond the plant sciences. Although iPlant was originally established to support the plant research community, the underlying CI is domain agnostic and can easily be utilized to support a wide range of biological data and users community needs. With an expanding user base it is important that due consideration is given to the long-term availability, access, support, and financing of the iPlant platform as an internationally recognized, and utilized resource. For example, one barrier that had to be overcome for the GARNet workshop, which was the first iPlant training workshop to take place outside the US, was that real-time user support is only available during US office hours. Special provision, therefore, had to be made for online support engineers to be available during the UK working day. As a US-funded initiative, US-based researchers also have the advantage of being able to apply for long-term collaborative support. On the final day of the workshop, one quarter of attendees took the opportunity to have a personal or small-group consultation with an iPlant trainer, demonstrating the potential future demand for those outside the US to work directly and collaboratively with iPlant experts on scientific challenges. Capacity is another potential stumbling block with increased international usage of iPlant's CI; despite the extensive HPC resources during the iPlant workshop, with up to 80 people logging on to the servers at once, the infrastructure sometimes struggled to cope with resource demand. A possible solution to these problems is at hand. At its inception, iPlant was structured as a distributed model within the US to spread effort, expertise, and resources between the Texas Advanced Computing Centre (TACC), Cold Spring Harbor Laboratory, and the University of Arizona, and as such the platform has been designed with extension and replication in mind. It would therefore be possible to take advantage of the federation capabilities of the iPlant CI to develop iPlant nodes across the globe. This would help to reduce the impact on US-based resources, distribute costs, and allow for the development of nodes that can meet nation-specific needs and provide support within users’ own time zones. Concluding comments User feedback revealed that the Data mining with iPlant workshop was successful in helping attendees learn how iPlant's CI can facilitate data storage and data analysis. The user-friendly interfaces and designed openness of the platform make the iPlant CI valuable for biologists and bioinformatics programme developers alike. The fact that iPlant community resources are open access and open source makes it a very useful resource for plant scientists in the UK or further afield, whether simply to use existing applications, or to create new ones to address specific data challenges. As the amount of data continues to grow, the iPlant platform will grow alongside it to become an integral, valuable part of biological sciences research. Notes from the workshop, including tutorials, presentation slides, and additional information, can be found on the iPlant Wiki at: https://pods.iplantcollaborative.org/wiki/display/Events/Data+Mining+with+iPlant (last accessed 18 September 2014). References Goff SA Vaughn M McKay Set al.  . 2011. The iPlant Collaborative: cyberinfrastructure for plant biology. Frontiers in Plant Science  2, 34. Google Scholar CrossRef Search ADS PubMed  Horvath D Anderson JV Dogramaci M Scheffler B. 2014. Progress in sequencing the genome of an invasive polyploid weed (leafy spurge). Plant and Animal Genome  XXII. San Diego, USA. Oliver SL Lenards AL Barthelson RA Merchant N McKay SJ. 2013. Using the iPlant collaborative discovery environment. Current Protocols in Bioinformatics  42, 1.22.21– 21.22.26. Rajasekar A Wan M Moore R Schroeder W. 2006. A prototype rule-based distributed data management system. Proceedings of High Performance Distributed Computing Workshop on Next Generation Distributed Data Management . Ruby JG Bellare P DeRisi JL. 2013. PRICE: software for the targeted assembly of components of (meta)genomic sequence data. G3: Genes, Genomes Genetics  3, 865– 880. Scherm H Thomas CS Garrett KA Olsen JM. 2014. Meta-analysis and other approaches for synthesizing structured and unstructured data in plant pathology. Annual Reviews in Phytopathology  52, 453– 476. Google Scholar CrossRef Search ADS   Smith A Balazinska M Baru C Gomelsky M McLennan M Rose L Smith B Stewart E Kolker E. 2011. Biology and data-intensive scientific discovery in the beginning of the 21st century. OMICS: A Journal of Integrative Biology  15, 209– 212. Google Scholar CrossRef Search ADS PubMed  Stanzione D. 2011. The iPlant Collaborative: cyberinfrastructure to feed the world. Computer  44, 44– 52. Google Scholar CrossRef Search ADS   © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com TI - Data mining with iPlant: A meeting report from the 2013 GARNet workshop, Data mining with iPlant JF - Journal of Experimental Botany DO - 10.1093/jxb/eru402 DA - 2014-10-17 UR - https://www.deepdyve.com/lp/oxford-university-press/data-mining-with-iplant-a-meeting-report-from-the-2013-garnet-workshop-gqAYix9T1Q SP - 1 EP - 6 VL - 66 IS - 1 DP - DeepDyve ER -