1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset

1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset Background: The presence of lymph node metastases is one of the most important factors in breast cancer prognosis. The most common way to assess regional lymph node status is the sentinel lymph node procedure. The sentinel lymph node is the most likely lymph node to contain metastasized cancer cells and is excised, histopathologically processed, and examined by a pathologist. This tedious examination process is time-consuming and can lead to small metastases being missed. However, recent advances in whole-slide imaging and machine learning have opened an avenue for analysis of digitized lymph node sections with computer algorithms. For example, convolutional neural networks, a type of machine-learning algorithm, can be used to automatically detect cancer metastases in lymph nodes with high accuracy. To train machine-learning models, large, well-curated datasets are needed. Results: We released a dataset of 1,399 annotated whole-slide images (WSIs) of lymph nodes, both with and without metastases, in 3 terabytes of data in the context of the CAMELYON16 and CAMELYON17 Grand Challenges. Slides were collected from five medical centers to cover a broad range of image appearance and staining variations. Each WSI has a slide-level label indicating whether it contains no metastases, macro-metastases, micro-metastases, or isolated tumor cells. Furthermore, for 209 WSIs, detailed hand-drawn contours for all metastases are provided. Last, open-source software tools to visualize and interact with the data have been made available. Conclusions: A unique dataset of annotated, whole-slide digital histopathology images has been provided with high potential for re-use. Received: 18 December 2017; Revised: 26 March 2018; Accepted: 22 May 2018 The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 2 CAMELYON dataset Keywords: breast cancer; lymph node metastases; whole-slide images; grand challenge; sentinel node Table 1: Rules for assigning clusters of metastasized tumor cells to a Background metastasis category Breast cancer is one of the most common and deadly cancers in Category Size women worldwide [1]. Although prognosis for breast cancer pa- tients is generally good, with an average5-year overall survival Macro-metastasis Larger than 2 mm rate of 90% and 10-year survival rate of 83%, it significantly de- Micro-metastasis Larger than 0.2 mm and/or containing more teriorates when breast cancer metastasizes [2]. While localized than 200 cells, but not larger than 2 mm breast cancer has a five-year survival rate of 99%, this drops to Isolated tumor cells Single tumor cells or a cluster of tumor 85% in the case of regional (lymph node) metastases and only cells not larger than 0.2 mm or less than 26% in case of distant metastases. As such, it is of the utmost 200 cells importance to establish whether metastases are present to al- low adequate treatment and the best chance of survival. This is formally captured in the tumor, node, metastasis (TNM) staging Table 2: Selection of N-stages for staging of breast cancer based on criteria [3]. the 7th edition of the TNM staging criteria The first step in determining the presence of metastases is to examine the regional lymph nodes. Not only is the presence Stage Description of metastases in these lymph nodes a poor prognostic factor by itself, it is also an important predictive factor for the presence of N0 Cancer has not spread to nearby lymph nodes distant metastases [4]. In breast cancer, the most common way N0(i+) Lymph nodes only contain ITCs to assess the regional lymph node status is the sentinel lymph N1mi Micro-metastases in 1 to 3 lymph nodes axillary node procedure [5, 6]. With this procedure, a blue dye and/or ra- N1a Cancer has spread to 1 to 3 lymph nodes axillary, dioactive tracer is injected near the tumor. The first lymph node with at least 1 macro-metastasis N1b Cancer has spread to internal mammary lymph reached by the injected substance, the sentinel node, is most nodes, but this spread could only be found on likely to contain the metastasized cancer cells and is excised. sentinel lymph node biopsy Subsequently, it is submitted for histopathological processing N1c Both N1a and N1b apply and examination by a pathologist. N2a Cancer has spread to 4 to 9 lymph nodes under the The pathologist examines a glass slide containing a tissue arm, with at least 1 macro-metastasis section of the lymph node stained with hematoxylin and eosin N2b Metastases in clinically detected internal mammary (H&E). Examples are shown in Fig 1.Based on solitary tumor cells lymph nodes in the absence of axillary lymph node or the diameter of clusters of tumor cells, metastases can be metastases divided into one of three categories: macro-metastases, micro- metastases, or isolated tumor cells (ITC). The size criteria for each of these categories is shown in Table1. Based on the pres- ence or absence of one or more of these metastasis, an initial technique where high-speed slide scanners digitize glass slides pathological N-stage (pN-stage) is assigned to a patient. Based at very high resolution (e.g., 240 nm per pixel). This results in on this initial stage, in combination with characteristics of the images with a size on the order of 10 gigapixels, typically called main tumor, further lymph node dissection or axillary radiother- whole-slide images (WSIs). This large amount of data makes apy may be performed. These axillary lymph nodes are then also WSIs ideally suited for analysis with machine-learning algo- pathologically assessed to come to a final pN-stage. pN catego- rithms. Although machine -earning algorithms have been ap- rization is mostly based on metastasis size and the number of plied to digitized pathology data as early as 1994 [12], WSIs have lymph nodes involved but also on the anatomical location of the only appeared since early 2000. Since then, many researchers lymph nodes. A small excerpt of the pN stage is shown in Table have described the use of machine-learning algorithms in WSIs, 2; for a full listing, refer to the 7th edition of the TNM staging e.g., for breast or prostate cancer classification [ 13, 14]. Over the criteria for breast cancer [7]. past five years, so-called deep learning algorithms, such as con- A key challenge for pathologists in assessing lymph node sta- volutional neural networks (CNNs), have become incredibly pop- tus is the large area of tissue that has to be examined to identify ular. For example, we were the first to show that training CNNs metastases that can be as small as single cells. Examples of a to detect cancer metastases in lymph nodes was possible and macro-metastasis, micro-metastasis, and ITC are shown in Fig potentially could result in improved efficiency and accuracy of 1 and Fig. 2. For sentinel lymph nodes, at least three sections histopathologic diagnostics [15]. at different levels through the lymph node have to be exam- To train machine-learning models, large, well-curated ined; for non-sentinel lymph nodes, one section of at least 10 datasets are needed to both train these models and accurately lymph nodes has to be examined [8, 9]. This tedious examination evaluate their performance. To allow the broader computer process is time-consuming, and pathologists may miss small vision community to replicate and build on our results, we metastases [10]. In the Netherlands, a secondary examination publicly released a large dataset of annotated WSIs of lymph using an immunohistochemical staining for cytokeratin has to nodes, both with and without metastases in the context of the be performed if inspection of the H&E slide identifies no metas- CAMELYON16 and CAMELYON17 challenges (CAncer MEtastases tases. However, even in this secondary examination, metastases in LYmph nOdes challeNge) [16, 17]. can still be missed [11]. The concept of challenges in medical imaging and computer Today, advances in whole-slide imaging and machine learn- vision has been around for nearly a decade. In medical imag- ing have opened an avenue for analysis of digitized lymph node ing it primarily started with the liver segmentation challenge at sections with computer algorithms. Whole-slide imaging is a the annual MICCAI conference in 2007 [18], and in computer vi- Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 Litjens et al. 3 Table 3: WSI-level characteristics for the CAMELYON16 part of the sion, the ImageNet Challenge is most widely known [19]. The dataset main goal of challenges, both in medical imaging and in com- puter vision, is to allow a meaningful comparison of algorithms. Metastases In scientific literature, this was often not the case as authors present results on their own, often proprietary, datasets with Center Total WSIs None Macro Micro their own choice of evaluation metrics. In medical imaging, RUMC 249 150 48 51 this was specifically a problem as sharing medical data is often UMCU 150 90 34 26 difficult. Challenges change this by making available datasets and enforcing standardized evaluation. Furthermore, challenges have the added benefit of opening up meaningful research ques- Table 4: WSI-level characteristics for the CAMELYON17 part of the tions to a large community who normally might not have access dataset to the necessary datasets. The CAMELYON dataset was collected at different Dutch Center Total WSIs Metastases (Train) medical centers to cover the heterogeneity encountered in clin- ical practice. It contains 1,399 WSIs, resulting in approximately Train Test None Macro Micro ITC 3 terabytes of image data. We released a part of the dataset with the reference standard (i.e., the training set) to allow other CWZ 100 100 64 15 10 11 LPON 100 100 64 25 4 7 groups to build algorithms to detect metastases. Subsequently, RST 100 100 60 11 22 7 the rest of the dataset was released without a reference stan- RUMC 100 100 60 19 13 8 dard (i.e., the test set). Participating teams could submit their UMCU 100 100 75 15 8 2 algorithm output on the test set to us, after which we evaluated Total 500 500 323 85 57 35 their performance on a predefined set of metrics to allow fair and standardized comparison to other teams. To enable partic- ipation of teams that are not familiar with WSIs, we released Table 5: Patient-level characteristics for the CAMELYON17 part of the a publicly available software package for viewing WSIs, annota- dataset tions, and algorithmic results, dubbed the automated slide anal- ysis platform (ASAP) [20]. Center Total patients Stages (Train) Here, we describe the CAMELYON dataset in detail and cover Train Test pN0 pN0 pN1 pN1 pN2 i+ mi the following topics: CWZ 20 20 4 3 5 7 1 LPON 20 20 6 2 2 7 3 Sample collection RST 20 20 4 2 6 5 3 Slide digitization and conversion RUMC 20 20 3 2 4 8 3 Challenge dataset construction and statistics UMCU 20 20 8 2 4 3 3 Instructions on the use of ASAP to view and analyze slides Total 100 100 25 11 21 30 13 Suggestions for data re-use Data description slides were randomly selected for inclusion. As the vast ma- jority of sentinel lymph nodes are negative for metastases, se- The CAMELYON dataset is a combination of the WSIs of sentinel lection was stratified for the presence of macro-metastases, lymph node tissue sections collected for the CAMELYON16 and micro-metastases, and ITCs based on the original pathology re- CAMELYON17 challenges, which contained 399 WSIs and 1,000 ports. This was done to obtain a good representation of differing WSIs, respectively. This resulted in 1399 unique WSIs and a to- metastasis appearance without the need for an excessively large tal data size of 2.95 terabytes. The dataset is currently publicly dataset. available after registration via the CAMELYON17 website [17]. At Data were acquired in two stages, corresponding to the time the time of writing, it had been accessed by more than 1,000 reg- periods for organization of the CAMELYON16 and CAMELYON17 istered users worldwide. It has been licensed under the Creative challenges. Within the CAMELYON16 challenge, only data from Commons CC0 license. the RUMC and UMCU were acquired, and no slides containing only ITCs were included. For CAMELYON17, data were included Data collection from all five centers, and glass slides containing only ITCs were Collection of the data was approved by the local ethics com- obtained as well. A categorization of the slides can be found in mittee of the Radboud University Medical Center (RUMC) under Tables 3 and 4. 2016-2761, and the need for informed consent was waived. Data After glass slides were selected, they were digitized with dif- were collected at five medical centers in the Netherlands: the ferent slide scanners such that scan variability across centers RUMC, the Utrecht University Medical Center (UMCU), the Rijn- was captured in addition to H&E staining procedure variabil- state Hospital (RST), the Canisius-Wilhelmina Hospital (CWZ), ity. The slides each from RUMC, CWZ, and RST were scanned and LabPON (LPON). An example of digitized slides from these with the 3DHistech Pannoramic Flash II 250 scanner at the centers can be seen in Fig.1. RUMC. At the UMCU, slides were scanned with a Hamamatsu Initial identification of cases eligible for inclusion was based NanoZoomer-XR C12000-01 scanner, and at LPON with a Philips on local pathology reports of sentinel lymph node procedures Ultrafast Scanner. between 2006 and 2016. The exact years varied from center to As all slides are initially stored in an original vendor for- center but did not affect data distribution or quality. After the mat that makes re-use challenging, slides were converted to a lists of sentinel node procedures and the corresponding glass common, generic TIFF (tagged image file format) using an open- slides containing H&E-stained tissue sections were obtained, source file converter, part of the ASAP package [ 20]. As there are Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 4 CAMELYON dataset Figure 1: Low-resolution example of a WSI from each of the five centers contributing data. Figure 2: Representative samples of the different sizes of breast cancer metastases in sentinel lymph nodes. Table 6: Basic descriptors for the TIFF used in the CAMELYON dataset Initial slide-level labels were assigned based on the pathol- ogy reports obtained from clinical routine. For the CAMELYON16 Format Tiled TIFF (bigTIFF) part of the dataset, all slides were subsequently examined and Tile size 512 pixels metastases outlined by an experienced lab technician (M.H.) and Pixel resolution 0.23 μmto0.25 μm a clinical PhD student (Q.M.). Afterward, all annotations were in- Channels per pixel 3 (red, green, blue) spected by one of two expert breast pathologists (P.B. or P.v.D.). Bits per channel 8 Some slides contained two consecutive tissue sections of the Data type Unsigned char same lymph node, in which case only one of the two sections Compression JPEG was annotated as this did not affect the slide-level label. In to- tal, 15 slides may contain unlabeled metastatic areas and are indicated via a descriptive text file that is part of the dataset. no open-source tools to convert the iSyntax format produced by For the CAMELYON17 part of the dataset, an experienced gen- the Philips Ultrafast Scanner, a proprietary converter was used eral pathologist (M.v.D.) inspected all the slides to assess the to convert files to a special TIFF format [ 21] that can be read by slide-level labels. For the 50 slides with detailed annotations, the open-source package OpenSlide [22] and the ASAP package experienced observers (M.v.D., M.H., Q.M., O.G., and R.vd.L.) an- [20]. Some basic descriptors are shown in Table 6. notated all metastases. Subsequently, these annotations were After digitization, the reference standard for each slide double-checked by one of the other observers or one of two needed to be established. The reference standard for each WSI pathology residents (A.H. and R.V.). consisted of a slide-level label indicating the largest metasta- For the entire dataset, when the slide-level label was unclear sis within a slide (i.e., no metastasis, macro-metastasis, micro- during the inspection of the H&E-stained slide, an additional metastasis, or ITC). In addition, for all 399 WSIs that were part WSI with a consecutive tissue section, immunohistochemically of the CAMELYON16 challenge and an additional 50 WSIs from stained for cytokeratin, was used to confirm the classification. the CAMELYON17 challenge, detailed contours were drawn along Furthermore, this stain was also used to aid in drawing the out- the boundaries of metastases within the WSI. For the 50 slides lines in both CAMELYON16 and CAMELYON17, which helps limit of the CAMELYON17 challenge, 10 slides from each center were observer variability. As both the H&E and IHC slides are digital, used to allow users of the dataset to analyze metastasis appear- they can be viewed simultaneously, allowing observers to easily ance differences across different centers. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 Litjens et al. 5 by an experienced technician (Q.M. and N.S. for UMCU, M.H. or R.vd.L. for the other centers) to assess the quality of the scan; when in doubt, a pathologist was consulted on whether scan- ning issues might affect diagnosis. Due to the inclusion of IHC for establishing the reference standard, the chance of errors being made can be considered limited, as pathologists make few mistakes in identifying metas- tases with IHC [25]. Furthermore, all slides were checked twice. However, to further ensure the quality of the reference standard, we looked at algorithmic results submitted to the challenge to identify slides where the best performing algorithms disagreed with the reference standard. This led to a correction of the ref- erence standard in 3 of the 1,399 slides. Tools for data use Several tools are available to visualize and interact with the CAMELYON dataset. Here, we present examples of how to use the data with an open-source package developed by us, ASAP [20]. Other open-source packages are also available, such as OpenSlide [26], but those do not contain functionality for read- ing annotations or storing image analysis results. Project name: Automated Slide Analysis Platform (ASAP) Project home page: https://github.com/GeertLitjens/ASAP Operating system(s): Linux, Windows Programming language: C++, Python Other requirements: CMake (www.cmake.org) Figure 3: H&E-stained tissue section and a consecutive section immunohisto- License: GNU GPL v2.0 chemically stained for cytokeratin. The top row shows the low-resolution im- ages and the bottom row a high-resolution image, centered at a metastasis. The ASAP contains several components, of which one is a metastasis is difficult to see in H&E but easy to identify in the immunohisto- viewer/annotation application (Fig. 4). This can be started via the chemically stained slide. A yellow bounding box indicates the metastasis loca- tion in the images in the top row. ASAP executable within the installation folder of the package. After opening an image file from the CAMELYON dataset, one can explore the data via a Google Maps-like interface. The pro- identify the same areas in both slides. This stain is also used in vided reference standard can be loaded via the annotation plu- daily clinical pathology practice to resolve diagnosis in the case gin. In addition, new annotations can be made with the annota- of metastasis-negative H&E [23, 24]. An example of an H&E WSI tion tools provided. Last, the viewer is not limited to files from and the corresponding consecutive cytokeratin immunohisto- the CAMELYON dataset but can visualize most WSI formats. chemical section are shown in Fig.3. In addition to the viewer application and C++ library for read- In the CAMELYON17 dataset, after establishing the reference ing and writing WSI images, we also provide Python-wrapped standard, slides were divided into artificial patients, covering the modules. To access the data via Python, the following code snip- different pN-stages (see Table 2). Each artificial patient only had pet can be used. WSIs from one center. For each artificial patient in the training The annotations are provided in human-readable XML for- part of the dataset, the pN-stage and the slide-level labels were mat and can be parsed using the ASAP package. However, other provided. This was done to assess the potential of participat- XML reading libraries can also be used. Annotations are stored ing algorithms within the challenge to perform automated pN- as polygons. Each polygon consists of a list of (x, y) coordinates staging. However, all WSIs can be used independently of their at the highest resolution level of the image. Annotations can be patient-level labels. A complete overview of the patient-level converted to binary images via the following code snippet. characteristics is shown in Table 5. The Python package can also be used to perform image pro- After the dataset and reference standard were established, cessing or machine-learning tasks on the data and to write out we uploaded the entire dataset to Google Drive and to BaiduPan. an image result. The code snippet below performs some basic These two options were chosen to reach as wide an audience as thresholding to generate a background mask. These results can possible, given that Google Drive is not accessible everywhere then subsequently be visualized using the viewer component of (e.g., People’s Republic of China). A link to the data was shared ASAP, which also supports floating point images. An example of with participants after registration at the CAMELYON websites the code snippet result can be seen in Fig.4B. [16, 17]. The ASAP package also supports writing your own image pro- cessing routines and integrating them as plugins into the viewer component. Some existing examples such as color deconvolu- Data validation and quality control tion and nuclei detection are provided. All glass slides included in the CAMELYON dataset were part of routine clinical care and are thus of diagnostic quality. However, Re-use potential during the acquisition process, scanning can fail or result in out- of-focus images. As a quality-control measure, all slides were in- The CAMELYON dataset is currently being used within the spected manually after scanning. The inspection was performed CAMELYON17 challenge, which is open for new participants and Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 6 CAMELYON dataset Figure 4: Interface of the ASAP viewer interface. Visible items are the annotations tools in toolbar, the viewport showing the WSI, and the plugin panel on the left. submissions. In this context, the dataset enables testing of new machine-learning and image analysis strategies against the cur- rent state-of-the-art. Within CAMELYON, we evaluate the algo- rithms based on a weighted Cohen’s kappa at the pN-stage level [27]. This statistic measures the categorical agreement between the algorithm and the reference standard where a value of 0 in- dicates agreement at the level of chance and 1 is perfect agree- ment. The quadratic weighting penalizes deviations of more than one category more severely. Conclusions arising from such experiments may have significance for the broader field of com- putational pathology, rather than being restricted to this partic- ular application. For example, experiments with weakly super- tion to different centers. In pathology, centers can differ in tissue vised machine learning in histopathology may benefit from the preparation, staining protocol, and scanning equipment. This CAMELYON dataset, with an established baseline based on fully can have a profound impact on image appearance. In the CAME- supervised machine learning. LYON dataset, we included data from five centers and three The dataset has also been used by companies experienced scanners. We are confident that algorithms trained with this in machine-learning applications to be a first foray into digi- data will generalize well. Users of the dataset can even explic- tal pathology, e.g., Google [28]. Because of its extent, observer itly evaluate this as we have indicated for each image the center experiments with pathologists may be performed to assess the from which it was obtained. By leaving out one center and eval- value of algorithms within a diagnostic setting. For example, a uating performance on that center specifically, the participants comparison of algorithms competing in the CAMELYON16 chal- can assess the robustness of their algorithms. lenge to pathologists in clinical practice was recently published We believe the usefulness of the dataset also extends beyond [29]. Experiments with the dataset may serve to identify relevant its initial use within the CAMELYON challenge. For example, it issues with implementation, validation, and regulatory affairs can be used for evaluation of color normalization algorithms and with respect to computational pathology. for cell detection/segmentation algorithms. A key example of implementation issues with respect to machine-learning algorithms in medical imaging is generaliza- Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 Litjens et al. 7 Ethical approval The collection of the data was approved by the local ethics com- mittee (Commissie Mensgebonden Onderzoek regio Arnhem - Nijmegen) under 2016-2761, and the need for informed consent was waived. Competing interests JvdL, PvD, and AB are members of the scientific advisory board of Philips Digital Pathology (Best, The Netherlands). JvdL is also part of the scientific advisory board of ContextVision (Stock- holm, Sweden), and PvD is part of the scientific advisory board of Sectra (Linkoping, ¨ Sweden). Funding Data collection and annotation where funded by Stichting IT Projecten and by the Fonds Economische Structuurversterking (tEPIS/TRAIT project; LSH-FES Program 2009; DFES1029161 and FES1103JJTBU). This work was also supported by grant 601040 from the FP7-funded VPH-PRISM project of the European Union. Author Contributions GL and JvdL designed the study and supervised the collection of the dataset. GL wrote the initial draft and final version of the manuscript. PBu, OG, BEB, MB, MH, QM, AB, NS, PvD, MvD, and CW were involved in sample collection. GL, PBa, and NS were involved in data anonymization and conversion. PBu, OG, MH, MB, MvD, QM, AH, RV, and PvD were involved in establishing the reference standard. All authors were involved in reviewing and finalizing the paper. References 1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Can- cer J Clin 2016;66(1):7–30. 2. Howlader N, Noone AM, Krapcho M, et al. SEER Cancer Statis- tics Review, 1975-2014. National Cancer Institute, Bethesda, MD. http://seer.cancer.gov/csr/1975 2014/ based on Novem- ber 2016 SEER data submission, posted to the SEER web site, April 2017;http://seer.cancer.gov/csr/1975 2014/. 3. Amin MB, Edge SB, Greene FL, et al. AJCC Cancer Staging Manual. Springer-Verlag GmbH; 2016. http://www.ebook.de /de/product/26196032/ajcc cancer staging manual.html. Availability of supporting data 4. Voogd AC, Nielsen M, Peterse JL, et al. Differences in risk fac- tors for local and distant recurrence after breast-conserving CAMELYON16 and CAMELYON17 datasets are open access and therapy or mastectomy for stage I and II breast cancer: shared publicly via the CAMELYON17 [17] website. Snapshots of pooled results of two large European randomized trials. J Clin this data and the code of ASAP [20] are also hosted in the Giga- Oncol 2001;19:1688–97. Science GigaDB database [30]. 5. Giuliano AE, Hunt KK, Ballman KV, et al. Axillary dissection vs no axillary dissection in women with invasive breast can- Abbreviations cer and sentinel node metastasis: a randomized clinical trial. JAMA 2011;305:569–75. ASAP: automated slide analysis platform; CNN: convolutional 6. Giuliano AE, Ballman KV, McCall L, et al. Effect of axillary neural network; CWZ: anisius-Wilhelmina Hospital; H&E: hema- dissection vs no axillary dissection on 10-year overall sur- toxylin and eosin; IHC: immunohistochemistry; ITC: isolated tu- vival among women with invasive breast cancer and sentinel mor cell; LPON: LabPON; pN-stage: pathological N-stage; RST: node metastasis: the ACOSOG Z0011 (Alliance) randomized Rijnstate Hospital; RUMC: Radboud University Medical Center; clinical trial. JAMA 2017;318:918–26. TIFF: tagged image file format; TNM: tumor, node, metastasis; 7. Edge SB, Compton CC. The American Joint Committee on UMCU: Utrecht University Medical Center; WSI: whole-slide im- Cancer: the 7th edition of the AJCC cancer staging manual age and the future of . Ann Surg Oncol 2010;17:1471–4. 8. Weaver DL. Pathology evaluation of sentinel lymph nodes Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 8 CAMELYON dataset in breast cancer: protocol recommendations and rationale. pp. 248–55. Mod Pathol 2010;23(Suppl 2):S26–32. 20. Litjens GJS. Automate Slide Analysis Platform (ASAP); 2017. 9. Somner JEA, Dixon JMJ, Thomas JSJ. Node retrieval in ax- https://github.com/geertlitjens/ASAP. Accessed 17 October illary lymph node dissections: recommendations for mini- 2017. mum numbers to be confident about node negative status. J 21. Description of Philips TIFF file format; 2017. http://openslid Clin Pathol 2004;57:845–8. e.org/formats/philips/. Accessed 17 October 2017. 10. van Diest PJ, van Deurzen CHM, Cserni G. Pathology is- 22. Goode A, Gilbert B, Harkes J, et al. OpenSlide: a vendor- sues related to SN procedures and increased detection of neutral software foundation for digital pathology. J Pathology micrometastases and isolated tumor cells. Breast Disease Informatics 2013;4(1):27. 2010;31:65–81. 23. Chagpar A, Middleton LP, Sahin AA, et al. Clinical outcome 11. Vestjens J, Pepels M, de Boer M, et al. Relevant impact of cen- of patients with lymph node-negative breast carcinoma who tral pathology review on nodal classification in individual have sentinel lymph node micrometastases detected by im- breast cancer patients. Ann Oncol 2012;23(10):2561–6. munohistochemistry. Cancer 2005;103:1581–6. 12. Wolberg WH, Street WN, Mangasarian OL. Machine learning 24. Reed J, Rosman M, Verbanac KM, et al. Prognostic implica- techniques to diagnose breast cancer from image-processed tions of isolated tumor cells and micrometastases in sen- nuclear features of fine needle aspirates. Cancer Letters tinel nodes of patients with invasive breast cancer: 10-year 1994;77:163–71. analysis of patients enrolled in the prospective East Carolina 13. Diamond J, Anderson NH, Bartels PH, et al. The use of mor- University/Anne Arundel Medical Center Sentinel Node Mul- phological characteristics and texture analysis in the iden- ticenter Study. J Am Coll Surg.\ 2009;208:333–40. tification of tissue composition in prostatic neoplasia. Hum 25. Roberts CA, Beitsch PD, Litz CE, et al. Interpretive disparity Pathol 2004;35:1121–31. among pathologists in breast sentinel lymph node evalua- 14. Petushi S, Garcia FU, Haber MM, et al. Large-scale compu- tion. Am J Surg 2003;186(4):324–9. tations on histology images reveal grade-differentiating pa- 26. OpenSlide; 2017. http://openslide.org. Accessed 17 October rameters for breast cancer. BMC Medical Imaging 2006;6:14. 2017. 15. Litjens G, Sanc ´ hez CI, Timofeeva N, et al. Deep learning as a 27. Cohen J. A coefficient of agreement for nominal scales. Edu- tool for increased accuracy and efficiency of histopathologi- cational and Psychological Measurement 1960;20(1):37–46. cal diagnosis. Nat Sci Rep 2016;6:26286. 28. Liu Y, Gadepalli K, Norouzi M, et al. Detecting Cancer Metas- tases on Gigapixel Pathology Images. arXiv:170302442. 16. The CAMELYON16 Challenge; 2017. https://camelyon16.gra 29. Ehteshami Bejnordi B, Veta M, van Diest PJ, et al. Diagnos- nd-challenge.org. Accessed 13 November 2017. tic assessment of deep learning algorithms for detection of 17. The CAMELYON17 Challenge; 2017. https://camelyon17.gra lymph node metastases in women with breast cancer. JAMA nd-challenge.org. Accessed 13 November 2017. 2017;318:2199–2210. 18. Heimann T, van Ginneken B, Styner M, et al. Comparison 30. Litjens G, Bandi P, Bejnordi BE, et al. Supporting data and evaluation of methods for liver segmentation from CT for “1399 H&E-stained sentinel lymph node sections of datasets. IEEE Trans Med Imaging 2009;28:1251–65. breast cancer patients: the CAMELYON dataset.” GigaScience 19. Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hi- Database 2018;http://dx.doi.org/10.5524/100439. erarchical image database. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on IEEE; 2009. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png GigaScience Oxford University Press

Loading next page...
 
/lp/ou_press/1399-h-e-stained-sentinel-lymph-node-sections-of-breast-cancer-qgn8iBJKK3
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press.
eISSN
2047-217X
D.O.I.
10.1093/gigascience/giy065
Publisher site
See Article on Publisher Site

Abstract

Background: The presence of lymph node metastases is one of the most important factors in breast cancer prognosis. The most common way to assess regional lymph node status is the sentinel lymph node procedure. The sentinel lymph node is the most likely lymph node to contain metastasized cancer cells and is excised, histopathologically processed, and examined by a pathologist. This tedious examination process is time-consuming and can lead to small metastases being missed. However, recent advances in whole-slide imaging and machine learning have opened an avenue for analysis of digitized lymph node sections with computer algorithms. For example, convolutional neural networks, a type of machine-learning algorithm, can be used to automatically detect cancer metastases in lymph nodes with high accuracy. To train machine-learning models, large, well-curated datasets are needed. Results: We released a dataset of 1,399 annotated whole-slide images (WSIs) of lymph nodes, both with and without metastases, in 3 terabytes of data in the context of the CAMELYON16 and CAMELYON17 Grand Challenges. Slides were collected from five medical centers to cover a broad range of image appearance and staining variations. Each WSI has a slide-level label indicating whether it contains no metastases, macro-metastases, micro-metastases, or isolated tumor cells. Furthermore, for 209 WSIs, detailed hand-drawn contours for all metastases are provided. Last, open-source software tools to visualize and interact with the data have been made available. Conclusions: A unique dataset of annotated, whole-slide digital histopathology images has been provided with high potential for re-use. Received: 18 December 2017; Revised: 26 March 2018; Accepted: 22 May 2018 The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 2 CAMELYON dataset Keywords: breast cancer; lymph node metastases; whole-slide images; grand challenge; sentinel node Table 1: Rules for assigning clusters of metastasized tumor cells to a Background metastasis category Breast cancer is one of the most common and deadly cancers in Category Size women worldwide [1]. Although prognosis for breast cancer pa- tients is generally good, with an average5-year overall survival Macro-metastasis Larger than 2 mm rate of 90% and 10-year survival rate of 83%, it significantly de- Micro-metastasis Larger than 0.2 mm and/or containing more teriorates when breast cancer metastasizes [2]. While localized than 200 cells, but not larger than 2 mm breast cancer has a five-year survival rate of 99%, this drops to Isolated tumor cells Single tumor cells or a cluster of tumor 85% in the case of regional (lymph node) metastases and only cells not larger than 0.2 mm or less than 26% in case of distant metastases. As such, it is of the utmost 200 cells importance to establish whether metastases are present to al- low adequate treatment and the best chance of survival. This is formally captured in the tumor, node, metastasis (TNM) staging Table 2: Selection of N-stages for staging of breast cancer based on criteria [3]. the 7th edition of the TNM staging criteria The first step in determining the presence of metastases is to examine the regional lymph nodes. Not only is the presence Stage Description of metastases in these lymph nodes a poor prognostic factor by itself, it is also an important predictive factor for the presence of N0 Cancer has not spread to nearby lymph nodes distant metastases [4]. In breast cancer, the most common way N0(i+) Lymph nodes only contain ITCs to assess the regional lymph node status is the sentinel lymph N1mi Micro-metastases in 1 to 3 lymph nodes axillary node procedure [5, 6]. With this procedure, a blue dye and/or ra- N1a Cancer has spread to 1 to 3 lymph nodes axillary, dioactive tracer is injected near the tumor. The first lymph node with at least 1 macro-metastasis N1b Cancer has spread to internal mammary lymph reached by the injected substance, the sentinel node, is most nodes, but this spread could only be found on likely to contain the metastasized cancer cells and is excised. sentinel lymph node biopsy Subsequently, it is submitted for histopathological processing N1c Both N1a and N1b apply and examination by a pathologist. N2a Cancer has spread to 4 to 9 lymph nodes under the The pathologist examines a glass slide containing a tissue arm, with at least 1 macro-metastasis section of the lymph node stained with hematoxylin and eosin N2b Metastases in clinically detected internal mammary (H&E). Examples are shown in Fig 1.Based on solitary tumor cells lymph nodes in the absence of axillary lymph node or the diameter of clusters of tumor cells, metastases can be metastases divided into one of three categories: macro-metastases, micro- metastases, or isolated tumor cells (ITC). The size criteria for each of these categories is shown in Table1. Based on the pres- ence or absence of one or more of these metastasis, an initial technique where high-speed slide scanners digitize glass slides pathological N-stage (pN-stage) is assigned to a patient. Based at very high resolution (e.g., 240 nm per pixel). This results in on this initial stage, in combination with characteristics of the images with a size on the order of 10 gigapixels, typically called main tumor, further lymph node dissection or axillary radiother- whole-slide images (WSIs). This large amount of data makes apy may be performed. These axillary lymph nodes are then also WSIs ideally suited for analysis with machine-learning algo- pathologically assessed to come to a final pN-stage. pN catego- rithms. Although machine -earning algorithms have been ap- rization is mostly based on metastasis size and the number of plied to digitized pathology data as early as 1994 [12], WSIs have lymph nodes involved but also on the anatomical location of the only appeared since early 2000. Since then, many researchers lymph nodes. A small excerpt of the pN stage is shown in Table have described the use of machine-learning algorithms in WSIs, 2; for a full listing, refer to the 7th edition of the TNM staging e.g., for breast or prostate cancer classification [ 13, 14]. Over the criteria for breast cancer [7]. past five years, so-called deep learning algorithms, such as con- A key challenge for pathologists in assessing lymph node sta- volutional neural networks (CNNs), have become incredibly pop- tus is the large area of tissue that has to be examined to identify ular. For example, we were the first to show that training CNNs metastases that can be as small as single cells. Examples of a to detect cancer metastases in lymph nodes was possible and macro-metastasis, micro-metastasis, and ITC are shown in Fig potentially could result in improved efficiency and accuracy of 1 and Fig. 2. For sentinel lymph nodes, at least three sections histopathologic diagnostics [15]. at different levels through the lymph node have to be exam- To train machine-learning models, large, well-curated ined; for non-sentinel lymph nodes, one section of at least 10 datasets are needed to both train these models and accurately lymph nodes has to be examined [8, 9]. This tedious examination evaluate their performance. To allow the broader computer process is time-consuming, and pathologists may miss small vision community to replicate and build on our results, we metastases [10]. In the Netherlands, a secondary examination publicly released a large dataset of annotated WSIs of lymph using an immunohistochemical staining for cytokeratin has to nodes, both with and without metastases in the context of the be performed if inspection of the H&E slide identifies no metas- CAMELYON16 and CAMELYON17 challenges (CAncer MEtastases tases. However, even in this secondary examination, metastases in LYmph nOdes challeNge) [16, 17]. can still be missed [11]. The concept of challenges in medical imaging and computer Today, advances in whole-slide imaging and machine learn- vision has been around for nearly a decade. In medical imag- ing have opened an avenue for analysis of digitized lymph node ing it primarily started with the liver segmentation challenge at sections with computer algorithms. Whole-slide imaging is a the annual MICCAI conference in 2007 [18], and in computer vi- Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 Litjens et al. 3 Table 3: WSI-level characteristics for the CAMELYON16 part of the sion, the ImageNet Challenge is most widely known [19]. The dataset main goal of challenges, both in medical imaging and in com- puter vision, is to allow a meaningful comparison of algorithms. Metastases In scientific literature, this was often not the case as authors present results on their own, often proprietary, datasets with Center Total WSIs None Macro Micro their own choice of evaluation metrics. In medical imaging, RUMC 249 150 48 51 this was specifically a problem as sharing medical data is often UMCU 150 90 34 26 difficult. Challenges change this by making available datasets and enforcing standardized evaluation. Furthermore, challenges have the added benefit of opening up meaningful research ques- Table 4: WSI-level characteristics for the CAMELYON17 part of the tions to a large community who normally might not have access dataset to the necessary datasets. The CAMELYON dataset was collected at different Dutch Center Total WSIs Metastases (Train) medical centers to cover the heterogeneity encountered in clin- ical practice. It contains 1,399 WSIs, resulting in approximately Train Test None Macro Micro ITC 3 terabytes of image data. We released a part of the dataset with the reference standard (i.e., the training set) to allow other CWZ 100 100 64 15 10 11 LPON 100 100 64 25 4 7 groups to build algorithms to detect metastases. Subsequently, RST 100 100 60 11 22 7 the rest of the dataset was released without a reference stan- RUMC 100 100 60 19 13 8 dard (i.e., the test set). Participating teams could submit their UMCU 100 100 75 15 8 2 algorithm output on the test set to us, after which we evaluated Total 500 500 323 85 57 35 their performance on a predefined set of metrics to allow fair and standardized comparison to other teams. To enable partic- ipation of teams that are not familiar with WSIs, we released Table 5: Patient-level characteristics for the CAMELYON17 part of the a publicly available software package for viewing WSIs, annota- dataset tions, and algorithmic results, dubbed the automated slide anal- ysis platform (ASAP) [20]. Center Total patients Stages (Train) Here, we describe the CAMELYON dataset in detail and cover Train Test pN0 pN0 pN1 pN1 pN2 i+ mi the following topics: CWZ 20 20 4 3 5 7 1 LPON 20 20 6 2 2 7 3 Sample collection RST 20 20 4 2 6 5 3 Slide digitization and conversion RUMC 20 20 3 2 4 8 3 Challenge dataset construction and statistics UMCU 20 20 8 2 4 3 3 Instructions on the use of ASAP to view and analyze slides Total 100 100 25 11 21 30 13 Suggestions for data re-use Data description slides were randomly selected for inclusion. As the vast ma- jority of sentinel lymph nodes are negative for metastases, se- The CAMELYON dataset is a combination of the WSIs of sentinel lection was stratified for the presence of macro-metastases, lymph node tissue sections collected for the CAMELYON16 and micro-metastases, and ITCs based on the original pathology re- CAMELYON17 challenges, which contained 399 WSIs and 1,000 ports. This was done to obtain a good representation of differing WSIs, respectively. This resulted in 1399 unique WSIs and a to- metastasis appearance without the need for an excessively large tal data size of 2.95 terabytes. The dataset is currently publicly dataset. available after registration via the CAMELYON17 website [17]. At Data were acquired in two stages, corresponding to the time the time of writing, it had been accessed by more than 1,000 reg- periods for organization of the CAMELYON16 and CAMELYON17 istered users worldwide. It has been licensed under the Creative challenges. Within the CAMELYON16 challenge, only data from Commons CC0 license. the RUMC and UMCU were acquired, and no slides containing only ITCs were included. For CAMELYON17, data were included Data collection from all five centers, and glass slides containing only ITCs were Collection of the data was approved by the local ethics com- obtained as well. A categorization of the slides can be found in mittee of the Radboud University Medical Center (RUMC) under Tables 3 and 4. 2016-2761, and the need for informed consent was waived. Data After glass slides were selected, they were digitized with dif- were collected at five medical centers in the Netherlands: the ferent slide scanners such that scan variability across centers RUMC, the Utrecht University Medical Center (UMCU), the Rijn- was captured in addition to H&E staining procedure variabil- state Hospital (RST), the Canisius-Wilhelmina Hospital (CWZ), ity. The slides each from RUMC, CWZ, and RST were scanned and LabPON (LPON). An example of digitized slides from these with the 3DHistech Pannoramic Flash II 250 scanner at the centers can be seen in Fig.1. RUMC. At the UMCU, slides were scanned with a Hamamatsu Initial identification of cases eligible for inclusion was based NanoZoomer-XR C12000-01 scanner, and at LPON with a Philips on local pathology reports of sentinel lymph node procedures Ultrafast Scanner. between 2006 and 2016. The exact years varied from center to As all slides are initially stored in an original vendor for- center but did not affect data distribution or quality. After the mat that makes re-use challenging, slides were converted to a lists of sentinel node procedures and the corresponding glass common, generic TIFF (tagged image file format) using an open- slides containing H&E-stained tissue sections were obtained, source file converter, part of the ASAP package [ 20]. As there are Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 4 CAMELYON dataset Figure 1: Low-resolution example of a WSI from each of the five centers contributing data. Figure 2: Representative samples of the different sizes of breast cancer metastases in sentinel lymph nodes. Table 6: Basic descriptors for the TIFF used in the CAMELYON dataset Initial slide-level labels were assigned based on the pathol- ogy reports obtained from clinical routine. For the CAMELYON16 Format Tiled TIFF (bigTIFF) part of the dataset, all slides were subsequently examined and Tile size 512 pixels metastases outlined by an experienced lab technician (M.H.) and Pixel resolution 0.23 μmto0.25 μm a clinical PhD student (Q.M.). Afterward, all annotations were in- Channels per pixel 3 (red, green, blue) spected by one of two expert breast pathologists (P.B. or P.v.D.). Bits per channel 8 Some slides contained two consecutive tissue sections of the Data type Unsigned char same lymph node, in which case only one of the two sections Compression JPEG was annotated as this did not affect the slide-level label. In to- tal, 15 slides may contain unlabeled metastatic areas and are indicated via a descriptive text file that is part of the dataset. no open-source tools to convert the iSyntax format produced by For the CAMELYON17 part of the dataset, an experienced gen- the Philips Ultrafast Scanner, a proprietary converter was used eral pathologist (M.v.D.) inspected all the slides to assess the to convert files to a special TIFF format [ 21] that can be read by slide-level labels. For the 50 slides with detailed annotations, the open-source package OpenSlide [22] and the ASAP package experienced observers (M.v.D., M.H., Q.M., O.G., and R.vd.L.) an- [20]. Some basic descriptors are shown in Table 6. notated all metastases. Subsequently, these annotations were After digitization, the reference standard for each slide double-checked by one of the other observers or one of two needed to be established. The reference standard for each WSI pathology residents (A.H. and R.V.). consisted of a slide-level label indicating the largest metasta- For the entire dataset, when the slide-level label was unclear sis within a slide (i.e., no metastasis, macro-metastasis, micro- during the inspection of the H&E-stained slide, an additional metastasis, or ITC). In addition, for all 399 WSIs that were part WSI with a consecutive tissue section, immunohistochemically of the CAMELYON16 challenge and an additional 50 WSIs from stained for cytokeratin, was used to confirm the classification. the CAMELYON17 challenge, detailed contours were drawn along Furthermore, this stain was also used to aid in drawing the out- the boundaries of metastases within the WSI. For the 50 slides lines in both CAMELYON16 and CAMELYON17, which helps limit of the CAMELYON17 challenge, 10 slides from each center were observer variability. As both the H&E and IHC slides are digital, used to allow users of the dataset to analyze metastasis appear- they can be viewed simultaneously, allowing observers to easily ance differences across different centers. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 Litjens et al. 5 by an experienced technician (Q.M. and N.S. for UMCU, M.H. or R.vd.L. for the other centers) to assess the quality of the scan; when in doubt, a pathologist was consulted on whether scan- ning issues might affect diagnosis. Due to the inclusion of IHC for establishing the reference standard, the chance of errors being made can be considered limited, as pathologists make few mistakes in identifying metas- tases with IHC [25]. Furthermore, all slides were checked twice. However, to further ensure the quality of the reference standard, we looked at algorithmic results submitted to the challenge to identify slides where the best performing algorithms disagreed with the reference standard. This led to a correction of the ref- erence standard in 3 of the 1,399 slides. Tools for data use Several tools are available to visualize and interact with the CAMELYON dataset. Here, we present examples of how to use the data with an open-source package developed by us, ASAP [20]. Other open-source packages are also available, such as OpenSlide [26], but those do not contain functionality for read- ing annotations or storing image analysis results. Project name: Automated Slide Analysis Platform (ASAP) Project home page: https://github.com/GeertLitjens/ASAP Operating system(s): Linux, Windows Programming language: C++, Python Other requirements: CMake (www.cmake.org) Figure 3: H&E-stained tissue section and a consecutive section immunohisto- License: GNU GPL v2.0 chemically stained for cytokeratin. The top row shows the low-resolution im- ages and the bottom row a high-resolution image, centered at a metastasis. The ASAP contains several components, of which one is a metastasis is difficult to see in H&E but easy to identify in the immunohisto- viewer/annotation application (Fig. 4). This can be started via the chemically stained slide. A yellow bounding box indicates the metastasis loca- tion in the images in the top row. ASAP executable within the installation folder of the package. After opening an image file from the CAMELYON dataset, one can explore the data via a Google Maps-like interface. The pro- identify the same areas in both slides. This stain is also used in vided reference standard can be loaded via the annotation plu- daily clinical pathology practice to resolve diagnosis in the case gin. In addition, new annotations can be made with the annota- of metastasis-negative H&E [23, 24]. An example of an H&E WSI tion tools provided. Last, the viewer is not limited to files from and the corresponding consecutive cytokeratin immunohisto- the CAMELYON dataset but can visualize most WSI formats. chemical section are shown in Fig.3. In addition to the viewer application and C++ library for read- In the CAMELYON17 dataset, after establishing the reference ing and writing WSI images, we also provide Python-wrapped standard, slides were divided into artificial patients, covering the modules. To access the data via Python, the following code snip- different pN-stages (see Table 2). Each artificial patient only had pet can be used. WSIs from one center. For each artificial patient in the training The annotations are provided in human-readable XML for- part of the dataset, the pN-stage and the slide-level labels were mat and can be parsed using the ASAP package. However, other provided. This was done to assess the potential of participat- XML reading libraries can also be used. Annotations are stored ing algorithms within the challenge to perform automated pN- as polygons. Each polygon consists of a list of (x, y) coordinates staging. However, all WSIs can be used independently of their at the highest resolution level of the image. Annotations can be patient-level labels. A complete overview of the patient-level converted to binary images via the following code snippet. characteristics is shown in Table 5. The Python package can also be used to perform image pro- After the dataset and reference standard were established, cessing or machine-learning tasks on the data and to write out we uploaded the entire dataset to Google Drive and to BaiduPan. an image result. The code snippet below performs some basic These two options were chosen to reach as wide an audience as thresholding to generate a background mask. These results can possible, given that Google Drive is not accessible everywhere then subsequently be visualized using the viewer component of (e.g., People’s Republic of China). A link to the data was shared ASAP, which also supports floating point images. An example of with participants after registration at the CAMELYON websites the code snippet result can be seen in Fig.4B. [16, 17]. The ASAP package also supports writing your own image pro- cessing routines and integrating them as plugins into the viewer component. Some existing examples such as color deconvolu- Data validation and quality control tion and nuclei detection are provided. All glass slides included in the CAMELYON dataset were part of routine clinical care and are thus of diagnostic quality. However, Re-use potential during the acquisition process, scanning can fail or result in out- of-focus images. As a quality-control measure, all slides were in- The CAMELYON dataset is currently being used within the spected manually after scanning. The inspection was performed CAMELYON17 challenge, which is open for new participants and Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 6 CAMELYON dataset Figure 4: Interface of the ASAP viewer interface. Visible items are the annotations tools in toolbar, the viewport showing the WSI, and the plugin panel on the left. submissions. In this context, the dataset enables testing of new machine-learning and image analysis strategies against the cur- rent state-of-the-art. Within CAMELYON, we evaluate the algo- rithms based on a weighted Cohen’s kappa at the pN-stage level [27]. This statistic measures the categorical agreement between the algorithm and the reference standard where a value of 0 in- dicates agreement at the level of chance and 1 is perfect agree- ment. The quadratic weighting penalizes deviations of more than one category more severely. Conclusions arising from such experiments may have significance for the broader field of com- putational pathology, rather than being restricted to this partic- ular application. For example, experiments with weakly super- tion to different centers. In pathology, centers can differ in tissue vised machine learning in histopathology may benefit from the preparation, staining protocol, and scanning equipment. This CAMELYON dataset, with an established baseline based on fully can have a profound impact on image appearance. In the CAME- supervised machine learning. LYON dataset, we included data from five centers and three The dataset has also been used by companies experienced scanners. We are confident that algorithms trained with this in machine-learning applications to be a first foray into digi- data will generalize well. Users of the dataset can even explic- tal pathology, e.g., Google [28]. Because of its extent, observer itly evaluate this as we have indicated for each image the center experiments with pathologists may be performed to assess the from which it was obtained. By leaving out one center and eval- value of algorithms within a diagnostic setting. For example, a uating performance on that center specifically, the participants comparison of algorithms competing in the CAMELYON16 chal- can assess the robustness of their algorithms. lenge to pathologists in clinical practice was recently published We believe the usefulness of the dataset also extends beyond [29]. Experiments with the dataset may serve to identify relevant its initial use within the CAMELYON challenge. For example, it issues with implementation, validation, and regulatory affairs can be used for evaluation of color normalization algorithms and with respect to computational pathology. for cell detection/segmentation algorithms. A key example of implementation issues with respect to machine-learning algorithms in medical imaging is generaliza- Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 Litjens et al. 7 Ethical approval The collection of the data was approved by the local ethics com- mittee (Commissie Mensgebonden Onderzoek regio Arnhem - Nijmegen) under 2016-2761, and the need for informed consent was waived. Competing interests JvdL, PvD, and AB are members of the scientific advisory board of Philips Digital Pathology (Best, The Netherlands). JvdL is also part of the scientific advisory board of ContextVision (Stock- holm, Sweden), and PvD is part of the scientific advisory board of Sectra (Linkoping, ¨ Sweden). Funding Data collection and annotation where funded by Stichting IT Projecten and by the Fonds Economische Structuurversterking (tEPIS/TRAIT project; LSH-FES Program 2009; DFES1029161 and FES1103JJTBU). This work was also supported by grant 601040 from the FP7-funded VPH-PRISM project of the European Union. Author Contributions GL and JvdL designed the study and supervised the collection of the dataset. GL wrote the initial draft and final version of the manuscript. PBu, OG, BEB, MB, MH, QM, AB, NS, PvD, MvD, and CW were involved in sample collection. GL, PBa, and NS were involved in data anonymization and conversion. PBu, OG, MH, MB, MvD, QM, AH, RV, and PvD were involved in establishing the reference standard. All authors were involved in reviewing and finalizing the paper. References 1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Can- cer J Clin 2016;66(1):7–30. 2. Howlader N, Noone AM, Krapcho M, et al. SEER Cancer Statis- tics Review, 1975-2014. National Cancer Institute, Bethesda, MD. http://seer.cancer.gov/csr/1975 2014/ based on Novem- ber 2016 SEER data submission, posted to the SEER web site, April 2017;http://seer.cancer.gov/csr/1975 2014/. 3. Amin MB, Edge SB, Greene FL, et al. AJCC Cancer Staging Manual. Springer-Verlag GmbH; 2016. http://www.ebook.de /de/product/26196032/ajcc cancer staging manual.html. Availability of supporting data 4. Voogd AC, Nielsen M, Peterse JL, et al. Differences in risk fac- tors for local and distant recurrence after breast-conserving CAMELYON16 and CAMELYON17 datasets are open access and therapy or mastectomy for stage I and II breast cancer: shared publicly via the CAMELYON17 [17] website. Snapshots of pooled results of two large European randomized trials. J Clin this data and the code of ASAP [20] are also hosted in the Giga- Oncol 2001;19:1688–97. Science GigaDB database [30]. 5. Giuliano AE, Hunt KK, Ballman KV, et al. Axillary dissection vs no axillary dissection in women with invasive breast can- Abbreviations cer and sentinel node metastasis: a randomized clinical trial. JAMA 2011;305:569–75. ASAP: automated slide analysis platform; CNN: convolutional 6. Giuliano AE, Ballman KV, McCall L, et al. Effect of axillary neural network; CWZ: anisius-Wilhelmina Hospital; H&E: hema- dissection vs no axillary dissection on 10-year overall sur- toxylin and eosin; IHC: immunohistochemistry; ITC: isolated tu- vival among women with invasive breast cancer and sentinel mor cell; LPON: LabPON; pN-stage: pathological N-stage; RST: node metastasis: the ACOSOG Z0011 (Alliance) randomized Rijnstate Hospital; RUMC: Radboud University Medical Center; clinical trial. JAMA 2017;318:918–26. TIFF: tagged image file format; TNM: tumor, node, metastasis; 7. Edge SB, Compton CC. The American Joint Committee on UMCU: Utrecht University Medical Center; WSI: whole-slide im- Cancer: the 7th edition of the AJCC cancer staging manual age and the future of . Ann Surg Oncol 2010;17:1471–4. 8. Weaver DL. Pathology evaluation of sentinel lymph nodes Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018 8 CAMELYON dataset in breast cancer: protocol recommendations and rationale. pp. 248–55. Mod Pathol 2010;23(Suppl 2):S26–32. 20. Litjens GJS. Automate Slide Analysis Platform (ASAP); 2017. 9. Somner JEA, Dixon JMJ, Thomas JSJ. Node retrieval in ax- https://github.com/geertlitjens/ASAP. Accessed 17 October illary lymph node dissections: recommendations for mini- 2017. mum numbers to be confident about node negative status. J 21. Description of Philips TIFF file format; 2017. http://openslid Clin Pathol 2004;57:845–8. e.org/formats/philips/. Accessed 17 October 2017. 10. van Diest PJ, van Deurzen CHM, Cserni G. Pathology is- 22. Goode A, Gilbert B, Harkes J, et al. OpenSlide: a vendor- sues related to SN procedures and increased detection of neutral software foundation for digital pathology. J Pathology micrometastases and isolated tumor cells. Breast Disease Informatics 2013;4(1):27. 2010;31:65–81. 23. Chagpar A, Middleton LP, Sahin AA, et al. Clinical outcome 11. Vestjens J, Pepels M, de Boer M, et al. Relevant impact of cen- of patients with lymph node-negative breast carcinoma who tral pathology review on nodal classification in individual have sentinel lymph node micrometastases detected by im- breast cancer patients. Ann Oncol 2012;23(10):2561–6. munohistochemistry. Cancer 2005;103:1581–6. 12. Wolberg WH, Street WN, Mangasarian OL. Machine learning 24. Reed J, Rosman M, Verbanac KM, et al. Prognostic implica- techniques to diagnose breast cancer from image-processed tions of isolated tumor cells and micrometastases in sen- nuclear features of fine needle aspirates. Cancer Letters tinel nodes of patients with invasive breast cancer: 10-year 1994;77:163–71. analysis of patients enrolled in the prospective East Carolina 13. Diamond J, Anderson NH, Bartels PH, et al. The use of mor- University/Anne Arundel Medical Center Sentinel Node Mul- phological characteristics and texture analysis in the iden- ticenter Study. J Am Coll Surg.\ 2009;208:333–40. tification of tissue composition in prostatic neoplasia. Hum 25. Roberts CA, Beitsch PD, Litz CE, et al. Interpretive disparity Pathol 2004;35:1121–31. among pathologists in breast sentinel lymph node evalua- 14. Petushi S, Garcia FU, Haber MM, et al. Large-scale compu- tion. Am J Surg 2003;186(4):324–9. tations on histology images reveal grade-differentiating pa- 26. OpenSlide; 2017. http://openslide.org. Accessed 17 October rameters for breast cancer. BMC Medical Imaging 2006;6:14. 2017. 15. Litjens G, Sanc ´ hez CI, Timofeeva N, et al. Deep learning as a 27. Cohen J. A coefficient of agreement for nominal scales. Edu- tool for increased accuracy and efficiency of histopathologi- cational and Psychological Measurement 1960;20(1):37–46. cal diagnosis. Nat Sci Rep 2016;6:26286. 28. Liu Y, Gadepalli K, Norouzi M, et al. Detecting Cancer Metas- tases on Gigapixel Pathology Images. arXiv:170302442. 16. The CAMELYON16 Challenge; 2017. https://camelyon16.gra 29. Ehteshami Bejnordi B, Veta M, van Diest PJ, et al. Diagnos- nd-challenge.org. Accessed 13 November 2017. tic assessment of deep learning algorithms for detection of 17. The CAMELYON17 Challenge; 2017. https://camelyon17.gra lymph node metastases in women with breast cancer. JAMA nd-challenge.org. Accessed 13 November 2017. 2017;318:2199–2210. 18. Heimann T, van Ginneken B, Styner M, et al. Comparison 30. Litjens G, Bandi P, Bejnordi BE, et al. Supporting data and evaluation of methods for liver segmentation from CT for “1399 H&E-stained sentinel lymph node sections of datasets. IEEE Trans Med Imaging 2009;28:1251–65. breast cancer patients: the CAMELYON dataset.” GigaScience 19. Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hi- Database 2018;http://dx.doi.org/10.5524/100439. erarchical image database. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on IEEE; 2009. Downloaded from https://academic.oup.com/gigascience/article-abstract/7/6/giy065/5026175 by Ed 'DeepDyve' Gillespie user on 21 June 2018

Journal

GigaScienceOxford University Press

Published: May 31, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off