Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Dedicated computer-aided detection software for automated 3D breast ultrasound; an efficient tool for the radiologist in supplemental screening of women with dense breasts

Dedicated computer-aided detection software for automated 3D breast ultrasound; an efficient tool... Objectives To determine the effect of computer-aided-detection (CAD) software for automated breast ultrasound (ABUS) on reading time (RT) and performance in screening for breast cancer. Material and methods Unilateral ABUS examinations of 120 women with dense breasts were randomly selected from a multi-institutional archive of cases including 30 malignant (20/30 mammography-occult), 30 benign, and 60 normal cases with histopathological verification or ≥ 2 years of negative follow-up. Eight radiologists read once with (CAD- ABUS) and once without CAD (ABUS) with > 8 weeks between reading sessions. Readers provided a BI-RADS score and a level of suspiciousness (0-100). RT, sensitivity, specificity, PPV and area under the curve (AUC) were compared. Results Average RT was significantly shorter using CAD-ABUS (133.4 s/case, 95% CI 129.2-137.6) compared with ABUS (158.3 s/case, 95% CI 153.0-163.3) (p < 0.001). Sensitivity was 0.84 for CAD-ABUS (95% CI 0.79-0.89) and ABUS (95% CI 0.78-0.88) (p = 0.90). Three out of eight readers showed significantly higher specificity using CAD. Pooled specificity (0.71, 95% CI 0.68-0.75 vs. 0.67, 95% CI 0.64-0.70, p = 0.08) and PPV (0.50, 95% CI 0.45-0.55 vs. 0.44, 95% CI 0.39-0.49, p =0.07) were higher in CAD-ABUS vs. ABUS, respectively, albeit not significantly. Pooled AUC for CAD-ABUS was comparable with ABUS (0.82 vs. 0.83, p = 0.53, respectively). Conclusion CAD software for ABUS may decrease the time needed to screen for breast cancer without compromising the screening performance of radiologists. Key Points � ABUS with CAD software may speed up reading time without compromising radiologists’ accuracy. � CAD software for ABUS might prevent non-detection of malignant breast lesions by radiologists. � Radiologists reading ABUS with CAD software might improve their specificity without losing sensitivity. . . . . Keywords Ultrasonography Breast neoplasms Diagnosis, Computer-assisted Mammography Early detection of cancer * Jan C. M. van Zelst Center for Medical Imaging and Department of Radiology, University jan.vanzelst@radboudumc.nl Medical Centre Groningen (NL), Groningen, Netherlands MeVis Medical Solutions, Bremen (DE), Bremen, Germany Department of Radiology and Nuclear Medicine, Radboud Department of Gynaecology and Obstetrics, University Medical Centre Nijmegen (NL), Geert Grooteplein 10, Universitäts-Frauenklinik Heidelberg (D), Heidelberg, Germany 6525 GA Nijmegen, The Netherlands Department of Radiology, Jeroen Bosch Hospital, s-Hertogenbosch Department of Biomedical Imaging and Image Guided Therapy, (NL), s-Hertogenbosch, Netherlands Division of Molecular and Gender Imaging, Medical University of Department of Radiology, University Medical Centre Utrecht (NL), Vienna/Vienna General Hospital (A), Vienna, Austria Utrecht, Netherlands Department of Radiology, Centre Diagnosi per la Imatge Tarragona (E), Tarragona, Spain Eur Radiol (2018) 28:2996–3006 2997 Abbreviations volumes per breast are acquired. There is mounting evidence ABUS Automated breast ultrasound that, similar to handheld ultrasound, ABUS devices also lead ABVS Automated breast volume scanner to the detection of mammography-negative invasive breast AFROC Alternative free-response receiver-operator cancers [11–15]. characteristics A downside of supplemental ultrasound screening is the ANOVA Analysis Of variance detection of mammographically occult benign lesions that AUC Area under the ROC curve warrant histological verification [11, 13, 16], thus decreasing CAD Computer-aided detection the specificity of screening. ABUS devices do allow storage FFDM Full-field digital mammography of full breast ultrasound volumes, which enables the radiolo- GEE Generalised estimation equation gist to compare examinations with relevant priors, which is HER2 Human epidermal growth factor receptor 2 status expected to improve specificity in follow-up examinations. HR Hormone receptor status Due to the large number of images in the scan, reading a IRB Institutional review board full ABUS examination can be lengthy and cancers may easily LOS Level Of suspiciousness be overlooked [12]. Computer-aided detection (CAD) soft- MinIP Minimum intensity projection ware for ABUS has been developed to aid radiologists in the MIP Maximum intensity projection interpretation of ABUS studies [17]. CAD software should MRI Magnetic resonance imaging reduce the reading time of supplemental ABUS and may have MRMC Multiple reader multiple case the potential to improve the screening performance of radiol- PPV Positive predictive value ogists. To investigate the effectiveness of this approach, we ROC Receiver-operator characteristics investigated the effect of commercially available CAD soft- RT Reading time ware for ABUS on the reading time and screening perfor- US Ultrasound mance of breast radiologists. WBUS Whole-breast ultrasound Materials and methods Introduction The need for informed consent for this study was waived by In mammographic screening the sensitivity in women with the institutional review board (IRB). extremely dense breasts is only 61% [1]. A four times higher interval cancer rate is reported for these women compared ABUS acquisitions with women with fatty breasts [1]. Supplemental ultrasound (US) is an effective imaging method to detect mammography- ABUS examinations were performed with ACUSON S2000 negative early stage invasive breast cancer in women with Automated Breast Volume Scanner systems (Siemens, heterogeneously and extremely dense breasts [2–4], thus re- Erlangen, Germany). This ABUS system acquires 3D B- ducing the frequency of symptomatic interval carcinomas [5]. mode ultrasound volumes over an area of 154 mm × This is crucial, because detection of breast cancer at an early 156 mm using a mechanically driven linear array transducer stage substantially improves prognosis, even when using (14L5). Adequate depth and focus can be obtained using modern therapy regimes [6]. This explains the rationale and predefined settings for different breast cup sizes. All ABUS ratification of the breast density inform laws in many states in examinations were performed by technicians. To ensure cov- the USA [7, 8] and the introduction of supplemental whole- erage of the entire breast two to five overlapping acquisitions breast ultrasound (WBUS) screening in Austria [9]. were performed at predefined locations. The number of acqui- Performing supplemental WBUS with handheld devices sitions depends on the size of the breasts and the possibility to has limitations. It is relatively time consuming and difficult compress the breasts. Per acquisition 318 slices of 0.5 mm to compare to prior examinations. Furthermore, handheld thickness are obtained. A dedicated ABUS workstation recon- WBUS screening is operator dependent and should therefore structs the transverse slices into a 3D volume that can be read be performed by trained sonographists, which consequently in a multiplanar hanging, also showing sagittal and coronal requires substantial resources [10]. Automated 3D breast US reconstructions. (ABUS) devices have been developed to improve the repro- ducibility of WBUS and decrease the need for highly trained Data and gold standard sonographers. An ABUS examination consists of a set of large 3D volumes for each breast acquired with a wide automatical- Cases were selected from a large multi-institutional imaging ly driven linear array transducer. The number of volumes de- archive that consisted of 2158 ABUS examinations in 1086 pends on the size of the breast and in large breasts up to five women acquired between August 2010 and February 2015 2998 Eur Radiol (2018) 28:2996–3006 from screening programmes for women at average, interme- average number of false-positive CAD marks per ABUS vol- diate, and high risk and symptomatic women. For each wom- ume. In this study, we chose the default setting of one false- an a full-field digital mammography (FFDM) examination positive CAD mark per ABUS volume. was also available. To select only cases with high breast density, breast Readers density was determined using an automated volumetric software package (Volpara Density, Matakina Ltd. Seven breast radiologists and one gynaecologist specialised in Wellington, New Zealand) on 1657 available unprocessed breast imaging were invited to participate in this study. By FFDM images. For 501 examinations, where unprocessed inviting readers from different institutes and countries we FFDM images were not available, breast density was vi- aimed to increase the applicability of our results to breast sually assessed according to the BIRADS lexicon. imaging practices in different countries, realising that different Examinations of 115 women with a history of breast sur- readers might have slightly varying standards and customs. In gery were excluded; 1187 unilateral examinations of some countries, also other clinicians are involved in breastsin715 womenwerescoredasVolpara Density interpreting breast-imaging examinations. Therefore, we also Grade 3 and 4 or BIRADS density categories C or D. invited a non-radiologist (gynaecologist) who specialises in We categorised these dense cases as Bnormal^ (n = 919), breast imaging with approximately 10 years of experience in Bbenign^ (n = 140), or Bmalignant^ (n =128)based on breast ultrasound and mammography and 8 years of experi- radiology and pathology reports from histopathological ence with ABUS. Experience with breast imaging for reader examinations. BNormal^ and non-biopsied Bbenign^ cases one to reader eight was 7, 10, 4, 8, 8, 20, 4, and 20 years and were only considered if at least 2 years of negative specifically with ABUS was 5, 8, 0, 5, 5, 5, 0, and 0 years, follow-up was available. Subsequently, from these women respectively. with dense breasts, we included all cases with a mammography-negative malignant lesion (n = 20), ten Study design randomly selected malignant cases that were positive on both mammography and ABUS, 30 biopsied benign cases All eight readers evaluated all cases twice in two separate and 60 Bnormal^ cases in the study data set. The study reading sessions in an independent crossover multi-reader- data set thus consisted of 120 unilateral ABUS evalua- multi-case (MRMC) study. In each session half of the tions, yielding a total of 375 ABUS volumes.The selected ABUS cases were read conventionally and half of the cases cases were anonymised and stripped from information were read using a CAD-based workflow designed for this such as age, study date, and imaging institute. All lesions study. We counterbalanced the reading modes and changed were annotated by a breast imaging researcher with > 3 the case order by randomisation for each reader per reading years of experience with ABUS based on pathology and session. The reading sessions were at least 8 weeks apart (av- radiology reports. These annotations served as the ground erage 11.0 weeks, range 8.3-13.1) to further minimise any truth for observer and CAD software detection effect of memory bias. performance. Standard ABUS reading was performed in a multiplanar hanging without CAD software. CAD-based reading was CAD software and reading workstation performed according to specific instructions of a two-step reading protocol. The first step was to evaluate all CAD A prototype workstation was designed and developed specif- marks and dark spots on the MinIP in a case. Subsequently, ically for the task of high-throughput ABUS screening in this readers were instructed to scan the coronal reconstruction observer study (MeVis Medical Solutions, Bremen, of each ABUS view in a hanging protocol where coronal Germany). In this prototype, each user action was logged with reconstructions of all ABUS views of a breast are simulta- time stamps that were subsequently used to estimate the time neously shown. spent per case. Commercially developed CAD software The readers performed a training session of 20 cases to (QVCAD, Qview Medical Inc., Los Altos, CA) was integrat- become familiar with the workstation, reading protocol, and ed into this workstation. This CAD software is designed to CAD software. Readers were given a rough estimate (10- detect suspicious region candidates in an ABUS volume and 30%) of the prevalence of cancer in the study data set because mark them with so-called CAD marks (Fig. 1). In addition, the criteria for a recall may vary between radiologists who, as QVCAD software provides an Bintelligent^ minimum inten- in our study, work at different institutes and in different coun- sity projection (MinIP) of the breast tissue in a 3D ABUS tries [18] and may depend on the prevalence of cancers they volume that can be used for rapid navigation through ABUS expect. scans and enhances possible suspicious regions. The number In both CAD-based and conventional reading the readers of CAD marks displayed can be adjusted by setting the were instructed to mark and rate lesions by placing a finding Eur Radiol (2018) 28:2996–3006 2999 Fig. 1 CAD-based minimum intensity projection (MinIP) integrated in a 2D image where lower intensity regions in the 3D ABUS volume are multiplanar hanging protocol for ABUS that shows the conventional enhanced as dark spots. By clicking on the dark spot, the 3D multiplanar ABUS planes. The top plane shows the transverse acquisitions, the hanging automatically snaps to the corresponding 3D location. The CAD lower left plane the coronal reconstructions, and the lower right plane marks (coloured square) are displayed on the MinIP the sagittal reconstruction. The MinIP (bottom row in the middle) is a marker and subsequently determine a BI-RADS assessment estimation equation (GEE) for pooled data to correct for score. Because a quasi-continuous linear scale is required to repeated measurements. An examination was considered perform receiver-operating characteristic (ROC) analysis, positive if a BI-RADS 3 score (and its anchor point equiv- readers were also asked to provide a level of suspiciousness alent of 41 on the LOS scale) or higher was given. (LOS) score on a scale from 0-100. Note that LOS is not a Furthermore, we determined the area under the curve probability of malignancy as described in the BI-RADS atlas. (AUC) and 95 % CI using an alternative free-response re- Instead, readers were recommended to use anchor points re- ceiver-operating characteristics (AFROC) [19, 20]. For these ferring to the BI-RADS scores with LOS values of 21, 41, 61, analyses, when multiple findings were present in a case, the and 81 corresponding to the BI-RADS 1/2, 2/3, 3/4, and 4/5 finding with the highest rating was used. Ratings in malig- transitions. nant cases where the marker was placed outside of the an- notated lesion margin were not included in the analysis and Statistical analysis regarded as a false negative (missed cancer). By doing so, readers are not rewarded for a recall based on a false- We determined the sensitivity, specificity, and positive pre- positive finding accidentally occurring in a malignant case. We compared the AUCs for both reading modes for each dictive value (PPV) in both reading modes based on BI- RADS scores and compared these parameters per reader reader individually and also pooled over all readers (random readers, random cases). Reading time was compared for using paired McNemar ’s and chi-square tests with bootstrapping (1000 samples) to determine the 95% confi- each reader individually by using Student’s t-test with 1000 bootstraps to determine the 95% CI and GEE for dence intervals (CI) for individual readers and generalised 3000 Eur Radiol (2018) 28:2996–3006 pooled data. Only the readings recorded within the 95th Reading time percentile were included in the analysis to correct for inac- tivity of the reader during the reading sessions. Table 4 summarises the reading time for each individual read- The ROC analyses were performed using MRMC software er. On average, reading unilateral ABUS examinations using (JAFROC, version 4.2.1). The GEE was performed using the CAD software decreases the overall reading time by 24.9 s/ ‘geese’ function in the ‘geepack’ package in R (v. 3.2.3, R case (SE 3.43; p <0.001) (Fig. 4), which is a reduction of Foundation for Statistical Computing, Vienna, Austria). All 15.7%. All readers were faster using CAD software (range, other analyses were performed with SPSS statistics 20.0 3.1%-26.3%). In six out of eight readers, the CAD-based (IBM Statistics, Armonk, NY). workflow was significantly faster. The average reading time for malignant cases decreased by 12.1% (20.5 s/case, SE 6.97, p = 0.003), for benign cases by 17.3% (28.2 s/case, SE 6.77, p ≤ 0.001), and for normal cases Results by 16.8% (25.3 s/case, SE 4.76) (p ≤ 0.001). Patient characteristics Discussion Table 1 summarises the patient characteristics in women with breast cancer and Table 2 summarises patient char- Our study shows that CAD software for ABUS can help radi- acteristics of women with a Bnormal^ or Bbenign^ ologists to evaluate ABUS examinations more efficiently. ABUS examination. Radiologists who screen for breast cancer may use CAD soft- ware to evaluate batches of ABUS examinations 15.7% faster, without decreasing their performance in terms of cancer de- Screening performance tection. Interestingly, the higher specificity and PPV of the CAD-based reading mode suggest that the use of CAD soft- Figure 2 and Table 3 summarise the screening perfor- ware for ABUS may help radiologists avoid unnecessary re- mance per reader. On average, the sensitivity of unaided calls of healthy women, albeit this did not reach statistical conventional ABUS reading (84%, 95% CI 78-88) was significance. Our results might facilitate further implementa- similartothe sensitivityintheCAD-basedABUSreading tion of ABUS. Supplemental ABUS in women with protocol (84%, 95% CI 79-89) (p = 0.90). Nevertheless, mammographically dense breasts helps radiologists detect half of the readers detected more cancers with CAD, early stage cancers that are occult on mammography while only two readers detected fewer cancers using the [11–13]. Supplemental US screening reduces the interval can- CAD-based reading protocol. In the CAD-based readings cer rate in women with dense breasts [2, 21], which in general 6 out of 8 readers placed markers on a total of 11 lesions is associated with improved outcome [6]. Unfortunately, 31% that were actually malignant, but still classified them as of cancers in supplemental US screening are found to be al- benign (BI-RADS 2). In the unaided ABUS reading this ready visible on a prior screening US examination and could happened only in four readers and a total of five malig- still havebeendetectedearlier [22]. Reasons for non-detection nant lesions. Hence CAD helped in the detection of addi- in WBUS screening are usually misinterpretation and over- tional cancers but could not always induce an adequate sight errors. In our study, oversight errors in malignant cases classification by the readers. were more often observed in conventional ABUS reading than The average specificity for conventional ABUS reading in the CAD-based reading. In fact, half of the readers detected was 67% (95% CI 64-70) and this increased to 71% (95% and correctly classified more cancers in the CAD-based read- CI 68-75) in the CAD-based reading strategy, although this ings than in conventional ABUS reading. Nevertheless, of the did not reach statistical significance (p =0.08). ThePPV was missed cancers several were still marked by six readers in the on average 13.6% higher for the CAD-based ABUS reading CAD-based reading, but wrongly classified as benign. (50.0%, 95% CI 45-55) compared to the conventional ABUS Therefore, it appears that the CAD software has the potential reading (44.0%, 95% CI 39-49) (also not significant, p = to prevent oversight errors in ABUS but might require further 0.07). Overall, seven out of eight readers had higher specific- development to also aid in characterising lesions. Also the ity and PPV with CAD than without. Specificity was signifi- very limited experience all readers had with the CAD system cantly higher in three out of eight readers (readers 1, 4, and 6; might have partly contributed to the misclassification of ma- Table 3). Nevertheless, the AUCs did not statistically differ lignant lesions. between the conventional ABUS reading and the CAD-based Supplemental ABUS has been shown to increase the recall workflow (0.82, 95% CI 0.73-0.92 and 0.83, 95% CI 0.75- rate in breast cancer screening programmes [11, 13]. The im- plementation of an intelligent MinIP into the reading 0.92, respectively) (p = 0.53) (Fig. 3). Eur Radiol (2018) 28:2996–3006 3001 Table 1 Characteristics of the malignant cases in the data set Malignant cases Mean N N Mean Lymph HR+ HR+ HR- HER2+ HR- Unknown †Grade †Grade †Grade Grade age (SD) Symptomatic: FFDM lesion node HER2- HER2+ HER- receptor I II III unknown screening neg:pos size in mm metastasis status (SD) Total (n = 30) 49.8 17:13 20:10 16.0 (8.8) 8 16 2 3 4 4 4 14 10 2 (12.1) Invasive ductal 48 (11.1) 15:7 14:8 16.9 (9.9) 6 11 2 3 3 3 2 10 8 2 carcinoma (n =22) Invasive lobular 73.5 (4.9) 1 : 2 2 : 1 14.7 (5.5) 1 3 0 0 0 0 0 3 0 0 carcinoma (n =3) Invasive metaplastic 47.0 1:1 1 :1 16.5 (2.1) 0 0 0 0 1 0 0 0 2 0 carcinoma (n =2) (14.1) Invasive tubular 52 0 : 1 1 : 0 7 0 1 0 0 0 0 1 0 0 0 carcinoma (n =1) Invasive intracystic 45 0 : 1 1 : 0 12 1 1 0 0 0 0 1 0 0 0 papillary carcinoma (n =1) Non-invasive intracystic 49 0 : 1 1 : 0 14 0 0 0 0 0 1 0 1 0 0 papillary carcinoma (n =1) Nottingham histological grade (modified Bloom-Richardson-Elston) FFDM Full-field digital mammography HR Hormone receptor status (oestrogen and progesterone receptors) HER2 Human epidermal growth factor receptor 2 status 3002 Eur Radiol (2018) 28:2996–3006 Table 2 Characteristics of Mean age (SD) N symptomatic:screening Mean size (SD) women with an ABUS examination labelled as ‘benign’ Normal cases (n = 60) 42.0 (9.5) 4:56 N/A and ‘normal’ Benign cases total (n = 30) 44.9 (9.1) 15:15 12.4 (5.1) Fibroadenoma (n = 12) 42.9 (5.3) 7:5 12.4 (5.7) Fibrosis/adenosis (n = 5) 43.6 (6.3) 1:4 10.2 (4.1) Cystic lesions (n = 5) 46.6 (8.8) 3:2 14.8 (7.8) Other benign breast tissue (n = 5) 54.6 (13.0) 3:2 12.2 (1.9) Papilloma (n = 2) 38.5 (9.2) 1:1 14.0 (2.8) Complex sclerosing lesion (n = 1) 30.0 0:1 8.0 (0.0) SD Standard deviation environment therefore also aims at improvement of specifici- unnecessary recalls in ABUS by improving the specific- ty. The MinIP uses the greyscale contrast in B-mode ultra- ity and PPV of radiologists. Although the overall results sound between lesions and healthy tissue to summarise the were not significant, a positive effect was still seen in 3D volume in a 2D image; hence normal tissue appears seven out of eight readers. Whether ABUS CAD soft- lighter than cancers that show up as dark spots on the ware in actual supplemental screening truly helps to MinIP. Moreover the CAD software also enhances the decrease the recall rate and improve radiologist’s speci- more suspicious regions by lowering the intensity of the ficity still needs to be investigated prospectively. lesion on the MinIP and strengthening the coronal re- In a previous pilot study, we investigated the effect of CAD traction sign, which is highly suggestive of breast can- software for ABUS on the screening performance of readers cer in ABUS [23]. Consequently, the MinIP points out when screening for breast cancer [24]. Our previous study relevant lesions and reduces the suspiciousness of irrel- showed that concurrent reading CAD software may improve evant regions in ABUS volumes. Our study indicates the accuracy of radiologists for evaluation of single ABUS that using this CAD software might indeed decrease volumes. In the current study, the CAD software was Fig. 2 Increment in sensitivity and specificity per reader after subtracting the sensitivity of the specificity of the conventional ABUS reading session from the CAD-based workflow reading session. Ideally all readers perform within the upper right quadrant Eur Radiol (2018) 28:2996–3006 3003 Table 3 Individual performance per reader for the conventional ABUS reading and the CAD-based workflow reading Reader (years of ABUS Sensitivity 95% CI (up, low) p- value Specificity 95% CI (up, low) p value PPV 95% CI (up, low) p value AUC 95% CI (up, low) p value experience) 1(5) ABUS 0.80 0.67 0.93 0.79 0.71 0.88 0.56 0.42 0.70 0.77 0.64 0.91 CAD 0.77 0.60 0.90 1.00 0.89 0.82 0.96 0.03 0.70 0.55 0.85 0.22 0.83 0.71 0.94 0.34 2(8) ABUS 0.83 0.67 0.97 0.69 0.60 0.79 0.47 0.34 0.60 0.79 0.66 0.92 CAD 0.83 0.7 0.97 1.00 0.71 0.62 0.80 0.82 0.49 0.35 0.63 0.85 0.8 0.67 0.93 0.93 3(0) ABUS 0.73 0.57 0.87 0.73 0.64 0.82 0.48 0.35 0.61 0.73 0.60 0.87 CAD 0.80 0.63 0.93 0.63 0.74 0.66 0.83 1.00 0.51 0.36 0.64 0.76 0.78 0.67 0.90 0.27 4(5) ABUS 0.80 0.63 0.90 0.64 0.54 0.74 0.43 0.30 0.55 0.85 0.75 0.95 CAD 0.80 0.63 0.90 1.00 0.80 0.71 0.88 0.001 0.57 0.43 0.71 0.16 0.87 0.78 0.96 0.51 5(5) ABUS 0.87 0.73 0.97 0.68 0.58 0.77 0.47 0.35 0.60 0.88 0.79 0.98 CAD 0.90 0.80 1.00 1.00 0.71 0.61 0.80 0.70 0.51 0.38 0.64 0.70 0.87 0.77 0.97 0.79 6(5) ABUS 0.93 0.83 1.00 0.42 0.32 0.52 0.35 0.25 0.45 0.87 0.79 0.96 CAD 0.77 0.60 0.90 0.06 0.68 0.58 0.78 < 0.001 0.44 0.31 0.58 0.29 0.81 0.71 0.91 0.21 7(0) ABUS 0.87 0.73 0.97 0.74 0.66 0.83 0.53 0.39 0.67 0.88 0.78 0.96 CAD 0.93 0.83 1.00 0.5 0.82 0.74 0.90 0.19 0.64 0.48 0.77 0.30 0.92 0.85 0.99 0.21 8(0) ABUS 0.83 0.70 0.97 0.51 0.41 0.61 0.36 0.25 0.48 0.81 0.70 0.92 CAD 0.90 0.80 1.00 0.5 0.43 0.33 0.53 0.35 0.35 0.24 0.46 0.84 0.81 0.71 0.91 0.96 Pooled ABUS 0.84 0.78 0.88 0.67 0.64 0.70 0.44 0.39 0.49 0.82 0.73 0.92 CAD 0.84 079 0.89 0.90 0.71 0.68 0.75 0.08 0.50 0.45 0.55 0.07 0.83 0.75 0.92 0.53 Sensitivity, specificity, and PPV are based on the BI-RADS assessment per case. The AUC is based on a BI-RADS-based linear rating scale from 0-100 ABUS Automated breast ultrasound reading CAD Computer-aided detection-based workflow reading PPV Positive predictive value (for all recommendations other than routine screening follow-up) AUC Area under the curve 95% CI 95% confidence interval 3004 Eur Radiol (2018) 28:2996–3006 Fig. 3 Alternative free-response receiver-operating characteristic curves for conventional ABUS reading (striped intervals) and computer-aided detection based workflow reading (straight line). No statistical difference is observed between the areas under the curves implemented into a specific CAD-based screening workflow dark spots and CAD marks in the MinIP and subsequently to boost the reading speed during batch reading of whole- also scan the coronal reconstructions of the ABUS vol- breast ABUS examinations. The purpose of this study was umes. As a consequence our instructions prolonged the therefore to investigate the effect of CAD software on the reading time in the CAD-based reading sessions. Most efficiency rather than on the accuracy. In addition, this study breast radiologists are familiar with the concept of was performed using whole-breast examinations only from summarising relevant information of 3D breast imaging women with heterogeneously dense or extremely dense in a 2D image, as is common practice in tomosynthesis breasts, thus creating a data set that is representative for (synthetic mammogram) and in dynamic contrast- supplementary screening with ABUS in dense breasts. enhanced breast MRI [maximum intensity projections The mean reading time of a unilateral ABUS examina- (MIP)]. Kuhl et al. reported that looking only at MIPs tion with an average of three volumes per breast with- is a reliable and fast (3-30 s per case) approach to out CAD software in our study was 158.3 s, which is in breast cancer screening with MRI [26]. The CAD- line with previously reported 3-9 min for a bilateral enhanced MinIP in our study could theoretically be used WBUS examination [11, 16, 25]. However, our study in a similar way, thus further reducing the reading time data set was enriched with cancers and suspicious be- required per ABUS volume. However, future studies nign cases, which likely increases the reading time per need to elucidate the effect this may have on the sensi- case. Our CAD-based reading workflow decreased the tivity of ABUS. average reading time with 15.7% to 133.4 s per unilat- Our study has limitations. We did not show corre- eral ABUS examination. The improvement in reading sponding mammograms with the ABUS examinations speed was higher in normal and benign cases than in although these modalities are complementary in most malignant cases. We therefore expect that this gain in screenings regimes of women with dense breasts and efficiency in a true screening setting could be higher this might positively or negatively affect the screening than in our study. performance. Furthermore, we enriched the data set with Navigation of the ABUS examinations using the CAD- benign and malignant lesions from both screening and enhanced MinIP can be performed relatively quickly. But diagnostic examinations to increase the power in this in our study the readers were instructed to evaluate all study. By doing so, our study data set does not Eur Radiol (2018) 28:2996–3006 3005 Table 4 Average reading time per reader for both conventional ABUS reading and reading the CAD-based reading workflow Reader (years Average reading 95% CI (low, high) Average reading time 95% CI Percentage p value experience ABUS) time ABUS (s) CAD-ABUS (s) (low, high) decrease 1 (5) 171.2 156.5 186.5 166.0 150.4 181.0 3.1 0.56 2 (8) 145.4 132.4 159.1 136.1 124.5 149.6 6.5 0.24 3 (0) 146.7 132.6 162.2 123.4 113.0 134.3 15.9 < 0.001 4 (5) 175.2 158.7 190.8 140.8 130.2 150.1 19.7 0.001 5 (5) 101.2 95.7 108.4 91.2 84.7 97.7 9.9 0.008 6 (5) 138.6 127.1 151.1 110 100.1 119.4 20.6 0.001 7 (0) 217.2 197.9 236.2 160.1 148.0 172.3 26.3 0.001 8 (0) 173.3 173.3 185.2 140.9 132.3 150.0 18.7 0.001 Pooled Average 158.3 153.0 163.6 133.4 129.2 137.6 15.7 < 0.001 Normal 151.0 143.6 158.4 125.7 120.0 131.4 16.8 < 0.001 Benign 163.0 152.6 173.3 134.8 126.4 143.1 17.3 < 0.001 Malignant 169.3 158.8 180.0 148.8 140.2 157.5 12.1 0.003 All readers were faster with CAD software. Six of eight readers were significantly faster ABUS Automated breast ultrasound CAD Computer-aided detection software represent clinical practice where the prevalence of be- In conclusion, our study shows that the CAD software de- nign and malignant lesions is lower. Finally, multiple veloped for ABUS has the potential to improve the efficiency readers had little experience with ABUS and all readers of reading ABUS by significantly improving the reading were inexperienced with the CAD software package that speed without decreasing the screening performance. Further we implemented in our screening environment, which research is warranted in a prospective study to investigate the may have negatively affected the screening performance effect of CAD on breast cancer detection, screening recalls, and reading time. and the interval cancer rate in screening programmes. Fig. 4 Histograms for reading time needed to read all cases in a conventional ABUS protocol (striped interval) and for reading in a CAD-based workflow protocol (straight) 3006 Eur Radiol (2018) 28:2996–3006 Funding This study has received funding by European Union's Seventh 8. Hooley RJ (2017) Breast density legislation and clinical evidence. Framework programme FP7 under grant agreement no. 306088. Radiol Clin North Am 55:513–526 9. The Austrian Breast Cancer Early Detection Programma. http:// www.frueh-erkennen.at/. Accessed 25 June 2017 Compliance with ethical standards 10. Berg WA, Blume JD, Cormack JB, Mendelson EB (2006) Operator dependence of physician-performed whole-breast US: lesion detec- Guarantor The scientific guarantor of this publication is Prof. Dr. N. tion and characterization. Radiology 241:355–365 Karssemeijer. 11. Brem RF, Tabár L, Duffy SW et al (2015) Assessing improvement in detection of breast cancer with three-dimensional automated Conflict of interest The authors of this manuscript declare relationships breast US in women with dense breast tissue: The SomoInsight with the following companies: Dr. N. Karssemeijer is CEO of Study. Radiology 274:663–673 Screenpoint Medical Inc. and a shareholder in Qview Medical Inc. and 12. Wilczek B, Wilczek HE, Rasouliyan L, Leifland K (2016) Adding Matakina Ltd. Dr. R. Mann is speaker for Siemens Healthcare. 3D automated breast ultrasound to mammography screening in women with heterogeneously and extremely dense breasts: Report Statistics and biometry One of the authors has significant statistical from a hospital-based, high-volume, single-center breast cancer expertise. screening program. Eur J Radiol 85:1554–1563 13. Giuliano V, Giuliano C (2012) Improved breast cancer detection in asymptomatic women using 3D-automated breast ultrasound in Ethical approval Institutional Review Board approval was obtained. mammographically dense breasts. Clin Imaging 37:480–486 14. Choi WJ, Cha JH, Kim HH et al (2014) Comparison of automated Informed consent Written informed consent was waived by the breast volume scanning and hand- held ultrasound in the detection Institutional Review Board. of breast cancer: an analysis of 5,566 patient evaluations. Asian Pac J Cancer Prev 15:9101–9105 Methodology 15. Vourtsis A, Kachulis A (2017) The performance of 3D ABUS ver- � retrospective sus HHUS in the visualisation and BI-RADS characterisation of � multiple case-multiple reader study breast lesions in a large cohort of 1,886 women. Eur Radiol 1–10. � performed at one institution https://doi.org/10.1007/s00330-017-5011-9 16. Kelly KM, Dean J, Comulada WS, Lee S-JJ (2010) Breast cancer Open Access This article is distributed under the terms of the Creative detection using automated whole breast ultrasound and mammog- Commons Attribution 4.0 International License (http:// raphy in radiographically dense breasts. Eur Radiol 20:734–742 creativecommons.org/licenses/by/4.0/), which permits unrestricted use, 17. Tan T, Mordang J-J, van Zelst J et al (2015) Computer-aided detec- distribution, and reproduction in any medium, provided you give appro- tion of breast cancers using Haar-like features in automated 3D priate credit to the original author(s) and the source, provide a link to the breast ultrasound. Med Phys 42:1498–1504 Creative Commons license, and indicate if changes were made. 18. Evans KK, Birdwell RL, Wolfe JM (2013) If you don’t find it often, you often don’t find it: why some cancers are missed in breast cancer screening. PLoS One 8:e64366 19. Hillis SL, Berbaum KS, Metz CE (2008) Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study References analysis. Acad Radiol 15:647–661 20. Dorfman DD, Berbaum KS, Metz CE (1992) Receiver operating characteristic rating analysis: Generalization to the population of 1. Wanders JOP, Holland K, Veldhuis WB et al (2017) Volumetric readers and patients with the jackknife method. Invest Radiol 27: breast density affects performance of digital screening mammogra- 723–731 phy. Breast Cancer Res Treat 162:95–103 21. Corsetti V, Houssami N, Ghirardi M et al (2011) Evidence of the 2. Ohuchi N, Suzuki A, Sobue T et al (2016) Sensitivity and specific- effect of adjunct ultrasound screening in women with ity of mammography and adjunctive ultrasonography to screen for mammography-negative dense breasts: interval breast cancers at 1 breast cancer in the Japan Strategic Anti-cancer Randomized Trial year follow-up. Eur J Cancer 47:1021–1026 (J-START): a randomised controlled trial. Lancet Jan 23;387:341– 22. Song SE, Cho N, Chu A et al (2015) Undiagnosed breast cancer: features at supplemental screening US. Radiology 277:372–380 3. Shen S, Zhou Y, Xu Y et al (2015) A multi-centre randomised trial 23. Van Zelst JCM, Platel B, Karssemeijer N, Mann RM (2015) comparing ultrasound vs mammography for screening breast can- Multiplanar reconstructions of 3D automated breast ultrasound im- cer in high-risk Chinese women. Br J Cancer 112:998–1004 prove lesion differentiation by radiologists. Acad Radiol. Dec;22: 4. Berg WA, Blume JD, Cormack JB et al (2008) Combined screening 1489-1496 with ultrasound and mammography vs mammography alone in 24. Van Zelst JCM, Tan T, Platel B et al (2017) Improved cancer de- women at elevated risk of breast cancer. JAMA 299:2151–2163 tection in automated breast ultrasound by radiologists using com- 5. Bae MS, Moon WK, Chang JM et al (2014) Breast cancer detected puter aided detection. Eur J Radiol 89:54–59 with screening US: reasons for nondetection at mammography. 25. Skaane P, Gullien R, Eben EB et al (2015) Interpretation of auto- Radiology 270:369–377 mated breast ultrasound (ABUS) with and without knowledge of 6. Saadatmand S, Bretveld R, Siesling S, Tilanus-Linthorst MMA mammography: a reader performance study. Acta Radiol 56:404– (2015) Influence of tumour stage at breast cancer detection on sur- vival in modern times: population based study in 173 797 patients. 26. Kuhl CK, Schrading S, Strobel K et al (2014) Abbreviated breast BMJ Oct 6;351:h4901 magnetic resonance imaging (MRI): first postcontrast subtracted 7. Durand MA, Hooley RJ (2017) Implementation of whole-breast images and maximum-intensity projection-a novel approach to screening ultrasonography. Radiol Clin North Am 55:527–539 breast cancer screening with MRI. J Clin Oncol 32:2304–2310 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png European Radiology Springer Journals

Dedicated computer-aided detection software for automated 3D breast ultrasound; an efficient tool for the radiologist in supplemental screening of women with dense breasts

Loading next page...
 
/lp/springer_journal/dedicated-computer-aided-detection-software-for-automated-3d-breast-G56cgSZoOL
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s)
Subject
Medicine & Public Health; Imaging / Radiology; Diagnostic Radiology; Interventional Radiology; Neuroradiology; Ultrasound; Internal Medicine
ISSN
0938-7994
eISSN
1432-1084
DOI
10.1007/s00330-017-5280-3
pmid
29417251
Publisher site
See Article on Publisher Site

Abstract

Objectives To determine the effect of computer-aided-detection (CAD) software for automated breast ultrasound (ABUS) on reading time (RT) and performance in screening for breast cancer. Material and methods Unilateral ABUS examinations of 120 women with dense breasts were randomly selected from a multi-institutional archive of cases including 30 malignant (20/30 mammography-occult), 30 benign, and 60 normal cases with histopathological verification or ≥ 2 years of negative follow-up. Eight radiologists read once with (CAD- ABUS) and once without CAD (ABUS) with > 8 weeks between reading sessions. Readers provided a BI-RADS score and a level of suspiciousness (0-100). RT, sensitivity, specificity, PPV and area under the curve (AUC) were compared. Results Average RT was significantly shorter using CAD-ABUS (133.4 s/case, 95% CI 129.2-137.6) compared with ABUS (158.3 s/case, 95% CI 153.0-163.3) (p < 0.001). Sensitivity was 0.84 for CAD-ABUS (95% CI 0.79-0.89) and ABUS (95% CI 0.78-0.88) (p = 0.90). Three out of eight readers showed significantly higher specificity using CAD. Pooled specificity (0.71, 95% CI 0.68-0.75 vs. 0.67, 95% CI 0.64-0.70, p = 0.08) and PPV (0.50, 95% CI 0.45-0.55 vs. 0.44, 95% CI 0.39-0.49, p =0.07) were higher in CAD-ABUS vs. ABUS, respectively, albeit not significantly. Pooled AUC for CAD-ABUS was comparable with ABUS (0.82 vs. 0.83, p = 0.53, respectively). Conclusion CAD software for ABUS may decrease the time needed to screen for breast cancer without compromising the screening performance of radiologists. Key Points � ABUS with CAD software may speed up reading time without compromising radiologists’ accuracy. � CAD software for ABUS might prevent non-detection of malignant breast lesions by radiologists. � Radiologists reading ABUS with CAD software might improve their specificity without losing sensitivity. . . . . Keywords Ultrasonography Breast neoplasms Diagnosis, Computer-assisted Mammography Early detection of cancer * Jan C. M. van Zelst Center for Medical Imaging and Department of Radiology, University jan.vanzelst@radboudumc.nl Medical Centre Groningen (NL), Groningen, Netherlands MeVis Medical Solutions, Bremen (DE), Bremen, Germany Department of Radiology and Nuclear Medicine, Radboud Department of Gynaecology and Obstetrics, University Medical Centre Nijmegen (NL), Geert Grooteplein 10, Universitäts-Frauenklinik Heidelberg (D), Heidelberg, Germany 6525 GA Nijmegen, The Netherlands Department of Radiology, Jeroen Bosch Hospital, s-Hertogenbosch Department of Biomedical Imaging and Image Guided Therapy, (NL), s-Hertogenbosch, Netherlands Division of Molecular and Gender Imaging, Medical University of Department of Radiology, University Medical Centre Utrecht (NL), Vienna/Vienna General Hospital (A), Vienna, Austria Utrecht, Netherlands Department of Radiology, Centre Diagnosi per la Imatge Tarragona (E), Tarragona, Spain Eur Radiol (2018) 28:2996–3006 2997 Abbreviations volumes per breast are acquired. There is mounting evidence ABUS Automated breast ultrasound that, similar to handheld ultrasound, ABUS devices also lead ABVS Automated breast volume scanner to the detection of mammography-negative invasive breast AFROC Alternative free-response receiver-operator cancers [11–15]. characteristics A downside of supplemental ultrasound screening is the ANOVA Analysis Of variance detection of mammographically occult benign lesions that AUC Area under the ROC curve warrant histological verification [11, 13, 16], thus decreasing CAD Computer-aided detection the specificity of screening. ABUS devices do allow storage FFDM Full-field digital mammography of full breast ultrasound volumes, which enables the radiolo- GEE Generalised estimation equation gist to compare examinations with relevant priors, which is HER2 Human epidermal growth factor receptor 2 status expected to improve specificity in follow-up examinations. HR Hormone receptor status Due to the large number of images in the scan, reading a IRB Institutional review board full ABUS examination can be lengthy and cancers may easily LOS Level Of suspiciousness be overlooked [12]. Computer-aided detection (CAD) soft- MinIP Minimum intensity projection ware for ABUS has been developed to aid radiologists in the MIP Maximum intensity projection interpretation of ABUS studies [17]. CAD software should MRI Magnetic resonance imaging reduce the reading time of supplemental ABUS and may have MRMC Multiple reader multiple case the potential to improve the screening performance of radiol- PPV Positive predictive value ogists. To investigate the effectiveness of this approach, we ROC Receiver-operator characteristics investigated the effect of commercially available CAD soft- RT Reading time ware for ABUS on the reading time and screening perfor- US Ultrasound mance of breast radiologists. WBUS Whole-breast ultrasound Materials and methods Introduction The need for informed consent for this study was waived by In mammographic screening the sensitivity in women with the institutional review board (IRB). extremely dense breasts is only 61% [1]. A four times higher interval cancer rate is reported for these women compared ABUS acquisitions with women with fatty breasts [1]. Supplemental ultrasound (US) is an effective imaging method to detect mammography- ABUS examinations were performed with ACUSON S2000 negative early stage invasive breast cancer in women with Automated Breast Volume Scanner systems (Siemens, heterogeneously and extremely dense breasts [2–4], thus re- Erlangen, Germany). This ABUS system acquires 3D B- ducing the frequency of symptomatic interval carcinomas [5]. mode ultrasound volumes over an area of 154 mm × This is crucial, because detection of breast cancer at an early 156 mm using a mechanically driven linear array transducer stage substantially improves prognosis, even when using (14L5). Adequate depth and focus can be obtained using modern therapy regimes [6]. This explains the rationale and predefined settings for different breast cup sizes. All ABUS ratification of the breast density inform laws in many states in examinations were performed by technicians. To ensure cov- the USA [7, 8] and the introduction of supplemental whole- erage of the entire breast two to five overlapping acquisitions breast ultrasound (WBUS) screening in Austria [9]. were performed at predefined locations. The number of acqui- Performing supplemental WBUS with handheld devices sitions depends on the size of the breasts and the possibility to has limitations. It is relatively time consuming and difficult compress the breasts. Per acquisition 318 slices of 0.5 mm to compare to prior examinations. Furthermore, handheld thickness are obtained. A dedicated ABUS workstation recon- WBUS screening is operator dependent and should therefore structs the transverse slices into a 3D volume that can be read be performed by trained sonographists, which consequently in a multiplanar hanging, also showing sagittal and coronal requires substantial resources [10]. Automated 3D breast US reconstructions. (ABUS) devices have been developed to improve the repro- ducibility of WBUS and decrease the need for highly trained Data and gold standard sonographers. An ABUS examination consists of a set of large 3D volumes for each breast acquired with a wide automatical- Cases were selected from a large multi-institutional imaging ly driven linear array transducer. The number of volumes de- archive that consisted of 2158 ABUS examinations in 1086 pends on the size of the breast and in large breasts up to five women acquired between August 2010 and February 2015 2998 Eur Radiol (2018) 28:2996–3006 from screening programmes for women at average, interme- average number of false-positive CAD marks per ABUS vol- diate, and high risk and symptomatic women. For each wom- ume. In this study, we chose the default setting of one false- an a full-field digital mammography (FFDM) examination positive CAD mark per ABUS volume. was also available. To select only cases with high breast density, breast Readers density was determined using an automated volumetric software package (Volpara Density, Matakina Ltd. Seven breast radiologists and one gynaecologist specialised in Wellington, New Zealand) on 1657 available unprocessed breast imaging were invited to participate in this study. By FFDM images. For 501 examinations, where unprocessed inviting readers from different institutes and countries we FFDM images were not available, breast density was vi- aimed to increase the applicability of our results to breast sually assessed according to the BIRADS lexicon. imaging practices in different countries, realising that different Examinations of 115 women with a history of breast sur- readers might have slightly varying standards and customs. In gery were excluded; 1187 unilateral examinations of some countries, also other clinicians are involved in breastsin715 womenwerescoredasVolpara Density interpreting breast-imaging examinations. Therefore, we also Grade 3 and 4 or BIRADS density categories C or D. invited a non-radiologist (gynaecologist) who specialises in We categorised these dense cases as Bnormal^ (n = 919), breast imaging with approximately 10 years of experience in Bbenign^ (n = 140), or Bmalignant^ (n =128)based on breast ultrasound and mammography and 8 years of experi- radiology and pathology reports from histopathological ence with ABUS. Experience with breast imaging for reader examinations. BNormal^ and non-biopsied Bbenign^ cases one to reader eight was 7, 10, 4, 8, 8, 20, 4, and 20 years and were only considered if at least 2 years of negative specifically with ABUS was 5, 8, 0, 5, 5, 5, 0, and 0 years, follow-up was available. Subsequently, from these women respectively. with dense breasts, we included all cases with a mammography-negative malignant lesion (n = 20), ten Study design randomly selected malignant cases that were positive on both mammography and ABUS, 30 biopsied benign cases All eight readers evaluated all cases twice in two separate and 60 Bnormal^ cases in the study data set. The study reading sessions in an independent crossover multi-reader- data set thus consisted of 120 unilateral ABUS evalua- multi-case (MRMC) study. In each session half of the tions, yielding a total of 375 ABUS volumes.The selected ABUS cases were read conventionally and half of the cases cases were anonymised and stripped from information were read using a CAD-based workflow designed for this such as age, study date, and imaging institute. All lesions study. We counterbalanced the reading modes and changed were annotated by a breast imaging researcher with > 3 the case order by randomisation for each reader per reading years of experience with ABUS based on pathology and session. The reading sessions were at least 8 weeks apart (av- radiology reports. These annotations served as the ground erage 11.0 weeks, range 8.3-13.1) to further minimise any truth for observer and CAD software detection effect of memory bias. performance. Standard ABUS reading was performed in a multiplanar hanging without CAD software. CAD-based reading was CAD software and reading workstation performed according to specific instructions of a two-step reading protocol. The first step was to evaluate all CAD A prototype workstation was designed and developed specif- marks and dark spots on the MinIP in a case. Subsequently, ically for the task of high-throughput ABUS screening in this readers were instructed to scan the coronal reconstruction observer study (MeVis Medical Solutions, Bremen, of each ABUS view in a hanging protocol where coronal Germany). In this prototype, each user action was logged with reconstructions of all ABUS views of a breast are simulta- time stamps that were subsequently used to estimate the time neously shown. spent per case. Commercially developed CAD software The readers performed a training session of 20 cases to (QVCAD, Qview Medical Inc., Los Altos, CA) was integrat- become familiar with the workstation, reading protocol, and ed into this workstation. This CAD software is designed to CAD software. Readers were given a rough estimate (10- detect suspicious region candidates in an ABUS volume and 30%) of the prevalence of cancer in the study data set because mark them with so-called CAD marks (Fig. 1). In addition, the criteria for a recall may vary between radiologists who, as QVCAD software provides an Bintelligent^ minimum inten- in our study, work at different institutes and in different coun- sity projection (MinIP) of the breast tissue in a 3D ABUS tries [18] and may depend on the prevalence of cancers they volume that can be used for rapid navigation through ABUS expect. scans and enhances possible suspicious regions. The number In both CAD-based and conventional reading the readers of CAD marks displayed can be adjusted by setting the were instructed to mark and rate lesions by placing a finding Eur Radiol (2018) 28:2996–3006 2999 Fig. 1 CAD-based minimum intensity projection (MinIP) integrated in a 2D image where lower intensity regions in the 3D ABUS volume are multiplanar hanging protocol for ABUS that shows the conventional enhanced as dark spots. By clicking on the dark spot, the 3D multiplanar ABUS planes. The top plane shows the transverse acquisitions, the hanging automatically snaps to the corresponding 3D location. The CAD lower left plane the coronal reconstructions, and the lower right plane marks (coloured square) are displayed on the MinIP the sagittal reconstruction. The MinIP (bottom row in the middle) is a marker and subsequently determine a BI-RADS assessment estimation equation (GEE) for pooled data to correct for score. Because a quasi-continuous linear scale is required to repeated measurements. An examination was considered perform receiver-operating characteristic (ROC) analysis, positive if a BI-RADS 3 score (and its anchor point equiv- readers were also asked to provide a level of suspiciousness alent of 41 on the LOS scale) or higher was given. (LOS) score on a scale from 0-100. Note that LOS is not a Furthermore, we determined the area under the curve probability of malignancy as described in the BI-RADS atlas. (AUC) and 95 % CI using an alternative free-response re- Instead, readers were recommended to use anchor points re- ceiver-operating characteristics (AFROC) [19, 20]. For these ferring to the BI-RADS scores with LOS values of 21, 41, 61, analyses, when multiple findings were present in a case, the and 81 corresponding to the BI-RADS 1/2, 2/3, 3/4, and 4/5 finding with the highest rating was used. Ratings in malig- transitions. nant cases where the marker was placed outside of the an- notated lesion margin were not included in the analysis and Statistical analysis regarded as a false negative (missed cancer). By doing so, readers are not rewarded for a recall based on a false- We determined the sensitivity, specificity, and positive pre- positive finding accidentally occurring in a malignant case. We compared the AUCs for both reading modes for each dictive value (PPV) in both reading modes based on BI- RADS scores and compared these parameters per reader reader individually and also pooled over all readers (random readers, random cases). Reading time was compared for using paired McNemar ’s and chi-square tests with bootstrapping (1000 samples) to determine the 95% confi- each reader individually by using Student’s t-test with 1000 bootstraps to determine the 95% CI and GEE for dence intervals (CI) for individual readers and generalised 3000 Eur Radiol (2018) 28:2996–3006 pooled data. Only the readings recorded within the 95th Reading time percentile were included in the analysis to correct for inac- tivity of the reader during the reading sessions. Table 4 summarises the reading time for each individual read- The ROC analyses were performed using MRMC software er. On average, reading unilateral ABUS examinations using (JAFROC, version 4.2.1). The GEE was performed using the CAD software decreases the overall reading time by 24.9 s/ ‘geese’ function in the ‘geepack’ package in R (v. 3.2.3, R case (SE 3.43; p <0.001) (Fig. 4), which is a reduction of Foundation for Statistical Computing, Vienna, Austria). All 15.7%. All readers were faster using CAD software (range, other analyses were performed with SPSS statistics 20.0 3.1%-26.3%). In six out of eight readers, the CAD-based (IBM Statistics, Armonk, NY). workflow was significantly faster. The average reading time for malignant cases decreased by 12.1% (20.5 s/case, SE 6.97, p = 0.003), for benign cases by 17.3% (28.2 s/case, SE 6.77, p ≤ 0.001), and for normal cases Results by 16.8% (25.3 s/case, SE 4.76) (p ≤ 0.001). Patient characteristics Discussion Table 1 summarises the patient characteristics in women with breast cancer and Table 2 summarises patient char- Our study shows that CAD software for ABUS can help radi- acteristics of women with a Bnormal^ or Bbenign^ ologists to evaluate ABUS examinations more efficiently. ABUS examination. Radiologists who screen for breast cancer may use CAD soft- ware to evaluate batches of ABUS examinations 15.7% faster, without decreasing their performance in terms of cancer de- Screening performance tection. Interestingly, the higher specificity and PPV of the CAD-based reading mode suggest that the use of CAD soft- Figure 2 and Table 3 summarise the screening perfor- ware for ABUS may help radiologists avoid unnecessary re- mance per reader. On average, the sensitivity of unaided calls of healthy women, albeit this did not reach statistical conventional ABUS reading (84%, 95% CI 78-88) was significance. Our results might facilitate further implementa- similartothe sensitivityintheCAD-basedABUSreading tion of ABUS. Supplemental ABUS in women with protocol (84%, 95% CI 79-89) (p = 0.90). Nevertheless, mammographically dense breasts helps radiologists detect half of the readers detected more cancers with CAD, early stage cancers that are occult on mammography while only two readers detected fewer cancers using the [11–13]. Supplemental US screening reduces the interval can- CAD-based reading protocol. In the CAD-based readings cer rate in women with dense breasts [2, 21], which in general 6 out of 8 readers placed markers on a total of 11 lesions is associated with improved outcome [6]. Unfortunately, 31% that were actually malignant, but still classified them as of cancers in supplemental US screening are found to be al- benign (BI-RADS 2). In the unaided ABUS reading this ready visible on a prior screening US examination and could happened only in four readers and a total of five malig- still havebeendetectedearlier [22]. Reasons for non-detection nant lesions. Hence CAD helped in the detection of addi- in WBUS screening are usually misinterpretation and over- tional cancers but could not always induce an adequate sight errors. In our study, oversight errors in malignant cases classification by the readers. were more often observed in conventional ABUS reading than The average specificity for conventional ABUS reading in the CAD-based reading. In fact, half of the readers detected was 67% (95% CI 64-70) and this increased to 71% (95% and correctly classified more cancers in the CAD-based read- CI 68-75) in the CAD-based reading strategy, although this ings than in conventional ABUS reading. Nevertheless, of the did not reach statistical significance (p =0.08). ThePPV was missed cancers several were still marked by six readers in the on average 13.6% higher for the CAD-based ABUS reading CAD-based reading, but wrongly classified as benign. (50.0%, 95% CI 45-55) compared to the conventional ABUS Therefore, it appears that the CAD software has the potential reading (44.0%, 95% CI 39-49) (also not significant, p = to prevent oversight errors in ABUS but might require further 0.07). Overall, seven out of eight readers had higher specific- development to also aid in characterising lesions. Also the ity and PPV with CAD than without. Specificity was signifi- very limited experience all readers had with the CAD system cantly higher in three out of eight readers (readers 1, 4, and 6; might have partly contributed to the misclassification of ma- Table 3). Nevertheless, the AUCs did not statistically differ lignant lesions. between the conventional ABUS reading and the CAD-based Supplemental ABUS has been shown to increase the recall workflow (0.82, 95% CI 0.73-0.92 and 0.83, 95% CI 0.75- rate in breast cancer screening programmes [11, 13]. The im- plementation of an intelligent MinIP into the reading 0.92, respectively) (p = 0.53) (Fig. 3). Eur Radiol (2018) 28:2996–3006 3001 Table 1 Characteristics of the malignant cases in the data set Malignant cases Mean N N Mean Lymph HR+ HR+ HR- HER2+ HR- Unknown †Grade †Grade †Grade Grade age (SD) Symptomatic: FFDM lesion node HER2- HER2+ HER- receptor I II III unknown screening neg:pos size in mm metastasis status (SD) Total (n = 30) 49.8 17:13 20:10 16.0 (8.8) 8 16 2 3 4 4 4 14 10 2 (12.1) Invasive ductal 48 (11.1) 15:7 14:8 16.9 (9.9) 6 11 2 3 3 3 2 10 8 2 carcinoma (n =22) Invasive lobular 73.5 (4.9) 1 : 2 2 : 1 14.7 (5.5) 1 3 0 0 0 0 0 3 0 0 carcinoma (n =3) Invasive metaplastic 47.0 1:1 1 :1 16.5 (2.1) 0 0 0 0 1 0 0 0 2 0 carcinoma (n =2) (14.1) Invasive tubular 52 0 : 1 1 : 0 7 0 1 0 0 0 0 1 0 0 0 carcinoma (n =1) Invasive intracystic 45 0 : 1 1 : 0 12 1 1 0 0 0 0 1 0 0 0 papillary carcinoma (n =1) Non-invasive intracystic 49 0 : 1 1 : 0 14 0 0 0 0 0 1 0 1 0 0 papillary carcinoma (n =1) Nottingham histological grade (modified Bloom-Richardson-Elston) FFDM Full-field digital mammography HR Hormone receptor status (oestrogen and progesterone receptors) HER2 Human epidermal growth factor receptor 2 status 3002 Eur Radiol (2018) 28:2996–3006 Table 2 Characteristics of Mean age (SD) N symptomatic:screening Mean size (SD) women with an ABUS examination labelled as ‘benign’ Normal cases (n = 60) 42.0 (9.5) 4:56 N/A and ‘normal’ Benign cases total (n = 30) 44.9 (9.1) 15:15 12.4 (5.1) Fibroadenoma (n = 12) 42.9 (5.3) 7:5 12.4 (5.7) Fibrosis/adenosis (n = 5) 43.6 (6.3) 1:4 10.2 (4.1) Cystic lesions (n = 5) 46.6 (8.8) 3:2 14.8 (7.8) Other benign breast tissue (n = 5) 54.6 (13.0) 3:2 12.2 (1.9) Papilloma (n = 2) 38.5 (9.2) 1:1 14.0 (2.8) Complex sclerosing lesion (n = 1) 30.0 0:1 8.0 (0.0) SD Standard deviation environment therefore also aims at improvement of specifici- unnecessary recalls in ABUS by improving the specific- ty. The MinIP uses the greyscale contrast in B-mode ultra- ity and PPV of radiologists. Although the overall results sound between lesions and healthy tissue to summarise the were not significant, a positive effect was still seen in 3D volume in a 2D image; hence normal tissue appears seven out of eight readers. Whether ABUS CAD soft- lighter than cancers that show up as dark spots on the ware in actual supplemental screening truly helps to MinIP. Moreover the CAD software also enhances the decrease the recall rate and improve radiologist’s speci- more suspicious regions by lowering the intensity of the ficity still needs to be investigated prospectively. lesion on the MinIP and strengthening the coronal re- In a previous pilot study, we investigated the effect of CAD traction sign, which is highly suggestive of breast can- software for ABUS on the screening performance of readers cer in ABUS [23]. Consequently, the MinIP points out when screening for breast cancer [24]. Our previous study relevant lesions and reduces the suspiciousness of irrel- showed that concurrent reading CAD software may improve evant regions in ABUS volumes. Our study indicates the accuracy of radiologists for evaluation of single ABUS that using this CAD software might indeed decrease volumes. In the current study, the CAD software was Fig. 2 Increment in sensitivity and specificity per reader after subtracting the sensitivity of the specificity of the conventional ABUS reading session from the CAD-based workflow reading session. Ideally all readers perform within the upper right quadrant Eur Radiol (2018) 28:2996–3006 3003 Table 3 Individual performance per reader for the conventional ABUS reading and the CAD-based workflow reading Reader (years of ABUS Sensitivity 95% CI (up, low) p- value Specificity 95% CI (up, low) p value PPV 95% CI (up, low) p value AUC 95% CI (up, low) p value experience) 1(5) ABUS 0.80 0.67 0.93 0.79 0.71 0.88 0.56 0.42 0.70 0.77 0.64 0.91 CAD 0.77 0.60 0.90 1.00 0.89 0.82 0.96 0.03 0.70 0.55 0.85 0.22 0.83 0.71 0.94 0.34 2(8) ABUS 0.83 0.67 0.97 0.69 0.60 0.79 0.47 0.34 0.60 0.79 0.66 0.92 CAD 0.83 0.7 0.97 1.00 0.71 0.62 0.80 0.82 0.49 0.35 0.63 0.85 0.8 0.67 0.93 0.93 3(0) ABUS 0.73 0.57 0.87 0.73 0.64 0.82 0.48 0.35 0.61 0.73 0.60 0.87 CAD 0.80 0.63 0.93 0.63 0.74 0.66 0.83 1.00 0.51 0.36 0.64 0.76 0.78 0.67 0.90 0.27 4(5) ABUS 0.80 0.63 0.90 0.64 0.54 0.74 0.43 0.30 0.55 0.85 0.75 0.95 CAD 0.80 0.63 0.90 1.00 0.80 0.71 0.88 0.001 0.57 0.43 0.71 0.16 0.87 0.78 0.96 0.51 5(5) ABUS 0.87 0.73 0.97 0.68 0.58 0.77 0.47 0.35 0.60 0.88 0.79 0.98 CAD 0.90 0.80 1.00 1.00 0.71 0.61 0.80 0.70 0.51 0.38 0.64 0.70 0.87 0.77 0.97 0.79 6(5) ABUS 0.93 0.83 1.00 0.42 0.32 0.52 0.35 0.25 0.45 0.87 0.79 0.96 CAD 0.77 0.60 0.90 0.06 0.68 0.58 0.78 < 0.001 0.44 0.31 0.58 0.29 0.81 0.71 0.91 0.21 7(0) ABUS 0.87 0.73 0.97 0.74 0.66 0.83 0.53 0.39 0.67 0.88 0.78 0.96 CAD 0.93 0.83 1.00 0.5 0.82 0.74 0.90 0.19 0.64 0.48 0.77 0.30 0.92 0.85 0.99 0.21 8(0) ABUS 0.83 0.70 0.97 0.51 0.41 0.61 0.36 0.25 0.48 0.81 0.70 0.92 CAD 0.90 0.80 1.00 0.5 0.43 0.33 0.53 0.35 0.35 0.24 0.46 0.84 0.81 0.71 0.91 0.96 Pooled ABUS 0.84 0.78 0.88 0.67 0.64 0.70 0.44 0.39 0.49 0.82 0.73 0.92 CAD 0.84 079 0.89 0.90 0.71 0.68 0.75 0.08 0.50 0.45 0.55 0.07 0.83 0.75 0.92 0.53 Sensitivity, specificity, and PPV are based on the BI-RADS assessment per case. The AUC is based on a BI-RADS-based linear rating scale from 0-100 ABUS Automated breast ultrasound reading CAD Computer-aided detection-based workflow reading PPV Positive predictive value (for all recommendations other than routine screening follow-up) AUC Area under the curve 95% CI 95% confidence interval 3004 Eur Radiol (2018) 28:2996–3006 Fig. 3 Alternative free-response receiver-operating characteristic curves for conventional ABUS reading (striped intervals) and computer-aided detection based workflow reading (straight line). No statistical difference is observed between the areas under the curves implemented into a specific CAD-based screening workflow dark spots and CAD marks in the MinIP and subsequently to boost the reading speed during batch reading of whole- also scan the coronal reconstructions of the ABUS vol- breast ABUS examinations. The purpose of this study was umes. As a consequence our instructions prolonged the therefore to investigate the effect of CAD software on the reading time in the CAD-based reading sessions. Most efficiency rather than on the accuracy. In addition, this study breast radiologists are familiar with the concept of was performed using whole-breast examinations only from summarising relevant information of 3D breast imaging women with heterogeneously dense or extremely dense in a 2D image, as is common practice in tomosynthesis breasts, thus creating a data set that is representative for (synthetic mammogram) and in dynamic contrast- supplementary screening with ABUS in dense breasts. enhanced breast MRI [maximum intensity projections The mean reading time of a unilateral ABUS examina- (MIP)]. Kuhl et al. reported that looking only at MIPs tion with an average of three volumes per breast with- is a reliable and fast (3-30 s per case) approach to out CAD software in our study was 158.3 s, which is in breast cancer screening with MRI [26]. The CAD- line with previously reported 3-9 min for a bilateral enhanced MinIP in our study could theoretically be used WBUS examination [11, 16, 25]. However, our study in a similar way, thus further reducing the reading time data set was enriched with cancers and suspicious be- required per ABUS volume. However, future studies nign cases, which likely increases the reading time per need to elucidate the effect this may have on the sensi- case. Our CAD-based reading workflow decreased the tivity of ABUS. average reading time with 15.7% to 133.4 s per unilat- Our study has limitations. We did not show corre- eral ABUS examination. The improvement in reading sponding mammograms with the ABUS examinations speed was higher in normal and benign cases than in although these modalities are complementary in most malignant cases. We therefore expect that this gain in screenings regimes of women with dense breasts and efficiency in a true screening setting could be higher this might positively or negatively affect the screening than in our study. performance. Furthermore, we enriched the data set with Navigation of the ABUS examinations using the CAD- benign and malignant lesions from both screening and enhanced MinIP can be performed relatively quickly. But diagnostic examinations to increase the power in this in our study the readers were instructed to evaluate all study. By doing so, our study data set does not Eur Radiol (2018) 28:2996–3006 3005 Table 4 Average reading time per reader for both conventional ABUS reading and reading the CAD-based reading workflow Reader (years Average reading 95% CI (low, high) Average reading time 95% CI Percentage p value experience ABUS) time ABUS (s) CAD-ABUS (s) (low, high) decrease 1 (5) 171.2 156.5 186.5 166.0 150.4 181.0 3.1 0.56 2 (8) 145.4 132.4 159.1 136.1 124.5 149.6 6.5 0.24 3 (0) 146.7 132.6 162.2 123.4 113.0 134.3 15.9 < 0.001 4 (5) 175.2 158.7 190.8 140.8 130.2 150.1 19.7 0.001 5 (5) 101.2 95.7 108.4 91.2 84.7 97.7 9.9 0.008 6 (5) 138.6 127.1 151.1 110 100.1 119.4 20.6 0.001 7 (0) 217.2 197.9 236.2 160.1 148.0 172.3 26.3 0.001 8 (0) 173.3 173.3 185.2 140.9 132.3 150.0 18.7 0.001 Pooled Average 158.3 153.0 163.6 133.4 129.2 137.6 15.7 < 0.001 Normal 151.0 143.6 158.4 125.7 120.0 131.4 16.8 < 0.001 Benign 163.0 152.6 173.3 134.8 126.4 143.1 17.3 < 0.001 Malignant 169.3 158.8 180.0 148.8 140.2 157.5 12.1 0.003 All readers were faster with CAD software. Six of eight readers were significantly faster ABUS Automated breast ultrasound CAD Computer-aided detection software represent clinical practice where the prevalence of be- In conclusion, our study shows that the CAD software de- nign and malignant lesions is lower. Finally, multiple veloped for ABUS has the potential to improve the efficiency readers had little experience with ABUS and all readers of reading ABUS by significantly improving the reading were inexperienced with the CAD software package that speed without decreasing the screening performance. Further we implemented in our screening environment, which research is warranted in a prospective study to investigate the may have negatively affected the screening performance effect of CAD on breast cancer detection, screening recalls, and reading time. and the interval cancer rate in screening programmes. Fig. 4 Histograms for reading time needed to read all cases in a conventional ABUS protocol (striped interval) and for reading in a CAD-based workflow protocol (straight) 3006 Eur Radiol (2018) 28:2996–3006 Funding This study has received funding by European Union's Seventh 8. Hooley RJ (2017) Breast density legislation and clinical evidence. Framework programme FP7 under grant agreement no. 306088. Radiol Clin North Am 55:513–526 9. The Austrian Breast Cancer Early Detection Programma. http:// www.frueh-erkennen.at/. Accessed 25 June 2017 Compliance with ethical standards 10. Berg WA, Blume JD, Cormack JB, Mendelson EB (2006) Operator dependence of physician-performed whole-breast US: lesion detec- Guarantor The scientific guarantor of this publication is Prof. Dr. N. tion and characterization. Radiology 241:355–365 Karssemeijer. 11. Brem RF, Tabár L, Duffy SW et al (2015) Assessing improvement in detection of breast cancer with three-dimensional automated Conflict of interest The authors of this manuscript declare relationships breast US in women with dense breast tissue: The SomoInsight with the following companies: Dr. N. Karssemeijer is CEO of Study. Radiology 274:663–673 Screenpoint Medical Inc. and a shareholder in Qview Medical Inc. and 12. Wilczek B, Wilczek HE, Rasouliyan L, Leifland K (2016) Adding Matakina Ltd. Dr. R. Mann is speaker for Siemens Healthcare. 3D automated breast ultrasound to mammography screening in women with heterogeneously and extremely dense breasts: Report Statistics and biometry One of the authors has significant statistical from a hospital-based, high-volume, single-center breast cancer expertise. screening program. Eur J Radiol 85:1554–1563 13. Giuliano V, Giuliano C (2012) Improved breast cancer detection in asymptomatic women using 3D-automated breast ultrasound in Ethical approval Institutional Review Board approval was obtained. mammographically dense breasts. Clin Imaging 37:480–486 14. Choi WJ, Cha JH, Kim HH et al (2014) Comparison of automated Informed consent Written informed consent was waived by the breast volume scanning and hand- held ultrasound in the detection Institutional Review Board. of breast cancer: an analysis of 5,566 patient evaluations. Asian Pac J Cancer Prev 15:9101–9105 Methodology 15. Vourtsis A, Kachulis A (2017) The performance of 3D ABUS ver- � retrospective sus HHUS in the visualisation and BI-RADS characterisation of � multiple case-multiple reader study breast lesions in a large cohort of 1,886 women. Eur Radiol 1–10. � performed at one institution https://doi.org/10.1007/s00330-017-5011-9 16. Kelly KM, Dean J, Comulada WS, Lee S-JJ (2010) Breast cancer Open Access This article is distributed under the terms of the Creative detection using automated whole breast ultrasound and mammog- Commons Attribution 4.0 International License (http:// raphy in radiographically dense breasts. Eur Radiol 20:734–742 creativecommons.org/licenses/by/4.0/), which permits unrestricted use, 17. Tan T, Mordang J-J, van Zelst J et al (2015) Computer-aided detec- distribution, and reproduction in any medium, provided you give appro- tion of breast cancers using Haar-like features in automated 3D priate credit to the original author(s) and the source, provide a link to the breast ultrasound. Med Phys 42:1498–1504 Creative Commons license, and indicate if changes were made. 18. Evans KK, Birdwell RL, Wolfe JM (2013) If you don’t find it often, you often don’t find it: why some cancers are missed in breast cancer screening. PLoS One 8:e64366 19. Hillis SL, Berbaum KS, Metz CE (2008) Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study References analysis. Acad Radiol 15:647–661 20. Dorfman DD, Berbaum KS, Metz CE (1992) Receiver operating characteristic rating analysis: Generalization to the population of 1. Wanders JOP, Holland K, Veldhuis WB et al (2017) Volumetric readers and patients with the jackknife method. Invest Radiol 27: breast density affects performance of digital screening mammogra- 723–731 phy. Breast Cancer Res Treat 162:95–103 21. Corsetti V, Houssami N, Ghirardi M et al (2011) Evidence of the 2. Ohuchi N, Suzuki A, Sobue T et al (2016) Sensitivity and specific- effect of adjunct ultrasound screening in women with ity of mammography and adjunctive ultrasonography to screen for mammography-negative dense breasts: interval breast cancers at 1 breast cancer in the Japan Strategic Anti-cancer Randomized Trial year follow-up. Eur J Cancer 47:1021–1026 (J-START): a randomised controlled trial. Lancet Jan 23;387:341– 22. Song SE, Cho N, Chu A et al (2015) Undiagnosed breast cancer: features at supplemental screening US. Radiology 277:372–380 3. Shen S, Zhou Y, Xu Y et al (2015) A multi-centre randomised trial 23. Van Zelst JCM, Platel B, Karssemeijer N, Mann RM (2015) comparing ultrasound vs mammography for screening breast can- Multiplanar reconstructions of 3D automated breast ultrasound im- cer in high-risk Chinese women. Br J Cancer 112:998–1004 prove lesion differentiation by radiologists. Acad Radiol. Dec;22: 4. Berg WA, Blume JD, Cormack JB et al (2008) Combined screening 1489-1496 with ultrasound and mammography vs mammography alone in 24. Van Zelst JCM, Tan T, Platel B et al (2017) Improved cancer de- women at elevated risk of breast cancer. JAMA 299:2151–2163 tection in automated breast ultrasound by radiologists using com- 5. Bae MS, Moon WK, Chang JM et al (2014) Breast cancer detected puter aided detection. Eur J Radiol 89:54–59 with screening US: reasons for nondetection at mammography. 25. Skaane P, Gullien R, Eben EB et al (2015) Interpretation of auto- Radiology 270:369–377 mated breast ultrasound (ABUS) with and without knowledge of 6. Saadatmand S, Bretveld R, Siesling S, Tilanus-Linthorst MMA mammography: a reader performance study. Acta Radiol 56:404– (2015) Influence of tumour stage at breast cancer detection on sur- vival in modern times: population based study in 173 797 patients. 26. Kuhl CK, Schrading S, Strobel K et al (2014) Abbreviated breast BMJ Oct 6;351:h4901 magnetic resonance imaging (MRI): first postcontrast subtracted 7. Durand MA, Hooley RJ (2017) Implementation of whole-breast images and maximum-intensity projection-a novel approach to screening ultrasonography. Radiol Clin North Am 55:527–539 breast cancer screening with MRI. J Clin Oncol 32:2304–2310

Journal

European RadiologySpringer Journals

Published: Feb 7, 2018

References