Access the full text.
Sign up today, get DeepDyve free for 14 days.
Purpose: The most frequently used model for simulating multireader multicase (MRMC) data that emulates confidence-of-disease ratings from diagnostic imaging studies has been the Roe and Metz (RM) model, proposed by Roe and Metz in 1997 and later generalized by Hillis (2012), Abbey et al. (2013), and Gallas and Hillis (2014). A problem with these models is that it has been difficult to set model parameters such that the simulated data are similar to MRMC data encountered in practice. To remedy this situation, Hillis (2018) mapped parameters from the RM model to Obuchowski–Rockette (OR) model parameters that describe the distribution of the empirical AUC outcomes computed from the RM model simulated data. We continue that work by providing the reverse mapping, i.e., by deriving an algorithm that expresses RM parameters as functions of the OR empirical AUC distribution parameters. Approach: We solve for the corresponding RM parameters in terms of the OR parameters using numerical methods. Results: An algorithm is developed that results in, at most, one solution of RM parameter values that correspond to inputted OR parameter values. The algorithm can be implemented using an R software function. Examples are provided that illustrate the use of the algorithm. A simulation study validates the algorithm. Conclusions: The resulting algorithm makes it possible to easily determine RM model parameter values such that simulated data emulate a specific real-data study. Thus, MRMC analysis methods can be empirically tested using simulated data similar to that encountered in practice. © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. [DOI: 10.1117/1.JMI.9.4.045501] Keywords: ROC curve; diagnostic radiology; Roe and Metz; Obuchowski and Rockette; simu- lated data. Paper 21345GR received Dec. 30, 2021; accepted for publication Jun. 7, 2022; published online Jul. 8, 2022. 1 Introduction For the typical diagnostic radiology study, several readers (typically radiologists) assign con- fidence-of-disease ratings to each case (i.e., subject) based on one or more corresponding radio- logic images. The resulting data are called multireader multicase (MRMC) data. These studies are typically used to compare different imaging modalities with respect to reader performance. Often measures of reader performance are functions of the estimated receiver-operating- characteristic (ROC) curve, such as the area under the ROC curve (AUC). The Obuchowski *Address all correspondence to Stephen L. Hillis, steve-hillis@uiowa.edu Journal of Medical Imaging 045501-1 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . and Rockette method (OR) is a commonly used method of analyzing reader performance out- comes which results in conclusions that generalize to both the reader and case populations. The most frequently used model for simulating MRMC data that emulate confidence-of- disease ratings from such studies has been the model first proposed by Roe and Metz and later 3 4 5 generalized by Hillis, Abbey, and Gallas and Hillis. We will refer to each of these models as the “Roe and Metz” (RM) model when there is no need to distinguish between them. Numerous studies have used this model for evaluating MRMC analysis and sample size methods. As dis- cussed by Hillis, the RM model generates continuous confidence-of-disease ratings based on an underlying binormal model for each reader–test combination, with the separation between the normal and abnormal rating distributions varying across readers. Because RM model parameters are expressed in terms of the latent rating data distribution, in contrast to MRMC analysis results that are almost always expressed in terms of parameters that describe the distribution of the reader performance outcomes, it has been difficult to set RM model parameter values such that the simulated data exhibit characteristics that are similar to MRMC data encountered in practice. To remedy this situation, Gallas and Hillis mapped the RM model parameters to variance and covariance parameters that describe the distribution of the empirical AUC outcomes computed from RM simulated data. Similarly, Hillis mapped the RM model parameters to OR parameters that describe the distribution of empirical AUC outcomes computed from RM simulated data. This paper continues that work by developing a numerical algorithm that expresses the RM parameters as functions of the empirical AUC distribution OR parameters. This result makes it easy to determine RM model parameter values such that the simulated data emulate a specific real-data study. The primary uses for the proposed algorithm are testing MRMC analysis methods and computing power estimates, using simulated MRMC data that match real data sets with respect to the empirical AUC distribution OR parameter estimates. An outline of this paper is as follows. In Sec. 2, we discuss the original Roe and Metz model, the Hillis generalization of it, and the OR model and analysis method. In Sec. 3, we discuss the numerical OR-to-RM algorithm that maps OR parameters to RM parameters, which is derived in Appendix A for the Hillis generalization of the original RM model. In Sec. 4, we illustrate using the OR-to-RM algorithm and the previously derived RM-to-OR algorithm to simulate data emulating a real-data study, along with other examples and remarks concerning the use of the two algorithms. The paper concludes in Secs. 5 and 6. 2 Previous Methods 2.1 Roe and Metz Models: Original and Constrained Unequal-Variance 2.1.1 Original RM model Let X denote a confidence-of-disease rating assigned by a reader to a case; X is often called a decision variable (DV). The original RM simulation model proposed by Roe and Metz is a mixed four-factor (test, reader, case, and truth) ANOVA model for X with case nested within truth; test, reader, and truth crossed; test and truth treated as fixed factors; and reader and case treated as random factors. Note that we use “test” as a general term that can refer to a diagnostic test, imaging modality, or a treatment. Throughout this paper, we only consider the situation of comparing two tests. Using the RM notation, the model is given as EQ-TARGET;temp:intralink-;e001;116;165X ¼ μ þ τ þ R þ C þðτRÞ þðτCÞ þðRCÞ þðτRCÞ þ E ; (1) ijkt t it jt kt ijt ikt jkt ijkt ijkt where X denotes the confidence-of-disease rating for test i, reader j, case k of truth state t, and ijkt t ¼ −; þ,with “−” indicating a nondiseased case and “+” indicating a diseased case. Here, μ is the effect of truth state t, τ is the interaction effect of test i and truth state t, R is the interaction it jt effect of reader j and truth state t, C is the effect of case k nested within truth state t, the kt multiple symbols in parentheses denote interactions, and E is the error term. Thus, X ijkt ijkt Journal of Medical Imaging 045501-2 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . denotes the confidence-of-disease rating assigned to case k of truth state t by reader j when reading under test i. All effects are random except for μ and τ . The random effects are mutually t it independent and normally distributed with zero means. Roe and Metz denote the corresponding 2 2 2 2 2 2 2 2 2 variance components by σ , σ , σ , σ , σ , σ , and σ . They note that σ and σ cannot R C τR τC RC τRC E τRC E be estimated separately for this model with no replications, as re-reading images in radiological studies is uncommon due to the cost, and hence define 2 2 2 EQ-TARGET;temp:intralink-;sec2.1.1;116;663σ ≡ σ þ σ : ε τRC E Although not mentioned by Roe and Metz, the omission of test, reader, and test-by-reader effects that do not depend on truth is justified by the invariance of the ROC curve to location shifts; thus inclusion of these terms would not change the ROC curve for a given reader. Note that inter- actions with truth are denoted only by a t subscript in Eq. (1). Roe and Metz constrain the sum of the error variance and variance components involving case to be equal to one: 2 2 2 2 EQ-TARGET;temp:intralink-;e002;116;559σ þ σ þ σ þ σ ¼ 1: (2) C τC RC It follows from this constraint that the fixed-reader nondiseased and diseased DV distributions have unit variances (and hence their ROC curves are symmetric about the negative 45 deg diago- nal), with the fixed-reader AUCs varying across the reader population. Without loss of generality, Roe and Metz impose the constraints EQ-TARGET;temp:intralink-;e003;116;479μ ¼ τ ¼ τ ¼ τ ¼ τ ¼ 0; (3) − 1− 2− 1þ 2þ which result in the same DV distributions for both tests 1 and 2. Under this constraint, it can be shown that the mean and median separation of the nondiseased and diseased DV distributions across the reader population is given by μ and the median reader-specific AUC is given by pﬃﬃﬃ A ¼ Φðμ ∕ 2Þ, where Φ is the cumulative distribution function of the standard normal z þ distribution. 2.1.2 Unequal test DV distributions Although Roe and Metz only consider simulations for equal test DV distributions for each reader, the model can be easily modified to allow for test DV distributions that differ in their median AUC values by not setting τ to zero, that is, only the constraints 2þ EQ-TARGET;temp:intralink-;e004;116;313μ ¼ τ ¼ τ ¼ τ ¼ 0 (4) − 1− 2− 1þ pﬃﬃﬃ ðiÞ are imposed. It follows that the median AUCs for tests 1 and 2 are equal to A ¼ Φðδ ∕ 2Þ, i ¼ 1; 2, respectively, where EQ-TARGET;temp:intralink-;e005;116;255δ ¼ μ þ τ i ¼ 1;2 (5) i þ iþ are the mean and median separations of the nondiseased and diseased DV distributions for tests 1 and 2, respectively, across the reader population. From constraints Eq. (4), it follows that ðiÞ δ ¼ μ for test 1 and δ ¼ μ þ τ for test 2. To insure that A ≥ :5, we assume 1 þ 2 þ 2þ z EQ-TARGET;temp:intralink-;e006;116;185δ ≥ 0;i ¼ 1;2: (6) Note that the RM model that allows for test-dependent AUCs is completely defined by seven parameters: 2 2 2 2 2 EQ-TARGET;temp:intralink-;e007;116;129δ ; δ ; σ ; σ ; σ ; σ ; and σ : (7) 1 2 R τR C τC RC Note that σ can be computed using Eqs. (2) and (7). Journal of Medical Imaging 045501-3 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 2.1.3 Constrained unequal-variance RM model (RMH model) In practice, estimated binormal-model nondiseased and diseased distribution variances for a reader-test combination are often different, with diseased subjects typically having more variable test results. Thus to better emulate real data, Hillis modified the original RM model by allowing variance components involving cases to depend on truth, with variance components involving diseased cases set equal to those involving nondiseased cases multiplied by the factor 1∕b , b> 0. Specifically, the model is given by Eq. (1) with variance components (using 2 2 2 2 2 2 2 2 2 an obvious notation) denoted by σ , σ , σ , σ , σ , σ , σ , σ , σ , and R τR Cð−Þ τCð−Þ RCð−Þ εð−Þ CðþÞ τCðþÞ RCðþÞ 2 2 −2 2 2 −2 2 2 −2 2 2 −2 2 σ , with σ ¼ b σ , σ ¼ b σ , σ ¼ b σ , σ ¼ b σ . εðþÞ CðþÞ Cð−Þ τCðþÞ τCð−Þ RCðþÞ RCð−Þ εðþÞ εð−Þ Similar to Eq. (2), the constraint 2 2 2 2 EQ-TARGET;temp:intralink-;e008;116;601σ þ σ þ σ þ σ ¼ 1 (8) Cð−Þ τCð−Þ RCð−Þ εð−Þ is imposed. It follows that 2 2 2 2 −2 EQ-TARGET;temp:intralink-;sec2.1.3;116;555σ þ σ þ σ þ σ ¼ b : CðþÞ τCðþÞ RCðþÞ εðþÞ Constraint Eq. (6) is also imposed. We will refer to this model as the constrained unequal- variance RM model or simply as the RMH model, with the “H” in RMH indicating that it is the generalization of the original RM model proposed by Hillis. Similar to the original RM model, imposing constraint Eq. (3) results in the null model pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð1Þ ð2Þ −2 with A ¼ A ¼ Φðμ ∕ 1 þ b Þ, and imposing constraint Eq. (4) results in the nonnull z z model with EQ-TARGET;temp:intralink-;sec2.1.3;116;446 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðiÞ −2 A ¼ Φ ðμ þ τ Þ∕ 1 þ b þ iþ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ −2 ¼ Φ δ ∕ 1 þ b ;i ¼ 1; 2; ðiÞ where again A denotes the median AUC across the reader population for test i, δ is defined by Eq. (5), and δ is the mean and median DV separation for test i across readers. The algorithm discussed in this paper will be for the RMH model, which includes the original RM model as a special case when b is set equal to 1. Note that the RMH model that allows for test-dependent AUCs is completely defined by the eight linearly independent parameters 2 2 2 2 2 b; δ ; δ ; σ ; σ ; σ ; σ , and σ . We let β denote the vector of these parameters: 1 2 RMH R τR Cð−Þ τCð−Þ RCð−Þ 2 2 2 2 2 EQ-TARGET;temp:intralink-;e009;116;306β ¼ðb; δ ; δ ; σ ; σ ; σ ; σ ; σ Þ: (9) RMH 1 2 R τR Cð−Þ τCð−Þ RCð−Þ 2.2 Obuchowski–Rockette Model Obuchowski and Rockette proposed a test × reader factorial ANOVA model for the AUC esti- mates, but unlike a conventional ANOVA model, the errors are assumed to be correlated to account for correlation due to each reader evaluating the same cases. Their model, which we refer to as the OR model, is given as EQ-TARGET;temp:intralink-;e010;116;195θ ¼ μ þ τ þ R þðτRÞ þ ε ; (10) ij OR i∶OR j∶OR ij∶OR ij∶OR where μ is the intercept term, τ denotes the fixed effect of test i, R denotes the random OR i∶OR j∶OR effect of reader j, ðτRÞ denotes the random test × reader interaction, and ε is the error ij∶OR ij∶OR term. The R and ðτRÞ are assumed to be mutually independent and normally distributed j∶OR ij∶OR 2 2 with zero means and respective variances σ and σ . (OR in the subscripts is to distin- R∶OR TR∶OR guish OR effects and variance components from similarly notated RMH-model quantities.) The ε are assumed to be normally distributed with mean zero and variance σ and are ij∶OR ε∶OR assumed uncorrelated with the R and ðτRÞ . Three possible error covariances are j∶OR ij∶OR assumed: Journal of Medical Imaging 045501-4 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 0 0 Cov i ≠ i ;j ¼ j ðdifferent test; same readerÞ 0 0 EQ-TARGET;temp:intralink-;e011;116;735Covðε ; ε 0 0 Þ¼ Cov i ¼ i ;j ≠ j ðsame test; different readerÞ : (11) ij∶OR i j ∶OR 2 0 0 Cov i ≠ i ;j ≠ j ðdifferent test; different readerÞ The OR model assumes EQ-TARGET;temp:intralink-;e012;116;668Cov ≥ Cov ; Cov ≥ Cov ; Cov ≥ 0: (12) 1 3 2 3 3 These error variance–covariance parameters are typically estimated by averaging correspond- 8–10 10,11 ing conditional-on-readers estimates computed using the jackknife, bootstrap, the method proposed by DeLong et al. (for empirical AUC estimates), or the method proposed by Metz et al. based on the semiparametric binormal ROC model. These four estimation methods are consistent but are not unbiased. An unbiased error covariance estimation method (unbiased 6,14 method) was recently proposed by Hillis for use when empirical AUC is the outcome. This method utilizes the unbiased fixed-reader method discussed by Gallas [Ref. 15, p 362] for esti- mating the error variance, and extensions of it for estimating the error covariances. This method results in unbiased OR parameter estimates when data are generated from the RMH model. OR analysis using this method is included in the freely available R software package MRMCaov. The ε can be interpreted as AUC measurement error attributable to the random selection ij∶OR of cases and within-reader variability that describes how a fixed reader interprets the same image in different ways on different occasions. The OR model can alternatively be described with population correlations r ¼ Cov ∕σ replacing corresponding Cov . i i ε∶OR i Defining EQ-TARGET;temp:intralink-;sec2.2;116;455 μ ¼ μ þ τ ; 1∶OR OR 1∶OR μ ¼ μ þ τ ; 2∶OR OR 2∶OR the OR model for two tests, similar to the RMH model, is defined by eight linearly independent parameters: 2 2 2 EQ-TARGET;temp:intralink-;e013;116;381μ ; μ ; σ ; σ ; σ ; Cov ; Cov ; and Cov ; (13) 1∶OR 2∶OR R∶OR TR∶OR ε∶OR 1 2 3 or equivalently, by 2 2 2 EQ-TARGET;temp:intralink-;e014;116;337μ ; μ ; σ ; σ ; σ ;r ;r and r : (14) 1∶OR 2∶OR 1 2 3 R∶OR TR∶OR ε∶OR We let β denote the vector of these parameters: OR 2 2 2 EQ-TARGET;temp:intralink-;e015;116;293β ¼ðμ ; μ ; σ ; σ ; σ ;r ;r ;r Þ: (15) OR 1∶OR 2∶OR 1 2 3 R∶OR TR∶OR ε∶OR Note that when the outcome is the empirical AUC that μ and μ are the test 1 and test 2 1∶OR 2∶OR expected values for the empirical AUC estimates across readers and cases. 3 Proposed Methods 3.1 OR-to-RMH Algorithm for Estimating RMH Parameter Values When the Goal Is to Emulate a Real-Data MRMC Study The RMH-to-OR mapping, previously derived by Hillis, and the new OR-to-RMH algorithm that maps OR parameters to RMH parameters and its development are provided in Tables 6 and 10, respectively, in Appendix A. In this section, we discuss the main points of the OR-to-RMH algorithm when the goal is to emulate data from a real study with the RMH model; i.e., to determine RMH parameter values such that the expected values of the OR parameter estimates from the simulated MRMC samples are described by the β vector Eq. (15), estimated from a real study. OR Journal of Medical Imaging 045501-5 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . The β vector Eq. (15) implicitly provides information about the shape of the underlying OR ROC curve through the value of σ , which is a function of the RMH b parameter in the RMH- ε∶OR to-OR mapping. The method used for estimating the RMH b parameter for the OR-to-RMH algorithm is called the b_method. To estimate a β vector Eq. (9) that maps to a particular RMH β vector Eq. (15), the algorithm requires use of the option b_method = unspecified, OR which we assume throughout this section. Two other options for b_method and the situations where they are useful will be discussed in Sec. 3.2. 3.1.1 Overview of OR-to-RMH algorithm Table 6 in Appendix A gives the previously derived analytical RMH-to-OR mapping formulas. Mathematically, we describe this transformation by the function f that maps the RMH parameter vector and the case samples sizes that will be used for the simulations to the resulting OR param- eter vector: EQ-TARGET;temp:intralink-;e016;116;566fðβ ;n ;n Þ¼ β : (16) RMH 0 1 OR This function is analytical and thus does not require a numerical algorithm. The OR-to-RMH algorithm requires inputted values for β ;n ;n , and b_method, where OR 0 1 β is given by Eq. (15) and n and n are the corresponding real-study nondiseased and dis- OR 0 1 eased case sizes. To derive the OR-to-RMH algorithm, we first assume that there exists an RMH parameter vector β corresponding to β such that Eq. (16) is true. We then express the OR RMH OR parameters in terms of the RMH parameters and solve for the RMH parameters using numerical methods (see Appendix A for details.) It is possible that there are several β vectors satisfying Eq. (16), in which case the cor- RMH responding β vectors will differ only in their b values, as discussed in Appendix A. It is also RMH possible that there is no β vector that satisfies Eq. (16). To force the OR-to-RMH algorithm RMH to produce, at most, only one output, the β vector with b closest to 1 with 0.01 ≤ b ≤ 1 is RMH chosen; if no corresponding β vector has 0.01 ≤ b ≤ 1, then the corresponding β vector RMH RMH with b closest to 1 with 1 < b ≤ 4 is chosen. If there are no corresponding β solution vectors RMH with 0.01 ≤ b ≤ 4, the algorithm does not return a solution for β ; see Sec. 3.1.3 for what to RMH do when this happens. Let g denote the function defined by the OR-to-RMH algorithm, with b_method = unspecified, that maps β to a solution for β , denoted by β ; i.e., OR RMH RMH;solution EQ-TARGET;temp:intralink-;e017;116;330g ðβ ;n ;n Þ¼ β : (17) 1 OR 0 1 RMH;solution Ideally, β will be such that the RMH-to-OR mapping will return the original OR RMH;solution parameter, i.e., EQ-TARGET;temp:intralink-;e018;116;274fðβ ;n ;n Þ¼ β : (18) RMH;solution 0 1 OR However, it is possible for the OR-to-RMH algorithm to return a solution such that Eq. (18) holds only approximately, i.e., EQ-TARGET;temp:intralink-;e019;116;218fðβ ;n ;n Þ ≈ β : (19) RMH;solution 0 1 OR The approximation results because of constraints on the RMH parameters that are imposed by the algorithm, as discussed in Appendix A and given in Eq. (23) in Table 7. For example, if the inputted value of r exceeds that of r then the solution β will be such that r ≥ r 3 2 RMH;solution 2 3 in fðβ ;n ;n Þ. RMH;solution 0 1 Rationale for the b limits. The lower and upper limits for b of 0.01 and 4 are chosen because b values outside these limits are not realistic for most real data sets. In most situations, a meaningful DV should be an increasing transformation of the likelihood ratio (likelihood of being diseased divided by likelihood of not being diseased). A DV having this property and its corresponding ROC curve are said to be proper; otherwise they are said to be improper [Ref. 18, Journal of Medical Imaging 045501-6 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . (a) b < 1 AUC = 0.8 AUC = 0.9 AUC = 0.95 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 FPF b = 0.01 b = 0.50 Chance line (b) b > 1 AUC = 0.8 AUC = 0.9 AUC = 0.95 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 FPF b = 4 b = 2 Chance line Fig. 1 ROC curves as a function of AUC and b. TPF, true positive fraction (or sensitivity); FPF, false positive fraction (or 1 – specificity). pp. 19, 37]. A proper ROC curve is concave (down) and never crosses the chance line. It fol- lows that an ROC curve that has “hooks” and crosses the chance line is improper. Pan and Metz note that hooks for fitted binormal ROC curves do not appear when fitting curves to reliable data sets, which strongly suggests that the true underlying ROC curves do not show such hooks for real-data studies. Thus, we have limited the underlying ROC curves to have b values between 0.01 and 4.0 since for typical AUC values (≤0.95) it can be shown that ROC curves with b values outside of these boundaries have noticeable hooks. For example, Fig. 1 shows ROC curves with AUCs of 0.8, 0.9, 0.95 for values of b ¼ 0.01; 0.5 [Fig. 1(a)] and b ¼ 2;4 [Fig. 1(b)]. We see that the ROC curves for the extreme cases of b ¼ 0.01 [Fig. 1(a)] and b ¼ 4 [Fig. 1(b)] are noticeably improper because they have hooks in the upper right and lower left corner, respectively, with the ROC curves below the chance line in those regions. Although not shown, the improperness becomes more noticeable as b decreases below 0.01 or increases above 4.0, or as the AUC decreases below 0.8. The ROC curves qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ −1 −1 1 2 were computed using the equation TPF ¼ Φða þ bΦ FPFÞ,with a ¼ bΦ ðAUCÞ 1 þð Þ and TPF and FPF denoting the true positive fraction (sensitivity) and false positive fraction (1 − specificity), respectively. (The expression for a results from the conventional binormal qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ −1 1 2 ROC relationships μ ¼ Φ ðAUCÞ 1 þð Þ and μ ¼ a∕b). Simulation of data to emulate a real-data study. Figure 2 summarizes how the OR-to- RMH and RMH-to-OR algorithms can be used to simulate data that emulate a real-data study. The OR-to-RMH algorithm (with b_method = unspecified) is applied to OR estimates (β ) obtained from a real-data study, resulting in the corresponding RMH model. This OR;input model is then used for generating MRMC samples for any specified number of readers and cases, with n and n denoting the case numbers for the simulations and n and n denoting the case 0 1 0 1 Journal of Medical Imaging 045501-7 Jul∕Aug 2022 Vol. 9(4) TPF TPF 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Fig. 2 Flowchart illustrating the use of the OR-to-RMH and RMH-to-OR algorithms to simulate MRMC data that emulate a real-data study. numbers for the original real-data study. The distribution of the empirical AUCs for the simulated data is described by β . We recommend always checking how closely the simulated data OR;output emulate the study data by comparing β and β when the simulation model generates OR;input OR;output samples with the same case sizes as the original study, i.e., with n ¼ n and n ¼ n . 0 1 0 1 3.1.2 Should the simulated ROC curves resemble the original study ROC curves? We emphasize that even when simulating data using an RMH model such that β ¼ OR;output β in Fig. 3, we do not claim that the resulting empirical ROC curves will be visually OR∶input similar to those estimated from a real-data study. Rather, we only claim that the expected values of the OR parameter estimates for the simulated data will be the same as those computed from the original real-data study, given by Eq. (13). (Note that Eq. (13) contains the error covariances rather than the error correlations.) However, because of the robustness of the binormal model 20–22 assumption for fitting ROC curves to real data, we typically expect there will be some resem- blance, although the degree of resemblance will be limited by the RMH model having only eight parameters. In particular, we note that the RMH model requires each reader’s ROC curve to have the same b value, which will determine the shape of the ROC curve for a given reader AUC value; this result follows from the one-to-one correspondence between ðb; AUCÞ and ða; bÞ, with qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 2 a ¼ bΦðAUCÞ 1 þð Þ , as mentioned in Sec. 3.1.1. 3.1.3 Reasons for neither an exact nor approximate solution OR-to-RMH algorithm does not work because there is not a solution for b. For 2 2 given values of the RMH parameters δ ; δ ; σ , and σ (computed in steps 1 to 3 of the OR-to- 1 2 R TR RMH algorithm in Table 10), the value of b (computed in step 4) determines the value of σ : ε;OR It can happen that the algorithm does not produce a solution for b, either because no solution exists, or the solution is <0.01 or >4.0 that will yield the input value for σ for the values of ε∶OR 2 2 δ ; δ ; σ , and σ that have been computed by the algorithm in previous steps. When this occurs, 1 2 R TR one can choose to use one of the other two methods for estimating b, as discussed in Sec. 3.2. OR-to-RMH algorithm does not work because there is not a solution for an RMH parameter other than b. When required, the algorithm imposes the constraints Journal of Medical Imaging 045501-8 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . in Table 7(b) by altering somewhat the inputted OR parameter values, which can lead to an approximate solution as given by Eq. (19). However, when other constraints, which are implied by the RMH-to-OR mapping in Table 6, do not hold, the result is a missing value for the par- ticular RMH parameter and for all other RMH parameters requiring it for their computation. For example, from the equations in Tables 8 and 9, it can be shown that there is an upper limit for σ , which is a function of the values of the inputted values for μ and μ . Similarly, 1∶OR 2∶OR R∶OR it can be shown that there are upper limits for σ ;r ;r , and r , which are functions of 1 2 3 TR∶OR parameters computed in previous steps. When one of these values exceeds its upper limit, the algorithm does not yield a solution. This problem is more likely to happen when inputted values for β are conjectured than OR when they are estimates from a real-data study. If this problem occurs, we first recommend that the inputted values be checked for entry errors. If there are none, then we suggest inputting a different (typically smaller) value for the OR parameter corresponding to the RMH parameter, which cannot be estimated. See Appendix A and Table 5 for more details and Sec. 4.3.7 for examples illustrating this problem. 3.2 OR-to-RMH Algorithm for Estimating RMH Parameter Values When the Goal Is to Emulate AUCs, OR Correlations and Variance Components, But Not σ ε∶OR 6,23 2 2 As discussed by Hillis, the OR parameters μ ; μ ; σ , and σ have meaningful 1∶OR 2∶OR R∶OR TR∶OR interpretations that do not depend on sample size, and r , r , and r have meaningful interpre- 1 2 3 tations that remain approximately (but not exactly) constant as the sample sizes change. On the other hand, σ varies with the sample sizes. In this section, we discuss two approaches for ε∶OR determining RMH parameters that result in simulated MRMC data for which the empirical AUC distribution matches conjectured values of the parameters in 2 2 EQ-TARGET;temp:intralink-;sec3.2;116;418β ¼ðμ ; μ ; σ ; σ ;r ;r ;r Þ: OR 1∶OR 2∶OR 1 2 3 R∶OR TR∶OR 2 2 Note that β is the same as β but without σ . The value of σ for the simulated data will OR OR ε;OR ε∶OR be determined by the sample sizes and the RMH parameters. These approaches are useful when one is primarily interested in simulating data that match an OR correlation and variance component structure and a real-data value of σ is not available. ε∶OR They also are useful when real-data estimates for β are available but there is no solution for b OR using the OR-to-RMH algorithm with b_method = unspecified. 3.2.1 Overview The two approaches are similar to that described in Sec. 3.1, except that estimation of b does not depend on an inputted value for σ . Instead, b is either (1) explicitly specified using ε∶OR b_method = specified and setting the value of the input variable b_input equal to the desired value for b; or (2) computed so as to result in a median specified mean-to-sigma ratio across readers, using b_method = mean_to_sigma and setting the value of the input variable mean_ sig_input equal to the desired mean-to-sigma ratio. Use of the OR-to-RMH and RMH-to-OR algorithms to simulate data using these two approaches is summarized in Fig. 3. Figure 3 is similar to Fig. 2 with these differences: (1) No input value for σ is included because the input values are for β instead of for β . ε∶OR OR OR (2) For the OR-to-RMH algorithm, the g or g function (as defined below) is used in the 2 3 place of the g function. Note that the outputted OR parameter values include a value for σ . ε∶OR Approach 1: b_method = specified. With this approach, the value of b is specified. For example, the parameter values for the original RM model can be determined by setting b_input ¼ 1. Journal of Medical Imaging 045501-9 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Fig. 3 Flowchart illustrating the use of the OR-to-RMH and RMH-to-OR algorithms to simulate MRMC data that emulates OR AUCs, reader variance components, and OR correlations, but not σ . ε∶OR Let g denote the function defined by the OR-to-RMH algorithm, with b_method = specified, that maps β and an inputted value of b to a solution for β , denoted by OR RMH β ; i.e., RMH;solution EQ-TARGET;temp:intralink-;e020;116;362g ðβ ;n ;n ;b_inputÞ¼ β : (20) 2 OR 0 1 RMH;solution Again, ideally β will be such that fðβ ;n ;n Þ¼ β . However, similar RMH;solution RMH;solution 0 1 OR to using b_method = unspecified, it is possible for the OR-to-RMH algorithm to return a solution such that fðβ ;n ;n Þ ≈ β because of constraints on the RMH parameters RMH;solution 0 1 OR Eq. (23) in Table 7 that are imposed by the algorithm. Approach 2: b_method = mean_to_sigma. Recall from Sec. 3.1.3 that when b_method = unspecified is used, the value of b (based on the computed values of the 2 2 2 RMH parameters δ ; δ ; σ and σ ) is determined such that σ for the simulated data will 1 2 R TR ε∶OR match the inputted value for σ . In contrast, when b_method = mean_to_sigma is used, ε∶OR the user specifies a desired median mean-to-sigma value (see discussion of the mean-to-sigma measure below) across readers for the test corresponding to the minimum of the inputted μ 1∶OR and μ values. 2∶OR Let g denote the function defined by the OR-to-RMH algorithm with b_method = mean_ to_sigma that maps β and an inputted value of the mean-to-sigma ratio, denoted by q,to OR a solution for β : RMH EQ-TARGET;temp:intralink-;e021;116;143g ðβ ;n ;n ;qÞ¼ β : (21) 3 OR 0 1 RMH;solution As was the case for the other two b estimation methods, ideally, fðβ ;n ;n Þ¼ β , RHM;solution 0 1 OR but it is possible for this relationship to hold only approximately because of constraints on the RMH parameters. Journal of Medical Imaging 045501-10 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 3.2.2 Mean-to-sigma ratio The mean-to-sigma ratio, denoted by q, is defined as the difference of the latent diseased and nondiseased DV means divided by the difference of their standard deviations. The mean-to- sigma ratio was first introduced by Swets, who noticed that it seemed to be approximately constant for a variety of experiments. Some support for this conclusion was provided by later 22,25,26 26 analyses. For example, Green and Swets note that q ≈ 4 is typical for many studies. As discussed by Hillis and Berbaum, q can be used as a measure of improperness for a binormal ROC curve; specifically, it indicates that the ROC curve crosses the chance line at fpf ¼ ΦðqÞ, where fpf is the false positive fraction. They point out that it follows that an absolute value <2 indicates a noticeably improper binormal curve and an absolute value of infinity indi- cates a symmetric curve (b ¼ 1). For the RMH model, the mean-to-sigma ratio varies across readers. To avoid simulating data based on visibly improper binormal curves, we suggest that the probability of a reader’s true ROC curve being noticeably improper be small for each test, e.g., <0.025. This probability can be computed as a function of the RMH parameters, as discussed in Appendix B.1. 4 Results and Examples 4.1 R language Functions Two functions written in the R statistical software language that perform the OR-to-RMH and RMH-to-OR mappings are available within the freely available MRMCaov R package, which can be downloaded from the Github repository: https://github.com/brian-j-smith/MRMCaov. The function OR_to_RMH transforms OR parameters to RMH parameters using the numerical algorithm described in Table 10, and the function RMH_to_OR performs the analytical RMH-to- OR transformation, described in Table 6. 4.2 Example: Using the Algorithms to Simulate Data Emulating a Real-Data Study 4.2.1 Approach In this section, we illustrate the use of the algorithms to simulate data that emulate data provided 28 29,30 by Carolyn Van Dyke (VanDyke), which we have used for examples in previous papers, with empirical AUC being the reader performance metric. The study compared the relative per- formance of single spin-echo magnetic resonance imaging (SE MRI) to cinematic presentation of MRI (CINE MRI) for the detection of thoracic aortic dissection. There were n ¼ 69 patients without a dissection and n ¼ 45 patients with an aortic dissection imaged with both SE MRI and CINE MRI; cases were evaluated by five readers using a five-point ordinal confidence-of- disease scale. Similarly, each RMH simulated sample emulated five readers, each evaluating the same 69 nondiseased and 45 diseased cases. We apply the OR-to-RMH algorithm to the set of parameter estimates (“original” values) obtained from an OR analysis of the data set to obtain corresponding RMH parameters values, simulate 10,000 MRMC samples based on the RMH values and analyze each simulated sample using an OR analysis, using the unbiased error covariance method, with the outcome being the empirical AUC. We set b_method = unspecified for the OR-to-RMH algorithm. Figure 4 shows the computation of the RMH simulation model and the “true values,” which we define as the OR parameter values that describe the true distribution of the empirical AUCs computed from the simulated samples; i.e., the true values are the same as the outputted OR parameter values, given by β . We see that for this data set the outputted values are OR;output the same as the inputted values, and hence the original OR estimates exactly describe the true distribution of the simulated empirical AUC estimates. The R code and output for the OR-to- RMH and RMH-to-OR functions used to produce the results in Fig. 4 are provided in Appendix C.1. Journal of Medical Imaging 045501-11 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Fig. 4 Flowchart, analogous to Fig. 2, illustrating the use of the OR-to-RMH and RMH-to-OR algorithms to simulate MRMC data that emulate the VanDyke data. 4.2.2 Simulation study results Table 1 presents the simulation study results. “Unbiased estimates” are the empirical estimates (the means across the simulated sample estimates) for the first eight parameters (μ , μ , 1;OR 2;OR σ ;:::; Cov ), where OR estimates for each sample were computed using the OR method R∶OR with the unbiased covariance estimation method discussed in Sec. 2.2. Because the sample esti- mates for the sample-level correlations r ;r , and r are not unbiased, instead of reporting the 1 2 3 empirical estimates we report the quotients resulting from dividing the corresponding empirical covariance estimates by the empirical error variance estimate. For example, the estimate of 0.434 for r is computed by dividing the Cov estimate (0.000343) by the σ estimate (0.000791). 1 1 ε∶OR Because the resulting estimates are not the means of the sample-level correlations, empirical bias estimates and 95% confidence intervals for the correlations are not included. “(Est - true)/true” is defined as (estimate – true value)/(true value); it describes the deviation of the estimate from the true value and is expressed as a percentage of the true value. For the first eight parameters (i.e., not the correlations), these values can also be interpreted as the empirical estimates of statistical bias expressed as a percentage of the true value. “Within 95% CI?” is “yes” if the empirical 95% confidence interval (not shown) includes the true value, and otherwise is “no.” We see that the unbiased estimates for the first eight parameters differ by <1.38% from the true values and that the correlation estimates differ by <0.53%. Moreover, all of the 95% empiri- cal confidence intervals include the true value. Thus, the unbiased estimates agree with the true parameter values and hence provide validation for the OR-to-RMH algorithm. Plots of the empirical ROC curve for the VanDyke original data and for the first three simu- lated MRMC samples, based on the RMH model given in Fig. 4, are displayed in Fig. 5. Like the VanDyke study, each simulated sample has five independent readers reading the same set of 69 nondiseased and 45 diseased cases. Although the plots look somewhat different because the VanDyke plots are based on at most five distinct ratings, whereas the simulated-data plots are based on a continuous rating scale, in general the simulated-data ROC curves show a definite resemblance to the VanDyke ROC curves, although this is only our subjective assessment. 4.3 Other Remarks and Examples 4.3.1 DeLong error covariance estimation For comparison, we also include in Table 1 results using the DeLong et al. (DeLong) error covariance estimation method. Results for μ and μ are omitted since they depend only 1∶OR 2∶OR on the AUC estimation method and hence remain the same. We see from the confidence intervals 2 2 that DeLong estimates for σ , Cov , and Cov are positively biased and the σ estimate is ε 1 2 TR negatively biased. Similar results were obtained by Hillis. Although the DeLong method is biased, the estimates are relatively close to the true values, suggesting that results using the DeLong or another resampling error-covariance method, such as the jackknife or bootstrap, will Journal of Medical Imaging 045501-12 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Journal of Medical Imaging 045501-13 Jul∕Aug 2022 Vol. 9(4) Table 1 Simulation study estimates of OR parameters. OR parameters 2 2 2 μ μ σ σ σ Cov Cov Cov r r r 1;OR 2;OR 1 2 3 1 2 3 R∶OR TR∶OR ε∶OR True values 0.897 0.941 0.001540 0.000208 0.000788 0.000341 0.000339 0.000236 0.433 0.430 0.299 Unbiased-method estimates 0.897 0.941 0.001537 0.000211 0.000791 0.000343 0.000341 0.000238 0.434 0.432 0.301 (Est - true)/true −0.03% −0.04% −0.18% 1.38% 0.40% 0.59% 0.78% 0.93% 0.19% 0.38% 0.53% Within 95% CI? Yes Yes Yes Yes Yes Yes Yes Yes —— — DeLong estimates —— 0.001537 0.000201 0.000802 0.000344 0.000343 0.000238 0.429 0.427 0.297 (Est - true)/true —— −0.21% −3.15% 1.80% 0.91% 1.12% 1.20% −0.87% −0.67% −0.59% Within 95% CI? —— Yes No No No No Yes —— — Notes: There were 10,000 simulated samples based on the Fig. 4 RMH model with 5 readers and n ¼ 69, n ¼ 45. “True values” are the β values from Fig. 4 and the corresponding error 0 1 OR;output variance and covariances. For the first eight parameters, “unbiased-method estimates” and ” DeLong estimates” are the empirical estimates (i.e., means across the 10,000 samples) corresponding to using unbiased and DeLong error covariance estimation methods with the OR method. The correlation estimates are the quotients from dividing the corresponding covariance empirical covariance estimates by the empirical error variance estimate. “Within 95% CI?” is “yes” if the 95% confidence interval includes the true value and is “no” otherwise. For the DeLong estimates, results for μ 1∶OR and μ are omitted since they are exactly the same as for the unbiased estimates. 2∶OR Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Reader 1 Reader 2 Reader 3 Reader 4 Reader 5 Reader 6 Reader 7 Reader 8 Reader 9 Reader 10 Reader 11 Reader 12 Reader 13 Reader 14 Reader 15 Reader 16 Reader 17 Reader 18 Reader 19 Reader 20 0 .2.4.6.8 1 0 .2.4.6.8 1 0 .2.4.6.8 1 0 .2.4.6.8 1 0 .2.4.6.8 1 FPF CINE MRI Spin Echo MRI Fig. 5 Comparison of empirical ROC curves computed from VanDyke data and three MRMC data samples that emulate the VanDyke data, generated from the RMH model in Fig. 4. TPF, true positive fraction (or sensitivity); FPF, false positive fraction (or 1 – specificity). typically be similar to those obtained using the unbiased method. This point is illustrated by the example in the next section. 4.3.2 Example of computing power Suppose our goal is to estimate the power for detecting a difference in test AUCs for a study such as the VanDyke study, assuming that the reader-averaged empirical AUC estimates (0.897 and 0.941) are the true population values. This can be done by simulating similar data (as we did for Table 1) and then estimating power by the proportion of samples where the null hypothesis is rejected. The power estimates from doing this, based on the simulated samples used for Table 1, are 0.106 for the unbiased method and 0.107 using the DeLong method, illustrating how the choice of error covariance method makes almost no difference in our power estimates. 4.3.3 Ordinal rating scale A limitation of the OR-to-RMH algorithm is that it applies only to continuous simulated ratings. For example, in Sec. 4, the simulation data emulated a continuous rating for which the empirical AUC distribution could be described by the original OR parameter values, but the VanDyke data set that yielded the original OR estimates consisted of ratings on a five-point ordinal scale. Although ordinal data can be simulated based on the RMH model by binning the simulated continuous data, the mapping from the RMH model to the corresponding OR parameters when the data are binned has not yet been developed, and hence neither has the corresponding OR-to- RM algorithm been developed. Journal of Medical Imaging 045501-14 Jul∕Aug 2022 Vol. 9(4) TPF Sample 3 Sample 2 Sample 1 VanDyke 0.2 .4 .6 .81 0.2 .4 .6 .81 0.2 .4 .6 .81 0 .2.4.6.8 1 Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . We conducted a simulation study to investigate how close the original OR parameter values might describe the distribution of the empirical AUC for ordinal ratings resulting from binning the continuous ratings generated by the RMH model given in Fig. 4. The simulation study was performed similar to Table 1 study, except that five-category ordinal ratings were created by binning simulated continuous ratings. The binning thresholds corresponded to the empirical cumulative probabilities for ratings 1,. . . ,5 for the VanDyke nondiseased cases, pooled across readers. Results are presented in Table 2. As expected, the two AUC (μ ; μ ) estimates are 1∶OR 2∶OR less than for the continuous values, but only by a maximum of 1.44%. We also see that the correlations are similar to those for the continuous ratings (maximum deviation is −4.63%), with the relative values of the three even more similar: r ≈ r , as was the case for the continuous 1 2 ratings, and r is 0.12 lower than the other two, compared to being 0.13 lower for the continuous ratings. The maximum change in the error variance and covariance estimates was 8.07% and 2 2 there were 6.7% and −7.9% changes in σ and σ , respectively, which are in the same R∶OR TR∶OR “ballpark” as for the continuous ratings. We conclude that compared with the continuous data, the empirical AUC distribution for the binned data has a similar correlation structure, similar AUC estimates and somewhat similar 2 2 values for the error variance, error covariances, σ and σ . Thus, this example shows R∶OR TR∶OR that the simulated ordinal data approximately emulate the VanDyke data set. Moreover, one could adjust the RMH parameters to result in a closer emulation using an iterative approach, where each iteration consists of adjustment of original OR values based on results from the previous-iteration simulation study, computation of corresponding RMH values, and a corre- sponding simulation study. For example, a first iteration might begin by upward adjustment of the μ and μ values. 1∶OR 2∶OR 4.3.4 Changing the numbers of readers and cases In our examples, thus far we have set the numbers of readers, diseased cases, and nondiseased cases to be the same as those of the VanDyke data set. However, often a researcher will want to investigate the performance of a reader-performance metric for a range of these numbers. Readers. For a given set of RMH parameter values, changing the number of readers has no 2 2 2 effect on the corresponding OR parameters μ , μ , σ , σ , σ , Cov , Cov , 1∶OR 2∶OR 1 2 R∶OR TR∶OR ε∶OR Cov , r , r , and r , as shown by the omission of the reader number in the RMH-to-OR algo- 3 1 2 3 rithm formulas in Table 6 in Appendix A. Cases. For a given set of RMH parameter values, changing the number of cases has no effect 2 2 on μ , μ , σ ,or σ , as shown by the omission of the case sample sizes in the 1∶OR 2∶OR R∶OR TR∶OR corresponding formulas in Table 6. In contrast, σ ; Cov ; Cov , and Cov will be affected. ε∶OR 1 2 3 Although the correlations are also affected, changes in the correlations will typically be small [Ref. 6, p 2078]. For example, Table 3(b) shows when the case sizes are doubled (n ¼ 138, n ¼ 90) that 0 1 σ is reduced by 50%, the correlations are virtually unchanged (maximum of 0.6%), and ε∶OR 2 2 there is no change in σ , σ , q or q . Table 3c shows when the case sample sizes are 1 2 R∶OR TR∶OR switched (n ¼ 45, n ¼ 69) that σ is reduced by 19% and there is a small increase in the 0 1 ε∶OR correlations (maximum increase of 2.3%), with all other values remaining unchanged. These results are computed using the RMH-to-OR formulas in Table 6, thus eliminating the need for simulations. 4.3.5 Null and power simulations The example in Sec. 4.3.2 showed how power could be easily computed for simulated data that emulate a particular study, assuming the effect size (μ − μ ) is equal to the observed 1∶OR 2∶OR effect size. Other effect sizes can be investigated by adjusting δ and δ in the RMH parameter 1 2 set accordingly, using the relationship (from Table 6): Journal of Medical Imaging 045501-15 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Journal of Medical Imaging 045501-16 Jul∕Aug 2022 Vol. 9(4) Table 2 Simulation results when continuous ratings are binned into a five-point ordinal scale. OR parameters 2 2 2 μ μ σ σ σ Cov Cov Cov r r r 1∶OR 2∶OR 1 2 3 1 2 3 R∶OR TR∶OR ε∶OR True values (same as in Table 1) 0.897 0.941 0.001540 0.000208 0.000788 0.000341 0.000339 0.000236 0.433 0.430 0.299 Unbiased-method estimates from 0.884 0.930 0.001643 0.000192 0.000852 0.000352 0.000352 0.000250 0.413 0.414 0.293 binned ratings (Est - true)/true −1.44% −1.18% 6.67% −7.90% 8.07% 3.07% 3.98% 5.96% −4.63% −3.78% −1.95% Notes: See notes for Table 1. OR parameter estimates are based on five-category ordinal ratings resulting from binning the continuous simulated ratings using the thresholds −0.2085494, 1.0270435, 1.7437654, and 2.3781446; these thresholds correspond to the empirical cumulative probabilities 0.4174, 0.8478, 0.9594, and 0.9913 for ratings 1 to 5 that were computed from the VanDyke data. Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Journal of Medical Imaging 045501-17 Jul∕Aug 2022 Vol. 9(4) Table 3 Effect of different case sizes and RMH δ and δ values on OR parameters. 1 2 2 2 2 Change from OR true values q q Pr1 μ μ σ σ σ r r r 1 2 1∶OR 2∶OR 1 2 3 R∶OR TR∶OR ε∶OR (a) No change (true values from Fig. 4) 4.56 5.64 0.004 0.897 0.941 0.00154 0.000208 0.000788 0.433 0.430 0.299 (b) Case sizes doubled ðn ¼ 138;n ¼ 90Þ 4.56 5.64 0.004 0.897 0.941 0.00154 0.000208 0.000391 0.435 0.432 0.301 0 1 —— — — — ð0.00%Þð0.00%Þð−50%Þð0.5%Þð0.5%Þð0.6%Þ (c) Case sizes switched ðn ¼ 45;n ¼ 69Þ 4.56 5.64 0.004 0.897 0.941 0.00154 0.000208 0.000634 0.441 0.438 0.306 0 1 —— — — — ð0.00%Þð0.00%Þð−19%Þð2.0%Þð1.8%Þð2.3%Þ (d) Null model 1 ðδ ¼ δ ¼ 2.6452Þ 5.05 5.05 0.001 0.919 0.919 0.00164 0.000074 0.000789 0.462 0.426 0.319 1 2 —— — — — ð6.8%Þð−64%Þð0.13%Þð6.7%Þð−0.9%Þð6.6%Þ (e) Null model 2 ðδ ¼ δ ¼ 1.2759Þ 2.43 2.43 0.326 0.75 0.75 0.00701 0.000302 0.002346 0.522 0.515 0.401 1 2 —— — — — ð355%Þð45%Þð198%Þð21%Þð20%Þð34.1%Þ Notes: Part (a) shows the set of OR parameter “true values” (β ) from Fig. 4 that correspond to simulations using the RMH model parameters (β ) in Fig. 4 when n ¼ 69,and n ¼ 45. OR;solution RMH;solution 0 1 In addition, the median mean-to-sigma ratios q and q corresponding to the test 1 and 2 latent RMH rating distributions are included, as well as Pr1, defined as the probability that a reader’s true 1 2 ROC curve is noticeably improper for test 1. Parts (b)–(e) show the corresponding values when the indicated changes are made to the case sizes for the simulated samples or to the RMH model values. The OR values are computed by applying the RMH-to-OR algorithm to the RMH model from Fig. 4 with the changes in the left column incorporated. Values in parentheses are the percentage change in the OR parameters from the original values. See Appendix C.2 for the R code that produced these results. Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ EQ-TARGET;temp:intralink-;sec4.3.5;116;735μ ¼ Φ ; i∶OR −2 2 2 1 þ b þ 2ðσ þ σ Þ R TR which implies qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ −1 −2 2 2 EQ-TARGET;temp:intralink-;e022;116;686δ ¼ Φ ðμ Þ 1 þ b þ 2ðσ þ σ Þ; (22) i i∶OR R TR where Φ is the cumulative standard normal distribution function. In addition, often the researcher wants to empirically compute the type I error for testing H ∶μ ¼ μ versus H ∶μ ≠ μ . This can be done by creating a null RMH model 0 1∶OR 2∶OR 0 1∶OR 2∶OR by setting δ ¼ δ , with the empirical type I error rate given by the proportion of simulated 1 2 samples where H is rejected. For example, in Table 3(d) we alter the RMH model given in Fig. 4 by setting δ ¼ δ ¼ δ, with the value of δ determined such that the corresponding 1 2 μ values are both equal to μ ¼ 0.919, the mean of the two original OR AUC values, i∶OR OR 0.897 and 0.941, in Fig. 4. It follows from Eq. (22), with μ ¼ð0.897 þ 0.941Þ∕2, that OR 2 2 δ ¼ 2.6452, using the values for b; σ , and σ , given in Fig. 4. R TR In Table 3(e), we similarly determine for a null RMH model the value of δ that correspond to μ ¼ 0.75. In both Table 3(d) and 3(e), we see that all of the original OR parameter values are OR changed, as well as the mean-to-sigma ratios, with Table 3(e) showing much more change. For this reason, we suggest that if the researcher wants to simulate data with error correlations and reader and reader-by-test variance components similar to those from an OR analysis of a real- data study, but with much different AUC values, the OR-to-RMH algorithm with b_option = mean_to_sigma should be used to determine the corresponding β vector, as discussed in RM the next section. The R code and output for the OR-to-RMH and RMH-to-OR functions used to produce the results in Table 3 are included in Appendix C.2. 4.3.6 Mean-to-sigma ratios and the specified and mean_to_sigma b_options From Table 3, parts (a)-(c), we see that the mean-to-sigma ratios are q ¼ 4.56 and q ¼ 5.64 for 1 2 the Fig. 4 RMH model latent distributions, as well as for the models when the case sizes are changed. However, in parts (d) and (e), we see that when the values for the RMH parameters δ and δ are changed, the mean-to-sigma ratios also change. In Table 3, Pr1 is the probability that a reader’s true ROC curve is noticeably improper for test 1. (See Appendix B.1 for how to compute Pr1.) We see that this probability is relatively small (≤0.004) for the first four models and thus is not of concern. In contrast, Pr1 = 0.326 for null model 2, and thus we recommend not using this model for a simulation study. (Note: although Pr2, the analogous probability for test 2, is not included in Table 3, conclusions based on it were the same.) In Table 4, we see for the specified and mean_to_sigma b_methods that the OR parameters corresponding to the resulting RMH models are equal to all of the original OR values except for the error variance and covariances (not shown). The R code for generating Table 4 is included in Appendix C.3. 4.3.7 Troubleshooting Table 5 provides examples where the OR-to-RMH algorithm fails to produce a solution. In each example, the OR-to-RMH algorithm is applied to the original parameter estimate values from the VanDyke study, given in Fig. 4, but with one value altered to result in the algorithm not working. For example, in part (a) σ is changed from 0.00154 (original value) to 0.154 and the algo- R∶OR rithm fails. Using Table 11 in Appendix A, we can identify which input value is causing the problem by checking for the first parameter in the sequence x ;x ;x ;x ;b;x ;x ;x that is 1 2 3 4 5 6 7 missing (NA), where x ;:::;x are the alternative RM parameters discussed in Appendix A. 1 7 Noting that the first parameter with a missing value is x , the rules in Table 11 suggest reducing 2 2 the value of σ : Similarly, in part (b), σ is increased and x is the first parameter with R∶OR TR∶OR 4 Journal of Medical Imaging 045501-18 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Journal of Medical Imaging 045501-19 Jul∕Aug 2022 Vol. 9(4) Table 4 Comparison of RMH parameter values and corresponding true OR values resulting from using the three different b_methods. The RMH parameter values (β ) RMH;solution are obtained by applying the OR-to-RMH algorithm to the “original” OR parameter values (β ) in Fig. 4. The true OR values (β ) result from applying the RMH-to-OR OR;input OR;output algorithm to the RMH parameter values. See Appendix C.4 for the corresponding R code and the complete sets of RMH and OR values. RMH parameter values True OR values 2 2 2 b_method n n 0 1 bq q Pr1 μ μ σ σ σ r r r 1 2 1∶OR 2∶OR 1 2 3 R∶OR TR∶OR ε∶OR unspecified 69 45 0.65608 4.56 5.64 0.004 0.897 0.941 0.00154 0.000208 0.000788 0.433 0.43 0.299 mean_to_sigma 69 45 0.69297 5.20 6.43 0.002 0.897 0.941 0.00154 0.000208 0.000766 0.433 0.43 0.299 Specified 69 45 1 ∞∞ 0.000 0.897 0.941 0.00154 0.000208 0.000658 0.433 0.43 0.299 Used with mean_sig_input = 5.2. Used with b_input =1. Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Table 5 Troubleshooting examples. For each example, one of the original parameter estimate values from the VanDyke study, as given by β in Fig. 4, is replaced by a value that causes the OR;orig OR-to-RMH algorithm to fail. These examples show how the value responsible for the algorithm failure can be identified from the alternative parameters x ;:::;x and b values using the rules 1 7 given Table 11. All examples use b_method = unspecified. See Appendix C.4 for the R code that produced these results. Note that to print the x ;:::;x variables the option all = T must be 1 7 included in the print function (see Appendix C.4 for examples). 2 2 (a) Original value: σ ¼ 0.00154. Altered value: σ ¼ 0.154. Output from applying OR-to-RMH algorithm R∶OR R∶OR to altered β vector is shown below. Noting that x is the first parameter in the sequence OR∶orig 3 x ;x ;x ;x ;b;x ;x ;x that is missing (NA), the rules in Table 11 suggest reducing the value of σ . 1 2 3 4 5 6 7 R∶OR n0 n1 mu1 mu2 var_R var_TR var_C var_TC var_RC var_error 69 45 NA NA NA NA NA NA NA NA b_method mean_sig1 mean_sig2 mean_sig1_025 mean_sig2_025 unspecified NA NA NA NA x1 x2 x3 x4 b x5 x6 x7 1.264641 1.563224 NA NA NA NA NA NA 2 2 (b) Original value: σ ¼ 0.00028. Altered value: σ ¼ 0.28. Output from applying OR-to-RMH TR∶OR TR∶OR algorithm to altered β vector is shown below. Noting that x is the first parameter in the sequence OR∶orig 4 x ;x ;x ;x ;b;x ;x ;x that is missing (NA), the rules in Table 11 suggest reducing the value of σ . 1 2 3 4 5 6 7 TR∶OR n0 n1 mu1 mu2 var_R var_TR var_C var_TC var_RC var_error 69 45 NA NA NA NA NA NA NA NA b_method mean_sig1 mean_sig2 mean_sig1_025 mean_sig2_025 unspecified NA NA NA NA x1 x2 x3 x4 b x5 x6 x7 1.264641 1.563224 0.06838082 NA NA NA NA NA 2 2 (c) Original value: σ ¼ 0.000788. Altered value: σ ¼ 0.00788. Output from applying OR-to-RMH ε∶OR ε∶OR algorithm to altered β vector is shown below. Noting that b is the first parameter in the sequence OR∶orig x ;x ;x ;x ;b;x ;x ;x that is missing (NA), the rules in Table 11 suggest either changing (reducing or 1 2 3 4 5 6 7 increasing) the value of σ , or using b_method = specified or b_method = mean_to_sigma ε∶OR n0 n1 mu1 mu2 var_R var_TR var_C var_TC var_RC var_error 69 45 NA NA NA NA NA NA NA NA b_method mean_sig1 mean_sig2 mean_sig1_025 mean_sig2_025 unspecified NA NA NA NA 2 2 a missing value; here, Table 11 suggests reducing the value of σ : In part (c), σ is TR∶OR ε∶OR increased and b is the first parameter with a missing value; here, Table 11 suggests either chang- ing (reducing or increasing) the value of σ or using b_method = specified or b_method = ε;OR mean_to_sigma. The R code for generating the results in Table 5 is provided in Appendix C.4. The values for the x ;:::;x parameters are by default not printed unless the option all = T is included in the 1 7 print function, as illustrated in Appendix C.4. Also note in Appendix C.4 that the OR_to_RM function suggests the remedy, based on Table 11, when the algorithm fails to produce a solution. 4.3.8 Using the algorithm with Gallas parameter estimates 15,31 For a real-data MRMC study analyzed by the Gallas method, a method has been developed to convert the U-statistic parameters of empirical AUC and variance estimates to RM model Journal of Medical Imaging 045501-20 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 32 14 parameters. Alternatively, it has been shown by Hillis that the Gallas MRMC method pro- duces the same empirical AUC single test and difference-of-two-tests variance estimates as the OR method, if the constraints given by Eq. (12) are not imposed on the OR estimates. As a result, OR parameter estimates can be computed from the Gallas parameter estimates using formulas provided in Hillis. Hence, RMH model parameters that correspond to real data studies can be derived using the OR-to-RMH algorithm applied to the transformed Gallas parameter estimates. 5 Discussion A previous problem with the original RM model and later generalized versions of it was that the RM model parameters were expressed only in terms of the latent binormal rating distributions, as opposed to the more familiar reader performance measure distributions. Thus, it has been diffi- cult to set RM model parameters such that the simulated data were similar to MRMC data encountered in practice. Assuming the constrained unequal-variance RM model, which we have referred to as the RMH model in this paper, Hillis recently remedied this problem by deriving formulas for computing the OR parameter values that describe the distribution of empirical AUC outcomes computed from RMH simulated data. However, that paper did not provide a reverse OR-to-RMH mapping. This paper overcomes that limitation by deriving a numerical OR-to- RMH algorithm that computes RMH parameter values from a specified set of OR parameter values and by providing an R function to implement the algorithm. The OR-to-RMH algorithm and its corresponding R function make it easy to calibrate the RMH model to produce simulated data that emulate specific real data sets with respect to the distribution of the empirical AUC estimates. The original RM model paper presented several simulation structures that were supposed to represent ROC analyses of representative real data sets, which was useful because then research- ers could assess the performance of MRMC analysis methods using a commonly accepted set of RM simulation structures. However, there was a mistake in some of the computations of the RM parameters and the model was limited to equal-variance binormal ROC curves, which are not common. The present approach has several limitations that we hope to remedy in future research. It is limited to generating continuous rating data that emulate a set of inputted OR parameter values describing the distribution of the empirical AUC estimates. Although the simulated continuous rating data can be binned, the distribution of the empirical AUC estimates for the binned data will not as closely emulate the inputted OR parameter values. We suggested a method to adjust the parameter values to better fit ordinal discrete ratings through an iterative simulation approach, but this process is time consuming and we hope to develop RMH-to-OR and OR-to-RMH algo- rithms, similar to the ones in this paper, that are primarily designed for simulation of rating data with a few ordinal values (e.g., 1, 2, 3, 4, or 5). The present approach is also limited to the empirical AUC as the reader performance mea- sure. We hope to develop an approach that allows for a semiparametric outcome, such as the binormal AUC. Finally, our algorithm is based on the RMH model, which assumes that the latent distribu- tions are the same for both tests. Thus, another area for future research is to relax this assumption and develop algorithms for a more general RM model, such as the unconstrained unequal 6 5 variance model, the generalized RM model, or some other generalization of the original RM model. 6 Conclusions The main contributions of this paper are the OR-to-RMH algorithm and the corresponding R software OR_to_RMH function; these contributions make it easy to calibrate RMH model parameters to match real-data OR parameter estimates, thus making it easy to simulate rating data that emulate real data sets for testing MRMC analysis methods or for performing power analysis. These contributions will allow researchers to develop sets of RMH simulation struc- tures that are representative of a wide spectrum of MRMC studies, which can then be used to Journal of Medical Imaging 045501-21 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . validate MRMC analysis methods. We expect these new RMH simulation structures will replace the original RM model structures, which were not linked to specific real-world data sets and were limited to equal-variance ROC curves, making the representativeness of the structures difficult to evaluate. 7 Appendix A: Algorithm Details for Mapping OR Model Parameters to RMH model Parameters In this section, we derive the mapping from OR model parameters to RMH model parameters. For the mapping, we assume the RMH model because it has the same number of parameters as the OR model. The mapping from a more general RM model, which includes the RMH model as a special case, to the OR model was derived by Hillis. Modifying this more RM general model by constraining the error variance and variance components involving diseased cases to be equal to those involving nondiseased cases multiplied by b ;b > 0, results in the RMH model. Table 6 presents the resulting analytical RMH-to-OR mapping. To facilitate the derivation of the reverse (OR-to-RM) mapping, an alternative parameteriza- tion for the RMH model is presented in Table 7. Table 7(a) expresses the alternative RMH param- eters in terms of the RMH parameters, Table 7(b) presents the constraints on these parameters, and Table 7(c) expresses the RMH model parameters in terms of the alternative RMH param- eters. Table 8 expresses the OR parameters in terms of the alternative RMH parameters and Table 9 expresses the alternative RMH parameters in terms of the OR parameters. The proposed algorithm is presented in Table 10. Steps 1 to 6 replace the OR parameters in Table 8 by specified values and then solve for the corresponding alternative RMH parameter values. Note that these steps incorporate the alternative parameter constraints given in Table 7(b). Using Table 7(c) mappings, step 7 computes the final RMH parameter estimates as functions of the estimated alternative RMH parameter values. Table 6 RMH-to-OR mapping: OR parameters expressed in terms of the RMH model parameters for the empirical AUC. pﬃﬃﬃ μ ¼ Φ ;i ¼ 1;2 i∶OR h i 2σ δ δ δ δ 2 1 2 R 1 2 pﬃﬃﬃ pﬃﬃﬃ pﬃﬃﬃ pﬃﬃﬃ σ ¼ F ; ; − Φ Φ BVN R∶OR V V V V V n h i o P 2 2 2ðσ þσ Þ 2 2 δ δ δ 2 i i R TR i pﬃﬃﬃ pﬃﬃﬃ pﬃﬃﬃ σ ¼ :5 F ; ; − Φ − σ BVN TR∶OR i¼1 V R∶OR V V V −2 2 ρ ð1þb Þþ2σ δ δ m 4 1 2 R pﬃﬃﬃ pﬃﬃﬃ Cov ¼ c F ; ; 1 m BVN m¼1 V V V 2 2 2 2 2 2 2 2 σ þσ þσ þσ σ þσ σ þσ RCð−Þ Cð−Þ RCðþÞ CðþÞ RCð−Þ Cð−Þ RCðþÞ CðþÞ where ρ ¼ ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 −2 −2 −2 1 2 3 4 1þb 1þb 1þb P P −2 1 δ δ ρ ð1þb Þ 2 4 i i m pﬃﬃﬃ pﬃﬃﬃ Cov ¼ c F ; ; 2 m BVN 2 i¼1 m¼1 V V V 2 2 2 2 2 2 2 2 σ þσ þσ þσ σ þσ σ þσ Cð−Þ TCð−Þ CðþÞ TCðþÞ Cð−Þ TCð−Þ CðþÞ TCðþÞ ρ ¼ ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 −2 2 −2 3 −2 4 1þb 1þb 1þb P −2 ρ ð1þb Þ 4 δ δ m 1 2 pﬃﬃﬃ pﬃﬃﬃ Cov ¼ c F ; ; 3 m BVN m¼1 V V V 2 2 2 2 σ þσ σ σ Cð−Þ CðþÞ Cð−Þ CðþÞ where ρ ¼ ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 −2 2 −2 3 −2 4 1þb 1þb 1þb −2 2 2 P P ρ ð1þb Þþ2ðσ þσ Þ δ δ 2 1 2 4 m i i R TR pﬃﬃﬃ pﬃﬃﬃ σ ¼ c F ; ; i¼1 m¼1 m BVN ε∶OR 2 V V V 2 2 2 2 2 2 2 2 σ þσ þσ þσ σ þσ þσ þσ TCð−Þ RCð−Þ Cð−Þ εð−Þ TCðþÞ RCðþÞ CðþÞ εðþÞ where ρ ¼ 1; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 2 −2 3 −2 4 1þb 1þb Notes: The numbers of nondiseased and diseased cases are denoted by n and n ; F ð:; :; ρÞ denotes the 0 1 BVN standardized bivariate normal distribution function with correlation ρ; μ ¼ μ þ τ ; δ ¼ μ þ τ ; i∶OR OR i∶OR i þ iþ 1 n −1 n −1 1−n −n −2 2 2 1 0 0 1 V ¼ 1 þ b þ 2ðσ þ σ Þ; c ¼ ; c ¼ ; c ¼ ; c ¼ . This table is reprinted, adapted, and 1 2 3 4 R TR n n n n n n n n 0 1 0 1 0 1 0 1 2 2 revised with permission from Hillis [Ref. 6, Table 3]; notation is the same except that ðσ þ σ Þ has fixedð−Þ fixedðþÞ −2 been replaced by 1 þ b ;b > 0, which results in the RMH model. Journal of Medical Imaging 045501-22 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Table 7 Alternative parameterization for RMH model parameterization. (a) Alternative RMH parameters expressed in terms of RMH model parameters. V −2 2 2 2 is defined by V ¼ 1 þ b þ 2ðσ þ σ Þ. Note that x ¼ x þ σ , R TR 6 5 TCð−Þ 2 −2 x ¼ x þ σ , and 1 − x ¼½1 þ b ∕V. 7 5 4 RCð−Þ Alternative RMH model parameters RMH model parameters b = b pﬃﬃﬃﬃ x = ðδ Þ∕ V 1 1 pﬃﬃﬃﬃ x = ðδ Þ∕ V 2 2 x = 2σ ∕V 2 2 x = ½2ðσ þ σ Þ∕V 4 R TR x = σ Cð−Þ 2 2 x = σ þ σ TCð−Þ Cð−Þ 2 2 x = σ þ σ RCð−Þ Cð−Þ (b) Constraints on RMH alternative parameters. These follow from the equations in part (a), nonnegativity of the RMH variance components and constraints Eqs. (6) and (8). EQ-TARGET;temp:intralink-;e0230;164;477 ≤ x ;0 ≤ x ;0 ≤ x ≤ 1;i ¼ 3;:::; 7; x ≥ x ; x ≥ x ; x 1 2 i 4 3 6 5 7 ≥ x ; x þ x − x ≤ 1 (23) 5 6 7 5 (c) RMH parameters expressed in terms of alternative RMH parameters. Note that in terms of the alternative parameterization, V ¼ð1 þ 1∕b Þ∕ð1 − x Þ. RMH parameter Alternative RMH parameter b = b pﬃﬃﬃﬃ δ = x V 1 1 pﬃﬃﬃﬃ δ = x V 2 2 σ = :5x V σ = :5V ðx − x Þ 4 3 TR σ = x Cð−Þ σ = x − x 6 5 TCð−Þ σ = x − x 7 5 RCð−Þ σ = 1 − ðx þ x − x Þ 6 7 5 εð−Þ From Table 9, it follows that for each of the alternative parameters other than b, there can be only one solution. It then follows from Table 8(a) that there can be only one solution for the RMH parameters other than b. Hence, if there is more than one solution, they differ only in their b values. Sometimes there is not an exact or approximate solution and the OR-to-RMH algorithm returns missing values. When this happens, changing the values of the inputted OR parameters or changing the b_method option will generally result in a solution, as discussed in Sec. 3.1.3. The algorithm solves for the alternative RMH parameters in the following order: x ;x ;x ;x ; 1 2 3 4 b; x ;x , and x . Because the parameters may require estimates of preceding but not subsequent 5 6 7 parameters, all parameters following a parameter with no solution are assigned a missing value by the algorithm. Table 11 describes the appropriate correction action that will produce a solution for the OR-to-RMH algorithm, according to which is the first RMH parameter to not have a solution. Journal of Medical Imaging 045501-23 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Table 8 RMH-to-OR mapping: OR parameters for the empirical AUC expressed in terms of the alternative parameterization of the RMH model given in Table 7. μ ¼ Φðx Þ;i ¼ 1;2 i∶OR i σ ¼ F ðx ;x ; x Þ − ½Φðx ÞΦðx Þ R∶OR BVN 1 2 3 1 2 2 2 2 2 σ ¼ :5 fF ðx ;x ; x Þ − ½Φðx Þ g − σ BVN i i 4 i TR∶OR i¼1 R∶OR P P 2 1 2 4 σ ¼ c F ðx ;x ; ρ ð1 − x Þþ x Þ m BVN i i m 4 4 ε∶OR 2 i¼1 m¼1 1 1 where ρ ¼ 1; ρ ¼ ; ρ ¼ ; ρ ¼ 0 −2 2 1 2 3 4 1þb 1þb Cov ¼ c F ðx ;x ; ρ ð1 − x Þþ x Þ 1 m BVN 1 2 m 4 3 m¼1 x x 7 7 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 −2 2 1 7 2 3 4 1þb ð1þb Þ P P 1 2 4 Cov ¼ : c F ðx ;x ; ρ ð1 − x ÞÞ 2 m BVN i i m 4 2 i¼1 m¼1 x x 6 6 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 6 2 −2 3 2 4 1þb 1þb Cov ¼ c F ðx ;x ; ρ ð1 − x ÞÞ 3 m BVN 1 2 m 4 m¼1 x x 5 5 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 5 2 −2 3 2 4 1þb 1þb Notes: This table results from replacing the RMH model parameters in Table 6 by the alternative RMH model parameters, as defined in Table 7(a). F ð:; :; ρÞ is the standardized bivariate normal distribution function BVN n −1 n −1 1−n −n −2 2 2 1 1 0 0 1 with correlation ρ; V ¼ 1 þ b þ 2ðσ þ σ Þ; c ¼ ; c ¼ ; c ¼ ; c ¼ . 1 2 3 4 R TR n n n n n n n n 0 1 0 1 0 1 0 1 Table 9 Alternative RMH parameters expressed in terms of OR parameters. −1 x ¼ Φ ðμ Þ 1 1∶OR −1 x ¼ Φ ðμ Þ 2 2∶OR x ¼f0 ≤ x ≤ 1∶σ − F ðx ;x ; x Þþ Φðx ÞΦðx Þ¼ 0g 3 3 BVN 1 2 3 1 2 R∶OR n o 2 2 2 2 x ¼ 0 ≤ x ≤ 1∶σ þ σ − :5 fF ðx ;x ; x Þ − ½Φðx Þ g¼ 0 4 4 BVN i i 4 i TR∶OR R∶OR i¼1 n o P P 2 1 2 4 b ¼ b> 0∶σ − c F ðx ;x ; ρ ð1 − x Þþ x Þ¼ 0 m BVN i i m 4 4 ε∶OR 2 i¼1 m¼1 1 1 where ρ ¼ 1; ρ ¼ ; ρ ¼ ; ρ ¼ 0 −2 2 1 2 3 4 1þb 1þb n o x ¼ 0 ≤ x ≤ 1∶Cov − c F ðx ;x ; ρ ð1 − x ÞÞ ¼ 0 5 5 3 m BVN 1 2 m 4 m¼1 x x 5 5 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 5 2 −2 3 2 4 1þb 1þb n o P P 1 2 4 x ¼ 0 ≤ x ≤ 1∶Cov − c F ðx ;x ; ρ ð1 − x ÞÞ ¼ 0 6 6 2 m BVN i i m 4 2 i¼1 m¼1 x x 6 6 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 6 2 −2 3 2 4 1þb 1þb n o x ¼ 0 ≤ x ≤ 1∶Cov − c F ðx ;x ; ρ ð1 − x Þþ x Þ¼ 0 7 7 1 m¼1 m BVN 1 2 m 4 3 x x 7 7 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 7 2 −2 3 2 4 1þb ð1þb Þ Notes: These results follow from the Table 8 relationships and constraints Eq. (23) in Table 7. F ð:; :; ρÞ is BVN 1 n −1 n −1 1 0 the standardized bivariate normal distribution function with correlation ρ; c ¼ ; c ¼ ; c ¼ ; 1 2 3 n n n n n n 0 1 0 1 0 1 1−n −n 0 1 c ¼ . n n 0 1 Journal of Medical Imaging 045501-24 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Table 10 OR-to-RMH algorithm for computing parameter values for the RMH model that correspond to specified OR parameter values. Step 1. Solve for x and x ∶ 1 2 −1 −1 ^ ^ x ¼ Φ ðθ Þ and x ¼ Φ ðθ Þ 1 1 2 2 Step 2. Solve for x , using the values for x and x obtained in step 1: 3 1 2 x ¼f0 ≤ x < 1∶σ^ − F ðx ;x ; x Þþ Φðx ÞΦðx Þ¼ 0g 3 3 BVN 1 2 3 1 2 R∶OR From the relationship F ðx; y; ρ Þ < F ðx; y; ρ Þ if ρ < ρ , where F ð·; ·; ρÞ is the standardized BVN 1 BVN 2 1 2 bivariate normal distribution function with correlation ρ, it follows that σ^ − F ðx ;x ; x Þþ BVN 1 2 3 R∶OR Φðx ÞΦðx Þ is an increasing function of x and hence x can be easily determined numerically. Numerical 1 2 3 3 solutions for x ;x ;x , and x can be similarly determined in steps 3 and 6. 4 5 6 7 Step 3. Solve for x , using the values for x and x obtained in step 1: 4 1 2 h n oi 2 2 2 2 x ¼ max x ; 0 ≤ x < 1∶σ^ þ σ^ − :5 fF ðx ;x ; x Þ − ½Φðx Þ g¼ 0 4 3 4 BVN i i 4 i TR∶OR R∶OR i¼1 Step 4. Solve for b using one of the following b_method options. The resulting value of b is used for the remaining steps. b_method = unspecified: Solve for b, using the values for x ;x , and x obtained in steps 1 and 3: 1 2 4 n o P P 2 1 2 4 b ¼ b> 0∶σ^ − c F ðx ;x ; ρ ð1 − x Þþ x Þ¼ 0 i¼1 m¼1 m BVN i i m 4 4 ε∶OR 2 1 1 where ρ ¼ 1; ρ ¼ ; ρ ¼ ; ρ ¼ 0. With this option there can be 0, 1, or 2 possible 1 2 −2 3 2 4 1þb 1þb solutions for b. The algorithm returns the largest solution such that 0.001 ≤ b ≤ 1 if it exists; otherwise, it returns the smallest solution such that 1 ≤ b ≤ 4 if it exists, or a missing value if it does not exist. b_method = specified: Use the specified value of b. b_method = mean_to_sigma: Solve for the value of b that corresponds to a specified mean-to-sigma ratio and the minimum of the specified values for the expected test 1 and test 2 AUCs. (See Sec. B.2 for details.) Step 5. Compute OR covariance estimates to be used in step 6. (a) If b_method = unspecified was used in step 4, compute Cov ¼ r σ^ ;i ¼ 1;2; 3. i i ε∶OR (b) If one of the other two methods was used in step 4, then using the computed value of b and the ^ ^ ^ inputted correlations r ; r and r , compute a new value for the OR error variance, given by 1 2 3 P P 2 1 2 4 1 1 ~ σ ¼ c F ðx ;x ; ρ ð1 − x Þþ x Þ, where ρ ¼ 1; ρ ¼ ; ρ ¼ ; ρ ¼ 0: m BVN i i m 4 4 1 2 −2 3 2 4 ε∶OR i¼1 m¼1 2 1þb 1þb Then compute d 2 Cov ¼ r σ ˜ ;i ¼ 1;2; 3: i i ε∶OR Step 6. Solve for x ;x and x , using the following equations and the values for x ;x ;x , b and 5 6; 7 1 2 4 Cov ;i ¼ 1;2; 3, obtained in steps 1, 3, and 5: n o d 4 x ¼ 0 ≤ x < 1∶Cov − c F ðx ;x ; ρ ð1 − x ÞÞ ¼ 0 5 5 3 m BVN 1 2 m 4 m¼1 x x 5 5 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 −2 2 1 5 2 3 4 1þb 1þb h n oi P P d 1 2 4 x ¼ max x ; 0 ≤ x < 1∶Cov − c F ðx ;x ; ρ ð1 − x ÞÞ ¼ 0 6 5 6 2 m BVN i i m 4 2 i¼1 m¼1 x x 6 6 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 6 2 −2 3 2 4 1þb 1þb h n oi d 4 x ¼ max x ; 0 ≤ x < 1∶Cov − c F ðx ;x ; ρ ð1 − x Þþ x Þ¼ 0 7 5 7 1 m BVN 1 2 m 4 3 m¼1 x x 7 7 where ρ ¼ x ; ρ ¼ ; ρ ¼ ; ρ ¼ 0 1 7 2 −2 3 2 4 1þb ð1þb Þ Step 7. Solve for the estimated RMH parameter values as functions of the estimated alternative RMH parameter values using the mapping given in Table 7c. ^ ^ Notes: θ and θ denote specified values of the reader-averaged performance empirical AUCs for tests 1 1 2 2 2 2 and 2, respectively; σ^ ; σ^ , and σ^ denote specified values of the corresponding OR parameters, R∶OR TR∶OR ε∶OR ^ ^ ^ and r ; r , and r denote specified values for the OR correlations defined by r ¼ Cov ∕σ . These speci- 1 2 3 i i∶OR ε∶OR fied values can be computed from real data or conjectured. F ð:; :; ρÞ is the standardized bivariate normal BVN distribution function with correlation ρ. Note that constraints Eq. (23) in Table 7 have been incorporated into the preceding steps. Journal of Medical Imaging 045501-25 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Table 11 Troubleshooting the OR-to-RMH algorithm when missing parameter values result. Alternative RMH When the parameter to the left is the first parameter in the column to have a missing parameter value, try the following corrective action: x NA (should be no problem) x NA (should be no problem) x Reduce the value of σ x Reduce the value of σ TR b If using b_method = unspecified, there are two possible solutions: (a) Change (reduce or increase) the value of σ ε;OR (b) Use one of the other two b_method options, which should always work x Reduce the value of r 5 3 x Reduce the value of r 6 2 x Reduce the value of r 7 1 8 Appendix B: Mean-to-Sigma Details 8.1 B.1 Computation of the Probability of a Noticeably Improper ROC Curve For the RMH model, the mean-to-sigma ratio varies across readers. Letting q denote the mean- ij to-sigma ratio for test i and reader j, Hillis shows the RMH model implies that 2 2 2ðσ þ σ Þ R TR EQ-TARGET;temp:intralink-;sec8.1;116;413q ∼ N q ; ;i ¼ 1;2; ij i −1 2 ðb − 1Þ where EQ-TARGET;temp:intralink-;sec8.1;116;355q ¼ : −1 ðb − 1Þ It follows for test i that the probability that a reader’s ROC curve is noticeably improper (i.e., the absolute value of the mean-to-sigma ratio is less than 2, as discussed in Sec. 3.2.2) is given as 2 − q −2 − q i i EQ-TARGET;temp:intralink-;sec8.1;116;287 Prðjq j < 2Þ¼ Φ − Φ ; ij σ ˜ σ ˜ where sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 2 2ðσ þ σ Þ R TR EQ-TARGET;temp:intralink-;sec8.1;116;230σ ˜ ¼ : −1 2 ðb − 1Þ 8.2 B.2 Derivation of b in Step 4 in Table 10 when b_method = mean_to_sigma Without loss of generality, we assume that test 1 has the lower OR AUC input; i.e., μ ¼ 1∶OR minðμ ; μ Þ. Let θ denote the empirical AUC estimate for a randomly selected RMH 1∶OR 2∶OR 1j reader j reading a random RMH sample of ratings for test 1. Given the solution values of x ;x ;x , and x from steps 1 to 4 in Table 6, we want to solve for b such that Eðθ Þ¼ 1 2 3 4 1j μ and medianðq Þ¼ q, where q is the mean-to-sigma ratio for reader j. 1;OR j j Journal of Medical Imaging 045501-26 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Recall that for test 1, the median separation between the latent normal and abnormal dis- tributions for test 1 across readers is equal to δ . It follows that the median mean-to-sigma ratio is given by EQ-TARGET;temp:intralink-;sec8.2;116;699q ¼ ; −1 b − 1 and hence −1 EQ-TARGET;temp:intralink-;e024;116;646δ ¼ qðb − 1Þ: (24) From Table 6, we can write pﬃﬃﬃﬃ −1 EQ-TARGET;temp:intralink-;sec8.2;116;601Φ ðμ Þ¼ δ ∕ V ¼ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ : 1∶OR 1 −2 2 2 1 þ b þ 2ðσ þ σ Þ R TR −2 Using the relationship 1 − x ¼½1 þ b ∕V from Table 7, it follows that −1 EQ-TARGET;temp:intralink-;e025;116;543Φ ðμ Þ¼ rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ : (25) 1∶OR h i −2 1 1 þ b 1−x Substituting expression Eq. (24) for δ into Eq. (25) yields −1 rðb − 1Þ −1 rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ EQ-TARGET;temp:intralink-;sec8.2;116;473Φ ðμ Þ¼ ; 1∶OR h i −2 1 ð1 þ b Þ 1−x which implies 2 −2 2 −1 2 r b − 2r b þ r −1 2 h i EQ-TARGET;temp:intralink-;sec8.2;116;402½Φ ðμ Þ ¼ ; 1∶OR −2 1 ð1 þ b Þ 1−x or equivalently −1 2 −2 2 −2 2 −1 2 EQ-TARGET;temp:intralink-;e026;116;337½Φ ðμ Þ ð1 þ b Þ − r b þ 2r b − r ¼ 0: (26) 1∶OR 1 − x −1 Collecting terms in Eq. (26) results in a quadratic equation in b : 1 1 −2 −1 2 2 −1 2 −1 2 EQ-TARGET;temp:intralink-;sec8.2;116;280b ½Φ ðμ Þ − r þ b ð2r Þþ½Φ ðμ Þ − r ¼ 0: 1∶OR 1∶OR 1 − x 1 − x 4 4 −1 Solving for b using the quadratic equation formula yields pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ −b b − 4a c 1 1 1 3 −1 EQ-TARGET;temp:intralink-;sec8.2;116;222b ¼ ; 2a where EQ-TARGET;temp:intralink-;sec8.2;116;164 −1 2 2 a ¼½Φ ðμ Þ − r ; 1 1∶OR 1 − x b ¼ 2r ; −1 2 2 c ¼½Φ ðμ Þ − r : 3 1∶OR 1 − x Journal of Medical Imaging 045501-27 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 9 Appendix C: Commands and Output for Tables from Applying the OR_to_RMH and RMH_to_OR R Functions This appendix includes the R commands and resulting output that were used to produce the content of Fig. 4 and Tables 3–5. Note that both the RMH_to_OR and RMH_to_OR functions return values for mean_to_sig1, mean_to_sig2, mean_sig1_025, and mean_sig2_025; these are not RMH-model or OR-model parameters but rather are parameters describing the distributions of the true reader AUC values. 9.1 C.1 R Commands and Output Corresponding to Fig. 4 9.1.1 C.1.1 Computation of RMH values by applying OR-to-RMH algorithm to VanDyke original OR values > VanDyke_OR_orig_values <- data.frame(n0 = 69, n1 = 45, AUC1 = 0.897, + AUC2 = 0.941, var_R = 0.00154, var_TR = 0.000208, + error_var = 0.000788, corr1 = 0.433, + corr2 = 0.430, corr3 = 0.299) > RM_values <- OR_to_RMH(VanDyke_OR_orig_values) > print(RM_values) n0 n1 delta1 delta2 var_R var_TR var_C var_TC 1 69 45 2.392224 2.957029 0.1223413 0.005180485 0.4716964 0.1222262 var_RC var_error b b_method mean_to_sig1 mean_to_sig2 1 0.1091448 0.2969327 0.656081 unspecified 4.563553 5.64101 Pr1_improper Pr2_improper 1 0.003896242 7.862956e-05 9.1.2 C.1.2 Computation of OR true values by applying RMH-to-OR algorithm to RMH values > OR_true_values <- RMH_to_OR(RM_values) > print(OR_true_values) n0 n1 AUC1 AUC2 var_R var_TR error_var cov1 1 69 45 0.897 0.941 0.00154 0.000208 0.0007880002 0.0003412041 cov2 cov3 corr1 corr2 corr3 b mean_to_sig1 1 0.0003388401 0.0002356121 0.433 0.43 0.299 0.656081 4.563553 mean_to_sig2 Pr1_improper Pr2_improper 1 5.64101 0.003896242 7.862956e-05 9.2 C.2 R Commands and Output Corresponding to Table 3 > # Create data frame with 5 rows, with row 1 same as RM_values in Table 3a > # and rows 2-5 changed slightly. > VanDyke_OR_orig_values <- data.frame(n0 = 69, n1 = 45, AUC1 = 0.897, + AUC2 = 0.941, var_R = 0.00154, var_TR = 0.000208, error_var = 0.000788, + corr1 = 0.433, corr2 = 0.430, corr3 = 0.299) > RM_values <- OR_to_RMH(VanDyke_OR_orig_values) > RM_Table4 <- RM_values[c(1,1,1,1,1),] #creates data frame with 5 rows, each = RM_values > RM_Table34[2,c("n0","n1")] <- c(138, 90) Journal of Medical Imaging 045501-28 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . > RM_Table34[3,c("n0","n1")] <- c(45, 69) > RM_Table34[4,c("delta1","delta2")] <- c(2.6452, 2.6452) > RM_Table34[5,c("delta1","delta2")] <- c(1.2759, 1.2759) > print(RM_Table3) n0 n1 delta1 delta2 var_R var_TR var_C 1 69 45 2.392224 2.957029 0.1223413 0.005180485 0.4716964 1.1 138 90 2.392224 2.957029 0.1223413 0.005180485 0.4716964 1.2 45 69 2.392224 2.957029 0.1223413 0.005180485 0.4716964 1.3 69 45 2.645200 2.645200 0.1223413 0.005180485 0.4716964 1.4 69 45 1.275900 1.275900 0.1223413 0.005180485 0.4716964 var_TC var_RC var_error b b_method mean_to_sig1 1 0.1222262 0.1091448 0.2969327 0.656081 unspecified 4.563553 1.1 0.1222262 0.1091448 0.2969327 0.656081 unspecified 4.563553 1.2 0.1222262 0.1091448 0.2969327 0.656081 unspecified 4.563553 1.3 0.1222262 0.1091448 0.2969327 0.656081 unspecified 4.563553 1.4 0.1222262 0.1091448 0.2969327 0.656081 unspecified 4.563553 mean_to_sig2 Pr1_improper Pr2_improper 1 5.64101 0.003896242 7.862956e-05 1.1 5.64101 0.003896242 7.862956e-05 1.2 5.64101 0.003896242 7.862956e-05 1.3 5.64101 0.003896242 7.862956e-05 1.4 5.64101 0.003896242 7.862956e-05 > OR_values_Table3 <- RMH_to_OR(RM_Table3) > print(OR_values_Table3) n0 n1 AUC1 AUC2 var_R var_TR error_var 1 69 45 0.8970000 0.9410000 0.001540000 2.080000e-04 0.0007880002 1.1 138 90 0.8970000 0.9410000 0.001540000 2.080000e-04 0.0003912576 1.2 45 69 0.8970000 0.9410000 0.001540000 2.080000e-04 0.0006344427 1.3 69 45 0.9190000 0.9190000 0.001644069 7.426773e-05 0.0007890063 1.4 69 45 0.7500034 0.7500034 0.007014410 3.019443e-04 0.0023458109 cov1 cov2 cov3 corr1 corr2 1 0.0003412041 0.0003388401 0.0002356121 0.4330000 0.4300000 1.1 0.0001703301 0.0001691406 0.0001176498 0.4353401 0.4322997 1.2 0.0002800701 0.0002778178 0.0001940871 0.4414426 0.4378927 1.3 0.0003644012 0.0003363961 0.0002513892 0.4618483 0.4263542 1.4 0.0012240655 0.0012083227 0.0009406161 0.5218091 0.5150981 corr3 b mean_to_sig1 mean_to_sig2 Pr1_improper 1 0.2990000 0.656081 4.563553 5.641010 0.003896242 1.1 0.3006966 0.656081 4.563553 5.641010 0.003896242 1.2 0.3059174 0.656081 4.563553 5.641010 0.003896242 1.3 0.3186150 0.656081 5.046146 5.046146 0.000783834 1.4 0.4009769 0.656081 2.433985 2.433985 0.326185605 Pr2_improper 1 7.862956e-05 1.1 7.862956e-05 1.2 7.862956e-05 1.3 7.838340e-04 1.4 3.261856e-01 Journal of Medical Imaging 045501-29 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 9.3 C.3 R Commands and Output Corresponding to Table 4 > VanDyke_OR_orig_values <- data.frame(n0 = 69, n1 = 45, AUC1 = 0.897, + AUC2 = 0.941, var_R = 0.00154, var_TR = 0.000208, var_error = 0.000788, + corr1 = 0.433, corr2 = 0.430, corr3 = 0.299) > Table4_OR1 <- VanDyke_OR_orig_values[c(1,1,1),] #creates data frame with 3 rows, > # each the same as VanDyke_OR_orig_values > Table4_OR2 <- data.frame(b_method=c("unspecified", "mean_to_sigma","specified"), + b_input = c(NA,NA,1), mean_sig_input = c(NA,5.2,NA)) > Table4_OR <- cbind(Table5_OR1, Table5_OR2) > print("Original OR parameter values") [1] "Original OR parameter values" > print(Table4_OR) n0 n1 AUC1 AUC2 var_R var_TR var_error corr1 corr2 corr3 1 69 45 0.897 0.941 0.00154 0.000208 0.000788 0.433 0.43 0.299 1.1 69 45 0.897 0.941 0.00154 0.000208 0.000788* 0.433 0.43 0.299 1.2 69 45 0.897 0.941 0.00154 0.000208 0.000788* 0.433 0.43 0.299 b_method b_input mean_sig_input 1 unspecified NA NA 1.1 mean_to_sigma NA 5.2 1.2 specified 1 NA *Note that with mean_to_sigma = mean_to_sigma or specified it is not necessary to specify a value for var_error, or the value can be NA > Table4_RMH <- OR_to_RMH(Table4_OR) > print("Table 4 RMH parameter values") [1] "Table 4 RMH parameter values" > print(Table4_RM) n0 n1 delta1 delta2 var_R var_TR var_C 1 69 45 2.392224 2.957029 0.12234134 0.005180485 0.4716964 1.1 69 45 2.303940 2.847902 0.11347812 0.004805176 0.4674676 1.2 69 45 1.855834 2.293997 0.07362882 0.003117776 0.4498198 var_TC var_RC var_error b b_method 1 0.1222262 0.1091448 0.2969327 0.6560810 unspecified 1.1 0.1220955 0.1089342 0.3015027 0.6929693 mean_to_sigma 1.2 0.1215947 0.1080172 0.3205683 1.0000000 specified mean_to_sig1 mean_to_sig2 Pr1_improper Pr2_improper 1 4.563553 5.641010 0.003896242 7.862956e-05 1.1 5.200000 6.427723 0.001778344 2.748745e-05 1.2 Inf Inf 0.000000000 0.000000e+00 > Table5_true_values <- RM_to_OR(Table4_RM) > print("Table 4 True OR values") [1] "Table 4 True OR values" > print(Table4_true_values) Journal of Medical Imaging 045501-30 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . n0 n1 AUC1 AUC2 var_R var_TR var_error cov1 1 69 45 0.897 0.941 0.00154 0.000208 0.0007880002 0.0003412041 1.1 69 45 0.897 0.941 0.00154 0.000208 0.0007664249 0.0003318620 1.2 69 45 0.897 0.941 0.00154 0.000208 0.0006584975 0.0002851294 cov2 cov3 corr1 corr2 corr3 b 1 0.0003388401 0.0002356121 0.433 0.43 0.299 0.6560810 1.1 0.0003295627 0.0002291610 0.433 0.43 0.299 0.6929693 1.2 0.0002831539 0.0001968908 0.433 0.43 0.299 1.0000000 mean_to_sig1 mean_to_sig2 Pr1_improper Pr2_improper 1 4.563553 5.641010 0.003896242 7.862956e-05 1.1 5.200000 6.427723 0.001778344 2.748745e-05 1.2 Inf Inf 0.000000000 0.000000e+00 9.4 C.4 R Commands and Output Corresponding to Table 5 9.4.1 C.4.1 Table 5(a) code (σ changed from 0.00154 to 0.154) R∶OR > VanDyke_OR_altered_values_a <- data.frame(n0 = 69, n1 = 45, AUC1 = 0.897, + AUC2 = 0.941, var_R = 0.154, var_TR = 0.000208, var_error = 0.000788, + corr1 = 0.433, corr2 = 0.430, corr3 = 0.299) > RM_values = OR_to_RM(VanDyke_OR_altered_values_a) Warning message: In OR_to_RM.default(n0 = 69, n1 = 45, AUC1 = 0.897, AUC2 = 0.941, : Conversion failed. Try reducing the value of var_R. > print(RM_values,all=T) n0 n1 delta1 delta2 var_R var_TR var_C var_TC var_RC var_error b 1 69 45 NA NANA NA NA NANANA NA b_method mean_to_sig1 mean_to_sig2 Pr1_improper Pr2_improper 1 unspecified NA NA NA NA x1 x2 x3 x4 x5 x6 x7 1 1.264641 1.563224 NA NA NA NA NA 9.4.2 C.4.2 Table 5(b) code (σ changed from 0.00028 to 0.28) TR∶OR > VanDyke_OR_altered_values_b <- data.frame(n0 = 69, n1 = 45, AUC1 = 0.897, + AUC2 = 0.941, var_R = 0.00154, var_TR = 0.208, var_error = 0.000788, + corr1 = 0.433, corr2 = 0.430, corr3 = 0.299) > RM_values <- OR_to_RM(VanDyke_OR_altered_values_b) Warning message: In OR_to_RM.default(n0 = 69, n1 = 45, AUC1 = 0.897, AUC2 = 0.941, : Conversion failed. Try reducing the value of var_TR. > print(RM_values,all=T) n0 n1 delta1 delta2 var_R var_TR var_C var_TC var_RC var_error b 1 69 45 NANANA NANA NANANA NA b_method mean_to_sig1 mean_to_sig2 Pr1_improper Pr2_improper 1 unspecified NA NA NA NA x1 x2 x3 x4 x5 x6 x7 1 1.264641 1.563224 0.06838082 NA NA NA NA Journal of Medical Imaging 045501-31 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 9.4.3 C.4.3 Table 5(c) code (σ changed from 0.000788 to 0.00788) ε∶OR > VanDyke_OR_altered_values_c <- data.frame(n0 = 69, n1 = 45, AUC1 = 0.897, + AUC2 = 0.941, var_R = 0.00154, var_TR = 0.000208, var_error = 0.00788, + corr1 = 0.433, corr2 = 0.430, corr3 = 0.299) > RM_values <- OR_to_RM(VanDyke_OR_altered_values_c) Warning message: In OR_to_RM.default (n0 = 69, n1 = 45, AUC1 = 0.897, AUC2 = 0.941, : Conversion failed. If using b_method = "unspecified," there are two possible solutions: (a) Try changing (reduce or increase) the value of var_error.( b) Try using one of the other two b_method options, which should always work. > print(RM_values,all=T) n0 n1 delta1 delta2 var_R var_TR var_C var_TC var_RC var_error b 1 69 45 NANANA NANA NA NANA NA b_method mean_to_sig1 mean_to_sig2 Pr1_improper Pr2_improper 1 unspecified NA NA NA NA x1 x2 x3 x4 x5 x6 x7 1 1.264641 1.563224 0.06838082 0.07127637 NA NA NA Disclosures No conflicts of interest, financial or otherwise, are declared by the authors. Acknowledgments For the first and second authors, this research was supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) of the National Institutes of Health under Award No. R01EB025174. Some of the information presented in this paper was presented in a prior SPIE proceedings paper by the first author. We thank two reviewers and an associate editor for their helpful comments which greatly improved the manuscript. References 1. N. A. Obuchowski and H. E. Rockette, “Hypothesis testing of diagnostic accuracy for multi- ple readers and multiple tests: an ANOVA approach with dependent observations,” Commun. Stat. Simul. Comput. 24(2), 285–308 (1995). 2. C. A. Roe and C. E. Metz, “Dorfman–Berbaum–Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation,” Acad. Radiol. 4(4), 298–303 (1997). 3. S. L. Hillis, “Simulation of unequal-variance binormal multireader ROC decision data: an extension of the Roe and Metz simulation model,” Acad. Radiol. 19(12), 1518–1528 (2012). 4. C. K. Abbey, F. W. Samuelson, and B. D. Gallas, “Statistical power considerations for a utility endpoint in observer performance studies,” Acad. Radiol. 20(7), 798–806 (2013). 5. B. D. Gallas and S. L. Hillis, “Generalized Roe and Metz receiver operating characteristic model: analytic link between simulated decision scores and empirical AUC variances and covariances,” J. Med. Imaging 1(3), 031006 (2014). 6. S. L. Hillis, “Relationship between Roe and Metz simulation model for multireader diag- nostic data and Obuchowski–Rockette model parameters,” Stat. Med. 37(13), 2067–2093 (2018). 7. S. L. Hillis, “A marginal-mean ANOVA approach for analyzing multireader multicase radio- logical imaging data,” Stat. Med. 33(2), 330–360 (2014). Journal of Medical Imaging 045501-32 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . 8. M. Quenoille, “Approximate tests of correlation in time series,” J. R. Stat. Soc. Ser. B 11, 68–84 (1949). 9. J. Shao and T. Dongshen, The Jackknife and Bootstrap, Springer-Verlag, New York (1995). 10. B. Efron, The Jackknife, The Bootstrap and Other Resampling Plans, SIAM (1982). 11. B. Efron and R. Tibshirani, An Introduction to the Bootstrap, Chapman and Hall, New York (1993). 12. E. R. DeLong, D. M. DeLong, and D. L. Clarke-Pearson, “Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach,” Biometrics 44(3), 837–845 (1988). 13. C. Metz, B. Herman, and C. Roe, “Statistical comparison of two ROC-curve estimates obtained from partially-paired datasets,” Med. Decis. Making 18(1), 110–121 (1998). 14. S. L. Hillis, “Relationship between Obuchowski–Rockette–Hillis and Gallas methods for analyzing multi-reader diagnostic imaging data with empirical AUC as the reader perfor- mance measure,” Biostat. Epidemiol., in press (2022). 15. B. D. Gallas, “One-shot estimate of MRMC variance: AUC,” Acad. Radiol. 13(3), 353–362 (2006). 16. B. J. Smith, S. L. Hillis, and L. L. Pesce, “MRMCaov: multi-reader multi-case analysis of variance,” R package version 0.1.16 [computer software] (2021). 17. M. Pepe, The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford University Press, New York (2003). 18. J. P. Egan and J. P. Egan, Signal Detection Theory and ROC-Analysis, Academic Press (1975). 19. X. C. Pan and C. E. Metz, “The “proper” binormal model: parametric receiver operating characteristic curve estimation with degenerate data,” Acad. Radiol. 4(5), 380–389 (1997). 20. J. Hanley, “The robustness of the binormal assumptions used in fitting ROC curves,” Med. Decis. Making 8(3), 197–203 (1988). 21. J. A. Hanley, “The use of the ‘binormal’model for parametric roc analysis of quantitative diagnostic tests,” Stat. Med. 15(14), 1575–1585 (1996). 22. J. Swets, “Form of empirical ROCs in discrimination and diagnostic tasks: implications for theory and measurement of performance,” Psychol. Bull. 99(2), 181–198 (1986). 23. S. L. Hillis and K. M. Schartz, “Multireader sample size program for diagnostic studies: demonstration and methodology,” J. Med. Imaging 5(4), 1–27 (2018). 24. J. Swets, W. Tanner, and T. Birdsall, “Decision processes in perception,” Psychol. Rev. 68(5), 301–340 (1961). 25. J. Swets, “Indices of discrimination or diagnostic accuracy: their ROCs and implied mod- els,” Psychol. Bull. 99(1), 100–117 (1986). 26. D. Green and J. Swets, Signal Detection Theory and Psychophysics, Peninsula Publishing, Los Altos (1988). 27. S. L. Hillis and K. S. Berbaum, “Using the mean-to-sigma ratio as a measure of the improp- erness of binormal ROC curves,” Acad. Radiol. 18(2), 143–154 (2011). 28. C. Van Dyke et al., “Cine MRI in the diagnosis of thoracic aortic dissection,” in 79th RSNA Meetings, Chicago, Illinois (1993). 29. S. L. Hillis et al., “A comparison of the Dorfman–Berbaum–Metz and Obuchowski– Rockette methods for receiver operating characteristic (ROC) data,” Stat. Med. 24, 1579–1607 (2005). 30. S. L. Hillis, “A comparison of denominator degrees of freedom methods for multiple observer ROC analysis,” Stat. Med. 26(3), 596–619 (2007). 31. B. D. Gallas et al., “A framework for random-effects ROC analysis: biases with the bootstrap and other variance estimators,” Commun. Stat. Theory Methods 38(15), 2586–2603 (2009). 32. X. Zhu and W. Chen, “Simulation of multi-reader multi-case study data with realistic ROC performance characteristics,” Proc. SPIE 11316, 113160M (2020). 33. D. P. Tihansky, “Properties of the bivariate normal cumulative distribution,” J. Am. Stat. Assoc. 67(340), 903–905 (1972). 34. S. L. Hillis, “Determining Roe and Metz model parameters for simulating multireader multi- case confidence-of-disease rating data based on read-data or conjectured Obuchowski– Rockette parameter estimates,” Proc. SPIE 11316, 113160N (2020). Journal of Medical Imaging 045501-33 Jul∕Aug 2022 Vol. 9(4) Hillis, Smith, and Chen: Determining Roe and Metz model parameters for simulating multireader multicase. . . Stephen L. Hillis is a research professor in the Departments of Radiology and Biostatistics at the University of Iowa. He received his PhD in statistics in 1987 and his MFA degree in music 1978, both from the University of Iowa. He is the author of more than 100 peer-reviewed journal articles and four book chapters. Since 1998, his research has focused on methodology for multi- reader diagnostic radiologic imaging studies. Brian J. Smith is a professor in the Department of Biostatistics at the University of Iowa and director of the Biostatistics Core in the Holden Comprehensive Cancer Center. He received his PhD in biostatistics in 2001 from the University of Iowa. His research is cancer focused and includes statistical computing, predictive modeling, and methods for medical imaging studies. Weijie Chen is a research physicist in the Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, CDRH, US FDA, where he con- ducts research and regulatory reviews of medical devices. He earned his PhD in Medical Physics in 2007 from the University of Chicago. He has published 36 peer-reviewed journal articles, 31 proceedings papers, two book chapters, two editorials, and one patent. His research interests include performance characterization and assessment methodologies for imaging and AI/ML/ CAD devices. Journal of Medical Imaging 045501-34 Jul∕Aug 2022 Vol. 9(4)
Journal of Medical Imaging – SPIE
Published: Jul 1, 2022
Keywords: ROC curve; diagnostic radiology; Roe and Metz; Obuchowski and Rockette; simulated data
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.