Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Confirmatory Composite Analysis

Confirmatory Composite Analysis METHODS published: 13 December 2018 doi: 10.3389/fpsyg.2018.02541 Confirmatory Composite Analysis 1 1,2 3 Florian Schuberth , Jörg Henseler and Theo K. Dijkstra 1 2 Faculty of Engineering Technology, Chair of Product-Market Relations, University of Twente, Enschede, Netherlands, Nova Information Management School, Universidade Nova de Lisboa, Lisbon, Portugal, Faculty of Economics and Business, University of Groningen, Groningen, Netherlands This article introduces confirmatory composite analysis (CCA) as a structural equation modeling technique that aims at testing composite models. It facilitates the operationalization and assessment of design concepts, so-called artifacts. CCA entails the same steps as confirmatory factor analysis: model specification, model identification, model estimation, and model assessment. Composite models are specified such that they consist of a set of interrelated composites, all of which emerge as linear combinations of observable variables. Researchers must ensure theoretical identification of their specified model. For the estimation of the model, several estimators are available; in particular Kettenring’s extensions of canonical correlation analysis provide consistent estimates. Model assessment mainly relies on the Bollen-Stine bootstrap to assess the discrepancy between the empirical and the estimated model-implied indicator covariance matrix. A Monte Carlo simulation examines the efficacy of CCA, and demonstrates that Edited by: CCA is able to detect various forms of model misspecification. Holmes Finch, Ball State University, United States Keywords: artifacts, composite modeling, design research, Monte Carlo simulation study, structural equation Reviewed by: modeling, theory testing Daniel Saverio John Costa, University of Sydney, Australia Shenghai Dai, 1. INTRODUCTION Washington State University, United States Structural equation modeling with latent variables (SEM) comprises confirmatory factor analysis *Correspondence: (CFA) and path analysis, thus combining methodological developments from different disciplines Florian Schuberth such as psychology, sociology, and economics, while covering a broad variety of traditional [email protected] multivariate statistical procedures (Bollen, 1989; Muthén, 2002). It is capable of expressing theoretical concepts by means of multiple observable indicators to connect them via the structural Specialty section: model as well as to account for measurement error. Since SEM allows for statistical testing of This article was submitted to the estimated parameters and even entire models, it is an outstanding tool for confirmatory Quantitative Psychology and Measurement, purposes such as for assessing construct validity (Markus and Borsboom, 2013) or for establishing a section of the journal measurement invariance (Van de Schoot et al., 2012). Apart from the original maximum likelihood Frontiers in Psychology estimator, robust versions and a number of alternative estimators were also introduced to encounter Received: 19 June 2018 violations of the original assumptions in empirical work, such as the asymptotic distribution free Accepted: 28 November 2018 (Browne, 1984) or the two-stage least squares (2SLS) estimator (Bollen, 2001). Over time, the initial Published: 13 December 2018 model has been continuously improved upon to account for more complex theories. Consequently, Citation: SEM is able to deal with categorical (Muthén, 1984) as well as longitudinal data (Little, 2013) and Schuberth F, Henseler J and can be used to model non-linear relationships between the constructs (Klein and Moosbrugger, Dijkstra TK (2018) Confirmatory 1 2000). Composite Analysis. Front. Psychol. 9:2541. For more details and a comprehensive overview, we referred to the following text books: Hayduk (1988), Bollen (1989), doi: 10.3389/fpsyg.2018.02541 Marcoulides and Schumacker (2001), Raykov and Marcoulides (2006), Kline (2015), and Brown (2015). Frontiers in Psychology | www.frontiersin.org 1 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA TABLE 1 | Examples of behavioral concepts and artifacts across several Researchers across many streams of science appreciate SEM’s disciplines. versatility as well as its ability to test common factor models. In particular, in the behavioral and social sciences, SEM enjoys wide Discipline Behavioral Concept Design Concept (Artifact) popularity, e.g., in marketing (Bagozzi and Yi, 1988; Steenkamp and Baumgartner, 2000), psychology (MacCallum and Austin, Criminology Criminal activity Prevention strategy 2000), communication science (Holbert and Stephenson, 2002), Lussier et al., 2005 Crowley, 2013 operations management (Shah and Goldstein, 2006), and Ecology Sediment contamination Abiotic stress information systems (Gefen et al., 2011),—to name a few. Malaeb et al., 2000 Grace et al., 2010 Additionally, beyond the realm of behavioral and social sciences, Education Student’s anxiety Teacher development program researchers have acknowledged the capabilities of SEM, such as Fong et al., 2016 Lee, 2005 in construction research (Xiong et al., 2015) or neurosciences Epidemiology Nutritional Risk Public health intervention (McIntosh and Gonzalez-Lima, 1994). Keller, 2006 Wight et al., 2015 Over the last decades, the operationalization of the theoretical Information Perceived ease of use User-interface design concept and the common factor has become more and more Systems conflated such that hardly any distinction is made between the Venkatesh et al., 2003 Vance et al., 2015 terms (Rigdon, 2012). Although the common factor model has Marketing Brand attitude Marketing mix demonstrated its usefulness for concepts of behavioral research Spears and Singh, 2004 Borden, 1964 such as traits and attitudes, the limitation of SEM to the factor model is unfortunate because many disciplines besides and even within social and behavioral sciences do not exclusively deal CFA or SEM, without assuming that the underlying concept is with behavioral concepts, but also with design concepts (so- necessarily modeled as a common factor. called artifacts) and their interplay with behavioral concepts. For While there is no exact instruction on how to apply SEM, a example Psychiatry: on the one hand it examines clinical relevant general consensus exists that SEM and CFA comprise at least the behavior to understand mental disorder, but on the other hand following four steps: model specification, model identification, it also aims at developing mental disorder treatments (Kirmayer model estimation, and model assessment (e.g., Schumacker and and Crafa, 2014). Table 1 displays further examples of disciplines Lomax, 2009, Chap. 4). To be in line with this proceeding, investigating behavioral concepts and artifacts. the remainder of the paper is structured as follows: Section Typically, the common factor model is used to operationalize 2 introduces the composite model providing the theoretical behavioral concepts, because it is well matched with the general foundation for the CCA and how the same can be specified; understanding of measurement (Sobel, 1997). It assumes that Section 3 considers the issue of identification in CCA and states each observable indicator is a manifestation of the underlying the assumptions as being necessary to guarantee the unique concept that is regarded as their common cause (Reichenbach, solvability of the composite model; Section 4 presents one 1956), and therefore fully explains the covariation among its approach that can be used to estimate the model parameters indicators. However, for artifacts the idea of measurement is in the framework of CCA; Section 5 provides a test for the unrewarding as they are rather constructed to fulfill a certain overall model fit to assess how well the estimated model fits the purpose. To account for the constructivist character of the observed data; Section 6 assesses the performance of this test artifact, the composite has been recently suggested for its in terms of a Monte Carlo simulation and presents the results; operationalization in SEM (Henseler, 2017). A composite is and finally, the last section discusses the results and gives an a weighted linear combination of observable indicators, and outlook for future research. A brief example on how to estimate therefore in contrast to the common factor model, the indicators and assess a composite model within the statistical programming do not necessarily share a common cause. environment R is provided in the Supplementary Material. At present, the validity of composite models cannot be systematically assessed. Current approaches are limited to assessing the indicators’ collinearity (Diamantopoulos and 2. SPECIFYING COMPOSITE MODELS Winklhofer, 2001) and their relations to other variables in the model (Bagozzi, 1994). A rigorous test of composite models in Composites have a long tradition in multivariate data analysis analogy to CFA does not exist so far. Not only does this situation (Pearson, 1901). Originally, they are the outcome of dimension limit the progress of composite models, it also represents an reduction techniques, i.e., the mapping of the data to a lower unnecessary weakness of SEM as its application is mainly dimensional space. In this respect, they are designed to capture limited to behavioral concepts. For this reason, we introduce the most important characteristics of the data as efficiently as confirmatory composite analysis (CCA) wherein the concept, i.e., possible. Apart from dimension reduction, composites can serve the artifact, under investigation is modeled as a composite. In this as proxies for concepts (MacCallum and Browne, 1993). In way, we make SEM become accessible to a broader audience. We marketing research, Fornell and Bookstein (1982) recognized show that the composite model relaxes some of the restrictions that certain concepts like marketing mix or population change imposed by the common factor model. However, it still provides are not appropriately modeled by common factors and instead testable constraints, which makes CCA a full-fledged method for employed a composite to operationalize these concepts. In the confirmatory purposes. In general, it involves the same steps as recent past, more and more researchers recognized composites Frontiers in Psychology | www.frontiersin.org 2 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA as a legitimate approach to operationalize concepts, e.g., in The intra-block covariance matrix 6 of dimension K × K jj j j marketing science (Diamantopoulos and Winklhofer, 2001; is unconstrained and captures the covariation between the Rossiter, 2002), business research (Diamantopoulos, 2008), indicators of block j; thus, this effectively allows the indicators environmental science (Grace and Bollen, 2008), and in design of one block to freely covary. Moreover, it can be shown that research (Henseler, 2017). the indicator covariance matrix is positive-definite if and only if In social and behavioral sciences, concepts are often the following two conditions hold: (i) all intra-block covariance understood as ontological entities such as abilities or attitudes, matrices are positive-definite, and (ii) the covariance matrix of which rests on the assumption that the concept of interest exists the composite is positive-definite (Dijkstra, 2015, 2017). The in nature, regardless of whether it is the subject of scientific covariances between the indicators of block j and l are captured examination. Researchers follow a positivist research paradigm in the inter-block covariance matrix 6 , with j 6= l of dimension jl assuming that existing concepts can be measured. K × K . However, in contrast to the intra-block covariance j l In contrast, design concepts can be conceived as artifacts, matrix, the inter-block covariance matrix is constrained, since i.e., objects designed to serve explicit goal(s) (Simon, 1969). by assumption, the composites carry all information between the Hence, they are inextricably linked to purposefulness, i.e., blocks: teleology (Horvath, 2004; Baskerville and Pries-Heje, 2010; ′ ′ 6 = ρ 6 w w 6 = ρ λ λ , (2) Møller et al., 2012). This way of thinking has its origin jj j j jl jl ll jl l l in constructivist epistemology. The epistemological distinction where ρ = w 6 w equals the correlation between the between the ontological and constructivist nature of concepts has jl jl l important implications when modeling the causal relationships composites c and c . The vector λ = 6 w of length K contains j j jj j j among the concepts and their relationships to the observable the composite loadings, which are defined as the covariances indicators. between the composite c and the associated indicators x . j j To operationalize behavioral concepts, the common factor Equation 2 is highly reminiscent of the corresponding equation model is typically used. It seeks to explore whether a certain where all concepts are modeled as common factors instead of concept exists by testing if collected measures of a concept composites. In a common factor model, the vector λ captures the are consistent with the assumed nature of that concept. It is covariances between the indicators and its connected common based on the principle of common cause (Reichenbach, 1956), factor, and ρ represents the correlation between common factor jl and therefore assumes that all covariation within a block of j and l. Hence, both models show the rank-one structure for the indicators can be fully explained by the underlying concept. covariance matrices between two indicator blocks. On the contrary, the composite model can be used to model Although the intra-block covariance matrices of the indicators artifacts as a linear combination of observable indicators. In 6 are not restricted, we emphasize that the composite model jj doing so, it is more pragmatic in the sense that it examines is still a model from the point of view of SEM. It assumes that whether a built artifact is useful at all. Figure 1 summarizes the all information between the indicators of two different blocks is differences between behavioral concepts and artifacts and their conveyed by the composite(s), and therefore, it imposes rank- operationalization in SEM. one restrictions on the inter-block covariance matrices of the In the following part, we present the theoretical foundation indicators (see Equation 2). These restrictions can be exploited of the composite model. Although the formal development of for testing the overall model fit (see Section 5). It is emphasized the composite model and the composite factor model (Henseler that the weights w producing these matrices are the same across et al., 2014), were already laid out by Dijkstra (2013, 2015), it has all inter-block covariance matrices 6 with l = 1, ..., J and l 6= j. jl not been put into a holistic framework yet. In the following, it Figure 2 illustrates an example of a composite model. is assumed that each artifact is modeled as a composite c with The artifact under investigation is modeled as the composite j = 1, . . . , J. By definition, a composite is completely determined c, illustrated by a hexagon, and the observable indicators ′ ′ by a unique block of K indicators, x = x . . . x , c = w x . are represented by squares. The unconstrained covariance σ j j1 jK j j 12 j j j between the indicators of block x = x x forming the The weights of block j are included in the column vector w 1 2 composite is highlighted by a double-headed arrow. of length K . Usually, each weight vector is scaled to ensure that The observable variables y and z do not form the composite. the composites have unit variance (see also Section 3). Here, we They are allowed to freely covary among each other as well as with assume that each indicator is connected to only one composite. the composite. For example, they can be regarded as antecedents The theoretical covariance matrix 6 of the indicators can be or consequences of the modeled artifact. expressed as a partitioned matrix as follows: To emphasize the difference between the composite model   and the common factor model typically used in CFA, we depict 6 6 . . . 6 11 12 1J the composite model as composite factor model (Dijkstra, 2013;   6 . . . 6 22 2J   Henseler et al., 2014). The composite factor model has the same 6 =   . (1) . .   . model-implied indicator covariance matrix as the composite model, but the deduction of the model-implied covariances and JJ the comparison to the common factor is more straightforward. Figure 3 shows the same model as Figure 2 but in terms of a In general, models containing common factors and composites are also conceivable but have not been considered here. composite factor representation. Frontiers in Psychology | www.frontiersin.org 3 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 1 | Two types of concepts: behavioral concepts vs. artifacts. The composite loading λ , i = 1, 2 captures the covariance between the indicator x and the composite c. In general, the error terms are included in the vector ǫ, explaining the variance of the indicators and the covariances between the indicators of one block, which are not explained by the composite factor. As the composite model does not restrict the covariances between the indicators of one block, the error terms are allowed to freely covary. The covariations among the error terms as well as their variances are captured in matrix 2. The model-implied covariance matrix of the example composite model can be displayed as follows: x x z 1 2   yy   λ σ σ 1 yc 11   6 = . (3)   λ σ λ λ + θ σ  2 yc 1 2 12 22 FIGURE 2 | Example of a composite model. σ λ σ λ σ σ yz 1 cz 2 cz zz In comparison to the same model using a common factor instead of a composite, the composite model is less restrictive as it allows 3. IDENTIFYING COMPOSITE MODELS all error terms of one block to be correlated, which leads to a more general model (Henseler et al., 2014). In fact, the common Like in SEM and CFA, model identification is an important factor model is always nested in the composite model since it uses issue in CCA. Since analysts can freely specify their models, it the same restriction as the composite model; but additionally, it needs to be ensured that the model parameters have a unique assumes that (some) covariances between the error terms of one solution (Bollen, 1989, Chap. 8). Therefore, model identification block are restricted (usually to zero). Under certain conditions, is necessary to obtain consistent parameter estimates and to it is possible to rescale the intra- and inter-block covariances of reliably interpret them (Marcoulides and Chin, 2013). a composite model to match those of a common factor model In general, the following three states of model identification (Dijkstra, 2013; Dijkstra and Henseler, 2015). can be distinguished: under-identified, just-identified, and Frontiers in Psychology | www.frontiersin.org 4 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA the indicator covariance matrix since there is a non-zero inter- block covariance matrix for every loading vector. Otherwise, if a composite c is isolated in the nomological network, all inter- block covariances 6 , l = 1, ..., J with l 6= j, belonging to jl this composite are of rank zero, and thus, the weights forming this composite cannot be uniquely retrieved. Although the non- isolation condition is required for identification, it also matches the idea of an artifact that is designed to fulfill a certain purpose. Without considering the artifact’s antecedents and/or consequences, the artifact’s purposefulness cannot be judged. In the following part, we give a description on how the number of degrees of freedom is counted in case of the composite model. It is given by the difference between the number of non-redundant elements of the indicator population covariance matrix 6 and the number of free parameters in the model. The number of free model parameters is given by the number of covariances among the composites, the number of covariances between composites and indicators not forming a composite, the number of covariances among indicators not forming a composite, the number of non-redundant off-diagonal elements FIGURE 3 | Example of a composite model displayed as composite factor model. of each intra-block covariance matrix, and the number of weights. Since we fix composite variances to one, one weight of each block can be expressed by the remaining ones of this block. Hence, we regain as many degrees of freedom as fixed composite over-identified. An under-identified model, also known as variances, i.e., as blocks in the model. Equation 4 summarizes not-identified model, offers several sets of parameters that are the way of determining the number of degrees of freedom of a consistent with the model constraints, and thus, no unique composite model. solution for the model parameters exists. Therefore, only questionable conclusions can be drawn. In contrast, a just- identified model provides a unique solution for the model df = number of non-redundant off-diagonal elements of the parameters and has the same number of free parameters as non- indicator covariance matrix redundant elements of the indicator covariance matrix (degrees − number of free correlations among the composites of freedom (df) are 0). In empirical analysis, such models − number of free covariances between the composites and cannot be used to evaluate the overall model fit since they perfectly fit the data. An over-identified model also has a unique indicators not forming a composite solution; however, it provides more non-redundant elements of − number of covariances among the indicators not forming the indicator covariance matrix than model parameters (df > 0). a composite (4) This can be exploited in empirical studies for assessing the overall model fit, as these constraints should hold for a sample within the − number of free non-redundant off-diagonal elements of limits of sampling error if the model is valid. each intra-block covariance matrix A necessary condition for ensuring identification is to − number of weights normalize each weight vector. In doing so, we assume that + number of blocks all composites are scaled to have a unit variance, w 6 w = jj j 1. Besides the scaling of the composite, each composite must To illustrate our approach to calculating the number be connected to at least one composite or one variable not of degrees of freedom, we consider the composite model forming a composite. As a result, at least one inter-block presented in Figure 2. As described above, the model consists covariance matrix 6 , l = 1, ..., J with l 6= j satisfies the jl of four (standardized) observable variables; thus, the indicator rank-one condition. Along with the normalization of the weight correlation matrix has six non-redundant off-diagonal elements. vectors, all model parameters can be uniquely retrieved from The number of free model parameters is counted as follows: no correlations among the composites as the models consists of only The existing literature sometimes mentions empirical (under-)identification in the one composite, two correlations between the composite and the context of model identification (Kenny, 1979). Since this expression refers to an observable variables not forming a composite (σ and σ ), one yc cz issue of estimation rather than to the issue of model identification, this topic is not correlation between the variables not forming a composite (σ ), yz discussed in the following. Another way of normalization is to fix one weight of each block to a certain value. Furthermore, we ignore trivial regularity assumptions such as weight The number of degrees of freedom can be helpful at determining whether a model vectors consisting of zeros only; and similarly, we ignore cases where intra-block is identified since an identified model has a non-negative number of degrees of covariance matrices are singular. freedom. Frontiers in Psychology | www.frontiersin.org 5 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA one non-redundant off-diagonal of the intra-block correlation dimension J × J, is a block-diagonal matrix containing the intra- matrix (σ ), and two weights (w and w ) minus one, the block correlation matrices 6 , j = 1, ..., J on its diagonal. To 12 1 2 jj number of blocks. As a result, we obtain the number of degrees obtain the estimates of the weights, the composites, and their of freedom as follows: df = 6 − 0 − 2 − 1 − 1 − 2 + 1 = 1. Once correlations, the population matrix 6 is replaced by its empirical identification of the composite model is ensured, in a next step counterpart S. the model can be estimated. 5. ASSESSING COMPOSITE MODELS 4. ESTIMATING COMPOSITE MODELS 5.1. Tests of Overall Model Fit The existing literature provides various ways of constructing In CFA and factor-based SEM, a test for overall model fit has composites from blocks of indicators. The most common been naturally supplied by the maximum-likelihood estimation among them are principal component analysis (PCA, Pearson, in the form of the chi-square test (Jöreskog, 1967), while maxvar 1901), linear discriminant analysis (LDA, Fisher, 1936), lacks in terms of such a test. In the light of this, we propose and (generalized) canonical correlation analysis ((G)CCA, a combination of a bootstrap procedure with several distance Hotelling, 1936; Kettenring, 1971). All these approaches seek measures to statistically test how well the assumed composite composites that “best” explain the data and can be regarded as model fits to the collected data. prescriptions for dimension reduction (Dijkstra and Henseler, The existing literature provides several measures with which to assess the discrepancy between the perfect fit and the model 2011). Further approaches are partial least squares path modeling (PLS-PM, Wold, 1975), regularized general canonical fit. In fact, every distance measure known from CFA can be used to assess the overall fit of a composite model. They all capture correlation analysis (RGCCA, Tenenhaus and Tenenhaus, 2011), and generalized structural component analysis (GSCA, the discrepancy between the sample covariance matrix S and the Hwang and Takane, 2004). The use of predefined weights is ˆ estimated model-implied covariance matrix 6 = 6(θ) of the also possible. indicators. In our study, we consider the following three distance We follow Dijkstra (2010) and apply GCCA in a first step measures: squared Euclidean distance (d ), geodesic distance to estimate the correlation between the composites. In the (d ), and standardized root mean square residual (SRMR). following part, we give a brief description of GCCA. The vector The squared Euclidean distance between the sample and of indicators x of length K is split up into J subvectors x , so j the estimated model-implied covariance matrix is calculated as called blocks, each of dimension (K × 1) with j = 1, . . . , J. We follows: assume that the indicators are standardized to have means of K K zero and unit variances. Moreover, each indicator is connected XX d = (s − σˆ ) , (6) L ij ij to one composite only. Hence, the correlation matrix of the i=1 j=1 indicators can be calculated as 6 = E(xx ) and the intra-block correlation matrix as 6 = E(x x ). Moreover, the correlation jj j where K is the total number of indicators, and s and σˆ are ij ij matrix of the composites c = x w is calculated as follows: j j ′ the elements of the sample and the estimated model-implied 6 = E(cc ). In general, GCCA chooses the weights to maximize covariance matrix, respectively. It is obvious that the squared the correlation between the composites. In doing so, GCCA Euclidean distance is zero for a perfectly fitting model, 6 = S. offers the following options: sumcor, maxvar, ssqcor, minvar, Moreover, the geodesic distance stemming from a class of and genvar. distance functions proposed by Swain (1975) can be used to In the following part, we use maxvar under the constraint measure the discrepancy between the sample and estimated that each composite has a unit variance, w 6 w = 1, to jj j model-implied covariance matrix. It is given by the following: estimate the weights, the composites, and the resulting composite correlations. In doing so, the weights are chosen to maximize the largest eigenvalue of the composite correlation matrix. Thus, the d = (log(ϕ )) , (7) G i total variation of the composites is explained as well as possible by i=1 one underlying “principal component,” and the weights to form the composite c are calculated as follows (Kettenring, 1971): −1 where ϕ is the i-th eigenvalue of the matrix S 6 and K is the number of indicators. The geodesic distance is zero when and w = 6 a˜ / a˜ a˜ . (5) j j j jj j only when all eigenvalues equal one, i.e., when and only when the fit is perfect. The subvector a˜ , of length J, corresponds to the largest j Finally, the SRMR (Hu and Bentler, 1999) can be used to assess 1 1 − − 2 2 the overall model fit. The SRMR is calculated as follows: eigenvalue of the matrix 6 66 , where the matrix 6 , of D D   GCCA builds composites in a way that they are maximally correlated. u K i XX For an overview we refer to Kettenring (1971). 2   SRMR = 2 ((s − σˆ )/(s s )) /(K(K + 1)), (8) ij ij ii jj In general, GCCA offers several composites (canonical variates); but in our study, i=1 j=1 we have focused only on the canonical variates of the first stage. Frontiers in Psychology | www.frontiersin.org 6 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA where K is the number of indicators. It reflects the average CFA or the geodesic distance. Values of the NFI close to one discrepancy between the empirical and the estimated model- imply a good model fit. However, cut-off values still need to be implied correlation matrix. Thus, for a perfectly fitting model, the determined. SRMR is zero, as σˆ equals s . Finally, we suggest considering the root mean square residual ij ij Since all distance measures considered are functions of the covariance of the outer residuals (RMS ) as a further fit theta sample covariance matrix, a procedure proposed by Beran and index (Lohmöller, 1989). It is defined as the square root of the Srivastava (1985) can be used to test the overall model fit: average residual correlations. Since the indicators of one block are H 6 = 6(θ). The reference distribution of the distance allowed to be freely correlated, the residual correlations within measures as well as the critical values are obtained from the a block should be excluded and only the residual correlations transformed sample data as follows: across the blocks should be taken into account during its calculation. Small values close to zero for the RMS indicate theta a good model fit. However, threshold values still need to be − 2 determined. XS 6 , (9) where the data matrix x of dimension (N × K) contains the 6. A MONTE CARLO SIMULATION N observations of all K indicators. This transformation ensures In order to assess our proposed procedure of statistically testing that the new dataset satisfies the null hypothesis; i.e., the sample covariance matrix of the transformed dataset equals the estimated the overall model fit of composite models and to examine the behavior of the earlier presented discrepancy measures, we model-implied covariance matrix. The reference distribution of conduct a Monte Carlo simulation. In particular, we investigate the distance measures is obtained by bootstrapping from the transformed dataset. In doing so, the estimated distance based on the type I error rate (false positive rate) and the power, which are the most important characteristics of a statistical test. In the original dataset can be compared to the critical value from the reference distribution (typically the empirical 95% or 99% designing the simulation, we choose a number of concepts used : several times in the literature to examine the performance of fit quantile) to decide whether the null hypothesis, H 6 = 6(θ) is rejected (Bollen and Stine, 1992). indices and tests of overall model fit in CFA: a model containing two composites and a model containing three composites (Hu 5.2. Fit Indices for Composite Models and Bentler, 1999; Heene et al., 2012). To investigate the power of In addition to the test of overall model fit, we provide some fit the test procedure, we consider various misspecifications of these indices as measures of the overall model fit. In general, fit indices models. Figures 4 and 5 summarize the conditions investigated can indicate whether a model is misspecified by providing an in our simulation study. absolute value of the misfit; however, we advise using them with caution as they are based on heuristic rules-of-thumb rather than 6.1. Model Containing Two Composites statistical theory. Moreover, it is recommended to calculate the All models containing two composites are estimated using the fit indices based on the indicator correlation matrix instead of specification illustrated in the last column of Figure 4. The the covariance matrix. indicators x to x are specified to build composite c , while 11 13 1 The standardized root mean square residual (SRMR) the remaining three indicators build composite c . Moreover, the was already introduced as a measure of overall model fit composites are allowed to freely correlate. The parameters of (Henseler et al., 2014). As described above, it represents the interest are the correlation between the two composites, and the average discrepancy between the sample and the model- weights, w to w . As column “Population model” of Figure 4 11 23 implied indicator correlation matrix. Values below 0.10 and, shows, we consider three types of population models with two following a more conservative view, below 0.08 indicate composites. a good model fit (Hu and Bentler, 1998). However, these 6.1.1. Condition 1: No Misspecification threshold values were proposed for common factor models First, in order to examine whether the rejection rates of the and their usefulness for composite models needs to be test procedure are close to the predefined significance level in investigated. cases in which the null hypothesis is true, a population model is Furthermore, the normed fit index (NFI) is suggested as a considered that has the same structure as the specified model. The measure of goodness of fit (Bentler and Bonett, 1980). It measures correlation between the two composites is set to ρ = 0.3 and the the relative discrepancy between the fit of the baseline model composites are formed by its connected standardized indicators and the fit of the estimated model. In this context, a model ′ ′ as follows: c = x w with i = 1, 2, where w = 0.6 0.2 0.4 and where all indicators are assumed to be uncorrelated (the model- i i i 1 w = 0.4 0.2 0.6 . All correlations between the indicators of implied correlation matrix equals the unit matrix) can serve one block are set to 0.5, which leads to the population correlation as a baseline model (Lohmöller, 1989, Chap. 2.4.4). To assess matrix given in Figure 4. the fit of the baseline model and the estimated model, several measures can be used, e.g., the log likelihood function used in 6.1.2. Condition 2: Confounded Indicators The second condition is used to investigate whether the test This procedure is known as the Bollen-Stine bootstrap (Bollen and Stine, 1992) in the factor-based SEM literature. The model must be over-identified for this test. procedure is capable of detecting misspecified models. It presents Frontiers in Psychology | www.frontiersin.org 7 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 4 | Simulation design for the model containing two composites. FIGURE 5 | Simulation design for the model containing three composites. a situation where the researcher falsely assigns two indicators to It shows a situation where the correlation between the two wrong constructs. The correlation between the two composites indicators x and x is not fully explained by the two 13 21 and the weights are the same as in population model 1: ρ = composites. As in the two previously presented population ′ ′ 0.3, w = 0.6 0.2 0.4 , and w = 0.4 0.2 0.6 . However, in models, the two composites have a correlation of ρ = 0.3. 1 2 contrast to population model 1, the indicators x and x are The correlations among the indicators of one block are set to 13 21 interchanged. Moreover, the correlations among all indicators 0.5, and the weights for the construction of the composites ′ ′ of one block are 0.5. The population correlation matrix of the are set to w = 0.6 0.2 0.4 , and w = 0.4 0.2 0.6 . The 1 2 second model is presented in Figure 4. population correlation matrix of the indicators is presented in Figure 4. 6.1.3. Condition 3: Unexplained Correlation The third condition is chosen to further investigate the The model-implied correlation between the two indicators is calculated as capabilities of the test procedure to detect misspecified models. follows, 0.8 · 0.3 · 0.8 6= 0.5. Frontiers in Psychology | www.frontiersin.org 8 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA 6.2. Model Containing Three Composites the specified model. All composites are assumed to be freely Furthermore, we investigate a more complex model consisting correlated. In the population, the composite correlations are set of three composites. Again, each composite is formed by three to ρ = 0.3, ρ = 0.5, and ρ = 0.4. Each composite is built 12 13 23 indicators, and the composites are allowed to freely covary. by three indicators using the following population weights: w = ′ ′ 0.6 0.4 0.2 0.3 0.5 0.6 0.4 0.5 0.5 The column “Estimated model” of Figure 5 illustrates the , w = , and w = . The 2 3 specification to be estimated in case of three composites. We indicator correlations of each block can be read from Figure 5. The indicator correlation matrix of population model 4 is given assume that the composites are built as follows: c = x w , 1 1 ′ ′ c = x w , and c = x w . Again, we examine two different in Figure 5. 2 2 3 3 2 3 population models. 6.2.2. Condition 5: Unexplained Correlation 6.2.1. Condition 4: No Misspecification In the fifth condition, we investigate a situation where the The fourth condition is used to further investigate whether the correlation between two indicators is not fully explained by the rejection rates of the test procedure are close to the predefined underlying composites, similar to what is observed in Condition significance level in cases in which the null hypothesis is true. 3. Consequently, population model 5 does not match the model Hence, the structure of the fourth population model matches to be estimated and is used to investigate the power of the FIGURE 6 | Rejection rates for population model 1. Frontiers in Psychology | www.frontiersin.org 9 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 7 | Rejection rates for population model 2 and 3. overall model test. It equals population model 4 with the observations (with increments of 100) and the significance level exception that the correlation between the indicators x and x α from 1% to 10%. To obtain the reference distribution of 13 21 is only partly explained by the composites. Since the original the discrepancy measures considered, 200 bootstrap samples are correlation between these indicators is 0.084, a correlation of drawn from the transformed and standardized dataset. Each 0.25 presents only a weak violation. The remaining model dataset is used in the maxvar procedure to estimate the model stays untouched. The population correlation matrix is illustrated parameters. in Figure 5. All simulations are conducted in the statistical programming environment R (R Core Team, 2016). The samples are drawn from the multivariate normal distribution using the mvrnorm 6.3. Further Simulation Conditions and function of the MASS packages (Venables and Ripley, 2002). Expectations The results for the test of overall model fit are obtained by To assess the quality of the proposed test of the overall user-written functions and the matrixpls package (Rönkkö, model fit, we generate 10,000 standardized samples from 2016). the multivariate normal distribution having zero means and a covariance matrix according to the respective population model. Moreover, we vary the sample size from 50 to 1,450 These functions are provided by the contact author upon request. Frontiers in Psychology | www.frontiersin.org 10 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 8 | Rejection rates for population model 4 and 5. Since population models 1 and 4 fit the respective 6.4. Results specification, we expect rejection rates close to the predefined Figure 6 illustrates the rejection rates for population model levels of significance α. Additionally, we expect that for an 1 i.e., no misspecification. Besides the rejection rates, the increasing sample size, the predefined significance level is kept figure also depicts the 95% confidence intervals (shaded area) with more precision. For population model 2, 3, and 5, much constructed around the rejection rates to clarify whether a larger rejection rates are expected as these population models rejection rate is significantly different from the predefined do not match the respective specification. Moreover, we expect significance level. that the power of the test to detect misspecifications would First, as expected, the squared Euclidean distance (d ) as well increase along with a larger sample size. Regarding the different as the SRMR lead to identical results. The test using the squared discrepancy measures, we have no expectations, only that the Euclidean distance and the SRMR rejects the model somewhat squared Euclidean distance and the SRMR should lead to too rarely in case of α = 10% and α = 5% respectively; however, identical results. For standardized datasets, the only difference is a constant factor that does not affect the order of the observations The limits of the 95% confidence interval are calculated as, pˆ ± −1 −1 in the reference distribution and, therefore, does not affect the 8 (0.975) pˆ(1 − pˆ)/10000, where pˆ represents the rejection rate and 8 () is decision about the null hypothesis. the quantile function of the standard normal distribution. Frontiers in Psychology | www.frontiersin.org 11 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA for an increasing sample size, the rejection rates converge to Its application is appropriate in situations where the research the predefined significance level without reaching it. For the goal is to examine whether an artifact is useful rather than 1% significance level, a similar picture is observed; however, to establish whether a certain concept exists. It follows the for larger sample sizes, the significance level is retained more same steps usually applied in SEM and enables researchers often compared to the larger significance levels. In contrast, to analyze a variety of situations, in particular, beyond the the test using the geodesic distance mostly rejects the model realm of social and behavioral sciences. Hence, CCA allows for too often for the 5% and 10% significance level. However, the dealing with research questions that could not be appropriately obtained rejection rates are less often significantly different from dealt with yet in the framework of CFA or more generally the predefined significance level compared to the same situation in SEM. where the SRMR or the Euclidean distance is used. In case The results of the Monte Carlo simulation confirmed that of α = 1% and sample sizes larger than n = 100, the CCA can be used for confirmatory purposes. They revealed test using the geodesic distance rejects the model significantly that the bootstrap-based test, in combination with different too often. discrepancy measures, can be used to statistically assess the Figure 7 displays the rejection rates for population models overall model fit of the composite model. For specifications 2 and 3. The horizontal line at 80% depicts the commonly matching the population model, the rejection rates were in recommended power for a statistical test (Cohen, 1988). For the acceptable range, i.e., close to the predefined significance the two cases where the specification does not match the level. Moreover, the results of the power analysis showed that underlying data generating process, the test using the squared the boostrap-based test can reliably detect misspecified models. Euclidean distance as well as the SRMR has more power than However, caution is needed in case of small sample sizes where the test using the geodesic distance, i.e., the test using former the rejection rates were low, which means that misspecified discrepancy measures rejects the wrong model more often. models were not reliably detected. For model 2 (confounded indicators) the test produces higher In future research, the usefulness of the composite model or equal rejection rates compared to model 3 (unexplained in empirical studies needs to be examined, accompanied and correlation). Furthermore, as expected, the power decreases for enhanced by simulation studies. In particular, the extensions an increasing level of significance and increases with increasing outlined by Dijkstra (2017); to wit, interdependent systems of sample sizes. equations for the composites estimated by classical econometric Figure 8 depicts the rejection rates for population model 4 methods (like 2SLS and three-stage least squares) warrant further and 5. Again, the 95% confidence intervals are illustrated for analysis and scrutiny. Robustness with respect to non-normality population model 4 (shaded area) matching the specification and misspecification also appear to be relevant research topics. estimated. Considering population model 4 which matches Additionally, devising ways to efficiently predict indicators and the estimated model, the test leads to similar results for all composites might be of particular interest (see for example the three discrepancy measures. However, the rejection rate of work by Shmueli et al., 2016). the test using the geodesic distance converges faster to the Moreover, to contribute to the confirmatory character of CCA, predefined significance level, i.e., for smaller sample sizes n ≥ we recommend further study of the performance and limitations 100. Again, among the three discrepancy measures considered, of the proposed test procedure: consider more misspecifications the geodesic distance performs best in terms of keeping the and the ability of the test to reliably detect them, find further significance level. discrepancy measures and examine their performance, and As the extent of misspecification in population model 5 is investigate the behavior of the test under the violation of the minor, the test struggles to detect the model misspecification up normality assumption, similar as Nevitt and Hancock (2001) did to sample sizes n = 350, regardless of the discrepancy measure for CFA. Finally, cut-off values for the fit indices need to be used. However, for sample sizes larger than 350 observations, determined for CCA. the test detects the model misspecification satisfactorily. For sample sizes larger than 1,050 observations, the misspecification AUTHOR CONTRIBUTIONS was identified in almost all cases regardless of the significance level and the discrepancy measure used. Again, this confirms FS conducted the literature review and wrote the majority the anticipated relationship between sample size and statistical of the paper (contribution: ca. 50%). JH initiated this paper power. and designed the simulation study (contribution: ca. 25%). TD proposed the composite model and developed the model fit test (contribution: ca. 25%). 7. DISCUSSION We introduced the confirmatory composite analysis (CCA) SUPPLEMENTARY MATERIAL as a full-fledged technique for confirmatory purposes that employs composites to model artifacts, i.e., design concepts. It The Supplementary Material for this article can be found overcomes current limitations in CFA and SEM and carries the online at: https://www.frontiersin.org/articles/10.3389/fpsyg. spirit of CFA and SEM to research domains studying artifacts. 2018.02541/full#supplementary-material Frontiers in Psychology | www.frontiersin.org 12 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA REFERENCES Fornell, C., and Bookstein, F. L. (1982). Two structural equation models: LISREL and PLS applied to consumer exit-voice theory. J. Market. Res. 19, 440–452. Bagozzi, R. P. (1994). “Structural equation models in marketing research: basic doi: 10.2307/3151718 principles,” in Principles of Marketing Research eds R. P. Bagozzi (Oxford: Gefen, D., Straub, D. W., and Rigdon, E. E. (2011). An update and extension to Blackwell), 317–385. SEM guidelines for admnistrative and social science research. MIS Quart. 35, Bagozzi, R. P., and Yi, Y. (1988). On the evaluation of structural equation models. iii–xiv. doi: 10.2307/23044042 J. Acad. Market. Sci. 16, 74–94. doi: 10.1007/BF02723327 Grace, J. B., Anderson, T. M., Olff, H., and Scheiner, S. M. (2010). On the Baskerville, R., and Pries-Heje, J. (2010). Explanatory design theory. Busin. Inform. specification of structural equation models for ecological systems. Ecol. Monogr. Syst. Eng. 2, 271–282. doi: 10.1007/s12599-010-0118-4 80, 67–87. doi: 10.1890/09-0464.1 Bentler, P. M., and Bonett, D. G. (1980). Significance tests and goodness Grace, J. B., and Bollen, K. A. (2008). Representing general theoretical concepts of fit in the analysis of covariance structures. Psychol. Bull. 88, 588–606. in structural equation models: the role of composite variables. Environ. Ecol. doi: 10.1037/0033-2909.88.3.588 Statist. 15, 191–213. doi: 10.1007/s10651-007-0047-7 Beran, R., and Srivastava, M. S. (1985). Bootstrap tests and confidence Hayduk, L. A. (1988). Structural Equation Modeling With LISREL: Essentials and regions for functions of a covariance matrix. Ann. Statist. 13, 95–115. Advances. Baltimore, MD: John Hopkins University Press. doi: 10.1214/aos/1176346579 Heene, M., Hilbert, S., Freudenthaler, H. H., and Buehner, M. (2012). Sensitivity of Bollen, K. A. (1989). Structural Equations with Latent Variables. New York, NY: SEM fit indexes with respect to violations of uncorrelated errors. Struct. Equat. John Wiley & Sons Inc . Model. 19, 36–50. doi: 10.1080/10705511.2012.634710 Bollen, K. A. (2001). “Two-stage least squares and latent variable models: Henseler, J. (2017). Bridging design and behavioral research with Simultaneous estimation and robustness to misspecifications,” in Structural variance-based structural equation modeling. J. Advert. 46, 178–192. Equation Modeling: Present and Future, A Festschrift in Honor of Karl Jöreskog doi: 10.1080/00913367.2017.1281780 eds R. Cudeck, S. Du Toit, and D. Sörbom (Chicago: Scientific Software Henseler, J., Dijkstra, T. K., Sarstedt, M., Ringle, C. M., Diamantopoulos, International), 119–138. A., Straub, D. W., et al. (2014). Common beliefs and reality about PLS Bollen, K. A., and Stine, R. A. (1992). Bootstrapping goodness-of-fit comments on Rönkkö and Evermann (2013). Organ. Res. Methods 17, 182–209. measures in structural equation models. Sociol. Methods Res. 21, 205–229. doi: 10.1177/1094428114526928 doi: 10.1177/0049124192021002004 Holbert, R. L., and Stephenson, M. T. (2002). Structural equation modeling in Borden, N. H. (1964). The concept of the marketing mix. J. Advert. Res. 4, 2–7. the communication sciences, 1995–2000. Hum. Commun. Res. 28, 531–551. Brown, T. A. (2015). Confirmatory Factor Analysis for Applied Research. New York, doi: 10.1111/j.1468-2958.2002.tb00822.x NY: Guilford Press. Horvath, I. (2004). A treatise on order in engineering design research. Res. Eng. Browne, M. W. (1984). Asymptotically distribution-free methods for the Design 15, 155–181. doi: 10.1007/s00163-004-0052-x analysis of covariance structures. Br. J. Math. Statist. Psychol. 37, 62–83. Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28, 321– doi: 10.1111/j.2044-8317.1984.tb00789.x 377. doi: 10.1093/biomet/28.3-4.321 Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd Edn. Hu, L., and Bentler, P. M. (1998). Fit indices in covariance structure modeling: Hillsdale, MI: Lawrence Erlbaum Associates. sensitivity to underparameterized model misspecification. Psychol. Methods 3, Crowley, D. M. (2013). Building efficient crime prevention strategies. Criminol. 424–453. doi: 10.1037/1082-989X.3.4.424 Public Policy 12, 353–366. doi: 10.1111/1745-9133.12041 Hu, L., and Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance Diamantopoulos, A. (2008). Formative indicators: introduction to the structure analysis: conventional criteria versus new alternatives. Struc. Equat. special issue. J. Busin. Res. 61, 1201–1202. doi: 10.1016/j.jbusres.2008. Model. 6, 1–55. doi: 10.1080/10705519909540118 01.008 Hwang, H., and Takane, Y. (2004). Generalized structured component analysis. Diamantopoulos, A., and Winklhofer, H. M. (2001). Index construction with Psychometrika 69, 81–99. doi: 10.1007/BF02295841 formative indicators: an alternative to scale development. J. Market. Res. 38, Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. 269–277. doi: 10.1509/jmkr.38.2.269.18845 Psychometrika 32, 443–482. doi: 10.1007/BF02289658 Dijkstra, T. K. (2010). “Latent variables and indices: Herman Wold’s basic design Keller, H. (2006). The SCREEN I (seniors in the community: risk evaluation and partial least squares,” in Handbook of Partial Least Squares (Berlin: for eating and nutrition) index adequately represents nutritional risk. J. Clin. Springer), 23–46. Epidemiol. 59, 836–841. doi: 10.1016/j.jclinepi.2005.06.013 Dijkstra, T. K. (2013). “Composites as factors: Canonical variables revisited,” in Kenny, D. A. (1979). Correlation and Causality. Hoboken, NJ: John Wiley & Sons Working Paper. Groningen. Available online at: https://www.rug.nl/staff/t.k. Inc. dijkstra/composites-as-factors.pdf Kettenring, J. R. (1971). Canonical analysis of several sets of variables. Biometrika Dijkstra, T. K. (2015). “All-inclusive versus single block composites,” in Working 58, 433–451. doi: 10.1093/biomet/58.3.433 Paper. Groningen. Available online at: https://www.researchgate.net/profile/ Kirmayer, L. J., and Crafa, D. (2014). What kind of science for psychiatry? Front. Theo_Dijkstra/publication/281443431_all-inclusive_and_single_block_ Hum. Neurosci. 8:435. doi: 10.3389/fnhum.2014.00435 composites/links/55e7509208ae65b63899564f/all-inclusive-and-single-block- Klein, A., and Moosbrugger, H. (2000). Maximum likelihood estimation of composites.pdf latent interaction effects with the LMS method. Psychometrika 65, 457–474. Dijkstra, T. K. (2017). “A perfect match between a model and a mode,” in Partial doi: 10.1007/BF02296338 Least Squares Path Modeling, eds H. Latan, and R. Noonan (Cham: Springer), Kline, R. B. (2015). Principles and Practice of Structural Equation Modeling. New 55–80. York, NY: Guilford Press. Dijkstra, T. K., and Henseler, J. (2011). Linear indices in nonlinear structural Lee, H.-J. (2005). Developing a professional development program model based on equation models: best fitting proper indices and other composites. Qual. Quant. teachers’ needs. Profess. Educ. 27, 39–49. 45, 1505–1518. doi: 10.1007/s11135-010-9359-z Little, T. D. (2013). Longitudinal Structural Equation Modeling. New York, NY: Dijkstra, T. K., and Henseler, J. (2015). Consistent and asymptotically normal PLS Guilford Press. estimators for linear structural equations. Computat. Statist. Data Anal. 81, Lohmöller, J.-B. (1989). Latent Variable Path Modeling with Partial Least Squares. 10–23. doi: 10.1016/j.csda.2014.07.008 Heidelberg: Physica. Fisher, R. A. (1936). The use of multiple measurements in taxonomic Lussier, P., LeBlanc, M., and Proulx, J. (2005). The generality of criminal behavior: problems. Ann. Eugen. 7, 179–188. doi: 10.1111/j.1469-1809.1936. a confirmatory factor analysis of the criminal activity of sex offenders in tb02137.x adulthood. J. Crim. Just. 33, 177–189. doi: 10.1016/j.jcrimjus.2004.12.009 Fong, C. J., Davis, C. W., Kim, Y., Kim, Y. W., Marriott, L., and Kim, S. (2016). MacCallum, R. C., and Austin, J. T. (2000). Applications of structural Psychosocial factors and community college student success. Rev. Educ. Res. equation modeling in psychological research. Ann. Rev. Psychol. 51, 201–226. 87, 388–424. doi: 10.3102/0034654316653479 doi: 10.1146/annurev.psych.51.1.201 Frontiers in Psychology | www.frontiersin.org 13 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA MacCallum, R. C., and Browne, M. W. (1993). The use of causal indicators in Shmueli, G., Ray, S., Estrada, J. M. V., and Chatla, S. B. (2016). The elephant in covariance structure models: Some practical issues. Psychol. Bull. 114, 533–541. the room: Predictive performance of PLS models. J. Busin. Res. 69, 4552–4564. doi: 10.1037/0033-2909.114.3.533 doi: 10.1016/j.jbusres.2016.03.049 Malaeb, Z. A., Summers, J. K., and Pugesek, B. H. (2000). Using structural equation Simon, H. (1969). The Sciences of the Artificial. Cambridge: MIT Press. modeling to investigate relationships among ecological variables. Environ. Ecol. Sobel, M. E. (1997). “Measurement, causation and local independence in latent Statist. 7, 93–111. doi: 10.1023/A:1009662930292 variable models,” in Latent Variable Modeling and Applications to Causality, ed Marcoulides, G. A., and Chin, W. W. (2013). “You write, but others read: M. Berkane (New York, NY. Springer), 11–28. common methodological misunderstandings in PLS and related methods,” Spears, N., and Singh, S. N. (2004). Measuring attitude toward the in New Perspectives in Partial Least Squares and Related Methods, eds H. brand and purchase intentions. J. Curr. Iss. Res. Advert. 26, 53–66. Abdi, V. E. Vinzi, G. Russolillo, and L. Trinchera (New York, NY: Springer), doi: 10.1080/10641734.2004.10505164 31–64. Steenkamp, J.-B. E., and Baumgartner, H. (2000). On the use of structural Marcoulides, G. A., and Schumacker, R. E., editors (2001). New Developments and equation models for marketing modeling. Int. J. Res. Market. 17, 195–202. Techniques in Structural Equation Modeling. Mahwah, NJ: Lawrence Erlbaum doi: 10.1016/S0167-8116(00)00016-1 Associates. Swain, A. (1975). A class of factor analysis estimation procedures with Markus, K. A., and Borsboom, D. (2013). Frontiers of Test Validity Theory: common asymptotic sampling properties. Psychometrika 40, 315–335. Measurement, Causation, and Meaning. New York, NY: Routledge. doi: 10.1007/BF02291761 McIntosh, A., and Gonzalez-Lima, F. (1994). Structural equation modeling and its Tenenhaus, A., and Tenenhaus, M. (2011). Regularized generalized application to network analysis in functional brain imaging. Hum. Brain Mapp. canonical correlation analysis. Psychometrika 76, 257–284. 2, 2–22. doi: 10.1007/s11336-011-9206-8 Møller, C., Brandt, C. J., and Carugati, A. (2012). “Deliberately by design, or? Van de Schoot, R., Lugtig, P., and Hox, J. (2012). A checklist for Enterprise architecture transformation at Arla Foods,” in Advances in Enterprise testing measurement invariance. Eur. J. Develop. Psychol. 9, 486–492. Information Systems II, eds C. Møller, and S. Chaudhry (Boca Raton, FL: CRC doi: 10.1080/17405629.2012.686740 Press), 91–104. Vance, A., Benjamin Lowry, P., and Eggett, D. (2015). Increasing accountability Muthén, B. O. (1984). A general structural equation model with dichotomous, through user-interface design artifacts: a new approach to addressing ordered categorical, and continuous latent variable indicators. Psychometrika the problem of access-policy violations. MIS Quart. 39, 345–366. 49, 115–132. doi: 10.25300/MISQ/2015/39.2.04 Muthén, B. O. (2002). Beyond SEM: general latent variable modeling. Venables, W. N., and Ripley, B. D. (2002). Modern Applied Statistics With S, 4th Behaviormetrika 29, 81–117.doi: 10.2333/bhmk.29.81 Edn. New York, NY: Springer. Nevitt, J., and Hancock, G. R. (2001). Performance of bootstrapping Venkatesh, V., Morris, M. G., Davis, G. B., and Davis, F. D. (2003). User approaches to model test statistics and parameter standard error estimation acceptance of information technology: toward a unified view. MIS Quart. in structural equation modeling. Struc. Equat. Model. 8, 353–377. 27:425. doi: 10.2307/30036540 doi: 10.1207/S15328007SEM0803_2 Wight, D., Wimbush, E., Jepson, R., and Doi, L. (2015). Six steps in quality Pearson, K. (1901). On lines and planes of closest fit to systems of intervention development (6SQuID). J. Epidemiol. Commun. Health 70, 520– points in space. Philos. Magazine 6 2, 559–572. doi: 10.1080/1478644010 525. doi: 10.1136/jech-2015-205952 9462720 Wold, H. (1975). “Path models with latent variables: The NIPALS approach. in R Core Team (2016). R: A Language and Environment for Statistical Computing. Quantitative Sociology, eds H. Blalock, A. Aganbegian, F. Borodkin, R. Boudon, Version 3.3.1. Vienna: R Foundation for Statistical Computing. and V. Capecchi (New York, NY: Academic Press), 307–357. Raykov, T. and Marcoulides, G. A. (2006). A First Course in Structural Equation Xiong, B., Skitmore, M., and Xia, B. (2015). A critical review of structural equation Modeling, 2nd Edn. Mahaw: Lawrence Erlbaum Associates. modeling applications in construction research. Automat. Construct. 49 (Pt A), Reichenbach, H. (1956). The Direction of Time. Berkeley, CA: University of 59–70. doi: 10.1016/j.autcon.2014.09.006 California Press. Rigdon, E. E. (2012). Rethinking partial least squares path modeling: in praise Conflict of Interest Statement: JH acknowledges a financial interest in ADANCO of simple methods. Long Range Plan. 45, 341–358. doi: 10.1016/j.lrp.2012. and its distributor, Composite Modeling. 09.010 Rönkkö, M. (2016). matrixpls: Matrix-based Partial Least Squares Estimation. The remaining authors declare that the research was conducted in the absence of R package version 1.0.0. Available online at: https://cran.r-project.org/web/ any commercial or financial relationships that could be construed as a potential packages/matrixpls/vignettes/matrixpls-intro.pdf conflict of interest. Rossiter, J. R. (2002). The C-OAR-SE procedure for scale development in marketing. Int. J. Res. Market. 19, 305–335. Copyright © 2018 Schuberth, Henseler and Dijkstra. This is an open-access article doi: 10.1016/S0167-8116(02)00097-6 distributed under the terms of the Creative Commons Attribution License (CC BY). Schumacker, R. E., and Lomax, R. G. (2009). A Beginner’s Guide to Structural The use, distribution or reproduction in other forums is permitted, provided the Equation Modeling, 3rd Edn. New York, NY: Routledge. original author(s) and the copyright owner(s) are credited and that the original Shah, R., and Goldstein, S. M. (2006). Use of structural equation modeling in publication in this journal is cited, in accordance with accepted academic practice. operations management research: looking back and forward. J. Operat. Manag. No use, distribution or reproduction is permitted which does not comply with these 24, 148–169. doi: 10.1016/j.jom.2005.05.001 terms. Frontiers in Psychology | www.frontiersin.org 14 December 2018 | Volume 9 | Article 2541 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Frontiers in Psychology Unpaywall

Confirmatory Composite Analysis

Frontiers in PsychologyDec 13, 2018

Loading next page...
 
/lp/unpaywall/confirmatory-composite-analysis-do608pcCY4

References (90)

Publisher
Unpaywall
ISSN
1664-1078
DOI
10.3389/fpsyg.2018.02541
Publisher site
See Article on Publisher Site

Abstract

METHODS published: 13 December 2018 doi: 10.3389/fpsyg.2018.02541 Confirmatory Composite Analysis 1 1,2 3 Florian Schuberth , Jörg Henseler and Theo K. Dijkstra 1 2 Faculty of Engineering Technology, Chair of Product-Market Relations, University of Twente, Enschede, Netherlands, Nova Information Management School, Universidade Nova de Lisboa, Lisbon, Portugal, Faculty of Economics and Business, University of Groningen, Groningen, Netherlands This article introduces confirmatory composite analysis (CCA) as a structural equation modeling technique that aims at testing composite models. It facilitates the operationalization and assessment of design concepts, so-called artifacts. CCA entails the same steps as confirmatory factor analysis: model specification, model identification, model estimation, and model assessment. Composite models are specified such that they consist of a set of interrelated composites, all of which emerge as linear combinations of observable variables. Researchers must ensure theoretical identification of their specified model. For the estimation of the model, several estimators are available; in particular Kettenring’s extensions of canonical correlation analysis provide consistent estimates. Model assessment mainly relies on the Bollen-Stine bootstrap to assess the discrepancy between the empirical and the estimated model-implied indicator covariance matrix. A Monte Carlo simulation examines the efficacy of CCA, and demonstrates that Edited by: CCA is able to detect various forms of model misspecification. Holmes Finch, Ball State University, United States Keywords: artifacts, composite modeling, design research, Monte Carlo simulation study, structural equation Reviewed by: modeling, theory testing Daniel Saverio John Costa, University of Sydney, Australia Shenghai Dai, 1. INTRODUCTION Washington State University, United States Structural equation modeling with latent variables (SEM) comprises confirmatory factor analysis *Correspondence: (CFA) and path analysis, thus combining methodological developments from different disciplines Florian Schuberth such as psychology, sociology, and economics, while covering a broad variety of traditional [email protected] multivariate statistical procedures (Bollen, 1989; Muthén, 2002). It is capable of expressing theoretical concepts by means of multiple observable indicators to connect them via the structural Specialty section: model as well as to account for measurement error. Since SEM allows for statistical testing of This article was submitted to the estimated parameters and even entire models, it is an outstanding tool for confirmatory Quantitative Psychology and Measurement, purposes such as for assessing construct validity (Markus and Borsboom, 2013) or for establishing a section of the journal measurement invariance (Van de Schoot et al., 2012). Apart from the original maximum likelihood Frontiers in Psychology estimator, robust versions and a number of alternative estimators were also introduced to encounter Received: 19 June 2018 violations of the original assumptions in empirical work, such as the asymptotic distribution free Accepted: 28 November 2018 (Browne, 1984) or the two-stage least squares (2SLS) estimator (Bollen, 2001). Over time, the initial Published: 13 December 2018 model has been continuously improved upon to account for more complex theories. Consequently, Citation: SEM is able to deal with categorical (Muthén, 1984) as well as longitudinal data (Little, 2013) and Schuberth F, Henseler J and can be used to model non-linear relationships between the constructs (Klein and Moosbrugger, Dijkstra TK (2018) Confirmatory 1 2000). Composite Analysis. Front. Psychol. 9:2541. For more details and a comprehensive overview, we referred to the following text books: Hayduk (1988), Bollen (1989), doi: 10.3389/fpsyg.2018.02541 Marcoulides and Schumacker (2001), Raykov and Marcoulides (2006), Kline (2015), and Brown (2015). Frontiers in Psychology | www.frontiersin.org 1 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA TABLE 1 | Examples of behavioral concepts and artifacts across several Researchers across many streams of science appreciate SEM’s disciplines. versatility as well as its ability to test common factor models. In particular, in the behavioral and social sciences, SEM enjoys wide Discipline Behavioral Concept Design Concept (Artifact) popularity, e.g., in marketing (Bagozzi and Yi, 1988; Steenkamp and Baumgartner, 2000), psychology (MacCallum and Austin, Criminology Criminal activity Prevention strategy 2000), communication science (Holbert and Stephenson, 2002), Lussier et al., 2005 Crowley, 2013 operations management (Shah and Goldstein, 2006), and Ecology Sediment contamination Abiotic stress information systems (Gefen et al., 2011),—to name a few. Malaeb et al., 2000 Grace et al., 2010 Additionally, beyond the realm of behavioral and social sciences, Education Student’s anxiety Teacher development program researchers have acknowledged the capabilities of SEM, such as Fong et al., 2016 Lee, 2005 in construction research (Xiong et al., 2015) or neurosciences Epidemiology Nutritional Risk Public health intervention (McIntosh and Gonzalez-Lima, 1994). Keller, 2006 Wight et al., 2015 Over the last decades, the operationalization of the theoretical Information Perceived ease of use User-interface design concept and the common factor has become more and more Systems conflated such that hardly any distinction is made between the Venkatesh et al., 2003 Vance et al., 2015 terms (Rigdon, 2012). Although the common factor model has Marketing Brand attitude Marketing mix demonstrated its usefulness for concepts of behavioral research Spears and Singh, 2004 Borden, 1964 such as traits and attitudes, the limitation of SEM to the factor model is unfortunate because many disciplines besides and even within social and behavioral sciences do not exclusively deal CFA or SEM, without assuming that the underlying concept is with behavioral concepts, but also with design concepts (so- necessarily modeled as a common factor. called artifacts) and their interplay with behavioral concepts. For While there is no exact instruction on how to apply SEM, a example Psychiatry: on the one hand it examines clinical relevant general consensus exists that SEM and CFA comprise at least the behavior to understand mental disorder, but on the other hand following four steps: model specification, model identification, it also aims at developing mental disorder treatments (Kirmayer model estimation, and model assessment (e.g., Schumacker and and Crafa, 2014). Table 1 displays further examples of disciplines Lomax, 2009, Chap. 4). To be in line with this proceeding, investigating behavioral concepts and artifacts. the remainder of the paper is structured as follows: Section Typically, the common factor model is used to operationalize 2 introduces the composite model providing the theoretical behavioral concepts, because it is well matched with the general foundation for the CCA and how the same can be specified; understanding of measurement (Sobel, 1997). It assumes that Section 3 considers the issue of identification in CCA and states each observable indicator is a manifestation of the underlying the assumptions as being necessary to guarantee the unique concept that is regarded as their common cause (Reichenbach, solvability of the composite model; Section 4 presents one 1956), and therefore fully explains the covariation among its approach that can be used to estimate the model parameters indicators. However, for artifacts the idea of measurement is in the framework of CCA; Section 5 provides a test for the unrewarding as they are rather constructed to fulfill a certain overall model fit to assess how well the estimated model fits the purpose. To account for the constructivist character of the observed data; Section 6 assesses the performance of this test artifact, the composite has been recently suggested for its in terms of a Monte Carlo simulation and presents the results; operationalization in SEM (Henseler, 2017). A composite is and finally, the last section discusses the results and gives an a weighted linear combination of observable indicators, and outlook for future research. A brief example on how to estimate therefore in contrast to the common factor model, the indicators and assess a composite model within the statistical programming do not necessarily share a common cause. environment R is provided in the Supplementary Material. At present, the validity of composite models cannot be systematically assessed. Current approaches are limited to assessing the indicators’ collinearity (Diamantopoulos and 2. SPECIFYING COMPOSITE MODELS Winklhofer, 2001) and their relations to other variables in the model (Bagozzi, 1994). A rigorous test of composite models in Composites have a long tradition in multivariate data analysis analogy to CFA does not exist so far. Not only does this situation (Pearson, 1901). Originally, they are the outcome of dimension limit the progress of composite models, it also represents an reduction techniques, i.e., the mapping of the data to a lower unnecessary weakness of SEM as its application is mainly dimensional space. In this respect, they are designed to capture limited to behavioral concepts. For this reason, we introduce the most important characteristics of the data as efficiently as confirmatory composite analysis (CCA) wherein the concept, i.e., possible. Apart from dimension reduction, composites can serve the artifact, under investigation is modeled as a composite. In this as proxies for concepts (MacCallum and Browne, 1993). In way, we make SEM become accessible to a broader audience. We marketing research, Fornell and Bookstein (1982) recognized show that the composite model relaxes some of the restrictions that certain concepts like marketing mix or population change imposed by the common factor model. However, it still provides are not appropriately modeled by common factors and instead testable constraints, which makes CCA a full-fledged method for employed a composite to operationalize these concepts. In the confirmatory purposes. In general, it involves the same steps as recent past, more and more researchers recognized composites Frontiers in Psychology | www.frontiersin.org 2 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA as a legitimate approach to operationalize concepts, e.g., in The intra-block covariance matrix 6 of dimension K × K jj j j marketing science (Diamantopoulos and Winklhofer, 2001; is unconstrained and captures the covariation between the Rossiter, 2002), business research (Diamantopoulos, 2008), indicators of block j; thus, this effectively allows the indicators environmental science (Grace and Bollen, 2008), and in design of one block to freely covary. Moreover, it can be shown that research (Henseler, 2017). the indicator covariance matrix is positive-definite if and only if In social and behavioral sciences, concepts are often the following two conditions hold: (i) all intra-block covariance understood as ontological entities such as abilities or attitudes, matrices are positive-definite, and (ii) the covariance matrix of which rests on the assumption that the concept of interest exists the composite is positive-definite (Dijkstra, 2015, 2017). The in nature, regardless of whether it is the subject of scientific covariances between the indicators of block j and l are captured examination. Researchers follow a positivist research paradigm in the inter-block covariance matrix 6 , with j 6= l of dimension jl assuming that existing concepts can be measured. K × K . However, in contrast to the intra-block covariance j l In contrast, design concepts can be conceived as artifacts, matrix, the inter-block covariance matrix is constrained, since i.e., objects designed to serve explicit goal(s) (Simon, 1969). by assumption, the composites carry all information between the Hence, they are inextricably linked to purposefulness, i.e., blocks: teleology (Horvath, 2004; Baskerville and Pries-Heje, 2010; ′ ′ 6 = ρ 6 w w 6 = ρ λ λ , (2) Møller et al., 2012). This way of thinking has its origin jj j j jl jl ll jl l l in constructivist epistemology. The epistemological distinction where ρ = w 6 w equals the correlation between the between the ontological and constructivist nature of concepts has jl jl l important implications when modeling the causal relationships composites c and c . The vector λ = 6 w of length K contains j j jj j j among the concepts and their relationships to the observable the composite loadings, which are defined as the covariances indicators. between the composite c and the associated indicators x . j j To operationalize behavioral concepts, the common factor Equation 2 is highly reminiscent of the corresponding equation model is typically used. It seeks to explore whether a certain where all concepts are modeled as common factors instead of concept exists by testing if collected measures of a concept composites. In a common factor model, the vector λ captures the are consistent with the assumed nature of that concept. It is covariances between the indicators and its connected common based on the principle of common cause (Reichenbach, 1956), factor, and ρ represents the correlation between common factor jl and therefore assumes that all covariation within a block of j and l. Hence, both models show the rank-one structure for the indicators can be fully explained by the underlying concept. covariance matrices between two indicator blocks. On the contrary, the composite model can be used to model Although the intra-block covariance matrices of the indicators artifacts as a linear combination of observable indicators. In 6 are not restricted, we emphasize that the composite model jj doing so, it is more pragmatic in the sense that it examines is still a model from the point of view of SEM. It assumes that whether a built artifact is useful at all. Figure 1 summarizes the all information between the indicators of two different blocks is differences between behavioral concepts and artifacts and their conveyed by the composite(s), and therefore, it imposes rank- operationalization in SEM. one restrictions on the inter-block covariance matrices of the In the following part, we present the theoretical foundation indicators (see Equation 2). These restrictions can be exploited of the composite model. Although the formal development of for testing the overall model fit (see Section 5). It is emphasized the composite model and the composite factor model (Henseler that the weights w producing these matrices are the same across et al., 2014), were already laid out by Dijkstra (2013, 2015), it has all inter-block covariance matrices 6 with l = 1, ..., J and l 6= j. jl not been put into a holistic framework yet. In the following, it Figure 2 illustrates an example of a composite model. is assumed that each artifact is modeled as a composite c with The artifact under investigation is modeled as the composite j = 1, . . . , J. By definition, a composite is completely determined c, illustrated by a hexagon, and the observable indicators ′ ′ by a unique block of K indicators, x = x . . . x , c = w x . are represented by squares. The unconstrained covariance σ j j1 jK j j 12 j j j between the indicators of block x = x x forming the The weights of block j are included in the column vector w 1 2 composite is highlighted by a double-headed arrow. of length K . Usually, each weight vector is scaled to ensure that The observable variables y and z do not form the composite. the composites have unit variance (see also Section 3). Here, we They are allowed to freely covary among each other as well as with assume that each indicator is connected to only one composite. the composite. For example, they can be regarded as antecedents The theoretical covariance matrix 6 of the indicators can be or consequences of the modeled artifact. expressed as a partitioned matrix as follows: To emphasize the difference between the composite model   and the common factor model typically used in CFA, we depict 6 6 . . . 6 11 12 1J the composite model as composite factor model (Dijkstra, 2013;   6 . . . 6 22 2J   Henseler et al., 2014). The composite factor model has the same 6 =   . (1) . .   . model-implied indicator covariance matrix as the composite model, but the deduction of the model-implied covariances and JJ the comparison to the common factor is more straightforward. Figure 3 shows the same model as Figure 2 but in terms of a In general, models containing common factors and composites are also conceivable but have not been considered here. composite factor representation. Frontiers in Psychology | www.frontiersin.org 3 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 1 | Two types of concepts: behavioral concepts vs. artifacts. The composite loading λ , i = 1, 2 captures the covariance between the indicator x and the composite c. In general, the error terms are included in the vector ǫ, explaining the variance of the indicators and the covariances between the indicators of one block, which are not explained by the composite factor. As the composite model does not restrict the covariances between the indicators of one block, the error terms are allowed to freely covary. The covariations among the error terms as well as their variances are captured in matrix 2. The model-implied covariance matrix of the example composite model can be displayed as follows: x x z 1 2   yy   λ σ σ 1 yc 11   6 = . (3)   λ σ λ λ + θ σ  2 yc 1 2 12 22 FIGURE 2 | Example of a composite model. σ λ σ λ σ σ yz 1 cz 2 cz zz In comparison to the same model using a common factor instead of a composite, the composite model is less restrictive as it allows 3. IDENTIFYING COMPOSITE MODELS all error terms of one block to be correlated, which leads to a more general model (Henseler et al., 2014). In fact, the common Like in SEM and CFA, model identification is an important factor model is always nested in the composite model since it uses issue in CCA. Since analysts can freely specify their models, it the same restriction as the composite model; but additionally, it needs to be ensured that the model parameters have a unique assumes that (some) covariances between the error terms of one solution (Bollen, 1989, Chap. 8). Therefore, model identification block are restricted (usually to zero). Under certain conditions, is necessary to obtain consistent parameter estimates and to it is possible to rescale the intra- and inter-block covariances of reliably interpret them (Marcoulides and Chin, 2013). a composite model to match those of a common factor model In general, the following three states of model identification (Dijkstra, 2013; Dijkstra and Henseler, 2015). can be distinguished: under-identified, just-identified, and Frontiers in Psychology | www.frontiersin.org 4 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA the indicator covariance matrix since there is a non-zero inter- block covariance matrix for every loading vector. Otherwise, if a composite c is isolated in the nomological network, all inter- block covariances 6 , l = 1, ..., J with l 6= j, belonging to jl this composite are of rank zero, and thus, the weights forming this composite cannot be uniquely retrieved. Although the non- isolation condition is required for identification, it also matches the idea of an artifact that is designed to fulfill a certain purpose. Without considering the artifact’s antecedents and/or consequences, the artifact’s purposefulness cannot be judged. In the following part, we give a description on how the number of degrees of freedom is counted in case of the composite model. It is given by the difference between the number of non-redundant elements of the indicator population covariance matrix 6 and the number of free parameters in the model. The number of free model parameters is given by the number of covariances among the composites, the number of covariances between composites and indicators not forming a composite, the number of covariances among indicators not forming a composite, the number of non-redundant off-diagonal elements FIGURE 3 | Example of a composite model displayed as composite factor model. of each intra-block covariance matrix, and the number of weights. Since we fix composite variances to one, one weight of each block can be expressed by the remaining ones of this block. Hence, we regain as many degrees of freedom as fixed composite over-identified. An under-identified model, also known as variances, i.e., as blocks in the model. Equation 4 summarizes not-identified model, offers several sets of parameters that are the way of determining the number of degrees of freedom of a consistent with the model constraints, and thus, no unique composite model. solution for the model parameters exists. Therefore, only questionable conclusions can be drawn. In contrast, a just- identified model provides a unique solution for the model df = number of non-redundant off-diagonal elements of the parameters and has the same number of free parameters as non- indicator covariance matrix redundant elements of the indicator covariance matrix (degrees − number of free correlations among the composites of freedom (df) are 0). In empirical analysis, such models − number of free covariances between the composites and cannot be used to evaluate the overall model fit since they perfectly fit the data. An over-identified model also has a unique indicators not forming a composite solution; however, it provides more non-redundant elements of − number of covariances among the indicators not forming the indicator covariance matrix than model parameters (df > 0). a composite (4) This can be exploited in empirical studies for assessing the overall model fit, as these constraints should hold for a sample within the − number of free non-redundant off-diagonal elements of limits of sampling error if the model is valid. each intra-block covariance matrix A necessary condition for ensuring identification is to − number of weights normalize each weight vector. In doing so, we assume that + number of blocks all composites are scaled to have a unit variance, w 6 w = jj j 1. Besides the scaling of the composite, each composite must To illustrate our approach to calculating the number be connected to at least one composite or one variable not of degrees of freedom, we consider the composite model forming a composite. As a result, at least one inter-block presented in Figure 2. As described above, the model consists covariance matrix 6 , l = 1, ..., J with l 6= j satisfies the jl of four (standardized) observable variables; thus, the indicator rank-one condition. Along with the normalization of the weight correlation matrix has six non-redundant off-diagonal elements. vectors, all model parameters can be uniquely retrieved from The number of free model parameters is counted as follows: no correlations among the composites as the models consists of only The existing literature sometimes mentions empirical (under-)identification in the one composite, two correlations between the composite and the context of model identification (Kenny, 1979). Since this expression refers to an observable variables not forming a composite (σ and σ ), one yc cz issue of estimation rather than to the issue of model identification, this topic is not correlation between the variables not forming a composite (σ ), yz discussed in the following. Another way of normalization is to fix one weight of each block to a certain value. Furthermore, we ignore trivial regularity assumptions such as weight The number of degrees of freedom can be helpful at determining whether a model vectors consisting of zeros only; and similarly, we ignore cases where intra-block is identified since an identified model has a non-negative number of degrees of covariance matrices are singular. freedom. Frontiers in Psychology | www.frontiersin.org 5 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA one non-redundant off-diagonal of the intra-block correlation dimension J × J, is a block-diagonal matrix containing the intra- matrix (σ ), and two weights (w and w ) minus one, the block correlation matrices 6 , j = 1, ..., J on its diagonal. To 12 1 2 jj number of blocks. As a result, we obtain the number of degrees obtain the estimates of the weights, the composites, and their of freedom as follows: df = 6 − 0 − 2 − 1 − 1 − 2 + 1 = 1. Once correlations, the population matrix 6 is replaced by its empirical identification of the composite model is ensured, in a next step counterpart S. the model can be estimated. 5. ASSESSING COMPOSITE MODELS 4. ESTIMATING COMPOSITE MODELS 5.1. Tests of Overall Model Fit The existing literature provides various ways of constructing In CFA and factor-based SEM, a test for overall model fit has composites from blocks of indicators. The most common been naturally supplied by the maximum-likelihood estimation among them are principal component analysis (PCA, Pearson, in the form of the chi-square test (Jöreskog, 1967), while maxvar 1901), linear discriminant analysis (LDA, Fisher, 1936), lacks in terms of such a test. In the light of this, we propose and (generalized) canonical correlation analysis ((G)CCA, a combination of a bootstrap procedure with several distance Hotelling, 1936; Kettenring, 1971). All these approaches seek measures to statistically test how well the assumed composite composites that “best” explain the data and can be regarded as model fits to the collected data. prescriptions for dimension reduction (Dijkstra and Henseler, The existing literature provides several measures with which to assess the discrepancy between the perfect fit and the model 2011). Further approaches are partial least squares path modeling (PLS-PM, Wold, 1975), regularized general canonical fit. In fact, every distance measure known from CFA can be used to assess the overall fit of a composite model. They all capture correlation analysis (RGCCA, Tenenhaus and Tenenhaus, 2011), and generalized structural component analysis (GSCA, the discrepancy between the sample covariance matrix S and the Hwang and Takane, 2004). The use of predefined weights is ˆ estimated model-implied covariance matrix 6 = 6(θ) of the also possible. indicators. In our study, we consider the following three distance We follow Dijkstra (2010) and apply GCCA in a first step measures: squared Euclidean distance (d ), geodesic distance to estimate the correlation between the composites. In the (d ), and standardized root mean square residual (SRMR). following part, we give a brief description of GCCA. The vector The squared Euclidean distance between the sample and of indicators x of length K is split up into J subvectors x , so j the estimated model-implied covariance matrix is calculated as called blocks, each of dimension (K × 1) with j = 1, . . . , J. We follows: assume that the indicators are standardized to have means of K K zero and unit variances. Moreover, each indicator is connected XX d = (s − σˆ ) , (6) L ij ij to one composite only. Hence, the correlation matrix of the i=1 j=1 indicators can be calculated as 6 = E(xx ) and the intra-block correlation matrix as 6 = E(x x ). Moreover, the correlation jj j where K is the total number of indicators, and s and σˆ are ij ij matrix of the composites c = x w is calculated as follows: j j ′ the elements of the sample and the estimated model-implied 6 = E(cc ). In general, GCCA chooses the weights to maximize covariance matrix, respectively. It is obvious that the squared the correlation between the composites. In doing so, GCCA Euclidean distance is zero for a perfectly fitting model, 6 = S. offers the following options: sumcor, maxvar, ssqcor, minvar, Moreover, the geodesic distance stemming from a class of and genvar. distance functions proposed by Swain (1975) can be used to In the following part, we use maxvar under the constraint measure the discrepancy between the sample and estimated that each composite has a unit variance, w 6 w = 1, to jj j model-implied covariance matrix. It is given by the following: estimate the weights, the composites, and the resulting composite correlations. In doing so, the weights are chosen to maximize the largest eigenvalue of the composite correlation matrix. Thus, the d = (log(ϕ )) , (7) G i total variation of the composites is explained as well as possible by i=1 one underlying “principal component,” and the weights to form the composite c are calculated as follows (Kettenring, 1971): −1 where ϕ is the i-th eigenvalue of the matrix S 6 and K is the number of indicators. The geodesic distance is zero when and w = 6 a˜ / a˜ a˜ . (5) j j j jj j only when all eigenvalues equal one, i.e., when and only when the fit is perfect. The subvector a˜ , of length J, corresponds to the largest j Finally, the SRMR (Hu and Bentler, 1999) can be used to assess 1 1 − − 2 2 the overall model fit. The SRMR is calculated as follows: eigenvalue of the matrix 6 66 , where the matrix 6 , of D D   GCCA builds composites in a way that they are maximally correlated. u K i XX For an overview we refer to Kettenring (1971). 2   SRMR = 2 ((s − σˆ )/(s s )) /(K(K + 1)), (8) ij ij ii jj In general, GCCA offers several composites (canonical variates); but in our study, i=1 j=1 we have focused only on the canonical variates of the first stage. Frontiers in Psychology | www.frontiersin.org 6 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA where K is the number of indicators. It reflects the average CFA or the geodesic distance. Values of the NFI close to one discrepancy between the empirical and the estimated model- imply a good model fit. However, cut-off values still need to be implied correlation matrix. Thus, for a perfectly fitting model, the determined. SRMR is zero, as σˆ equals s . Finally, we suggest considering the root mean square residual ij ij Since all distance measures considered are functions of the covariance of the outer residuals (RMS ) as a further fit theta sample covariance matrix, a procedure proposed by Beran and index (Lohmöller, 1989). It is defined as the square root of the Srivastava (1985) can be used to test the overall model fit: average residual correlations. Since the indicators of one block are H 6 = 6(θ). The reference distribution of the distance allowed to be freely correlated, the residual correlations within measures as well as the critical values are obtained from the a block should be excluded and only the residual correlations transformed sample data as follows: across the blocks should be taken into account during its calculation. Small values close to zero for the RMS indicate theta a good model fit. However, threshold values still need to be − 2 determined. XS 6 , (9) where the data matrix x of dimension (N × K) contains the 6. A MONTE CARLO SIMULATION N observations of all K indicators. This transformation ensures In order to assess our proposed procedure of statistically testing that the new dataset satisfies the null hypothesis; i.e., the sample covariance matrix of the transformed dataset equals the estimated the overall model fit of composite models and to examine the behavior of the earlier presented discrepancy measures, we model-implied covariance matrix. The reference distribution of conduct a Monte Carlo simulation. In particular, we investigate the distance measures is obtained by bootstrapping from the transformed dataset. In doing so, the estimated distance based on the type I error rate (false positive rate) and the power, which are the most important characteristics of a statistical test. In the original dataset can be compared to the critical value from the reference distribution (typically the empirical 95% or 99% designing the simulation, we choose a number of concepts used : several times in the literature to examine the performance of fit quantile) to decide whether the null hypothesis, H 6 = 6(θ) is rejected (Bollen and Stine, 1992). indices and tests of overall model fit in CFA: a model containing two composites and a model containing three composites (Hu 5.2. Fit Indices for Composite Models and Bentler, 1999; Heene et al., 2012). To investigate the power of In addition to the test of overall model fit, we provide some fit the test procedure, we consider various misspecifications of these indices as measures of the overall model fit. In general, fit indices models. Figures 4 and 5 summarize the conditions investigated can indicate whether a model is misspecified by providing an in our simulation study. absolute value of the misfit; however, we advise using them with caution as they are based on heuristic rules-of-thumb rather than 6.1. Model Containing Two Composites statistical theory. Moreover, it is recommended to calculate the All models containing two composites are estimated using the fit indices based on the indicator correlation matrix instead of specification illustrated in the last column of Figure 4. The the covariance matrix. indicators x to x are specified to build composite c , while 11 13 1 The standardized root mean square residual (SRMR) the remaining three indicators build composite c . Moreover, the was already introduced as a measure of overall model fit composites are allowed to freely correlate. The parameters of (Henseler et al., 2014). As described above, it represents the interest are the correlation between the two composites, and the average discrepancy between the sample and the model- weights, w to w . As column “Population model” of Figure 4 11 23 implied indicator correlation matrix. Values below 0.10 and, shows, we consider three types of population models with two following a more conservative view, below 0.08 indicate composites. a good model fit (Hu and Bentler, 1998). However, these 6.1.1. Condition 1: No Misspecification threshold values were proposed for common factor models First, in order to examine whether the rejection rates of the and their usefulness for composite models needs to be test procedure are close to the predefined significance level in investigated. cases in which the null hypothesis is true, a population model is Furthermore, the normed fit index (NFI) is suggested as a considered that has the same structure as the specified model. The measure of goodness of fit (Bentler and Bonett, 1980). It measures correlation between the two composites is set to ρ = 0.3 and the the relative discrepancy between the fit of the baseline model composites are formed by its connected standardized indicators and the fit of the estimated model. In this context, a model ′ ′ as follows: c = x w with i = 1, 2, where w = 0.6 0.2 0.4 and where all indicators are assumed to be uncorrelated (the model- i i i 1 w = 0.4 0.2 0.6 . All correlations between the indicators of implied correlation matrix equals the unit matrix) can serve one block are set to 0.5, which leads to the population correlation as a baseline model (Lohmöller, 1989, Chap. 2.4.4). To assess matrix given in Figure 4. the fit of the baseline model and the estimated model, several measures can be used, e.g., the log likelihood function used in 6.1.2. Condition 2: Confounded Indicators The second condition is used to investigate whether the test This procedure is known as the Bollen-Stine bootstrap (Bollen and Stine, 1992) in the factor-based SEM literature. The model must be over-identified for this test. procedure is capable of detecting misspecified models. It presents Frontiers in Psychology | www.frontiersin.org 7 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 4 | Simulation design for the model containing two composites. FIGURE 5 | Simulation design for the model containing three composites. a situation where the researcher falsely assigns two indicators to It shows a situation where the correlation between the two wrong constructs. The correlation between the two composites indicators x and x is not fully explained by the two 13 21 and the weights are the same as in population model 1: ρ = composites. As in the two previously presented population ′ ′ 0.3, w = 0.6 0.2 0.4 , and w = 0.4 0.2 0.6 . However, in models, the two composites have a correlation of ρ = 0.3. 1 2 contrast to population model 1, the indicators x and x are The correlations among the indicators of one block are set to 13 21 interchanged. Moreover, the correlations among all indicators 0.5, and the weights for the construction of the composites ′ ′ of one block are 0.5. The population correlation matrix of the are set to w = 0.6 0.2 0.4 , and w = 0.4 0.2 0.6 . The 1 2 second model is presented in Figure 4. population correlation matrix of the indicators is presented in Figure 4. 6.1.3. Condition 3: Unexplained Correlation The third condition is chosen to further investigate the The model-implied correlation between the two indicators is calculated as capabilities of the test procedure to detect misspecified models. follows, 0.8 · 0.3 · 0.8 6= 0.5. Frontiers in Psychology | www.frontiersin.org 8 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA 6.2. Model Containing Three Composites the specified model. All composites are assumed to be freely Furthermore, we investigate a more complex model consisting correlated. In the population, the composite correlations are set of three composites. Again, each composite is formed by three to ρ = 0.3, ρ = 0.5, and ρ = 0.4. Each composite is built 12 13 23 indicators, and the composites are allowed to freely covary. by three indicators using the following population weights: w = ′ ′ 0.6 0.4 0.2 0.3 0.5 0.6 0.4 0.5 0.5 The column “Estimated model” of Figure 5 illustrates the , w = , and w = . The 2 3 specification to be estimated in case of three composites. We indicator correlations of each block can be read from Figure 5. The indicator correlation matrix of population model 4 is given assume that the composites are built as follows: c = x w , 1 1 ′ ′ c = x w , and c = x w . Again, we examine two different in Figure 5. 2 2 3 3 2 3 population models. 6.2.2. Condition 5: Unexplained Correlation 6.2.1. Condition 4: No Misspecification In the fifth condition, we investigate a situation where the The fourth condition is used to further investigate whether the correlation between two indicators is not fully explained by the rejection rates of the test procedure are close to the predefined underlying composites, similar to what is observed in Condition significance level in cases in which the null hypothesis is true. 3. Consequently, population model 5 does not match the model Hence, the structure of the fourth population model matches to be estimated and is used to investigate the power of the FIGURE 6 | Rejection rates for population model 1. Frontiers in Psychology | www.frontiersin.org 9 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 7 | Rejection rates for population model 2 and 3. overall model test. It equals population model 4 with the observations (with increments of 100) and the significance level exception that the correlation between the indicators x and x α from 1% to 10%. To obtain the reference distribution of 13 21 is only partly explained by the composites. Since the original the discrepancy measures considered, 200 bootstrap samples are correlation between these indicators is 0.084, a correlation of drawn from the transformed and standardized dataset. Each 0.25 presents only a weak violation. The remaining model dataset is used in the maxvar procedure to estimate the model stays untouched. The population correlation matrix is illustrated parameters. in Figure 5. All simulations are conducted in the statistical programming environment R (R Core Team, 2016). The samples are drawn from the multivariate normal distribution using the mvrnorm 6.3. Further Simulation Conditions and function of the MASS packages (Venables and Ripley, 2002). Expectations The results for the test of overall model fit are obtained by To assess the quality of the proposed test of the overall user-written functions and the matrixpls package (Rönkkö, model fit, we generate 10,000 standardized samples from 2016). the multivariate normal distribution having zero means and a covariance matrix according to the respective population model. Moreover, we vary the sample size from 50 to 1,450 These functions are provided by the contact author upon request. Frontiers in Psychology | www.frontiersin.org 10 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 8 | Rejection rates for population model 4 and 5. Since population models 1 and 4 fit the respective 6.4. Results specification, we expect rejection rates close to the predefined Figure 6 illustrates the rejection rates for population model levels of significance α. Additionally, we expect that for an 1 i.e., no misspecification. Besides the rejection rates, the increasing sample size, the predefined significance level is kept figure also depicts the 95% confidence intervals (shaded area) with more precision. For population model 2, 3, and 5, much constructed around the rejection rates to clarify whether a larger rejection rates are expected as these population models rejection rate is significantly different from the predefined do not match the respective specification. Moreover, we expect significance level. that the power of the test to detect misspecifications would First, as expected, the squared Euclidean distance (d ) as well increase along with a larger sample size. Regarding the different as the SRMR lead to identical results. The test using the squared discrepancy measures, we have no expectations, only that the Euclidean distance and the SRMR rejects the model somewhat squared Euclidean distance and the SRMR should lead to too rarely in case of α = 10% and α = 5% respectively; however, identical results. For standardized datasets, the only difference is a constant factor that does not affect the order of the observations The limits of the 95% confidence interval are calculated as, pˆ ± −1 −1 in the reference distribution and, therefore, does not affect the 8 (0.975) pˆ(1 − pˆ)/10000, where pˆ represents the rejection rate and 8 () is decision about the null hypothesis. the quantile function of the standard normal distribution. Frontiers in Psychology | www.frontiersin.org 11 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA for an increasing sample size, the rejection rates converge to Its application is appropriate in situations where the research the predefined significance level without reaching it. For the goal is to examine whether an artifact is useful rather than 1% significance level, a similar picture is observed; however, to establish whether a certain concept exists. It follows the for larger sample sizes, the significance level is retained more same steps usually applied in SEM and enables researchers often compared to the larger significance levels. In contrast, to analyze a variety of situations, in particular, beyond the the test using the geodesic distance mostly rejects the model realm of social and behavioral sciences. Hence, CCA allows for too often for the 5% and 10% significance level. However, the dealing with research questions that could not be appropriately obtained rejection rates are less often significantly different from dealt with yet in the framework of CFA or more generally the predefined significance level compared to the same situation in SEM. where the SRMR or the Euclidean distance is used. In case The results of the Monte Carlo simulation confirmed that of α = 1% and sample sizes larger than n = 100, the CCA can be used for confirmatory purposes. They revealed test using the geodesic distance rejects the model significantly that the bootstrap-based test, in combination with different too often. discrepancy measures, can be used to statistically assess the Figure 7 displays the rejection rates for population models overall model fit of the composite model. For specifications 2 and 3. The horizontal line at 80% depicts the commonly matching the population model, the rejection rates were in recommended power for a statistical test (Cohen, 1988). For the acceptable range, i.e., close to the predefined significance the two cases where the specification does not match the level. Moreover, the results of the power analysis showed that underlying data generating process, the test using the squared the boostrap-based test can reliably detect misspecified models. Euclidean distance as well as the SRMR has more power than However, caution is needed in case of small sample sizes where the test using the geodesic distance, i.e., the test using former the rejection rates were low, which means that misspecified discrepancy measures rejects the wrong model more often. models were not reliably detected. For model 2 (confounded indicators) the test produces higher In future research, the usefulness of the composite model or equal rejection rates compared to model 3 (unexplained in empirical studies needs to be examined, accompanied and correlation). Furthermore, as expected, the power decreases for enhanced by simulation studies. In particular, the extensions an increasing level of significance and increases with increasing outlined by Dijkstra (2017); to wit, interdependent systems of sample sizes. equations for the composites estimated by classical econometric Figure 8 depicts the rejection rates for population model 4 methods (like 2SLS and three-stage least squares) warrant further and 5. Again, the 95% confidence intervals are illustrated for analysis and scrutiny. Robustness with respect to non-normality population model 4 (shaded area) matching the specification and misspecification also appear to be relevant research topics. estimated. Considering population model 4 which matches Additionally, devising ways to efficiently predict indicators and the estimated model, the test leads to similar results for all composites might be of particular interest (see for example the three discrepancy measures. However, the rejection rate of work by Shmueli et al., 2016). the test using the geodesic distance converges faster to the Moreover, to contribute to the confirmatory character of CCA, predefined significance level, i.e., for smaller sample sizes n ≥ we recommend further study of the performance and limitations 100. Again, among the three discrepancy measures considered, of the proposed test procedure: consider more misspecifications the geodesic distance performs best in terms of keeping the and the ability of the test to reliably detect them, find further significance level. discrepancy measures and examine their performance, and As the extent of misspecification in population model 5 is investigate the behavior of the test under the violation of the minor, the test struggles to detect the model misspecification up normality assumption, similar as Nevitt and Hancock (2001) did to sample sizes n = 350, regardless of the discrepancy measure for CFA. Finally, cut-off values for the fit indices need to be used. However, for sample sizes larger than 350 observations, determined for CCA. the test detects the model misspecification satisfactorily. For sample sizes larger than 1,050 observations, the misspecification AUTHOR CONTRIBUTIONS was identified in almost all cases regardless of the significance level and the discrepancy measure used. Again, this confirms FS conducted the literature review and wrote the majority the anticipated relationship between sample size and statistical of the paper (contribution: ca. 50%). JH initiated this paper power. and designed the simulation study (contribution: ca. 25%). TD proposed the composite model and developed the model fit test (contribution: ca. 25%). 7. DISCUSSION We introduced the confirmatory composite analysis (CCA) SUPPLEMENTARY MATERIAL as a full-fledged technique for confirmatory purposes that employs composites to model artifacts, i.e., design concepts. It The Supplementary Material for this article can be found overcomes current limitations in CFA and SEM and carries the online at: https://www.frontiersin.org/articles/10.3389/fpsyg. spirit of CFA and SEM to research domains studying artifacts. 2018.02541/full#supplementary-material Frontiers in Psychology | www.frontiersin.org 12 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA REFERENCES Fornell, C., and Bookstein, F. L. (1982). Two structural equation models: LISREL and PLS applied to consumer exit-voice theory. J. Market. Res. 19, 440–452. Bagozzi, R. P. (1994). “Structural equation models in marketing research: basic doi: 10.2307/3151718 principles,” in Principles of Marketing Research eds R. P. Bagozzi (Oxford: Gefen, D., Straub, D. W., and Rigdon, E. E. (2011). An update and extension to Blackwell), 317–385. SEM guidelines for admnistrative and social science research. MIS Quart. 35, Bagozzi, R. P., and Yi, Y. (1988). On the evaluation of structural equation models. iii–xiv. doi: 10.2307/23044042 J. Acad. Market. Sci. 16, 74–94. doi: 10.1007/BF02723327 Grace, J. B., Anderson, T. M., Olff, H., and Scheiner, S. M. (2010). On the Baskerville, R., and Pries-Heje, J. (2010). Explanatory design theory. Busin. Inform. specification of structural equation models for ecological systems. Ecol. Monogr. Syst. Eng. 2, 271–282. doi: 10.1007/s12599-010-0118-4 80, 67–87. doi: 10.1890/09-0464.1 Bentler, P. M., and Bonett, D. G. (1980). Significance tests and goodness Grace, J. B., and Bollen, K. A. (2008). Representing general theoretical concepts of fit in the analysis of covariance structures. Psychol. Bull. 88, 588–606. in structural equation models: the role of composite variables. Environ. Ecol. doi: 10.1037/0033-2909.88.3.588 Statist. 15, 191–213. doi: 10.1007/s10651-007-0047-7 Beran, R., and Srivastava, M. S. (1985). Bootstrap tests and confidence Hayduk, L. A. (1988). Structural Equation Modeling With LISREL: Essentials and regions for functions of a covariance matrix. Ann. Statist. 13, 95–115. Advances. Baltimore, MD: John Hopkins University Press. doi: 10.1214/aos/1176346579 Heene, M., Hilbert, S., Freudenthaler, H. H., and Buehner, M. (2012). Sensitivity of Bollen, K. A. (1989). Structural Equations with Latent Variables. New York, NY: SEM fit indexes with respect to violations of uncorrelated errors. Struct. Equat. John Wiley & Sons Inc . Model. 19, 36–50. doi: 10.1080/10705511.2012.634710 Bollen, K. A. (2001). “Two-stage least squares and latent variable models: Henseler, J. (2017). Bridging design and behavioral research with Simultaneous estimation and robustness to misspecifications,” in Structural variance-based structural equation modeling. J. Advert. 46, 178–192. Equation Modeling: Present and Future, A Festschrift in Honor of Karl Jöreskog doi: 10.1080/00913367.2017.1281780 eds R. Cudeck, S. Du Toit, and D. Sörbom (Chicago: Scientific Software Henseler, J., Dijkstra, T. K., Sarstedt, M., Ringle, C. M., Diamantopoulos, International), 119–138. A., Straub, D. W., et al. (2014). Common beliefs and reality about PLS Bollen, K. A., and Stine, R. A. (1992). Bootstrapping goodness-of-fit comments on Rönkkö and Evermann (2013). Organ. Res. Methods 17, 182–209. measures in structural equation models. Sociol. Methods Res. 21, 205–229. doi: 10.1177/1094428114526928 doi: 10.1177/0049124192021002004 Holbert, R. L., and Stephenson, M. T. (2002). Structural equation modeling in Borden, N. H. (1964). The concept of the marketing mix. J. Advert. Res. 4, 2–7. the communication sciences, 1995–2000. Hum. Commun. Res. 28, 531–551. Brown, T. A. (2015). Confirmatory Factor Analysis for Applied Research. New York, doi: 10.1111/j.1468-2958.2002.tb00822.x NY: Guilford Press. Horvath, I. (2004). A treatise on order in engineering design research. Res. Eng. Browne, M. W. (1984). Asymptotically distribution-free methods for the Design 15, 155–181. doi: 10.1007/s00163-004-0052-x analysis of covariance structures. Br. J. Math. Statist. Psychol. 37, 62–83. Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28, 321– doi: 10.1111/j.2044-8317.1984.tb00789.x 377. doi: 10.1093/biomet/28.3-4.321 Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd Edn. Hu, L., and Bentler, P. M. (1998). Fit indices in covariance structure modeling: Hillsdale, MI: Lawrence Erlbaum Associates. sensitivity to underparameterized model misspecification. Psychol. Methods 3, Crowley, D. M. (2013). Building efficient crime prevention strategies. Criminol. 424–453. doi: 10.1037/1082-989X.3.4.424 Public Policy 12, 353–366. doi: 10.1111/1745-9133.12041 Hu, L., and Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance Diamantopoulos, A. (2008). Formative indicators: introduction to the structure analysis: conventional criteria versus new alternatives. Struc. Equat. special issue. J. Busin. Res. 61, 1201–1202. doi: 10.1016/j.jbusres.2008. Model. 6, 1–55. doi: 10.1080/10705519909540118 01.008 Hwang, H., and Takane, Y. (2004). Generalized structured component analysis. Diamantopoulos, A., and Winklhofer, H. M. (2001). Index construction with Psychometrika 69, 81–99. doi: 10.1007/BF02295841 formative indicators: an alternative to scale development. J. Market. Res. 38, Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. 269–277. doi: 10.1509/jmkr.38.2.269.18845 Psychometrika 32, 443–482. doi: 10.1007/BF02289658 Dijkstra, T. K. (2010). “Latent variables and indices: Herman Wold’s basic design Keller, H. (2006). The SCREEN I (seniors in the community: risk evaluation and partial least squares,” in Handbook of Partial Least Squares (Berlin: for eating and nutrition) index adequately represents nutritional risk. J. Clin. Springer), 23–46. Epidemiol. 59, 836–841. doi: 10.1016/j.jclinepi.2005.06.013 Dijkstra, T. K. (2013). “Composites as factors: Canonical variables revisited,” in Kenny, D. A. (1979). Correlation and Causality. Hoboken, NJ: John Wiley & Sons Working Paper. Groningen. Available online at: https://www.rug.nl/staff/t.k. Inc. dijkstra/composites-as-factors.pdf Kettenring, J. R. (1971). Canonical analysis of several sets of variables. Biometrika Dijkstra, T. K. (2015). “All-inclusive versus single block composites,” in Working 58, 433–451. doi: 10.1093/biomet/58.3.433 Paper. Groningen. Available online at: https://www.researchgate.net/profile/ Kirmayer, L. J., and Crafa, D. (2014). What kind of science for psychiatry? Front. Theo_Dijkstra/publication/281443431_all-inclusive_and_single_block_ Hum. Neurosci. 8:435. doi: 10.3389/fnhum.2014.00435 composites/links/55e7509208ae65b63899564f/all-inclusive-and-single-block- Klein, A., and Moosbrugger, H. (2000). Maximum likelihood estimation of composites.pdf latent interaction effects with the LMS method. Psychometrika 65, 457–474. Dijkstra, T. K. (2017). “A perfect match between a model and a mode,” in Partial doi: 10.1007/BF02296338 Least Squares Path Modeling, eds H. Latan, and R. Noonan (Cham: Springer), Kline, R. B. (2015). Principles and Practice of Structural Equation Modeling. New 55–80. York, NY: Guilford Press. Dijkstra, T. K., and Henseler, J. (2011). Linear indices in nonlinear structural Lee, H.-J. (2005). Developing a professional development program model based on equation models: best fitting proper indices and other composites. Qual. Quant. teachers’ needs. Profess. Educ. 27, 39–49. 45, 1505–1518. doi: 10.1007/s11135-010-9359-z Little, T. D. (2013). Longitudinal Structural Equation Modeling. New York, NY: Dijkstra, T. K., and Henseler, J. (2015). Consistent and asymptotically normal PLS Guilford Press. estimators for linear structural equations. Computat. Statist. Data Anal. 81, Lohmöller, J.-B. (1989). Latent Variable Path Modeling with Partial Least Squares. 10–23. doi: 10.1016/j.csda.2014.07.008 Heidelberg: Physica. Fisher, R. A. (1936). The use of multiple measurements in taxonomic Lussier, P., LeBlanc, M., and Proulx, J. (2005). The generality of criminal behavior: problems. Ann. Eugen. 7, 179–188. doi: 10.1111/j.1469-1809.1936. a confirmatory factor analysis of the criminal activity of sex offenders in tb02137.x adulthood. J. Crim. Just. 33, 177–189. doi: 10.1016/j.jcrimjus.2004.12.009 Fong, C. J., Davis, C. W., Kim, Y., Kim, Y. W., Marriott, L., and Kim, S. (2016). MacCallum, R. C., and Austin, J. T. (2000). Applications of structural Psychosocial factors and community college student success. Rev. Educ. Res. equation modeling in psychological research. Ann. Rev. Psychol. 51, 201–226. 87, 388–424. doi: 10.3102/0034654316653479 doi: 10.1146/annurev.psych.51.1.201 Frontiers in Psychology | www.frontiersin.org 13 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA MacCallum, R. C., and Browne, M. W. (1993). The use of causal indicators in Shmueli, G., Ray, S., Estrada, J. M. V., and Chatla, S. B. (2016). The elephant in covariance structure models: Some practical issues. Psychol. Bull. 114, 533–541. the room: Predictive performance of PLS models. J. Busin. Res. 69, 4552–4564. doi: 10.1037/0033-2909.114.3.533 doi: 10.1016/j.jbusres.2016.03.049 Malaeb, Z. A., Summers, J. K., and Pugesek, B. H. (2000). Using structural equation Simon, H. (1969). The Sciences of the Artificial. Cambridge: MIT Press. modeling to investigate relationships among ecological variables. Environ. Ecol. Sobel, M. E. (1997). “Measurement, causation and local independence in latent Statist. 7, 93–111. doi: 10.1023/A:1009662930292 variable models,” in Latent Variable Modeling and Applications to Causality, ed Marcoulides, G. A., and Chin, W. W. (2013). “You write, but others read: M. Berkane (New York, NY. Springer), 11–28. common methodological misunderstandings in PLS and related methods,” Spears, N., and Singh, S. N. (2004). Measuring attitude toward the in New Perspectives in Partial Least Squares and Related Methods, eds H. brand and purchase intentions. J. Curr. Iss. Res. Advert. 26, 53–66. Abdi, V. E. Vinzi, G. Russolillo, and L. Trinchera (New York, NY: Springer), doi: 10.1080/10641734.2004.10505164 31–64. Steenkamp, J.-B. E., and Baumgartner, H. (2000). On the use of structural Marcoulides, G. A., and Schumacker, R. E., editors (2001). New Developments and equation models for marketing modeling. Int. J. Res. Market. 17, 195–202. Techniques in Structural Equation Modeling. Mahwah, NJ: Lawrence Erlbaum doi: 10.1016/S0167-8116(00)00016-1 Associates. Swain, A. (1975). A class of factor analysis estimation procedures with Markus, K. A., and Borsboom, D. (2013). Frontiers of Test Validity Theory: common asymptotic sampling properties. Psychometrika 40, 315–335. Measurement, Causation, and Meaning. New York, NY: Routledge. doi: 10.1007/BF02291761 McIntosh, A., and Gonzalez-Lima, F. (1994). Structural equation modeling and its Tenenhaus, A., and Tenenhaus, M. (2011). Regularized generalized application to network analysis in functional brain imaging. Hum. Brain Mapp. canonical correlation analysis. Psychometrika 76, 257–284. 2, 2–22. doi: 10.1007/s11336-011-9206-8 Møller, C., Brandt, C. J., and Carugati, A. (2012). “Deliberately by design, or? Van de Schoot, R., Lugtig, P., and Hox, J. (2012). A checklist for Enterprise architecture transformation at Arla Foods,” in Advances in Enterprise testing measurement invariance. Eur. J. Develop. Psychol. 9, 486–492. Information Systems II, eds C. Møller, and S. Chaudhry (Boca Raton, FL: CRC doi: 10.1080/17405629.2012.686740 Press), 91–104. Vance, A., Benjamin Lowry, P., and Eggett, D. (2015). Increasing accountability Muthén, B. O. (1984). A general structural equation model with dichotomous, through user-interface design artifacts: a new approach to addressing ordered categorical, and continuous latent variable indicators. Psychometrika the problem of access-policy violations. MIS Quart. 39, 345–366. 49, 115–132. doi: 10.25300/MISQ/2015/39.2.04 Muthén, B. O. (2002). Beyond SEM: general latent variable modeling. Venables, W. N., and Ripley, B. D. (2002). Modern Applied Statistics With S, 4th Behaviormetrika 29, 81–117.doi: 10.2333/bhmk.29.81 Edn. New York, NY: Springer. Nevitt, J., and Hancock, G. R. (2001). Performance of bootstrapping Venkatesh, V., Morris, M. G., Davis, G. B., and Davis, F. D. (2003). User approaches to model test statistics and parameter standard error estimation acceptance of information technology: toward a unified view. MIS Quart. in structural equation modeling. Struc. Equat. Model. 8, 353–377. 27:425. doi: 10.2307/30036540 doi: 10.1207/S15328007SEM0803_2 Wight, D., Wimbush, E., Jepson, R., and Doi, L. (2015). Six steps in quality Pearson, K. (1901). On lines and planes of closest fit to systems of intervention development (6SQuID). J. Epidemiol. Commun. Health 70, 520– points in space. Philos. Magazine 6 2, 559–572. doi: 10.1080/1478644010 525. doi: 10.1136/jech-2015-205952 9462720 Wold, H. (1975). “Path models with latent variables: The NIPALS approach. in R Core Team (2016). R: A Language and Environment for Statistical Computing. Quantitative Sociology, eds H. Blalock, A. Aganbegian, F. Borodkin, R. Boudon, Version 3.3.1. Vienna: R Foundation for Statistical Computing. and V. Capecchi (New York, NY: Academic Press), 307–357. Raykov, T. and Marcoulides, G. A. (2006). A First Course in Structural Equation Xiong, B., Skitmore, M., and Xia, B. (2015). A critical review of structural equation Modeling, 2nd Edn. Mahaw: Lawrence Erlbaum Associates. modeling applications in construction research. Automat. Construct. 49 (Pt A), Reichenbach, H. (1956). The Direction of Time. Berkeley, CA: University of 59–70. doi: 10.1016/j.autcon.2014.09.006 California Press. Rigdon, E. E. (2012). Rethinking partial least squares path modeling: in praise Conflict of Interest Statement: JH acknowledges a financial interest in ADANCO of simple methods. Long Range Plan. 45, 341–358. doi: 10.1016/j.lrp.2012. and its distributor, Composite Modeling. 09.010 Rönkkö, M. (2016). matrixpls: Matrix-based Partial Least Squares Estimation. The remaining authors declare that the research was conducted in the absence of R package version 1.0.0. Available online at: https://cran.r-project.org/web/ any commercial or financial relationships that could be construed as a potential packages/matrixpls/vignettes/matrixpls-intro.pdf conflict of interest. Rossiter, J. R. (2002). The C-OAR-SE procedure for scale development in marketing. Int. J. Res. Market. 19, 305–335. Copyright © 2018 Schuberth, Henseler and Dijkstra. This is an open-access article doi: 10.1016/S0167-8116(02)00097-6 distributed under the terms of the Creative Commons Attribution License (CC BY). Schumacker, R. E., and Lomax, R. G. (2009). A Beginner’s Guide to Structural The use, distribution or reproduction in other forums is permitted, provided the Equation Modeling, 3rd Edn. New York, NY: Routledge. original author(s) and the copyright owner(s) are credited and that the original Shah, R., and Goldstein, S. M. (2006). Use of structural equation modeling in publication in this journal is cited, in accordance with accepted academic practice. operations management research: looking back and forward. J. Operat. Manag. No use, distribution or reproduction is permitted which does not comply with these 24, 148–169. doi: 10.1016/j.jom.2005.05.001 terms. Frontiers in Psychology | www.frontiersin.org 14 December 2018 | Volume 9 | Article 2541

Journal

Frontiers in PsychologyUnpaywall

Published: Dec 13, 2018

There are no references for this article.