Access the full text.
Sign up today, get DeepDyve free for 14 days.
M. Heene, H. Freudenthaler, M. Bühner (2012)
Sensitivity of SEM Fit Indexes With Respect to Violations of Uncorrelated ErrorsStructural Equation Modeling: A Multidisciplinary Journal, 19
Gefen, Rigdon, Straub (2011)
Editor's Comments: An Update and Extension to SEM Guidelines for Administrative and Social Science ResearchManagement Information Systems Quarterly, 35
H. Simon (1970)
The Sciences of the Artificial
Charles Møller, C. Brandt, A. Carugati (2012)
Deliberately by Design, Or?: Enterprise Architecture Transformation at Arla Foods
R. Fisher (1936)
THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMSAnnals of Human Genetics, 7
Li-tze Hu, P. Bentler (1999)
Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternativesStructural Equation Modeling, 6
P. Lussier, M. Leblanc, J. Proulx (2005)
The generality of criminal behavior: A confirmatory factor analysis of the criminal activity of sex offenders in adulthoodJournal of Criminal Justice, 33
Bo Xiong, M. Skitmore, Bo Xia (2015)
A critical review of structural equation modeling applications in construction researchScience & Engineering Faculty
T. Dijkstra (2013)
Composites as Factors : canonical variables revisited working paper
Jie Gong, Charles Møller (2012)
Towards a Toolbox for a Process Innovation Laboratory
G. Freytag (1964)
[CORRELATION AND CAUSALITY].Psychiatrie, Neurologie, und medizinische Psychologie, 16
H. Hotelling (1936)
Relations Between Two Sets of VariatesBiometrika, 28
K. Jöreskog (1967)
Some contributions to maximum likelihood factor analysisPsychometrika, 32
D. Wight, E. Wimbush, R. Jepson, L. Doi (2014)
Six steps in quality intervention development (6SQuID)Journal of Epidemiology and Community Health, 70
T. K. Dijkstra (2013)
Composites as factors: Canonical variables revisitedWorking Paper
H. Wold (1975)
Path Models with Latent Variables: The NIPALS Approach
T. Raykov, G. Marcoulides (2000)
A First Course in Structural Equation Modeling
Andreas Klein, H. Moosbrugger (2000)
Maximum likelihood estimation of latent interaction effects with the LMS methodPsychometrika, 65
Charles Møller, S. Chaudhry (2012)
Advances in Enterprise Information Systems II
Bagozzi (1988)
On the evaluation of structural equation modelsJ. Acad. Market. Sci., 16
A. Diamantopoulos, H. Winklhofer (2001)
Index Construction with Formative Indicators: An Alternative to Scale DevelopmentJournal of Marketing Research, 38
K. Bollen, R. Stine (1992)
Bootstrapping Goodness-of-Fit Measures in Structural Equation ModelsSociological Methods & Research, 21
Bagozzi (1994)
Structural equation models in marketing research: basic principles
Phil Wood (2008)
Confirmatory Factor Analysis for Applied ResearchThe American Statistician, 62
J. Lohmöller (1989)
Latent Variable Path Modeling with Partial Least Squares
A. Swain (1975)
A class of factor analysis estimation procedures with common asymptotic sampling propertiesPsychometrika, 40
A. Mclntosh, F. Gonzalez-Lima (1994)
Structural equation modeling and its application to network analysis in functional brain imagingHuman Brain Mapping, 2
D. Crowley (2013)
Building Efficient Crime Prevention Strategies: Considering the Economics of Investing in Human Development.Criminology & public policy, 12 2
J. Grace, T. Anderson, H. Olff, S. Scheiner (2010)
On the specification of structural equation models for ecological systems.Ecological Monographs, 80
A. Tenenhaus, M. Tenenhaus (2011)
Regularized Generalized Canonical Correlation AnalysisPsychometrika, 76
A. Diamantopoulos (2008)
Formative indicators: Introduction to the special issueJournal of Business Research, 61
V. Venkatesh, Michael Morris, G. Davis, Fred Davis (2003)
User Acceptance of Information Technology: Toward a Unified ViewInstitutions & Transition Economics: Microeconomic Issues eJournal
B. Muthén (2002)
BEYOND SEM: GENERAL LATENT VARIABLE MODELINGBehaviormetrika, 29
Laurence Kirmayer, D. Crafa (2014)
What kind of science for psychiatry?Frontiers in Human Neuroscience, 8
Jon Kettenring (1971)
Canonical analysis of several sets of variablesBiometrika, 58
Edward Rigdon (2012)
Rethinking Partial Least Squares Path Modeling: In Praise of Simple MethodsLong Range Planning, 45
M. Sobel (1997)
Measurement, Causation and Local Independence in Latent Variable Models
Ziad Malaeb, J. Summers, B. Pugesek (2000)
Using structural equation modeling to investigate relationships among ecological variablesEnvironmental and Ecological Statistics, 7
T. Dijkstra (2010)
Latent Variables and Indices: Herman Wold’s Basic Design and Partial Least Squares
T. Little (2013)
Longitudinal Structural Equation Modeling
B. Muthén (1984)
A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicatorsPsychometrika, 49
(2017)
Estimating hierarchical constructs using Partial Least Squares: the case of second order composites of factors
J. Grace, K. Bollen (2008)
Representing general theoretical concepts in structural equation models: the role of composite variablesEnvironmental and Ecological Statistics, 15
Chong-sun Kim (1973)
Canonical Analysis of Several Sets of Variables
Li-tze Hu, P. Bentler (1998)
Fit indices in covariance structure modeling : Sensitivity to underparameterized model misspecificationPsychological Methods, 3
D. Finkelstein (2005)
A Beginner's Guide to Structural Equation ModelingTechnometrics, 47
J. Nevitt, G. Hancock (2001)
Performance of Bootstrapping Approaches to Model Test Statistics and Parameter Standard Error Estimation in Structural Equation ModelingStructural Equation Modeling: A Multidisciplinary Journal, 8
C. Fornell, F. Bookstein (1982)
Two Structural Equation Models: LISREL and PLS Applied to Consumer Exit-Voice Theory:Journal of Marketing Research, 19
T. Dijkstra, J. Henseler (2015)
Consistent and asymptotically normal PLS estimators for linear structural equationsComput. Stat. Data Anal., 81
Hea-Jin Lee (2005)
Developing a Professional Development Program Model Based on Teachers' Needs.The Professional Educator, 27
(2001)
Two-stage least squares and latent variable models: Simultaneous estimation and robustness to misspecifications
J. Steenkamp, H. Baumgartner (2000)
On the use of structural equation models for marketing modelingInternational Journal of Research in Marketing, 17
R. Baskerville, J. Pries-Heje (2010)
Explanatory Design TheoryBusiness & Information Systems Engineering, 2
O. Penrose (1962)
The Direction of Time, 79
J. Henseler, C. Ringle, M. Sarstedt (2016)
Testing measurement invariance of composites using partial least squaresInternational Marketing Review, 33
G. Marcoulides, R. Schumacker (2001)
New developments and techniques in structural equation modeling
(2016)
matrixpls: Matrix-based Partial Least Squares Estimation
R. Schoot, P. Lugtig, J. Hox (2012)
A checklist for testing measurement invarianceEuropean Journal of Developmental Psychology, 9
Karl F.R.S. (1901)
LIII. On lines and planes of closest fit to systems of points in spacePhilosophical Magazine Series 1, 2
P. Bentler, D. Bonett (1980)
Significance Tests and Goodness of Fit in the Analysis of Covariance StructuresPsychological Bulletin, 88
L. Hayduk (1987)
Structural equation modeling with LISREL: essentials and advancesSocial Forces, 14
M. Browne (1984)
Asymptotically distribution-free methods for the analysis of covariance structures.The British journal of mathematical and statistical psychology, 37 ( Pt 1)
J. Henseler (2017)
Bridging Design and Behavioral Research With Variance-Based Structural Equation ModelingJournal of Advertising, 46
R. Holbert, M. Stephenson, Stephenson, Robert Hauser, R. Hoyle (2002)
Structural Equation Modeling in the Communication Sciences, 1995–2000Human Communication Research, 28
G. Marcoulides, Wynne Chin (2013)
You Write, but Others Read: Common Methodological Misunderstandings in PLS and Related Methods
R. Kline (1998)
Principles and Practice of Structural Equation Modeling
R. Maccallum, M. Browne (1993)
The use of causal indicators in covariance structure models: some practical issues.Psychological bulletin, 114 3
N. Borden (1964)
The Concept of the Marketing Mix
Rachna Shah, S. Goldstein (2006)
Use of structural equation modeling in operations management research: Looking back and forward ☆Journal of Operations Management, 24
Nancy Spears, Surendra Singh (2004)
Measuring Attitude toward the Brand and Purchase IntentionsJournal of Current Issues & Research in Advertising, 26
T. Dijkstra (2017)
A Perfect Match Between a Model and a Mode
(2015)
All-inclusive versus single block composites
T. Dijkstra, J. Henseler (2011)
Linear indices in nonlinear structural equation models: best fitting proper indices and other compositesQuality & Quantity, 45
Carlton Fong, Coreen Davis, Yughi Kim, Young Kim, L. Marriott, Sooyeon Kim (2017)
Psychosocial Factors and Community College Student SuccessReview of Educational Research, 87
J. Henseler, J. Henseler, T. Dijkstra, M. Sarstedt, M. Sarstedt, C. Ringle, C. Ringle, A. Diamantopoulos, D. Straub, D. Ketchen, Joseph Hair, G. Hult, R. Calantone (2014)
Common Beliefs and Reality About PLSOrganizational Research Methods, 17
(2018)
Dealing with hierarchical models containing composites of composites using PLS path modeling Working Paper
Heungsun Hwang, Y. Takane (2004)
Generalized structured component analysisPsychometrika, 69
Keith Markus, D. Borsboom (2013)
Frontiers of Test Validity Theory: Measurement, Causation, and Meaning
A. Ritter (2016)
Structural Equations With Latent Variables
R. Team (2014)
R: A language and environment for statistical computing.MSOR connections, 1
H. Keller (2006)
The SCREEN I (Seniors in the Community: Risk Evaluation for Eating and Nutrition) index adequately represents nutritional risk.Journal of clinical epidemiology, 59 8
Alan Sawyer, T. Page (1984)
The use of incremental goodness of fit indices in structural equation models in marketing researchJournal of Business Research, 12
R. Bagozzi, Youjae Yi (1988)
On the evaluation of structural equation modelsJournal of the Academy of Marketing Science, 16
I. Horváth (2004)
A treatise on order in engineering design researchResearch in Engineering Design, 15
J. Rossiter (2002)
The C-OAR-SE procedure for scale development in marketingInternational Journal of Research in Marketing, 19
G. Shmueli, Soumya Ray, Juan Estrada, S. Chatla (2016)
The elephant in the room: Predictive performance of PLS modelsJournal of Business Research, 69
R. Beran, M. Srivastava (1985)
Bootstrap Tests and Confidence Regions for Functions of a Covariance MatrixAnnals of Statistics, 13
R. Maccallum, James Austin (2000)
Applications of structural equation modeling in psychological research.Annual review of psychology, 51
Anthony Vance, P. Lowry, D. Eggett (2015)
Increasing Accountability Through User-Interface Design Artifacts: A New Approach to Addressing the Problem of Access-Policy ViolationsMIS Q., 39
W. Venables, B. Ripley (2010)
Modern Applied Statistics with S
METHODS published: 13 December 2018 doi: 10.3389/fpsyg.2018.02541 Conﬁrmatory Composite Analysis 1 1,2 3 Florian Schuberth , Jörg Henseler and Theo K. Dijkstra 1 2 Faculty of Engineering Technology, Chair of Product-Market Relations, University of Twente, Enschede, Netherlands, Nova Information Management School, Universidade Nova de Lisboa, Lisbon, Portugal, Faculty of Economics and Business, University of Groningen, Groningen, Netherlands This article introduces conﬁrmatory composite analysis (CCA) as a structural equation modeling technique that aims at testing composite models. It facilitates the operationalization and assessment of design concepts, so-called artifacts. CCA entails the same steps as conﬁrmatory factor analysis: model speciﬁcation, model identiﬁcation, model estimation, and model assessment. Composite models are speciﬁed such that they consist of a set of interrelated composites, all of which emerge as linear combinations of observable variables. Researchers must ensure theoretical identiﬁcation of their speciﬁed model. For the estimation of the model, several estimators are available; in particular Kettenring’s extensions of canonical correlation analysis provide consistent estimates. Model assessment mainly relies on the Bollen-Stine bootstrap to assess the discrepancy between the empirical and the estimated model-implied indicator covariance matrix. A Monte Carlo simulation examines the efﬁcacy of CCA, and demonstrates that Edited by: CCA is able to detect various forms of model misspeciﬁcation. Holmes Finch, Ball State University, United States Keywords: artifacts, composite modeling, design research, Monte Carlo simulation study, structural equation Reviewed by: modeling, theory testing Daniel Saverio John Costa, University of Sydney, Australia Shenghai Dai, 1. INTRODUCTION Washington State University, United States Structural equation modeling with latent variables (SEM) comprises conﬁrmatory factor analysis *Correspondence: (CFA) and path analysis, thus combining methodological developments from diﬀerent disciplines Florian Schuberth such as psychology, sociology, and economics, while covering a broad variety of traditional [email protected] multivariate statistical procedures (Bollen, 1989; Muthén, 2002). It is capable of expressing theoretical concepts by means of multiple observable indicators to connect them via the structural Specialty section: model as well as to account for measurement error. Since SEM allows for statistical testing of This article was submitted to the estimated parameters and even entire models, it is an outstanding tool for conﬁrmatory Quantitative Psychology and Measurement, purposes such as for assessing construct validity (Markus and Borsboom, 2013) or for establishing a section of the journal measurement invariance (Van de Schoot et al., 2012). Apart from the original maximum likelihood Frontiers in Psychology estimator, robust versions and a number of alternative estimators were also introduced to encounter Received: 19 June 2018 violations of the original assumptions in empirical work, such as the asymptotic distribution free Accepted: 28 November 2018 (Browne, 1984) or the two-stage least squares (2SLS) estimator (Bollen, 2001). Over time, the initial Published: 13 December 2018 model has been continuously improved upon to account for more complex theories. Consequently, Citation: SEM is able to deal with categorical (Muthén, 1984) as well as longitudinal data (Little, 2013) and Schuberth F, Henseler J and can be used to model non-linear relationships between the constructs (Klein and Moosbrugger, Dijkstra TK (2018) Conﬁrmatory 1 2000). Composite Analysis. Front. Psychol. 9:2541. For more details and a comprehensive overview, we referred to the following text books: Hayduk (1988), Bollen (1989), doi: 10.3389/fpsyg.2018.02541 Marcoulides and Schumacker (2001), Raykov and Marcoulides (2006), Kline (2015), and Brown (2015). Frontiers in Psychology | www.frontiersin.org 1 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA TABLE 1 | Examples of behavioral concepts and artifacts across several Researchers across many streams of science appreciate SEM’s disciplines. versatility as well as its ability to test common factor models. In particular, in the behavioral and social sciences, SEM enjoys wide Discipline Behavioral Concept Design Concept (Artifact) popularity, e.g., in marketing (Bagozzi and Yi, 1988; Steenkamp and Baumgartner, 2000), psychology (MacCallum and Austin, Criminology Criminal activity Prevention strategy 2000), communication science (Holbert and Stephenson, 2002), Lussier et al., 2005 Crowley, 2013 operations management (Shah and Goldstein, 2006), and Ecology Sediment contamination Abiotic stress information systems (Gefen et al., 2011),—to name a few. Malaeb et al., 2000 Grace et al., 2010 Additionally, beyond the realm of behavioral and social sciences, Education Student’s anxiety Teacher development program researchers have acknowledged the capabilities of SEM, such as Fong et al., 2016 Lee, 2005 in construction research (Xiong et al., 2015) or neurosciences Epidemiology Nutritional Risk Public health intervention (McIntosh and Gonzalez-Lima, 1994). Keller, 2006 Wight et al., 2015 Over the last decades, the operationalization of the theoretical Information Perceived ease of use User-interface design concept and the common factor has become more and more Systems conﬂated such that hardly any distinction is made between the Venkatesh et al., 2003 Vance et al., 2015 terms (Rigdon, 2012). Although the common factor model has Marketing Brand attitude Marketing mix demonstrated its usefulness for concepts of behavioral research Spears and Singh, 2004 Borden, 1964 such as traits and attitudes, the limitation of SEM to the factor model is unfortunate because many disciplines besides and even within social and behavioral sciences do not exclusively deal CFA or SEM, without assuming that the underlying concept is with behavioral concepts, but also with design concepts (so- necessarily modeled as a common factor. called artifacts) and their interplay with behavioral concepts. For While there is no exact instruction on how to apply SEM, a example Psychiatry: on the one hand it examines clinical relevant general consensus exists that SEM and CFA comprise at least the behavior to understand mental disorder, but on the other hand following four steps: model speciﬁcation, model identiﬁcation, it also aims at developing mental disorder treatments (Kirmayer model estimation, and model assessment (e.g., Schumacker and and Crafa, 2014). Table 1 displays further examples of disciplines Lomax, 2009, Chap. 4). To be in line with this proceeding, investigating behavioral concepts and artifacts. the remainder of the paper is structured as follows: Section Typically, the common factor model is used to operationalize 2 introduces the composite model providing the theoretical behavioral concepts, because it is well matched with the general foundation for the CCA and how the same can be speciﬁed; understanding of measurement (Sobel, 1997). It assumes that Section 3 considers the issue of identiﬁcation in CCA and states each observable indicator is a manifestation of the underlying the assumptions as being necessary to guarantee the unique concept that is regarded as their common cause (Reichenbach, solvability of the composite model; Section 4 presents one 1956), and therefore fully explains the covariation among its approach that can be used to estimate the model parameters indicators. However, for artifacts the idea of measurement is in the framework of CCA; Section 5 provides a test for the unrewarding as they are rather constructed to fulﬁll a certain overall model ﬁt to assess how well the estimated model ﬁts the purpose. To account for the constructivist character of the observed data; Section 6 assesses the performance of this test artifact, the composite has been recently suggested for its in terms of a Monte Carlo simulation and presents the results; operationalization in SEM (Henseler, 2017). A composite is and ﬁnally, the last section discusses the results and gives an a weighted linear combination of observable indicators, and outlook for future research. A brief example on how to estimate therefore in contrast to the common factor model, the indicators and assess a composite model within the statistical programming do not necessarily share a common cause. environment R is provided in the Supplementary Material. At present, the validity of composite models cannot be systematically assessed. Current approaches are limited to assessing the indicators’ collinearity (Diamantopoulos and 2. SPECIFYING COMPOSITE MODELS Winklhofer, 2001) and their relations to other variables in the model (Bagozzi, 1994). A rigorous test of composite models in Composites have a long tradition in multivariate data analysis analogy to CFA does not exist so far. Not only does this situation (Pearson, 1901). Originally, they are the outcome of dimension limit the progress of composite models, it also represents an reduction techniques, i.e., the mapping of the data to a lower unnecessary weakness of SEM as its application is mainly dimensional space. In this respect, they are designed to capture limited to behavioral concepts. For this reason, we introduce the most important characteristics of the data as eﬃciently as conﬁrmatory composite analysis (CCA) wherein the concept, i.e., possible. Apart from dimension reduction, composites can serve the artifact, under investigation is modeled as a composite. In this as proxies for concepts (MacCallum and Browne, 1993). In way, we make SEM become accessible to a broader audience. We marketing research, Fornell and Bookstein (1982) recognized show that the composite model relaxes some of the restrictions that certain concepts like marketing mix or population change imposed by the common factor model. However, it still provides are not appropriately modeled by common factors and instead testable constraints, which makes CCA a full-ﬂedged method for employed a composite to operationalize these concepts. In the conﬁrmatory purposes. In general, it involves the same steps as recent past, more and more researchers recognized composites Frontiers in Psychology | www.frontiersin.org 2 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA as a legitimate approach to operationalize concepts, e.g., in The intra-block covariance matrix 6 of dimension K × K jj j j marketing science (Diamantopoulos and Winklhofer, 2001; is unconstrained and captures the covariation between the Rossiter, 2002), business research (Diamantopoulos, 2008), indicators of block j; thus, this eﬀectively allows the indicators environmental science (Grace and Bollen, 2008), and in design of one block to freely covary. Moreover, it can be shown that research (Henseler, 2017). the indicator covariance matrix is positive-deﬁnite if and only if In social and behavioral sciences, concepts are often the following two conditions hold: (i) all intra-block covariance understood as ontological entities such as abilities or attitudes, matrices are positive-deﬁnite, and (ii) the covariance matrix of which rests on the assumption that the concept of interest exists the composite is positive-deﬁnite (Dijkstra, 2015, 2017). The in nature, regardless of whether it is the subject of scientiﬁc covariances between the indicators of block j and l are captured examination. Researchers follow a positivist research paradigm in the inter-block covariance matrix 6 , with j 6= l of dimension jl assuming that existing concepts can be measured. K × K . However, in contrast to the intra-block covariance j l In contrast, design concepts can be conceived as artifacts, matrix, the inter-block covariance matrix is constrained, since i.e., objects designed to serve explicit goal(s) (Simon, 1969). by assumption, the composites carry all information between the Hence, they are inextricably linked to purposefulness, i.e., blocks: teleology (Horvath, 2004; Baskerville and Pries-Heje, 2010; ′ ′ 6 = ρ 6 w w 6 = ρ λ λ , (2) Møller et al., 2012). This way of thinking has its origin jj j j jl jl ll jl l l in constructivist epistemology. The epistemological distinction where ρ = w 6 w equals the correlation between the between the ontological and constructivist nature of concepts has jl jl l important implications when modeling the causal relationships composites c and c . The vector λ = 6 w of length K contains j j jj j j among the concepts and their relationships to the observable the composite loadings, which are deﬁned as the covariances indicators. between the composite c and the associated indicators x . j j To operationalize behavioral concepts, the common factor Equation 2 is highly reminiscent of the corresponding equation model is typically used. It seeks to explore whether a certain where all concepts are modeled as common factors instead of concept exists by testing if collected measures of a concept composites. In a common factor model, the vector λ captures the are consistent with the assumed nature of that concept. It is covariances between the indicators and its connected common based on the principle of common cause (Reichenbach, 1956), factor, and ρ represents the correlation between common factor jl and therefore assumes that all covariation within a block of j and l. Hence, both models show the rank-one structure for the indicators can be fully explained by the underlying concept. covariance matrices between two indicator blocks. On the contrary, the composite model can be used to model Although the intra-block covariance matrices of the indicators artifacts as a linear combination of observable indicators. In 6 are not restricted, we emphasize that the composite model jj doing so, it is more pragmatic in the sense that it examines is still a model from the point of view of SEM. It assumes that whether a built artifact is useful at all. Figure 1 summarizes the all information between the indicators of two diﬀerent blocks is diﬀerences between behavioral concepts and artifacts and their conveyed by the composite(s), and therefore, it imposes rank- operationalization in SEM. one restrictions on the inter-block covariance matrices of the In the following part, we present the theoretical foundation indicators (see Equation 2). These restrictions can be exploited of the composite model. Although the formal development of for testing the overall model ﬁt (see Section 5). It is emphasized the composite model and the composite factor model (Henseler that the weights w producing these matrices are the same across et al., 2014), were already laid out by Dijkstra (2013, 2015), it has all inter-block covariance matrices 6 with l = 1, ..., J and l 6= j. jl not been put into a holistic framework yet. In the following, it Figure 2 illustrates an example of a composite model. is assumed that each artifact is modeled as a composite c with The artifact under investigation is modeled as the composite j = 1, . . . , J. By deﬁnition, a composite is completely determined c, illustrated by a hexagon, and the observable indicators ′ ′ by a unique block of K indicators, x = x . . . x , c = w x . are represented by squares. The unconstrained covariance σ j j1 jK j j 12 j j j between the indicators of block x = x x forming the The weights of block j are included in the column vector w 1 2 composite is highlighted by a double-headed arrow. of length K . Usually, each weight vector is scaled to ensure that The observable variables y and z do not form the composite. the composites have unit variance (see also Section 3). Here, we They are allowed to freely covary among each other as well as with assume that each indicator is connected to only one composite. the composite. For example, they can be regarded as antecedents The theoretical covariance matrix 6 of the indicators can be or consequences of the modeled artifact. expressed as a partitioned matrix as follows: To emphasize the diﬀerence between the composite model and the common factor model typically used in CFA, we depict 6 6 . . . 6 11 12 1J the composite model as composite factor model (Dijkstra, 2013; 6 . . . 6 22 2J Henseler et al., 2014). The composite factor model has the same 6 = . (1) . . . model-implied indicator covariance matrix as the composite model, but the deduction of the model-implied covariances and JJ the comparison to the common factor is more straightforward. Figure 3 shows the same model as Figure 2 but in terms of a In general, models containing common factors and composites are also conceivable but have not been considered here. composite factor representation. Frontiers in Psychology | www.frontiersin.org 3 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 1 | Two types of concepts: behavioral concepts vs. artifacts. The composite loading λ , i = 1, 2 captures the covariance between the indicator x and the composite c. In general, the error terms are included in the vector ǫ, explaining the variance of the indicators and the covariances between the indicators of one block, which are not explained by the composite factor. As the composite model does not restrict the covariances between the indicators of one block, the error terms are allowed to freely covary. The covariations among the error terms as well as their variances are captured in matrix 2. The model-implied covariance matrix of the example composite model can be displayed as follows: x x z 1 2 yy λ σ σ 1 yc 11 6 = . (3) λ σ λ λ + θ σ 2 yc 1 2 12 22 FIGURE 2 | Example of a composite model. σ λ σ λ σ σ yz 1 cz 2 cz zz In comparison to the same model using a common factor instead of a composite, the composite model is less restrictive as it allows 3. IDENTIFYING COMPOSITE MODELS all error terms of one block to be correlated, which leads to a more general model (Henseler et al., 2014). In fact, the common Like in SEM and CFA, model identiﬁcation is an important factor model is always nested in the composite model since it uses issue in CCA. Since analysts can freely specify their models, it the same restriction as the composite model; but additionally, it needs to be ensured that the model parameters have a unique assumes that (some) covariances between the error terms of one solution (Bollen, 1989, Chap. 8). Therefore, model identiﬁcation block are restricted (usually to zero). Under certain conditions, is necessary to obtain consistent parameter estimates and to it is possible to rescale the intra- and inter-block covariances of reliably interpret them (Marcoulides and Chin, 2013). a composite model to match those of a common factor model In general, the following three states of model identiﬁcation (Dijkstra, 2013; Dijkstra and Henseler, 2015). can be distinguished: under-identiﬁed, just-identiﬁed, and Frontiers in Psychology | www.frontiersin.org 4 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA the indicator covariance matrix since there is a non-zero inter- block covariance matrix for every loading vector. Otherwise, if a composite c is isolated in the nomological network, all inter- block covariances 6 , l = 1, ..., J with l 6= j, belonging to jl this composite are of rank zero, and thus, the weights forming this composite cannot be uniquely retrieved. Although the non- isolation condition is required for identiﬁcation, it also matches the idea of an artifact that is designed to fulﬁll a certain purpose. Without considering the artifact’s antecedents and/or consequences, the artifact’s purposefulness cannot be judged. In the following part, we give a description on how the number of degrees of freedom is counted in case of the composite model. It is given by the diﬀerence between the number of non-redundant elements of the indicator population covariance matrix 6 and the number of free parameters in the model. The number of free model parameters is given by the number of covariances among the composites, the number of covariances between composites and indicators not forming a composite, the number of covariances among indicators not forming a composite, the number of non-redundant oﬀ-diagonal elements FIGURE 3 | Example of a composite model displayed as composite factor model. of each intra-block covariance matrix, and the number of weights. Since we ﬁx composite variances to one, one weight of each block can be expressed by the remaining ones of this block. Hence, we regain as many degrees of freedom as ﬁxed composite over-identiﬁed. An under-identiﬁed model, also known as variances, i.e., as blocks in the model. Equation 4 summarizes not-identiﬁed model, oﬀers several sets of parameters that are the way of determining the number of degrees of freedom of a consistent with the model constraints, and thus, no unique composite model. solution for the model parameters exists. Therefore, only questionable conclusions can be drawn. In contrast, a just- identiﬁed model provides a unique solution for the model df = number of non-redundant oﬀ-diagonal elements of the parameters and has the same number of free parameters as non- indicator covariance matrix redundant elements of the indicator covariance matrix (degrees − number of free correlations among the composites of freedom (df) are 0). In empirical analysis, such models − number of free covariances between the composites and cannot be used to evaluate the overall model ﬁt since they perfectly ﬁt the data. An over-identiﬁed model also has a unique indicators not forming a composite solution; however, it provides more non-redundant elements of − number of covariances among the indicators not forming the indicator covariance matrix than model parameters (df > 0). a composite (4) This can be exploited in empirical studies for assessing the overall model ﬁt, as these constraints should hold for a sample within the − number of free non-redundant oﬀ-diagonal elements of limits of sampling error if the model is valid. each intra-block covariance matrix A necessary condition for ensuring identiﬁcation is to − number of weights normalize each weight vector. In doing so, we assume that + number of blocks all composites are scaled to have a unit variance, w 6 w = jj j 1. Besides the scaling of the composite, each composite must To illustrate our approach to calculating the number be connected to at least one composite or one variable not of degrees of freedom, we consider the composite model forming a composite. As a result, at least one inter-block presented in Figure 2. As described above, the model consists covariance matrix 6 , l = 1, ..., J with l 6= j satisﬁes the jl of four (standardized) observable variables; thus, the indicator rank-one condition. Along with the normalization of the weight correlation matrix has six non-redundant oﬀ-diagonal elements. vectors, all model parameters can be uniquely retrieved from The number of free model parameters is counted as follows: no correlations among the composites as the models consists of only The existing literature sometimes mentions empirical (under-)identiﬁcation in the one composite, two correlations between the composite and the context of model identiﬁcation (Kenny, 1979). Since this expression refers to an observable variables not forming a composite (σ and σ ), one yc cz issue of estimation rather than to the issue of model identiﬁcation, this topic is not correlation between the variables not forming a composite (σ ), yz discussed in the following. Another way of normalization is to ﬁx one weight of each block to a certain value. Furthermore, we ignore trivial regularity assumptions such as weight The number of degrees of freedom can be helpful at determining whether a model vectors consisting of zeros only; and similarly, we ignore cases where intra-block is identiﬁed since an identiﬁed model has a non-negative number of degrees of covariance matrices are singular. freedom. Frontiers in Psychology | www.frontiersin.org 5 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA one non-redundant oﬀ-diagonal of the intra-block correlation dimension J × J, is a block-diagonal matrix containing the intra- matrix (σ ), and two weights (w and w ) minus one, the block correlation matrices 6 , j = 1, ..., J on its diagonal. To 12 1 2 jj number of blocks. As a result, we obtain the number of degrees obtain the estimates of the weights, the composites, and their of freedom as follows: df = 6 − 0 − 2 − 1 − 1 − 2 + 1 = 1. Once correlations, the population matrix 6 is replaced by its empirical identiﬁcation of the composite model is ensured, in a next step counterpart S. the model can be estimated. 5. ASSESSING COMPOSITE MODELS 4. ESTIMATING COMPOSITE MODELS 5.1. Tests of Overall Model Fit The existing literature provides various ways of constructing In CFA and factor-based SEM, a test for overall model ﬁt has composites from blocks of indicators. The most common been naturally supplied by the maximum-likelihood estimation among them are principal component analysis (PCA, Pearson, in the form of the chi-square test (Jöreskog, 1967), while maxvar 1901), linear discriminant analysis (LDA, Fisher, 1936), lacks in terms of such a test. In the light of this, we propose and (generalized) canonical correlation analysis ((G)CCA, a combination of a bootstrap procedure with several distance Hotelling, 1936; Kettenring, 1971). All these approaches seek measures to statistically test how well the assumed composite composites that “best” explain the data and can be regarded as model ﬁts to the collected data. prescriptions for dimension reduction (Dijkstra and Henseler, The existing literature provides several measures with which to assess the discrepancy between the perfect ﬁt and the model 2011). Further approaches are partial least squares path modeling (PLS-PM, Wold, 1975), regularized general canonical ﬁt. In fact, every distance measure known from CFA can be used to assess the overall ﬁt of a composite model. They all capture correlation analysis (RGCCA, Tenenhaus and Tenenhaus, 2011), and generalized structural component analysis (GSCA, the discrepancy between the sample covariance matrix S and the Hwang and Takane, 2004). The use of predeﬁned weights is ˆ estimated model-implied covariance matrix 6 = 6(θ) of the also possible. indicators. In our study, we consider the following three distance We follow Dijkstra (2010) and apply GCCA in a ﬁrst step measures: squared Euclidean distance (d ), geodesic distance to estimate the correlation between the composites. In the (d ), and standardized root mean square residual (SRMR). following part, we give a brief description of GCCA. The vector The squared Euclidean distance between the sample and of indicators x of length K is split up into J subvectors x , so j the estimated model-implied covariance matrix is calculated as called blocks, each of dimension (K × 1) with j = 1, . . . , J. We follows: assume that the indicators are standardized to have means of K K zero and unit variances. Moreover, each indicator is connected XX d = (s − σˆ ) , (6) L ij ij to one composite only. Hence, the correlation matrix of the i=1 j=1 indicators can be calculated as 6 = E(xx ) and the intra-block correlation matrix as 6 = E(x x ). Moreover, the correlation jj j where K is the total number of indicators, and s and σˆ are ij ij matrix of the composites c = x w is calculated as follows: j j ′ the elements of the sample and the estimated model-implied 6 = E(cc ). In general, GCCA chooses the weights to maximize covariance matrix, respectively. It is obvious that the squared the correlation between the composites. In doing so, GCCA Euclidean distance is zero for a perfectly ﬁtting model, 6 = S. oﬀers the following options: sumcor, maxvar, ssqcor, minvar, Moreover, the geodesic distance stemming from a class of and genvar. distance functions proposed by Swain (1975) can be used to In the following part, we use maxvar under the constraint measure the discrepancy between the sample and estimated that each composite has a unit variance, w 6 w = 1, to jj j model-implied covariance matrix. It is given by the following: estimate the weights, the composites, and the resulting composite correlations. In doing so, the weights are chosen to maximize the largest eigenvalue of the composite correlation matrix. Thus, the d = (log(ϕ )) , (7) G i total variation of the composites is explained as well as possible by i=1 one underlying “principal component,” and the weights to form the composite c are calculated as follows (Kettenring, 1971): −1 where ϕ is the i-th eigenvalue of the matrix S 6 and K is the number of indicators. The geodesic distance is zero when and w = 6 a˜ / a˜ a˜ . (5) j j j jj j only when all eigenvalues equal one, i.e., when and only when the ﬁt is perfect. The subvector a˜ , of length J, corresponds to the largest j Finally, the SRMR (Hu and Bentler, 1999) can be used to assess 1 1 − − 2 2 the overall model ﬁt. The SRMR is calculated as follows: eigenvalue of the matrix 6 66 , where the matrix 6 , of D D GCCA builds composites in a way that they are maximally correlated. u K i XX For an overview we refer to Kettenring (1971). 2 SRMR = 2 ((s − σˆ )/(s s )) /(K(K + 1)), (8) ij ij ii jj In general, GCCA oﬀers several composites (canonical variates); but in our study, i=1 j=1 we have focused only on the canonical variates of the ﬁrst stage. Frontiers in Psychology | www.frontiersin.org 6 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA where K is the number of indicators. It reﬂects the average CFA or the geodesic distance. Values of the NFI close to one discrepancy between the empirical and the estimated model- imply a good model ﬁt. However, cut-oﬀ values still need to be implied correlation matrix. Thus, for a perfectly ﬁtting model, the determined. SRMR is zero, as σˆ equals s . Finally, we suggest considering the root mean square residual ij ij Since all distance measures considered are functions of the covariance of the outer residuals (RMS ) as a further ﬁt theta sample covariance matrix, a procedure proposed by Beran and index (Lohmöller, 1989). It is deﬁned as the square root of the Srivastava (1985) can be used to test the overall model ﬁt: average residual correlations. Since the indicators of one block are H 6 = 6(θ). The reference distribution of the distance allowed to be freely correlated, the residual correlations within measures as well as the critical values are obtained from the a block should be excluded and only the residual correlations transformed sample data as follows: across the blocks should be taken into account during its calculation. Small values close to zero for the RMS indicate theta a good model ﬁt. However, threshold values still need to be − 2 determined. XS 6 , (9) where the data matrix x of dimension (N × K) contains the 6. A MONTE CARLO SIMULATION N observations of all K indicators. This transformation ensures In order to assess our proposed procedure of statistically testing that the new dataset satisﬁes the null hypothesis; i.e., the sample covariance matrix of the transformed dataset equals the estimated the overall model ﬁt of composite models and to examine the behavior of the earlier presented discrepancy measures, we model-implied covariance matrix. The reference distribution of conduct a Monte Carlo simulation. In particular, we investigate the distance measures is obtained by bootstrapping from the transformed dataset. In doing so, the estimated distance based on the type I error rate (false positive rate) and the power, which are the most important characteristics of a statistical test. In the original dataset can be compared to the critical value from the reference distribution (typically the empirical 95% or 99% designing the simulation, we choose a number of concepts used : several times in the literature to examine the performance of ﬁt quantile) to decide whether the null hypothesis, H 6 = 6(θ) is rejected (Bollen and Stine, 1992). indices and tests of overall model ﬁt in CFA: a model containing two composites and a model containing three composites (Hu 5.2. Fit Indices for Composite Models and Bentler, 1999; Heene et al., 2012). To investigate the power of In addition to the test of overall model ﬁt, we provide some ﬁt the test procedure, we consider various misspeciﬁcations of these indices as measures of the overall model ﬁt. In general, ﬁt indices models. Figures 4 and 5 summarize the conditions investigated can indicate whether a model is misspeciﬁed by providing an in our simulation study. absolute value of the misﬁt; however, we advise using them with caution as they are based on heuristic rules-of-thumb rather than 6.1. Model Containing Two Composites statistical theory. Moreover, it is recommended to calculate the All models containing two composites are estimated using the ﬁt indices based on the indicator correlation matrix instead of speciﬁcation illustrated in the last column of Figure 4. The the covariance matrix. indicators x to x are speciﬁed to build composite c , while 11 13 1 The standardized root mean square residual (SRMR) the remaining three indicators build composite c . Moreover, the was already introduced as a measure of overall model ﬁt composites are allowed to freely correlate. The parameters of (Henseler et al., 2014). As described above, it represents the interest are the correlation between the two composites, and the average discrepancy between the sample and the model- weights, w to w . As column “Population model” of Figure 4 11 23 implied indicator correlation matrix. Values below 0.10 and, shows, we consider three types of population models with two following a more conservative view, below 0.08 indicate composites. a good model ﬁt (Hu and Bentler, 1998). However, these 6.1.1. Condition 1: No Misspeciﬁcation threshold values were proposed for common factor models First, in order to examine whether the rejection rates of the and their usefulness for composite models needs to be test procedure are close to the predeﬁned signiﬁcance level in investigated. cases in which the null hypothesis is true, a population model is Furthermore, the normed ﬁt index (NFI) is suggested as a considered that has the same structure as the speciﬁed model. The measure of goodness of ﬁt (Bentler and Bonett, 1980). It measures correlation between the two composites is set to ρ = 0.3 and the the relative discrepancy between the ﬁt of the baseline model composites are formed by its connected standardized indicators and the ﬁt of the estimated model. In this context, a model ′ ′ as follows: c = x w with i = 1, 2, where w = 0.6 0.2 0.4 and where all indicators are assumed to be uncorrelated (the model- i i i 1 w = 0.4 0.2 0.6 . All correlations between the indicators of implied correlation matrix equals the unit matrix) can serve one block are set to 0.5, which leads to the population correlation as a baseline model (Lohmöller, 1989, Chap. 2.4.4). To assess matrix given in Figure 4. the ﬁt of the baseline model and the estimated model, several measures can be used, e.g., the log likelihood function used in 6.1.2. Condition 2: Confounded Indicators The second condition is used to investigate whether the test This procedure is known as the Bollen-Stine bootstrap (Bollen and Stine, 1992) in the factor-based SEM literature. The model must be over-identiﬁed for this test. procedure is capable of detecting misspeciﬁed models. It presents Frontiers in Psychology | www.frontiersin.org 7 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 4 | Simulation design for the model containing two composites. FIGURE 5 | Simulation design for the model containing three composites. a situation where the researcher falsely assigns two indicators to It shows a situation where the correlation between the two wrong constructs. The correlation between the two composites indicators x and x is not fully explained by the two 13 21 and the weights are the same as in population model 1: ρ = composites. As in the two previously presented population ′ ′ 0.3, w = 0.6 0.2 0.4 , and w = 0.4 0.2 0.6 . However, in models, the two composites have a correlation of ρ = 0.3. 1 2 contrast to population model 1, the indicators x and x are The correlations among the indicators of one block are set to 13 21 interchanged. Moreover, the correlations among all indicators 0.5, and the weights for the construction of the composites ′ ′ of one block are 0.5. The population correlation matrix of the are set to w = 0.6 0.2 0.4 , and w = 0.4 0.2 0.6 . The 1 2 second model is presented in Figure 4. population correlation matrix of the indicators is presented in Figure 4. 6.1.3. Condition 3: Unexplained Correlation The third condition is chosen to further investigate the The model-implied correlation between the two indicators is calculated as capabilities of the test procedure to detect misspeciﬁed models. follows, 0.8 · 0.3 · 0.8 6= 0.5. Frontiers in Psychology | www.frontiersin.org 8 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA 6.2. Model Containing Three Composites the speciﬁed model. All composites are assumed to be freely Furthermore, we investigate a more complex model consisting correlated. In the population, the composite correlations are set of three composites. Again, each composite is formed by three to ρ = 0.3, ρ = 0.5, and ρ = 0.4. Each composite is built 12 13 23 indicators, and the composites are allowed to freely covary. by three indicators using the following population weights: w = ′ ′ 0.6 0.4 0.2 0.3 0.5 0.6 0.4 0.5 0.5 The column “Estimated model” of Figure 5 illustrates the , w = , and w = . The 2 3 speciﬁcation to be estimated in case of three composites. We indicator correlations of each block can be read from Figure 5. The indicator correlation matrix of population model 4 is given assume that the composites are built as follows: c = x w , 1 1 ′ ′ c = x w , and c = x w . Again, we examine two diﬀerent in Figure 5. 2 2 3 3 2 3 population models. 6.2.2. Condition 5: Unexplained Correlation 6.2.1. Condition 4: No Misspeciﬁcation In the ﬁfth condition, we investigate a situation where the The fourth condition is used to further investigate whether the correlation between two indicators is not fully explained by the rejection rates of the test procedure are close to the predeﬁned underlying composites, similar to what is observed in Condition signiﬁcance level in cases in which the null hypothesis is true. 3. Consequently, population model 5 does not match the model Hence, the structure of the fourth population model matches to be estimated and is used to investigate the power of the FIGURE 6 | Rejection rates for population model 1. Frontiers in Psychology | www.frontiersin.org 9 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 7 | Rejection rates for population model 2 and 3. overall model test. It equals population model 4 with the observations (with increments of 100) and the signiﬁcance level exception that the correlation between the indicators x and x α from 1% to 10%. To obtain the reference distribution of 13 21 is only partly explained by the composites. Since the original the discrepancy measures considered, 200 bootstrap samples are correlation between these indicators is 0.084, a correlation of drawn from the transformed and standardized dataset. Each 0.25 presents only a weak violation. The remaining model dataset is used in the maxvar procedure to estimate the model stays untouched. The population correlation matrix is illustrated parameters. in Figure 5. All simulations are conducted in the statistical programming environment R (R Core Team, 2016). The samples are drawn from the multivariate normal distribution using the mvrnorm 6.3. Further Simulation Conditions and function of the MASS packages (Venables and Ripley, 2002). Expectations The results for the test of overall model ﬁt are obtained by To assess the quality of the proposed test of the overall user-written functions and the matrixpls package (Rönkkö, model ﬁt, we generate 10,000 standardized samples from 2016). the multivariate normal distribution having zero means and a covariance matrix according to the respective population model. Moreover, we vary the sample size from 50 to 1,450 These functions are provided by the contact author upon request. Frontiers in Psychology | www.frontiersin.org 10 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA FIGURE 8 | Rejection rates for population model 4 and 5. Since population models 1 and 4 ﬁt the respective 6.4. Results speciﬁcation, we expect rejection rates close to the predeﬁned Figure 6 illustrates the rejection rates for population model levels of signiﬁcance α. Additionally, we expect that for an 1 i.e., no misspeciﬁcation. Besides the rejection rates, the increasing sample size, the predeﬁned signiﬁcance level is kept ﬁgure also depicts the 95% conﬁdence intervals (shaded area) with more precision. For population model 2, 3, and 5, much constructed around the rejection rates to clarify whether a larger rejection rates are expected as these population models rejection rate is signiﬁcantly diﬀerent from the predeﬁned do not match the respective speciﬁcation. Moreover, we expect signiﬁcance level. that the power of the test to detect misspeciﬁcations would First, as expected, the squared Euclidean distance (d ) as well increase along with a larger sample size. Regarding the diﬀerent as the SRMR lead to identical results. The test using the squared discrepancy measures, we have no expectations, only that the Euclidean distance and the SRMR rejects the model somewhat squared Euclidean distance and the SRMR should lead to too rarely in case of α = 10% and α = 5% respectively; however, identical results. For standardized datasets, the only diﬀerence is a constant factor that does not aﬀect the order of the observations The limits of the 95% conﬁdence interval are calculated as, pˆ ± −1 −1 in the reference distribution and, therefore, does not aﬀect the 8 (0.975) pˆ(1 − pˆ)/10000, where pˆ represents the rejection rate and 8 () is decision about the null hypothesis. the quantile function of the standard normal distribution. Frontiers in Psychology | www.frontiersin.org 11 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA for an increasing sample size, the rejection rates converge to Its application is appropriate in situations where the research the predeﬁned signiﬁcance level without reaching it. For the goal is to examine whether an artifact is useful rather than 1% signiﬁcance level, a similar picture is observed; however, to establish whether a certain concept exists. It follows the for larger sample sizes, the signiﬁcance level is retained more same steps usually applied in SEM and enables researchers often compared to the larger signiﬁcance levels. In contrast, to analyze a variety of situations, in particular, beyond the the test using the geodesic distance mostly rejects the model realm of social and behavioral sciences. Hence, CCA allows for too often for the 5% and 10% signiﬁcance level. However, the dealing with research questions that could not be appropriately obtained rejection rates are less often signiﬁcantly diﬀerent from dealt with yet in the framework of CFA or more generally the predeﬁned signiﬁcance level compared to the same situation in SEM. where the SRMR or the Euclidean distance is used. In case The results of the Monte Carlo simulation conﬁrmed that of α = 1% and sample sizes larger than n = 100, the CCA can be used for conﬁrmatory purposes. They revealed test using the geodesic distance rejects the model signiﬁcantly that the bootstrap-based test, in combination with diﬀerent too often. discrepancy measures, can be used to statistically assess the Figure 7 displays the rejection rates for population models overall model ﬁt of the composite model. For speciﬁcations 2 and 3. The horizontal line at 80% depicts the commonly matching the population model, the rejection rates were in recommended power for a statistical test (Cohen, 1988). For the acceptable range, i.e., close to the predeﬁned signiﬁcance the two cases where the speciﬁcation does not match the level. Moreover, the results of the power analysis showed that underlying data generating process, the test using the squared the boostrap-based test can reliably detect misspeciﬁed models. Euclidean distance as well as the SRMR has more power than However, caution is needed in case of small sample sizes where the test using the geodesic distance, i.e., the test using former the rejection rates were low, which means that misspeciﬁed discrepancy measures rejects the wrong model more often. models were not reliably detected. For model 2 (confounded indicators) the test produces higher In future research, the usefulness of the composite model or equal rejection rates compared to model 3 (unexplained in empirical studies needs to be examined, accompanied and correlation). Furthermore, as expected, the power decreases for enhanced by simulation studies. In particular, the extensions an increasing level of signiﬁcance and increases with increasing outlined by Dijkstra (2017); to wit, interdependent systems of sample sizes. equations for the composites estimated by classical econometric Figure 8 depicts the rejection rates for population model 4 methods (like 2SLS and three-stage least squares) warrant further and 5. Again, the 95% conﬁdence intervals are illustrated for analysis and scrutiny. Robustness with respect to non-normality population model 4 (shaded area) matching the speciﬁcation and misspeciﬁcation also appear to be relevant research topics. estimated. Considering population model 4 which matches Additionally, devising ways to eﬃciently predict indicators and the estimated model, the test leads to similar results for all composites might be of particular interest (see for example the three discrepancy measures. However, the rejection rate of work by Shmueli et al., 2016). the test using the geodesic distance converges faster to the Moreover, to contribute to the conﬁrmatory character of CCA, predeﬁned signiﬁcance level, i.e., for smaller sample sizes n ≥ we recommend further study of the performance and limitations 100. Again, among the three discrepancy measures considered, of the proposed test procedure: consider more misspeciﬁcations the geodesic distance performs best in terms of keeping the and the ability of the test to reliably detect them, ﬁnd further signiﬁcance level. discrepancy measures and examine their performance, and As the extent of misspeciﬁcation in population model 5 is investigate the behavior of the test under the violation of the minor, the test struggles to detect the model misspeciﬁcation up normality assumption, similar as Nevitt and Hancock (2001) did to sample sizes n = 350, regardless of the discrepancy measure for CFA. Finally, cut-oﬀ values for the ﬁt indices need to be used. However, for sample sizes larger than 350 observations, determined for CCA. the test detects the model misspeciﬁcation satisfactorily. For sample sizes larger than 1,050 observations, the misspeciﬁcation AUTHOR CONTRIBUTIONS was identiﬁed in almost all cases regardless of the signiﬁcance level and the discrepancy measure used. Again, this conﬁrms FS conducted the literature review and wrote the majority the anticipated relationship between sample size and statistical of the paper (contribution: ca. 50%). JH initiated this paper power. and designed the simulation study (contribution: ca. 25%). TD proposed the composite model and developed the model ﬁt test (contribution: ca. 25%). 7. DISCUSSION We introduced the conﬁrmatory composite analysis (CCA) SUPPLEMENTARY MATERIAL as a full-ﬂedged technique for conﬁrmatory purposes that employs composites to model artifacts, i.e., design concepts. It The Supplementary Material for this article can be found overcomes current limitations in CFA and SEM and carries the online at: https://www.frontiersin.org/articles/10.3389/fpsyg. spirit of CFA and SEM to research domains studying artifacts. 2018.02541/full#supplementary-material Frontiers in Psychology | www.frontiersin.org 12 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA REFERENCES Fornell, C., and Bookstein, F. L. (1982). Two structural equation models: LISREL and PLS applied to consumer exit-voice theory. J. Market. Res. 19, 440–452. Bagozzi, R. P. (1994). “Structural equation models in marketing research: basic doi: 10.2307/3151718 principles,” in Principles of Marketing Research eds R. P. Bagozzi (Oxford: Gefen, D., Straub, D. W., and Rigdon, E. E. (2011). An update and extension to Blackwell), 317–385. SEM guidelines for admnistrative and social science research. MIS Quart. 35, Bagozzi, R. P., and Yi, Y. (1988). On the evaluation of structural equation models. iii–xiv. doi: 10.2307/23044042 J. Acad. Market. Sci. 16, 74–94. doi: 10.1007/BF02723327 Grace, J. B., Anderson, T. M., Olﬀ, H., and Scheiner, S. M. (2010). On the Baskerville, R., and Pries-Heje, J. (2010). Explanatory design theory. Busin. Inform. speciﬁcation of structural equation models for ecological systems. Ecol. Monogr. Syst. Eng. 2, 271–282. doi: 10.1007/s12599-010-0118-4 80, 67–87. doi: 10.1890/09-0464.1 Bentler, P. M., and Bonett, D. G. (1980). Signiﬁcance tests and goodness Grace, J. B., and Bollen, K. A. (2008). Representing general theoretical concepts of ﬁt in the analysis of covariance structures. Psychol. Bull. 88, 588–606. in structural equation models: the role of composite variables. Environ. Ecol. doi: 10.1037/0033-2909.88.3.588 Statist. 15, 191–213. doi: 10.1007/s10651-007-0047-7 Beran, R., and Srivastava, M. S. (1985). Bootstrap tests and conﬁdence Hayduk, L. A. (1988). Structural Equation Modeling With LISREL: Essentials and regions for functions of a covariance matrix. Ann. Statist. 13, 95–115. Advances. Baltimore, MD: John Hopkins University Press. doi: 10.1214/aos/1176346579 Heene, M., Hilbert, S., Freudenthaler, H. H., and Buehner, M. (2012). Sensitivity of Bollen, K. A. (1989). Structural Equations with Latent Variables. New York, NY: SEM ﬁt indexes with respect to violations of uncorrelated errors. Struct. Equat. John Wiley & Sons Inc . Model. 19, 36–50. doi: 10.1080/10705511.2012.634710 Bollen, K. A. (2001). “Two-stage least squares and latent variable models: Henseler, J. (2017). Bridging design and behavioral research with Simultaneous estimation and robustness to misspeciﬁcations,” in Structural variance-based structural equation modeling. J. Advert. 46, 178–192. Equation Modeling: Present and Future, A Festschrift in Honor of Karl Jöreskog doi: 10.1080/00913367.2017.1281780 eds R. Cudeck, S. Du Toit, and D. Sörbom (Chicago: Scientiﬁc Software Henseler, J., Dijkstra, T. K., Sarstedt, M., Ringle, C. M., Diamantopoulos, International), 119–138. A., Straub, D. W., et al. (2014). Common beliefs and reality about PLS Bollen, K. A., and Stine, R. A. (1992). Bootstrapping goodness-of-ﬁt comments on Rönkkö and Evermann (2013). Organ. Res. Methods 17, 182–209. measures in structural equation models. Sociol. Methods Res. 21, 205–229. doi: 10.1177/1094428114526928 doi: 10.1177/0049124192021002004 Holbert, R. L., and Stephenson, M. T. (2002). Structural equation modeling in Borden, N. H. (1964). The concept of the marketing mix. J. Advert. Res. 4, 2–7. the communication sciences, 1995–2000. Hum. Commun. Res. 28, 531–551. Brown, T. A. (2015). Conﬁrmatory Factor Analysis for Applied Research. New York, doi: 10.1111/j.1468-2958.2002.tb00822.x NY: Guilford Press. Horvath, I. (2004). A treatise on order in engineering design research. Res. Eng. Browne, M. W. (1984). Asymptotically distribution-free methods for the Design 15, 155–181. doi: 10.1007/s00163-004-0052-x analysis of covariance structures. Br. J. Math. Statist. Psychol. 37, 62–83. Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28, 321– doi: 10.1111/j.2044-8317.1984.tb00789.x 377. doi: 10.1093/biomet/28.3-4.321 Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd Edn. Hu, L., and Bentler, P. M. (1998). Fit indices in covariance structure modeling: Hillsdale, MI: Lawrence Erlbaum Associates. sensitivity to underparameterized model misspeciﬁcation. Psychol. Methods 3, Crowley, D. M. (2013). Building eﬃcient crime prevention strategies. Criminol. 424–453. doi: 10.1037/1082-989X.3.4.424 Public Policy 12, 353–366. doi: 10.1111/1745-9133.12041 Hu, L., and Bentler, P. M. (1999). Cutoﬀ criteria for ﬁt indexes in covariance Diamantopoulos, A. (2008). Formative indicators: introduction to the structure analysis: conventional criteria versus new alternatives. Struc. Equat. special issue. J. Busin. Res. 61, 1201–1202. doi: 10.1016/j.jbusres.2008. Model. 6, 1–55. doi: 10.1080/10705519909540118 01.008 Hwang, H., and Takane, Y. (2004). Generalized structured component analysis. Diamantopoulos, A., and Winklhofer, H. M. (2001). Index construction with Psychometrika 69, 81–99. doi: 10.1007/BF02295841 formative indicators: an alternative to scale development. J. Market. Res. 38, Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. 269–277. doi: 10.1509/jmkr.38.2.269.18845 Psychometrika 32, 443–482. doi: 10.1007/BF02289658 Dijkstra, T. K. (2010). “Latent variables and indices: Herman Wold’s basic design Keller, H. (2006). The SCREEN I (seniors in the community: risk evaluation and partial least squares,” in Handbook of Partial Least Squares (Berlin: for eating and nutrition) index adequately represents nutritional risk. J. Clin. Springer), 23–46. Epidemiol. 59, 836–841. doi: 10.1016/j.jclinepi.2005.06.013 Dijkstra, T. K. (2013). “Composites as factors: Canonical variables revisited,” in Kenny, D. A. (1979). Correlation and Causality. Hoboken, NJ: John Wiley & Sons Working Paper. Groningen. Available online at: https://www.rug.nl/staﬀ/t.k. Inc. dijkstra/composites-as-factors.pdf Kettenring, J. R. (1971). Canonical analysis of several sets of variables. Biometrika Dijkstra, T. K. (2015). “All-inclusive versus single block composites,” in Working 58, 433–451. doi: 10.1093/biomet/58.3.433 Paper. Groningen. Available online at: https://www.researchgate.net/proﬁle/ Kirmayer, L. J., and Crafa, D. (2014). What kind of science for psychiatry? Front. Theo_Dijkstra/publication/281443431_all-inclusive_and_single_block_ Hum. Neurosci. 8:435. doi: 10.3389/fnhum.2014.00435 composites/links/55e7509208ae65b63899564f/all-inclusive-and-single-block- Klein, A., and Moosbrugger, H. (2000). Maximum likelihood estimation of composites.pdf latent interaction eﬀects with the LMS method. Psychometrika 65, 457–474. Dijkstra, T. K. (2017). “A perfect match between a model and a mode,” in Partial doi: 10.1007/BF02296338 Least Squares Path Modeling, eds H. Latan, and R. Noonan (Cham: Springer), Kline, R. B. (2015). Principles and Practice of Structural Equation Modeling. New 55–80. York, NY: Guilford Press. Dijkstra, T. K., and Henseler, J. (2011). Linear indices in nonlinear structural Lee, H.-J. (2005). Developing a professional development program model based on equation models: best ﬁtting proper indices and other composites. Qual. Quant. teachers’ needs. Profess. Educ. 27, 39–49. 45, 1505–1518. doi: 10.1007/s11135-010-9359-z Little, T. D. (2013). Longitudinal Structural Equation Modeling. New York, NY: Dijkstra, T. K., and Henseler, J. (2015). Consistent and asymptotically normal PLS Guilford Press. estimators for linear structural equations. Computat. Statist. Data Anal. 81, Lohmöller, J.-B. (1989). Latent Variable Path Modeling with Partial Least Squares. 10–23. doi: 10.1016/j.csda.2014.07.008 Heidelberg: Physica. Fisher, R. A. (1936). The use of multiple measurements in taxonomic Lussier, P., LeBlanc, M., and Proulx, J. (2005). The generality of criminal behavior: problems. Ann. Eugen. 7, 179–188. doi: 10.1111/j.1469-1809.1936. a conﬁrmatory factor analysis of the criminal activity of sex oﬀenders in tb02137.x adulthood. J. Crim. Just. 33, 177–189. doi: 10.1016/j.jcrimjus.2004.12.009 Fong, C. J., Davis, C. W., Kim, Y., Kim, Y. W., Marriott, L., and Kim, S. (2016). MacCallum, R. C., and Austin, J. T. (2000). Applications of structural Psychosocial factors and community college student success. Rev. Educ. Res. equation modeling in psychological research. Ann. Rev. Psychol. 51, 201–226. 87, 388–424. doi: 10.3102/0034654316653479 doi: 10.1146/annurev.psych.51.1.201 Frontiers in Psychology | www.frontiersin.org 13 December 2018 | Volume 9 | Article 2541 Schuberth et al. CCA MacCallum, R. C., and Browne, M. W. (1993). The use of causal indicators in Shmueli, G., Ray, S., Estrada, J. M. V., and Chatla, S. B. (2016). The elephant in covariance structure models: Some practical issues. Psychol. Bull. 114, 533–541. the room: Predictive performance of PLS models. J. Busin. Res. 69, 4552–4564. doi: 10.1037/0033-2909.114.3.533 doi: 10.1016/j.jbusres.2016.03.049 Malaeb, Z. A., Summers, J. K., and Pugesek, B. H. (2000). Using structural equation Simon, H. (1969). The Sciences of the Artiﬁcial. Cambridge: MIT Press. modeling to investigate relationships among ecological variables. Environ. Ecol. Sobel, M. E. (1997). “Measurement, causation and local independence in latent Statist. 7, 93–111. doi: 10.1023/A:1009662930292 variable models,” in Latent Variable Modeling and Applications to Causality, ed Marcoulides, G. A., and Chin, W. W. (2013). “You write, but others read: M. Berkane (New York, NY. Springer), 11–28. common methodological misunderstandings in PLS and related methods,” Spears, N., and Singh, S. N. (2004). Measuring attitude toward the in New Perspectives in Partial Least Squares and Related Methods, eds H. brand and purchase intentions. J. Curr. Iss. Res. Advert. 26, 53–66. Abdi, V. E. Vinzi, G. Russolillo, and L. Trinchera (New York, NY: Springer), doi: 10.1080/10641734.2004.10505164 31–64. Steenkamp, J.-B. E., and Baumgartner, H. (2000). On the use of structural Marcoulides, G. A., and Schumacker, R. E., editors (2001). New Developments and equation models for marketing modeling. Int. J. Res. Market. 17, 195–202. Techniques in Structural Equation Modeling. Mahwah, NJ: Lawrence Erlbaum doi: 10.1016/S0167-8116(00)00016-1 Associates. Swain, A. (1975). A class of factor analysis estimation procedures with Markus, K. A., and Borsboom, D. (2013). Frontiers of Test Validity Theory: common asymptotic sampling properties. Psychometrika 40, 315–335. Measurement, Causation, and Meaning. New York, NY: Routledge. doi: 10.1007/BF02291761 McIntosh, A., and Gonzalez-Lima, F. (1994). Structural equation modeling and its Tenenhaus, A., and Tenenhaus, M. (2011). Regularized generalized application to network analysis in functional brain imaging. Hum. Brain Mapp. canonical correlation analysis. Psychometrika 76, 257–284. 2, 2–22. doi: 10.1007/s11336-011-9206-8 Møller, C., Brandt, C. J., and Carugati, A. (2012). “Deliberately by design, or? Van de Schoot, R., Lugtig, P., and Hox, J. (2012). A checklist for Enterprise architecture transformation at Arla Foods,” in Advances in Enterprise testing measurement invariance. Eur. J. Develop. Psychol. 9, 486–492. Information Systems II, eds C. Møller, and S. Chaudhry (Boca Raton, FL: CRC doi: 10.1080/17405629.2012.686740 Press), 91–104. Vance, A., Benjamin Lowry, P., and Eggett, D. (2015). Increasing accountability Muthén, B. O. (1984). A general structural equation model with dichotomous, through user-interface design artifacts: a new approach to addressing ordered categorical, and continuous latent variable indicators. Psychometrika the problem of access-policy violations. MIS Quart. 39, 345–366. 49, 115–132. doi: 10.25300/MISQ/2015/39.2.04 Muthén, B. O. (2002). Beyond SEM: general latent variable modeling. Venables, W. N., and Ripley, B. D. (2002). Modern Applied Statistics With S, 4th Behaviormetrika 29, 81–117.doi: 10.2333/bhmk.29.81 Edn. New York, NY: Springer. Nevitt, J., and Hancock, G. R. (2001). Performance of bootstrapping Venkatesh, V., Morris, M. G., Davis, G. B., and Davis, F. D. (2003). User approaches to model test statistics and parameter standard error estimation acceptance of information technology: toward a uniﬁed view. MIS Quart. in structural equation modeling. Struc. Equat. Model. 8, 353–377. 27:425. doi: 10.2307/30036540 doi: 10.1207/S15328007SEM0803_2 Wight, D., Wimbush, E., Jepson, R., and Doi, L. (2015). Six steps in quality Pearson, K. (1901). On lines and planes of closest ﬁt to systems of intervention development (6SQuID). J. Epidemiol. Commun. Health 70, 520– points in space. Philos. Magazine 6 2, 559–572. doi: 10.1080/1478644010 525. doi: 10.1136/jech-2015-205952 9462720 Wold, H. (1975). “Path models with latent variables: The NIPALS approach. in R Core Team (2016). R: A Language and Environment for Statistical Computing. Quantitative Sociology, eds H. Blalock, A. Aganbegian, F. Borodkin, R. Boudon, Version 3.3.1. Vienna: R Foundation for Statistical Computing. and V. Capecchi (New York, NY: Academic Press), 307–357. Raykov, T. and Marcoulides, G. A. (2006). A First Course in Structural Equation Xiong, B., Skitmore, M., and Xia, B. (2015). A critical review of structural equation Modeling, 2nd Edn. Mahaw: Lawrence Erlbaum Associates. modeling applications in construction research. Automat. Construct. 49 (Pt A), Reichenbach, H. (1956). The Direction of Time. Berkeley, CA: University of 59–70. doi: 10.1016/j.autcon.2014.09.006 California Press. Rigdon, E. E. (2012). Rethinking partial least squares path modeling: in praise Conﬂict of Interest Statement: JH acknowledges a ﬁnancial interest in ADANCO of simple methods. Long Range Plan. 45, 341–358. doi: 10.1016/j.lrp.2012. and its distributor, Composite Modeling. 09.010 Rönkkö, M. (2016). matrixpls: Matrix-based Partial Least Squares Estimation. The remaining authors declare that the research was conducted in the absence of R package version 1.0.0. Available online at: https://cran.r-project.org/web/ any commercial or ﬁnancial relationships that could be construed as a potential packages/matrixpls/vignettes/matrixpls-intro.pdf conﬂict of interest. Rossiter, J. R. (2002). The C-OAR-SE procedure for scale development in marketing. Int. J. Res. Market. 19, 305–335. Copyright © 2018 Schuberth, Henseler and Dijkstra. This is an open-access article doi: 10.1016/S0167-8116(02)00097-6 distributed under the terms of the Creative Commons Attribution License (CC BY). Schumacker, R. E., and Lomax, R. G. (2009). A Beginner’s Guide to Structural The use, distribution or reproduction in other forums is permitted, provided the Equation Modeling, 3rd Edn. New York, NY: Routledge. original author(s) and the copyright owner(s) are credited and that the original Shah, R., and Goldstein, S. M. (2006). Use of structural equation modeling in publication in this journal is cited, in accordance with accepted academic practice. operations management research: looking back and forward. J. Operat. Manag. No use, distribution or reproduction is permitted which does not comply with these 24, 148–169. doi: 10.1016/j.jom.2005.05.001 terms. Frontiers in Psychology | www.frontiersin.org 14 December 2018 | Volume 9 | Article 2541
Frontiers in Psychology – Unpaywall
Published: Dec 13, 2018
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.