psychometrika—vol. 83, no. 1, 21–47
SIMULTANEOUS COMPONENT ANALYSIS BY MEANS OF TUCKER3
KU LEUVEN – KULAK
A new model for simultaneous component analysis (SCA) is introduced that contains the existing SCA
models with common loading matrix as special cases. The new SCA-T3 model is a multi-set generalization
of the Tucker3 model for component analysis of three-way data. For each mode (observational units,
variables, sets) a different number of components can be chosen and the obtained solution can be rotated
without loss of ﬁt to facilitate interpretation. SCA-T3 can be ﬁtted on centered multi-set data and also on
the corresponding covariance matrices. For this purpose, alternating least squares algorithms are derived.
SCA-T3 is evaluated in a simulation study, and its practical merits are demonstrated for several benchmark
Key words: simultaneous components analysis, multi-set data, tucker, parafac, rotation.
Simultaneous component analysis (SCA) aims to summarize observed (centered) scores of
variables in samples of several subpopulations into a small number of components for each
sample. Such data are also known as multi-set data, where each set consists of observations of
the same variables in a sample from one subpopulation. When no constraints are imposed in the
SCA problem, the best summary in terms of explained variance is given by principal component
analysis (PCA) for each sample separately. To facilitate the comparison of the components found
for each subpopulation, several constraints have been proposed for SCA. Imposing the component
weights matrices to be congruent (i.e., columnwise proportional) guarantees equal deﬁnitions of
the components as linear combinations of the observed variables for each subpopulation. This
method is referred to as SCA-W. Alternatively, one may impose that the structure matrices are
congruent (SCA-S) to establish equal interpretation of the components across subpopulations.
Yet another possibility is to impose pattern congruence (SCA-P) to obtain proportional regression
weights for optimally reconstructing the variables from the components. Relations between SCA-
W, SCA-S, and SCA-P are discussed in Kiers and Ten Berge (1994). The authors show that SCA-W
always has largest explained variance, followed by SCA-P, and then SCA-S. A disadvantage of
SCA-W is that it discriminates poorly between subpopulations with clearly different correlation
structures. Conversely, there can be a large gap in explained variance between doing separate
PCAs and SCA-S when correlation structures are very similar in subpopulations. Hence, not one
method of SCA seems to be preferred in all cases.
Working within the framework of SCA-P, Timmerman and Kiers (2003) consider three other
variants of SCA. The SCA-PF2 model is based on the multi-set Parafac2 model (Harshman,
1972) and is equal to SCA-P with identical component correlations across subpopulations. The
Electronic supplementary material The online version of this article (doi:10.1007/s11336-017-9568-7) contains
supplementary material, which is available to authorized users.
Correspondence should be made to Alwin Stegeman, Group Science, Engineering and Technology, KU Leuven –
Kulak, E. Sabbelaan 53, 8500 Kortrijk, Belgium. Email: email@example.com; http://www.alwinstegeman.nl
© 2017 The Psychometric Society