Examining the Latent Structure of the Delis–Kaplan Executive Function System

Examining the Latent Structure of the Delis–Kaplan Executive Function System Abstract Objective The current study aimed to determine whether the Delis–Kaplan Executive Function System (D-KEFS) taps into three executive function factors (inhibition, shifting, fluency) and to assess the relationship between these factors and tests of executive-related constructs less often measured in latent variable research: reasoning, abstraction, and problem solving. Method Participants included 425 adults from the D-KEFS standardization sample (20–49 years old; 50.1% female; 70.1% White). Eight alternative measurement models were compared based on model fit, with test scores assigned a priori to three factors: inhibition (Color-Word Interference, Tower), shifting (Trail Making, Sorting, Design Fluency), and fluency (Verbal/Design Fluency). The Twenty Questions, Word Context, and Proverb Tests were predicted in separate structural models. Results The three-factor model fit the data well (CFI = 0.938; RMSEA = 0.047), although a two-factor model, with shifting and fluency merged, fit similarly well (CFI = 0.929; RMSEA = 0.048). A bifactor model fit best (CFI = 0.977; RMSEA = 0.032) and explained the most variance in shifting indicators, but rarely converged among 5,000 bootstrapped samples. When the three first-order factors simultaneously predicted the criterion variables, only shifting was uniquely predictive (p < .05; R2 = 0.246–0.408). The bifactor significantly predicted all three criterion variables (p < .001; R2 = 0.141–242). Conclusions Results supported a three-factor D-KEFS model (i.e., inhibition, shifting, and fluency), although shifting and fluency were highly related (r = 0.696). The bifactor showed superior fit, but converged less often than other models. Shifting best predicted tests of reasoning, abstraction, and problem solving. These findings support the validity of D-KEFS scores for measuring executive-related constructs and provide a framework through which clinicians can interpret D-KEFS results. Assessment, Executive functions, Test construction Introduction A seminal article in executive function research reported a confirmatory factor analysis that supported three latent variables based on the common construct variance in their test battery: updating, shifting, and inhibition (Miyake et al., 2000). Following this article, numerous sequential latent variable studies have examined the structure of executive functions throughout the lifespan (e.g., de Frias, Dixon, & Strauss, 2009; Friedman, Corley, Hewitt, & Wright, 2009; Lehto, Juujärvi, Kooistra, & Pulkkinen, 2003; Fournier-Vicente, Larigauderie, & Gaonac’h, 2008). Tests of executive functions are inherently impure, and confirmatory factor analysis helps control for this impurity (Miyake & Friedman, 2012). Executive functions operate through lower-level abilities as control processes involved in the self-regulation of cognition and behavior (Jurado & Rosselli, 2007), meaning that non-executive cognitive functions contribute to performances on executive-related neuropsychological measures (Burgess, 1997; Duggan & Garcia-Barrera, 2015; Phillips, 1997). A confirmatory factor analytic approach results in latent variables based on shared variance between tests of executive functions, providing a purer estimate of these higher-order cognitive abilities than any individual test (Miyake et al., 2000). Since the first confirmatory factor analysis on executive functions, researchers have recommended the evaluation of latent factors in clinical settings (Miyake, Emerson, & Friedman, 2000), but clinicians do not commonly use composite scores based on evidenced-based factors in clinical practice. The three most heavily researched factors of executive function – the updating of working memory, the shifting of mental sets, and the inhibition of prepotent responses – have garnered a significant amount of empirical support through conceptual replications of the first measurement model (Miyake et al., 2000) across different age groups (e.g., de Frias et al., 2009; Friedman et al., 2009; Klauer, Schmitz, Teige-Mocigemba, & Voss, 2010; Lehto et al., 2003). As well, a more recent conceptualization has supported inhibition as a more unitary dimension of the executive system (Miyake & Friedman, 2012). However, despite the evidence for these factors, they have only been evaluated in an experimental context, with no published applications of these factors through clinical test batteries. Although executive-related composite scores remain uncommon in clinical settings (Rabin, Paolillo, & Barr, 2016), clinicians have moved towards broadband assessments of executive functions by developing batteries that evaluate diverse higher-order abilities through multiple tests (Jurado & Rosselli, 2007). Increasingly common in clinical practice, the Delis–Kaplan Executive Function System (D-KEFS; Delis, Kaplan, & Kramer, 2001) stands as a nine-test battery composed of traditional and newly developed measures of executive functions. Past researchers have evaluated its latent structure, identifying some evidence of diverse latent abilities explaining test performances (Floyd, Bergeron, Hamilton, & Parra, 2010; Latzman & Markon, 2010); however, these researchers used largely exploratory approaches, although confirmatory approaches, specifying a priori measurement models that consider task impurity and previous empirical findings, have since become the standard of the field when examining the latent structure of executive functions. Analyzing the correlation matrices published in the D-KEFS technical manual (Delis et al., 2001), Latzman and Markon (2010) used an exploratory factor analysis to identify a three-factor solution for the D-KEFS Total Achievement scores (for all tests except the Proverb Test) across three age spans (i.e., 8–19, 20–49, and 50–89). These researchers found performances on D-KEFS tests grouping to form test-specific factors rather than construct-specific factors. Only Sorting outcomes loaded on the first factor, only Verbal Fluency outcomes loaded on the second factor, and two Color-Word Interference Test outcomes, plus one Trail Making Test outcome, loaded on the third factor (i.e., all timed tests). The researchers labeled these factors cognitive flexibility, monitoring, and inhibition, respectively. A second research team administered 25 tests from the D-KEFS and the third edition of the Woodcock-Johnson Test of Cognitive Abilities (WJ-III) to a sample of 100 children, evaluating whether the latent structure of a combined test battery aligned with the Cattell-Horn-Carroll (CHC) theory of cognitive abilities (Floyd et al., 2010). These researchers found the D-KEFS tasks dispersed across different CHC factors during the exploratory factor analysis, while also identifying support for a second-order general factor. Taking these findings, the researchers ran a confirmatory model of their exploratory factor structure and identified acceptable fit for their model. However, many D-KEFS tests loaded on factors conceptually disparate from the construct that the D-KEFS tests were purported to measure. The D-KEFS Color-Word Interference Test, a version of the Stroop test (i.e., a measure used as an indicator for inhibition in past measurement models; Klauer et al., 2010; Miyake et al., 2000), loaded on the Processing Speed (Gs) factor, likely due to a primary outcome of time-to-completion. Three other D-KEFS tests (i.e., Free Sorting, Word Context, and Twenty Questions) loaded on a Comprehension-Knowledge (Gc) factor. These three tests tap into higher-order cognition, but likely share variance with Gc tasks because of their reliance on crystallized knowledge. When these tests co-loaded on a broad Executive Function factor, the overall model fit improved, which indicates unexplained executive function variance remaining in the model; however, these additional paths were rejected due to a lack of parsimony. Both past factor analyses on D-KEFS data ignore the task impurity problem endemic in the measurement of executive functions (Burgess, 1997; Duggan & Garcia-Barrera, 2015; Phillips, 1997) and identify factors based largely on common method variance, rather than the variance of executive function constructs. Floyd and colleagues (2010) concluded that “if there are measures of abilities associated with executive functions, they are contaminated by the general factor and more specific ability factors, so that there is probably little unique about them” (p. 734). Notably, in a second-order confirmatory factor model, the relationship between the second-order factor and the manifest variable is fully mediated by the first-order factors (Reise, Moore, & Haviland, 2010). In turn, the executive function variance not accounted for by first-order factors, composed largely of method variance, remains unexplained error variance in the model. The primary issue with the approach of these researchers is their limited consideration for past theory and research on executive functions. A latent variable analysis of the D-KEFS guided by past empirical evidence would provide an outcome more useful to clinicians using the broadband measure in clinical settings. Although flawed, these past factor analyses evidence (a) the task impurity of the D-KEFS measures – as is common to all tests of executive functions (Burgess, 1997; Duggan & Garcia-Barrera, 2015; Phillips, 1997) – and (b) the latent interrelatedness between these measures. In turn, an interpretation of D-KEFS tests independently from one another provides a biased estimate of executive functions; however, an interpretation that understands performance patterns in aggregate will provide an evaluation of executive functions more in alignment with current research on the construct, so long as the interpretation considers the influence of common method variance. With its multidimensional nature, the D-KEFS is an appropriate measure for the development of composite scores and the translation of evidence-based factors into clinical practice. However, to develop composite scores, a confirmatory factor analysis must first demonstrate that these evidence-based factors explain performance patterns across D-KEFS tests. As noted earlier, three factors of diverse executive functions (i.e., updating, shifting, and inhibition) have garnered substantial empirical support; however, they do not present an exhaustive list of executive functions (Miyake & Friedman, 2012; Miyake et al., 2000). Among the most frequently used terms for executive functions in the literature, five occur most often: planning, working memory, fluency, set-shifting, and inhibition (Packwood, Hodgetts, & Tremblay, 2011). The D-KEFS battery offers a series of tests that tap into many of these established constructs and provides a method for evaluating theoretical executive functions via a norm-referenced and clinically validated battery of tests. Among the executive function constructs measured by the D-KEFS, those with multiple tests available to serve as indicators in a factor analytic model include inhibition, shifting, and fluency. Both inhibition (i.e., the volitional restriction of a dominant or prepotent response pattern in reaction to a change in task demands) and shifting (i.e., the flexible switching between mental sets) are consistent with those more basic functions (Miyake & Friedman, 2012), but the third factor of updating (i.e., the monitoring of working memory content, along with the active addition and deletion of said content) is not directly measured by any D-KEFS tests. However, the D-KEFS does include fluency tasks, and the two mechanisms possibly underlying updating include the “effective gating of information and controlled retrieval from long-term memory” (Miyake & Friedman, 2012, p. 11). Verbal fluency performances require the strategic retrieval of information from long-term memory, involving both working memory and the lexicon (Shao, Janse, Visser, & Meyer, 2014; Unsworth, Spillers, & Brewer, 2011). Within the D-KEFS framework, fluency may serve as the best proxy for updating. Complementing these three factors, higher-order dimensions of executive function may be present in the D-KEFS data. Recently published statistical models have shown a common executive function dimension fully explaining the variance in inhibition, while also explaining a significant amount of variance in shifting and updating (Fleming, Heintzelman, & Bartholow, 2016; Friedman et al., 2008; Friedman, Miyake, Robinson, & Hewitt, 2011; Ito et al., 2015). Therefore, the D-KEFS may demonstrate a similar structure, whereby a common executive function explains performances across inhibition, shifting, and fluency. In addition to these three basic factors, additional tasks embedded in the D-KEFS, such as the Twenty Questions, Word Context, and Proverb Tests, may tap into constructs that have not been extensively evaluated in past confirmatory factor analyses (e.g., abstraction, reasoning, and problem solving; Baron, 2004; Delis et al., 2001; Shunk, Davis, & Dean, 2006). These constructs are semantically related to each other, but based on a latent semantic analysis (Packwood et al., 2011), they were unrelated to the five terms most frequently used to describe executive functions: planning, working memory, fluency, set-shifting, and inhibition. The D-KEFS offers a method for evaluating whether inhibition, shifting, or fluency are substantially related to these constructs, and these tests may best serve as outcomes in structural models rather than separable factors in measurement models. The D-KEFS is a widely disseminated, multidimensional clinical instrument, and the first goal of the proposed study was to use the normative D-KEFS data to link research on executive functions with clinical practice by deriving evidence-based factors of executive functions from the D-KEFS test battery. Achieving this goal involves two research aims that could guide the future development of evidence-based composite scores for executive functions in clinical settings. Specifically, we aimed to (a) derive a three-factor statistical model of executive functions (i.e., inhibition, shifting, and fluency) from D-KEFS tests, and (b) compare this three-factor model to alternative first-order models (e.g., one-factor, two-factor models) and a bifactor model (Reise, 2012), to determine any support for a common executive function composite. The second goal of this study was to evaluate how well these factors explained performance on tasks related to abstraction, reasoning, and problem solving using an analytical framework similar to the structural models tested by Miyake and colleagues (2000). To achieve this goal, we aimed to evaluate the relative contribution of each factor at predicting complex task performances through (a) each factor separately serving as a single predictor, (b) all three factors simultaneously serving as predictors, and (c) a common executive function bifactor serving as a sole predictor. Method Participants Participant data came from the D-KEFS normative sample. The D-KEFS norming procedure involved a standardized sampling of 1,750 participants (8–89 years old), with representation of sex, age, race/ethnicity, education, and geographic region consistent with the 2000 U.S. Census data. The large-scale systematic sampling procedure served to minimize sampling bias and improve the generalizability of the inference drawn from the normative data. The data were received anonymized from Pearson, and an institutional Human Research Ethics Board approved the secondary analyses conducted herein. For the current study, an adult sub-sample was selected for analysis (N = 425; 20–49 years old; 49.9% male; 70.1% White, 13.6% African-American, 12.2% Hispanic, 4% other races/ethnicities). This age range was selected because three-factor and incomplete bifactor models have been consistently observed in previous studies examining similarly aged samples (Fleming et al., 2016; Ito et al., 2015; Klauer et al., 2010; Miyake et al., 2000). An a priori power analysis also estimated the sample size for this age span to be sufficient (i.e., πˆ ~ 0.85) based on the approximate degrees of freedom for the evaluated models (df = ~25; Hancock, 2006). The D-KEFS technical manual (Delis et al., 2001) offers further information on the normative sample. A random sub-sample of 358 participants also completed the Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler, 1999). Materials D-KEFS test scores were assigned a priori to three factors: inhibition, shifting, and fluency. Additional test scores were used as control or criterion variables. Inhibition Two D-KEFS tests tap into inhibition: the Tower and Color-Word Interference Tests. The Tower Test involves participants re-organizing disks on three pegs to match a target design. The primary Tower Test outcome is the Total Achievement score, which is a sum of points awarded in each trial, weighted by moves-to-completion. One structural model found an alternative version of the tower test related to inhibition (Miyake et al., 2000). The Color-Word Interference Test involves two conditions requiring inhibition, both with a primary score of time-to-completion: Inhibition and Inhibition/Switching. The Inhibition trial consists of an incongruent Stroop condition, where participants read the color of the ink and not the word written. The Inhibition/Switching trial requires participants read either the color of the ink or the word written depending on whether the word is displayed inside a box. Shifting The first shifting indicator was time-to-completion on the Trail Making Test Number-Letter Switching trial. This test requires participants to connect labeled dots in both alphabetical and numerical order while actively switching between number and letter sets. The second indicator was the Sorting Test: Confirmed Correct Sorts, which has correlated significantly (r = 0.59) with performance on the Wisconsin Card Sorting Test (Delis et al., 2001), another test requiring shifting for effective performance (Miyake et al., 2000). Participants arrange cards based on verbal and visual-spatial characteristics without being told how to sort the cards, shifting from previous sorting rules to new rules to attain a greater number of accurate sorts. The last indicator was the Switching–Total Correct score (i.e., number of unique designs accurately drawn) from the Design Fluency Test, where participants had to draw novel abstract designs, while switching between connecting black and empty dots. Fluency For the Verbal Fluency Test, participants complete multiple one-minute sub-trials, where they must generate as many possible words beginning with either a letter or fitting within a category. The sum of the total correct words generated across all sub-trials served as fluency indicators: Letter Fluency–Total Correct and Category Fluency–Total Correct. The Design Fluency Test involves two trials where participants connected empty or filled dots to draw novel abstract designs. The total unique designs drawn across these two trials served as the third fluency indicator. Control variables The D-KEFS includes contrast scores through which scaled scores on control conditions are subtracted from scaled scores on conditions involving a greater executive demand (e.g., for the Color-Word Interference Test, the scaled score for combined performance on the Word Reading and Color Naming conditions is subtracted from the scaled score for the Inhibition condition to obtain a contrast score). Conceptually, these contrast scores would result in a purer score of executive function, controlling for lower-level cognitive abilities. However, raw difference scores have been criticized for poor psychometric properties (Furr, 2011) and an alternative residualized difference score was used in analyses (Tucker, Damarin, & Messick, 1965). Previous confirmatory factor analyses on executive function test batteries have used similar scores as dependent variables (Bettcher et al., 2016; Frazier et al., 2015; Pettigrew and Martin, 2014). To calculate a residualized difference score, an ordinary least squares regression was conducted, where the control condition (e.g., combined Word Reading and Color Naming performance) predicts the condition with higher cognitive demands (e.g., the Inhibition condition). Through this equation, the residual value (i.e., error) is saved and used as a residualized difference score. The residualized difference score is orthogonal to the score from the control condition, and it represents the variance in the executive function task that is not attributable to performance on the control condition. The percent of variance in the scaled score for the executive function task accounted for by the residualized difference score can be calculated through 1 minus the R2 value from the regression equation used to acquire the residualized score (i.e., the variance not attributable to the control condition). To control for processing speed, the two Color-Word Interference scores were made orthogonal to the summed performance on the Word Reading and Color Naming trials. For the Color-Word Interference Test, 70.90% of the variance in the scaled score for the Inhibition condition and 76.20% of the variance in the scaled score for the Inhibition/Switching condition were represented in the residualized scores. Also controlling for processing speed, the Trail Making Number-Letter Switching score was made orthogonal to the summed performance on the Number and Letter Sequencing trials. This residualized score represented 71.00% of the variance in the original Number-Letter Switching scaled score. The Design Fluency Switching score that loaded on the shifting factor was made orthogonal to the Design Fluency score that loaded on the fluency factor. This residualized score represented 83.50% of the variance in the original Design Fluency Switching score. Lastly, the Verbal Fluency scores were made orthogonal to the WASI Vocabulary subtest, controlling for the impact of language functioning on these outcomes. A total of 79.00% of the variance in the scaled score for the Letter Fluency condition and 84.10% of the variance in the scaled score for the Category Fluency condition were represented in the residualized scores. Criterion variables Three measures from the D-KEFS (i.e., Twenty Questions, Word Context, and Proverb Tests) will serve as criterion variables in structural models, predicted by the latent variables included in the measurement model. These D-KEFS tests require abstraction, reasoning, and problem solving for effective performance (Baron, 2004; Delis et al., 2001; Shunk et al., 2006), which represent semantically unique constructs from the latent variables included in the measurement model (Packwood et al., 2011). For the Twenty Questions Test, participants ask a series of questions to identify which picture the examiner pre-selected from a page of images, attempting to identify the correct image with the fewest questions possible. The Twenty Questions outcome used in the analysis was the Total Weighted Achievement Score. For the Word Context Test, participants are provided clues about the definitions of a set of neologisms, attempting to infer their meaning with as few clues as possible. The Word Context criterion score was the Total Consecutively Correct. During the Proverb Test, participants describe the meaning of a series of proverbs, with their accuracy and level of abstraction summed. The Proverb Test Total Achievement Score: Free Inquiry was used in analysis. Statistical Analysis The normative data are age-corrected and standardized for all D-KEFS variables (M = 10, SD = 3) and WASI variables (M = 50, SD = 10), with a higher value indicating a greater performance. All structural equation modeling was conducted in MPlus v.7.3 (Muthén & Muthén, 2014), using maximum likelihood to estimate model parameters based on the covariance matrix of the selected D-KEFS outcomes. Fit indices All models were evaluated in relation to statistical fit, with optimally fitting models selected to guide interpretation. The reported fit indices include the χ2 Test of Model Fit, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), the Root Mean Square Error of Approximation (RMSEA), the Comparative Fit Index (CFI), and the Tucker–Lewis Index (TLI). With large sample size, the χ2 test has increased power to reject the null hypothesis (Tanaka, 1987), and alternative fit indices will serve as primary metrics for interpretation. The AIC represents a modification of the χ2 statistic that penalizes for model complexity, with values closer to zero representing better fit (Akaike, 1974). Closely related to the AIC, the BIC can be interpreted in the same manner (Schwarz, 1978). For the CFI and TLI, values closer to 1.0 indicate better fit (Bentler, 1990; Cheung & Rensvold, 2002), while for the RMSEA values closer to zero are desirable. Past researchers have posited fit cutoffs of ≥.95 for the CFI/TLI and ≤0.06 for the RMSEA (Hu & Bentler, 1999). Some alternative recommendations provide greater leniency on these cutoffs, including a lower-bound cutoff of ≥.90 for the CFI (Bentler & Bonett, 1980) and a higher-bound cutoff of ≤.07 for the RMSEA (Steiger, 2007). The ∆CFI served as the primary metric for model comparison, using the recommend cutoff of ∆CFI ≥ .01 for significantly improved fit (Cheung & Rensvold, 2002). The ∆BIC was also used in model comparison, where BIC values closer to zero were preferable. Typical cutoffs for the ∆BIC include 0–2 for weak, 2–6 for positive, 6–10 for strong, and >10 for very strong evidence of one model being preferable over another (Raftery, 1995). Measurement models The hypothesized three-factor model (i.e., inhibition, shifting, and fluency) was evaluated first for statistical fit, and alternative measurement models were evaluated thereafter. These alternative models included a one-factor model and a set of three two-factor models merging two of the first-order factors: inhibition=fluency, fluency=shifting, inhibition=shifting. A bifactor model was also evaluated, where indicators co-loaded onto their specific factor (i.e., inhibition, shifting, fluency) and a general factor (i.e., common executive function). Aligning with past research, an incomplete bifactor model (Chen, West, & Sousa, 2006) was also evaluated, where a specific inhibition factor was not included in the model (Miyake & Friedman, 2012). For all models evaluated, 5,000 samples were bootstrapped to calculate how often among those samples the factor model “properly converged,” meaning the model both converged and provided an admissible solution (e.g., no correlation above 1.0, no negative residual variances, etc.). Reliability The factors included in each model were evaluated for their reliability using omega (ω) as an estimate, which represents the ratio of true-score variance to the total variance among indicators loading onto a given factor (McDonald, 1999). In the context of the bifactor model, the reliability of the general and specific factors was calculated using omega-hierarchical (ωH) and omega-subscale values (ωS), which provide estimates of variance accounted for by the bifactor and the specific factors (i.e., inhibition, shifting, fluency), respectively (Reise, 2012). There are no commonly used cutoff values to guide interpretation of ω values, but the values can be interpreted as the amount of variance in the indicators attributable to the factor and not error. Structural Model The best-fitting first-order factor model – and bifactor model if it showed acceptable fit – was treated as the accepted models, with its latent variable(s) used as predictors in structural models. In a series of models, either diverse (e.g., inhibition, shifting, fluency) or unitary (i.e., common executive function) factor(s) predicted performances on three complex tasks of executive functions (i.e., Twenty Questions, Word Context, and Proverb Tests). The path coefficient from each factor to each complex task was evaluated for its unique significance, along with the R2 value for each model to determine the amount of variance accounted for in each task. Results The descriptive statistics and bivariate correlations for all variables are provided in Tables 1 and 2, respectively. Skewness and kurtosis were within normal limits, indicating an approximate univariate normal distribution for each variable. One multivariate outlier was identified via Mahalanobis’ distance, but the exclusion of the outlier did not change the results, and the outlier’s data were included in analyses. Table 1. Descriptive statistics for variables included in measurement and structural models Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Note: *Indicates a value that was residualized of variance attributable to a control variable. Table 1. Descriptive statistics for variables included in measurement and structural models Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Note: *Indicates a value that was residualized of variance attributable to a control variable. Table 2. Correlation matrix for variables included in measurement and structural models 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 Note: *p < .05; **p < .01; +Indicates a value that was residualized of variance attributable to a control variable. Table 2. Correlation matrix for variables included in measurement and structural models 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 Note: *p < .05; **p < .01; +Indicates a value that was residualized of variance attributable to a control variable. Measurement Models Table 3 provides the fit indices for each model evaluated. The hypothesized three-factor model did not meet full criteria for adequate fit (CFI = 0.871; RMSEA = 0.066). Modification indices recommended correlated errors between the two Verbal Fluency scores, and the three-factor model with this modification had good statistical fit (CFI = 0.938; RMSEA = 0.047). The fit for the three-factor model surpassed the fit of the unidimensional model (CFI = 0.838; RMSEA = 0.072) and all models with merged factors. Although fit was much greater for the three-factor model in comparison to models that merged inhibition with shifting (CFI = 0.871; RMSEA = 0.065) or fluency (CFI = 0.854; RMSEA = 0.069), the model that merged fluency and shifting showed good fit (CFI = 0.929; RMSEA = 0.048) that was not significantly different than the fit of the three-factor model (∆CFI = 0.009), but came extremely close to the ∆CFI ≥ .01 threshold for a significant difference. The ∆BIC between these two models was 6.803 with a preference for the merged factor model; however, all other fit indices showed better fit for the three-factor model. In turn, the three-factor model was the accepted first-order factor model, displayed graphically in Fig. 1. Based on the bootstrapping analysis, this model properly converged at a frequency of 97.98%, with convergence frequencies for other models presented in Table 3. Reliability estimates were calculated for each latent variable included in the three-factor model, resulting in ω values of 0.59 for inhibition, 0.38 for shifting, and 0.38 for fluency. Table 3. Measurement model fit indices Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Note. χ2 = Chi-Square Test of Model Fit; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated. Table 3. Measurement model fit indices Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Note. χ2 = Chi-Square Test of Model Fit; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated. Fig. 1. View largeDownload slide Three-factor first-order measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Errors are not displayed above, but were as follows: CWIT Inhibition = 0.682; CWIT Inhibition/Switching = 0.326; TWT Total Achievement = 0.917; TMT Switch Time = 0.843; DF Switch Dots = 0.918; CS Total Confirmed Sorts = 0.710; VF Letters Total Correct Words = 0.936; VF Categories Total Correct Words = 0.884; DF Filled + Empty = 0.607. Fig. 1. View largeDownload slide Three-factor first-order measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Errors are not displayed above, but were as follows: CWIT Inhibition = 0.682; CWIT Inhibition/Switching = 0.326; TWT Total Achievement = 0.917; TMT Switch Time = 0.843; DF Switch Dots = 0.918; CS Total Confirmed Sorts = 0.710; VF Letters Total Correct Words = 0.936; VF Categories Total Correct Words = 0.884; DF Filled + Empty = 0.607. The bifactor model, displayed graphically in Fig. 2, was also evaluated and showed the best fit based on some indices (CFI = 0.977; RMSEA = 0.032). It was the only model to provide a non-significant χ2 value (p = .1006) despite the large sample size, but it had a ∆BIC of 11.682 when compared to the three-factor model. The bifactor model also properly converged among only 65.86% of 5,000 bootstrapped samples. Some loadings in the bifactor model were non-significant, including Trail Making on the shifting-specific factor and the Verbal Fluency phonemic trial on the bifactor. Although non-significant, their removal did not improve statistical fit and they were included in all structural models. The bifactor model without a specific inhibition factor did not converge. Reliability estimates were calculated for the general and specific factors: ωH came to 0.49 for the bifactor, while the ωS values were 0.39 for inhibition, 0.03 for shifting, and 0.48 for fluency. Despite its low rate of convergence, the bifactor model was used in structural models in addition to the three-factor model, considering its excellent fit and the parsimony of interpreting the general factor. Fig. 2. View largeDownload slide Bifactor measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Dashed lines designate non-significant loadings. Errors are neither displayed nor diagrammed above, but were as follows: CWIT Inhibition = 0.709; CWIT Inhibition/Switching = 0.248; TWT Total Achievement = 0.909; TMT Switch Time = 0.644; DF Switch Dots = 0.845; CS Total Confirmed Sorts = 0.547; VF Letters Total Correct Words = 0.455; VF Categories Total Correct Words = 0.614; DF Filled + Empty = 0.809. Fig. 2. View largeDownload slide Bifactor measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Dashed lines designate non-significant loadings. Errors are neither displayed nor diagrammed above, but were as follows: CWIT Inhibition = 0.709; CWIT Inhibition/Switching = 0.248; TWT Total Achievement = 0.909; TMT Switch Time = 0.644; DF Switch Dots = 0.845; CS Total Confirmed Sorts = 0.547; VF Letters Total Correct Words = 0.455; VF Categories Total Correct Words = 0.614; DF Filled + Empty = 0.809. Structural Models Two series of structural models were conducted, involving the prediction of the Twenty Questions, Word Context, and Proverb Tests using latent dimensions of executive function. The first series included three models in which all three first-order factors simultaneously predicted each task, and follow-up models in which each diverse factor predicted each task one at a time. The second series consisted of three models, where the common executive function factor predicted each task separately. Table 4 provides results for all structural models. The bifactor significantly predicted all complex tasks, accounting for a significant amount of variance in each outcome. The models involving the simultaneous prediction of each task by inhibition, shifting, and fluency accounted for a significant amount of variance in all tasks except for the Twenty Questions Test, although the R2 value for this model approached significance (p = .057). Notably, in all models involving simultaneous prediction, shifting was the only uniquely significant predictor. In models where only inhibition predicted the criterion, inhibition was a significant predictor; and in models where only shifting predicted the criterion, shifting was also a significant predictor. However, in models where only fluency predicted the criterion, the models did not properly converge due to a correlation above 1.0 between fluency and shifting. Table 4. Structural model results Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Note: *p<.05, **p<.01, ***p<.001; χ2 = Chi-Square Test of Model Fit; 20Q = Twenty Questions Test; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; PVT = Proverb Test; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated; WCT = Word Context Test. Table 4. Structural model results Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Note: *p<.05, **p<.01, ***p<.001; χ2 = Chi-Square Test of Model Fit; 20Q = Twenty Questions Test; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; PVT = Proverb Test; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated; WCT = Word Context Test. Discussion A series of confirmatory factor analyses on a subset of D-KEFS tests identified a three-factor solution as the best-fitting measurement model. This model included three executive functions posited by past researchers (Miyake et al., 2000; Packwood et al., 2011): inhibition, shifting, and fluency. Inter-factor correlations were high between shifting and inhibition (r= 0.591) and shifting and fluency (r = 0.696), although much lower between inhibition and fluency (r = 0.360). The relatively high correlations between shifting and the other factors likely derive from the close relationship between shifting and the common executive function dimension. A bifactor model showed that a general factor, representative of a common executive function, explained nearly all variance in the shifting indicators, leaving virtually no unique variance for a shifting-specific factor (ωS = 0.03). Unlike previous measurement models showing inhibition as fully explained by a common executive function factor (Fleming et al., 2016; Friedman et al., 2016; Ito et al., 2015), a bifactor model without inhibition failed to converge for the D-KEFS battery. Although the bifactor model fit the data better than the three-factor model, the model properly converged among only 65.86% of 5,000 bootstrapped samples; and while this model was informative for explicating the relationship between shifting and a general executive function dimension, the replicability of this model remains questionable. The low convergence rate for this model likely derives from the limited construct-specific variance represented in the manifest variables. Small inter-test correlations are typical for executive function test batteries (Willoughby et al., 2014), and they were also small in the current study. These low inter-test correlations correspond to limited amounts of shared variance between indicators, which equates to low construct reliability of factors in the model (Gagne & Hancock, 2006), as shown by the modest ωH and ωS reliability estimates. Limited construct reliability could explain the low rates of convergence for the bifactor model, where there was insufficient shared variance in the bifactor and all specific factors for many of the bootstrapped samples. Despite issues of low convergence and construct reliability, a robust general factor is consistent with previous research (Fleming et al., 2016; Friedman et al., 2008, 2016; Ito et al., 2015; Klauer et al., 2010), but the relationship between shifting and a common executive function dimension was novel, and can be interpreted in multiple ways. A first interpretation could consider shifting as higher-order, which aligns with previous claims that, in early development, shifting requires the establishment of more basic executive abilities prior to successful performance (Müller & Kerns, 2015). There is evidence for a superordinate fronto-cingulo-parietal network being activated during tasks of inhibition, working memory, and flexibility (Niendam et al., 2012), which supports a common executive function ability, but does not indicate one ability as more central than another. In contrast to past research supporting inhibition as closely related to common executive function (Fleming et al., 2016; Fournier-Vicente et al., 2008; Ito et al., 2015; Klauer et al., 2010), a substitute model with a central shifting ability could accurately represent the true nature of executive function. Alternatively, other possible interpretations for the near perfect relationship between common executive function and shifting may consider the nature of the tests assigned to its measurement. The tests that loaded onto the shifting factor in the accepted model conceptually and empirically align with general executive function abilities. The Sorting Test score loaded highest onto the shifting factor, and a previous analysis has shown a strong relationship between the Sorting Test and the Wisconsin Card Sorting Test (i.e., r = 0.59; Delis et al., 2001), which is related to both shifting ability (Miyake et al., 2000) and general executive function (Greve, Stickle, Love, Bianchini, & Stanford, 2005). The Trail Making Test had the second highest loading on the shifting factor, but this loading was non-significant in the bifactor model, indicating that Trail Making Test performance was highly related to common executive function. A recent construct validity analysis of the Trail Making Test found that both working memory and task-switching explained performance of the switching condition (Sanchez-Cubillo et al., 2009). The tasks used as indicators for shifting likely tapped into shifting, but also other executive-related constructs, making the variance within this factor more general than specific. Compared to the shifting factor, the more modest relationship between the general factor and the specific inhibition and fluency factors may also derive from the nature of the tests assigned to these factors. Method variance influenced the findings of a previous factor analysis on the D-KEFS (Latzman & Markon, 2010), and has potentially impacted the current findings as well. For the bifactor model, the specific inhibition (ωS = 0.39) and fluency (ωS = 0.48) factors had far more unique variance than the shifting factor (ωS = 0.03); however, the co-loadings for some of their respective indicators decreased, suggesting the bifactor explained some variance in those indicators. The inhibition-specific loading for the Tower Test decreased from the three-factor (λ = 0.289) to the bifactor model (λ = 0.144); and the fluency-specific loading for Design Fluency decreased from the three-factor (λ = 0.627) to the bifactor model (λ = 0.187) as well. The indicators loading the highest on the specific factors for these constructs were related based on method variance. Two Verbal Fluency scores loaded highly on the fluency-specific factor (i.e., λ = 0.572 and 0.730), and two Color-Word Interference scores loaded highly on the inhibition-specific factor (i.e., λ = 0.466 and 0.745). The strong relationship between shifting and the general factor could have resulted from a lack of common method variance between shifting indicators in comparison to the other specific factors, because shifting was the only diverse factor with indicators from three different D-KEFS tests. Aside from examining the latent structure of the D-KEFS, the current study also evaluated the relationship between executive function factors and tests purported to measure abstraction, reasoning, and problem solving. In a series of models, the bifactor or the three diverse factors (i.e., inhibition, shifting, and fluency) predicted a set of complex executive function tasks not often evaluated in latent variable research: Twenty Questions, Word Context, and Proverb Tests. The bifactor significantly predicted performance on all tasks. In comparison, when all diverse factors predicted these tasks simultaneously, shifting was the only factor to significantly predict performance. In models including just a single path from one diverse factor to the criterion variable, shifting and inhibition predicted all tasks significantly; however, when fluency served as the sole predictor, the models did not converge due to high co-linearity between shifting and fluency. The models involving shifting as the sole predictor yielded higher R2 values than models with inhibition or the bifactor as sole predictors, while models including three paths from inhibition, shifting, and fluency had higher absolute R2 values than any single path models. These findings indicate that, for the three-factor model, shifting explained the most variance in the criterion outcomes, likely due to its association with the common executive function factor. In turn, the constructs measured by the Twenty Questions, Word Context, and Proverb Tests are closely related to either shifting or general executive function ability. The results discussed above achieved the primary aims of this study: determining the latent structure of the D-KEFS and which constructs explain performances on a set of complex tasks. An overarching goal that guided these two aims was the linkage between the current findings and clinical practice; however, the analysis used residualized variables, which were free of common method variance attributable to speed and language ability, but are not normed through the D-KEFS. The residualization process has the benefit of dealing with the issue of task impurity that characterizes executive function measurement; however, it causes a disconnect between the scores within this analysis and those applied in clinical practice. Summed scaled scores from the D-KEFS would not fully correspond to the factors identified in the accepted model, so the findings cannot be directly translated into clinical practice. Researchers have tried various methods to control for lower-level abilities when calculating scores in clinical practice. For example, different derived Trail Making Test scores have become the subject of research evaluation (Drane, Yuspeh, Huthwaite, & Klingler, 2002; Perianez et al., 2007), including a difference score (i.e., B-A), ratio score (i.e., B/A), and proportional score (i.e., B-A/A). These derived scores attempt to control for processing speed to provide a purer estimate of executive function. The residualization process follows the same intent as these approaches, but offers a more statistically rigorous method for controlling lower-level abilities than a simple difference score. Considering the issue of task impurity in executive function measurement, future editions of the D-KEFS could provide norms for residualized scores, using regression-based norming procedures to control for lower-order abilities that influence test scores. Previous researchers have discussed the relationship between processing speed in executive functions (Albinet, Boucard, Bouquet, & Audiffren, 2012; Salthouse, 2005), and the residualization approach adjusts for individual differences in processing speed on task performance. However, if speed were not residualized from certain scores, the shared variance between tasks could have been largely attributable to speed rather than the three hypothesized constructs ultimately represented as factors in the model. In a previous analysis of the latent structure of the D-KEFS, an exploratory model resulted in a factor composed of only time-based outcomes from the Trail Making and Color Word Interference Tests (Latzman & Markon, 2010); and if not for residualized scores, the current analysis could have resulted in a model inclusive of a similar speed-based factor. While other batteries include more basic measurements of processing speed (e.g., WAIS-IV Processing Speed Index; Wechsler, 2008), the goal of this study was to identify factors that aligned best with constructs reported previously in the executive function literature (Miyake et al., 2000; Packwood et al., 2011). In turn, although the factor model is not directly translatable, it is based on less confounded indicators that are more uniquely associated with their respective constructs. Another notable limitation was the inclusion of scores from the same D-KEFS test on either the same or different factors. The D-KEFS offers different conditions within each test that tap into different constructs, but these conditions often rely on a common method of measurement. Although the residualization procedure attempted to control for method variance, the use of multiple indicators from the same tests led to shared method variance between Color-Word Interference and Verbal Fluency scores infiltrating the inhibition and fluency factor scores, respectively. As well, the two orthogonalized Design Fluency scores loaded on the shifting and fluency factors; and while this approach controlled for shared method variance between indicators, it may have biased the model towards a multidimensional solution by orthogonalizing indicators assigned to separate, but correlated factors. Considering issues of multiple indicators from the same test loading onto the same or different factors, the findings also identify potential considerations for future editions of the D-KEFS, where the latent structure of the battery may be hypothesized and evaluated during test development. Future batteries could include a greater diversity of tests selected with the aim of having multiple indicators tapping into specific executive function constructs. Many experimental tasks often used in executive function research have not made their way into common clinical practice, likely due to limited use of computerized tests in applied settings (Rabin et al., 2014) and minimal change in assessment practices by active practitioners (Rabin et al., 2016). The D-KEFS offers the first co-normed battery of executive function tests, which likely contributes to its frequency of use; and incorporation of computerized tests in future editions may increase their dissemination, offering a linkage between research and clinical practice. The translation of experimental tasks of executive functions into clinical practice would increase the diversity of tasks available to measure different constructs, which could result in evidence-based composite scores in future D-KEFS editions. By design, the D-KEFS does not provide composite scores (Delis, Jacobson, Bondi, Hamilton, & Salmon, 2003), but instead offers multiple scores evaluating many proposed dimensions of executive function. Delis and colleagues questioned the utility of composite scores among clinical samples, because unitary constructs dissociate among clinical populations. Despite these concerns, past researchers have provided methods for the calculation of an atheoretical global D-KEFS composite score (Crawford, Garthwaite, Sutherland, & Borland, 2011); although this composite does not align with the current findings or any previous research on executive functions. Considering the improved purity of factor estimates of executive functions (Miyake & Friedman, 2012), the use of composite scores in clinical practice, analogous to evidence-based factors, may provide a more precise estimate of executive functions. Without composite scores, the current assessment practices for executive functions appear inconsistent compared to the assessment practices for other cognitive domains. Composite scores have become particularly common in clinical practice due to their ability to simplify psychometric interpretation by summarizing performances across tests. They have become almost ubiquitous in the assessment of intellectual functioning, where composite scores of intelligence present with strong reliability and predictive validity (Kamphaus, 2005; Lubinski, 2004; Sternberg, Grigorenko, & Bundy, 2001). Although composite scores allow clinicians to reduce the dimensions of their test findings, they are unfortunately unavailable for the D-KEFS, but they do not represent the only useful multivariate method in neuropsychological assessment to simplify complex test profiles. Alternative to composite scores, multivariate base rates for the D-KEFS, which quantify the normal frequency of low scores when multiple tests are administered and interpreted, offer an additional method for aggregating test information (Karr, Garcia-Barrera, Holdnack, & Iverson, 2017a, 2017b), but this method also does not consider the latent structure of the D-KEFS. Efforts to provide D-KEFS composite scores in future editions of the test battery would benefit from considering the theoretical structure of executive functions in their calculation, ensuring the highest possible level of reliability and construct validity. Although the residualization process prevents the calculation of composite scores in clinical practice, the latent structure identified herein offers an empirical framework through which clinicians can interpret their assessment findings. The modeling results illustrate that the D-KEFS taps into three core executive functions along with more global executive-related abilities. This approach may enhance interpretability of test scores and provide a better understanding of the effect of psychopathology and brain damage on executive functions (Snyder, Miyake, & Hankin, 2015). Neuropsychologists often discuss their constructs in groupings in assessment reports (e.g., attention, memory), and often infer the constructs underlying specific task performances (e.g., attention span, concentration, encoding, retrieval). The accuracy of these groupings is based on the construct validity of the administered tasks. In turn, neuropsychologists must know which cognitive abilities are tapped into by their specific tests to inform their case conceptualization and recommendations. Previous research has involved numerous posited executive function constructs based on face valid interpretations of task performance. This study offers the first attempt to explain performances on D-KEFS tests through hypothesized constructs supported by previous research (Miyake et al., 2000). Authors' Note Justin E. Karr is a Vanier Canada Graduate Scholar and he thanks the Natural Sciences and Engineering Research Council of Canada (NSERC) for their support of his graduate studies. This study was completed in partial fulfillment of the requirements for his dissertation. Data used for the analyses reported within this manuscript were provided by Pearson, Inc. (2001). Standardization data from the Delis–Kaplan Executive Function System (D-KEFS). Copyright © 2001 NCS Pearson, Inc. Used with permission. All rights reserved. San Antonio: Pearson, Inc. Conflict of Interest None of the authors have conflicts of interests to declare, but the following general disclosures are offered for the consideration of the readers. Mauricio A. Garcia-Barrera has served in the past as a consultant for Pearson. Grant L. Iverson has received research support from test publishing companies in the past, including PAR, Inc., ImPACT Applications, Inc., and CNS Vital Signs. He acknowledges current unrestricted philanthropic support from ImPACT Applications, Inc. He receives royalties for one neuropsychological test (Wisconsin Card Sorting Test-64 Card Version). References Akaike , H. ( 1974 ). A new look at the statistical model identification . IEEE Transactions on Automatic Control , 19 , 716 – 723 . http://dx.doi.org/10.1109/TAC.1974.1100705 . Google Scholar CrossRef Search ADS Albinet , C. T. , Boucard , G. , Bouquet , C. A. , & Audiffren , M. ( 2012 ). Processing speed and executive functions in cognitive aging: How to disentangle their mutual relationship? Brain and Cognition , 79 ( 1 ), 1 – 11 . http://doi.org/10.1016/j.bandc.2012.02.001 . Google Scholar CrossRef Search ADS Baron , I. S. ( 2004 ). Delis-Kaplan Executive Function System . Child Neuropsychology , 10 , 147 – 152 . http://dx.doi.org/10.1080/09297040490911140 . Google Scholar CrossRef Search ADS Bentler , P. M. ( 1990 ). Comparative fit indexes in structural models . Psychological Bulletin , 107 , 238 – 246 . http://dx.doi.org/10.1037/0033-2909.107.2.238 . Google Scholar CrossRef Search ADS Bentler , P. M. , & Bonett , D. G. ( 1980 ). Significance tests and goodness of fit in the analysis of covariance structures . Psychological Bulletin , 88 , 588 – 606 . http://dx.doi.org/10.1037/0033-2909.88.3.588 . Google Scholar CrossRef Search ADS Bettcher , B. M. , Mungas , D. , Patel , N. , Elofson , J. , Dutt , S. , Wynn , M. , et al. . ( 2016 ). Neuroanatomical substrates of executive functions: Beyond prefrontal structures . Neuropsychologia , 85 , 100 – 109 . http://dx.doi.org/10.1016/j.neuropsychologia.2016.03.001 . Google Scholar CrossRef Search ADS Burgess , P. W. ( 1997 ). Theory and methodology in executive function research. In Rabbitt P. (Ed.) , Methodology of frontal and executive function (pp. 81 – 116 ). Hove, UK : Psychology Press . Chen , F. F. , West , S. G. , & Sousa , K. H. ( 2006 ). A comparison of bifactor and second-order models of quality of life . Multivariate Behavioral Research , 41 , 189 – 225 . http://dx.doi.org/10.1207/s15327906mbr4102_5 . Google Scholar CrossRef Search ADS Cheung , G. W. , & Rensvold , R. B. ( 2002 ). Evaluating goodness-of-fit indexes for testing measurement invariance . Structural Equation Modeling , 9 , 233 – 255 . http://dx.doi.org/10.1207/S15328007SEM0902_5 . Google Scholar CrossRef Search ADS Crawford , J. R. , Garthwaite , P. H. , Sutherland , D. , & Borland , N. ( 2011 ). Some supplementary methods for the analysis of the Delis–Kaplan Executive Function System . Psychological Assessment , 23 , 888 – 898 . http://dx.doi.org/10.1037/a0023712 . Google Scholar CrossRef Search ADS de Frias , C. M. , Dixon , R. A. , & Strauss , E. ( 2009 ). Characterizing executive functioning in older special populations: From cognitively elite to cognitively impaired . Neuropsychology , 23 , 778 – 791 . http://dx.doi.org/10.1037/a0016743 . Google Scholar CrossRef Search ADS Delis , D. C. , Jacobson , M. , Bondi , M. W. , Hamilton , J. M. , & Salmon , D. P. ( 2003 ). The myth of testing construct validity using shared variance techniques with normal or mixed clinical populations: Lessons from memory assessment . Journal of the International Neuropsychological Society , 9 , 936 – 946 . Google Scholar CrossRef Search ADS Delis , D. C. , Kaplan , E. , & Kramer , J. H. ( 2001 ). Delis–Kaplan Executive Function System . Odessa, FL : Psychological Assessment Resources . Drane , D. L. , Yuspeh , R. L. , Huthwaite , J. S. , & Klingler , L. K. ( 2002 ). Demographic characteristics and normative observations for derived-trail making test indices . Cognitive and Behavioral Neurology , 15 , 39 – 43 . Duggan , E. C. , & Garcia-Barrera , M. A. ( 2015 ). Executive functioning and intelligence. In Goldstein S. , Naglieri J. A. , & Princiotta D. (Eds.) , Handbook of intelligence: Evolutionary theory, historical perspective, and current concepts (pp. 435 – 458 ). New York : Springer Publishing Co . http://dx.doi.org/10.1007/978-1-4939-1562-0_27 . Fleming , K. A. , Heintzelman , S. J. , & Bartholow , B. D. ( 2016 ). Specifying associations between conscientiousness and executive functioning: Mental set shifting, not prepotent response inhibition or working memory updating . Journal of Personality , 84 , 348 – 360 . http://dx.doi.org/10.1111/jopy.12163 . Google Scholar CrossRef Search ADS Floyd , R. G. , Bergeron , R. , Hamilton , G. , & Parra , G. R. ( 2010 ). How do executive functions fit with the Cattell–Horn–Carroll model? Some evidence from a joint factor analysis of the Delis–Kaplan executive function system and the Woodcock–Johnson III tests of cognitive abilities . Psychology in the Schools , 47 , 721 – 738 . http://dx.doi.org/10.1002/pits.20500 . Fournier-Vicente , S. , Larigauderie , P. , & Gaonac’h , D. ( 2008 ). More dissociations and interactions within central executive functioning: A comprehensive latent-variable analysis . Acta Psychologica , 129 , 32 – 48 . http://dx.doi.org/10.1016/j.actpsy.2008.04.004 . Google Scholar CrossRef Search ADS Frazier , D. T. , Bettcher , B. M. , Dutt , S. , Patel , N. , Mungas , D. , Miller , J. , et al. . ( 2015 ). Relationship between insulin-resistance processing speed and specific executive function profiles in neurologically intact older adults . Journal of the International Neuropsychological Society , 21 , 622 – 628 . http://dx.doi.org/10.1017/S1355617715000624 . Google Scholar CrossRef Search ADS Friedman , N. P. , Corley , R. P. , Hewitt , J. K. , & Wright , K. P. ( 2009 ). Individual differences in childhood sleep problems predict later cognitive executive control . Sleep , 32 , 323 – 333 . http://doi.org/10.1093/sleep/32.3.323 . Google Scholar CrossRef Search ADS Friedman , N. P. , Miyake , A. , Altamirano , L. J. , Corley , R. P. , Young , S. E. , Rhea , S. A. , et al. . ( 2016 ). Stability and change in executive function abilities from late adolescence to early adulthood: A longitudinal twin study . Developmental Psychology , 52 , 326 – 340 . http://dx.doi.org/10.1037/dev0000075 . Google Scholar CrossRef Search ADS Friedman , N. P. , Miyake , A. , Robinson , J. L. , & Hewitt , J. K. ( 2011 ). Developmental trajectories in toddlers’ self-restraint predict individual differences in executive functions 14 years later: A behavioral genetic analysis . Developmental Psychology , 47 , 1410 – 1430 . http://dx.doi.org/10.1037/a0023750 . Google Scholar CrossRef Search ADS Friedman , N. P. , Miyake , A. , Young , S. E. , DeFries , J. C. , Corley , R. P. , & Hewitt , J. K. ( 2008 ). Individual differences in executive functions are almost entirely genetic in origin . Journal of Experimental Psychology: General , 137 , 201 – 225 . http://dx.doi.org/10.1037/0096-3445.137.2.201 . Google Scholar CrossRef Search ADS Furr , R. M. ( 2011 ). Scale construction and psychometrics for social and personality psychology . London, England : SAGE Publications Ltd . Google Scholar CrossRef Search ADS Gagne , P. , & Hancock , G. R. ( 2006 ). Measurement model quality, sample size, and solution propriety in confirmatory factor models . Multivariate Behavioral Research , 41 , 65 – 83 . http://dx.doi.org/10.1207/s15327906mbr4101_5 . Google Scholar CrossRef Search ADS Greve , K. W. , Stickle , T. R. , Love , J. M. , Bianchini , K. J. , & Stanford , M. S. ( 2005 ). Latent structure of the Wisconsin Card Sorting Test: A confirmatory factor analytic study . Archives of Clinical Neuropsychology , 20 , 355 – 364 . http://dx.doi.org/10.1016/j.acn.2004.09.004 . Google Scholar CrossRef Search ADS Hancock , G. R. ( 2006 ). Power analysis in covariance structure modeling. In Hancock G. R. , & Mueller R. O. (Eds.) , Structural equation modeling: A second course (pp. 69 – 115 ). Greenwich, Connecticut : Information Age Publishing . Hu , L. , & Bentler , P. M. ( 1999 ). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives . Structural Equation Modeling , 6 ( 1 ), 1 – 55 . http://dx.doi.org/10.1080/10705519909540118 . Google Scholar CrossRef Search ADS Ito , T. A. , Friedman , N. P. , Bartholow , B. D. , Correll , J. , Loersch , C. , Altamirano , L. J. , et al. . ( 2015 ). Toward a comprehensive understanding of executive cognitive function in implicit racial bias . Journal of Personality and Social Psychology , 108 , 187 – 218 . http://dx.doi.org/10.1037/a0038557 . Google Scholar CrossRef Search ADS Jurado , M. B. , & Rosselli , M. ( 2007 ). The elusive nature of executive functions: A review of our current understanding . Neuropsychology Review , 17 , 213 – 233 . http://dx.doi.org/10.1007/s11065-007-9040-z . Google Scholar CrossRef Search ADS Kamphaus , R. W. ( 2005 ). Clinical assessment of child and adolescent intelligence ( 2nd ed. ). New York : Springer . Karr , J. E. , Garcia-Barrera , M. A. , Holdnack , J. A. , & Iverson , G. L. ( 2017 a). Advanced clinical interpretation of the Delis-Kaplan Executive Function System: Multivariate base rates of low scores . The Clinical Neuropsychologist , 32 , 42 – 53 . http://dx.doi.org/10.1080/13854046.2017.1334828 . Google Scholar CrossRef Search ADS Karr , J. E. , Garcia-Barrera , M. A. , Holdnack , J. A. , & Iverson , G. L. ( 2017 b). Using multivariate base rates to interpret low scores on an abbreviated battery of the Delis-Kaplan Executive Function System . Archives of Clinical Neuropsychology , 32 , 297 – 305 . http://dx.doi.org/10.1093/arclin/acw105 . Google Scholar CrossRef Search ADS Klauer , K. C. , Schmitz , F. , Teige-Mocigemba , S. , & Voss , A. ( 2010 ). Understanding the role of executive control in the Implicit Association Test: Why flexible people have small IAT effects . The Quarterly Journal of Experimental Psychology , 63 , 595 – 619 . http://dx.doi.org/10.1080/17470210903076826 . Google Scholar CrossRef Search ADS Latzman , R. D. , & Markon , K. E. ( 2010 ). The factor structure and age-related factorial invariance of the Delis-Kaplan Executive Function System (D-KEFS) . Assessment , 17 , 172 – 184 . http://dx.doi.org/10.1177/1073191109356254 . Google Scholar CrossRef Search ADS Lehto , J. E. , Juujärvi , P. , Kooistra , L. , & Pulkkinen , L. ( 2003 ). Dimensions of executive functioning: Evidence from children . British Journal of Developmental Psychology , 21 , 59 – 80 . http://dx.doi.org/10.1348/026151003321164627 . Google Scholar CrossRef Search ADS Lubinski , D. ( 2004 ). Introduction to the special section on cognitive abilities: 100 years after Spearman’s (1904)”‘General intelligence,’ objectively determined and measured” . Journal of Personality and Social Psychology , 86 , 96 – 111 . http://dx.doi.org/10.1037/0022-3514.86.1.96 . Google Scholar CrossRef Search ADS McDonald , R. P. ( 1999 ). Test theory: A unified treatment . Mahwah, New Jersey : Lawrence Erlbaum Associates . Miyake , A. , Emerson , M. J. , & Friedman , N. P. ( 2000 a). Assessment of executive functions in clinical settings: Problems and recommendations . Seminars in Speech and Language , 21 , 169 – 183 . http://dx.doi.org/10.1055/s-2000-7563 . Google Scholar CrossRef Search ADS Miyake , A. , & Friedman , N. P. ( 2012 ). The nature and organization of individual differences in executive functions four general conclusions . Current Directions in Psychological Science , 21 , 8 – 14 . http://dx.doi.org/10.1177/0963721411429458 . Google Scholar CrossRef Search ADS Miyake , A. , Friedman , N. P. , Emerson , M. J. , Witzki , A. H. , Howerter , A. , & Wager , T. D. ( 2000 b). The unity and diversity of executive functions and their contributions to complex “Frontal Lobe” tasks: A latent variable analysis . Cognitive Psychology , 41 , 49 – 100 . http://dx.doi.org/10.1006/cogp.1999.0734 . Google Scholar CrossRef Search ADS Müller , U. , & Kerns , K. ( 2015 ). The development of executive function. In Liben L. S. , Müller U. , & Lerner R. M. (Eds.) , Handbook of child psychology and developmental science, Vol. 2: Cognitive processes ( 7th ed. , pp. 571 – 623 ). Hoboken, NJ : Wiley . http://dx.doi.org/10.1002/9781118963418.childpsy214 . Muthén , L. , & Muthén , B. ( 2014 ). MPlus (Version 7.3) [Computer Software] . Los Angeles : Muthén & Muthén . Niendam , T. A. , Laird , A. R. , Ray , K. L. , Dean , Y. M. , Glahn , D. C. , & Carter , C. S. ( 2012 ). Meta-analytic evidence for a superordinate cognitive control network subserving diverse executive functions . Cognitive, Affective, & Behavioral Neuroscience , 12 , 241 – 268 . http://dx.doi.org/10.3758/s13415-011-0083-5 . Google Scholar CrossRef Search ADS Packwood , S. , Hodgetts , H. M. , & Tremblay , S. ( 2011 ). A multiperspective approach to the conceptualization of executive functions . Journal of Clinical and Experimental Neuropsychology , 33 , 456 – 470 . http://dx.doi.org/10.1080/13803395.2010.533157 . Google Scholar CrossRef Search ADS Perianez , J. A. , Rios-Lago , M. , Rodriguez-Sanchez , J. M. , Adrover-Roig , D. , Sanchez-Cubillo , I. , Crespo-Facorro , B. E. E. A. , et al. . ( 2007 ). Trail Making Test in traumatic brain injury, schizophrenia, and normal ageing: Sample comparisons and normative data . Archives of Clinical Neuropsychology , 22 , 433 – 447 . http://dx.doi.org/10.1016/j.acn.2007.01.022 . Google Scholar CrossRef Search ADS Pettigrew , C. , & Martin , R. C. ( 2014 ). Cognitive declines in healthy aging: Evidence from multiple aspects of interference resolution . Psychology and Aging , 29 , 187 – 204 . http://dx.doi.org/10.1037/a0036085 . Google Scholar CrossRef Search ADS Phillips , L. H. ( 1997 ). Do “frontal tests” measure executive function? Issues of assessment and evidence from fluency tests. In Rabbitt P. (Ed.) , Methodology of frontal and executive function (pp. 191 – 213 ). Hove, UK : Psychology Press . Rabin , L. A. , Paolillo , E. , & Barr , W. B. ( 2016 ). Stability in test-usage practices of clinical neuropsychologists in the United States and Canada over a 10-year period: A follow-up survey of INS and NAN members . Archives of Clinical Neuropsychology , 31 , 206 – 230 . http://dx.doi.org/10.1093/arclin/acw007 . Google Scholar CrossRef Search ADS Rabin , L. A. , Spadaccini , A. T. , Brodale , D. L. , Grant , K. S. , Elbulok-Charcape , M. M. , & Barr , W. B. ( 2014 ). Utilization rates of computerized tests and test batteries among clinical neuropsychologists in the United States and Canada . Professional Psychology: Research and Practice , 45 , 368 – 377 . http://dx.doi.org/10.1037/a0037987 . Google Scholar CrossRef Search ADS Raftery , A. E. ( 1995 ). Bayesian model selection in social research . Sociological Methodology , 25 , 111 – 163 . http://dx.doi.org/10.2307/271063 . Google Scholar CrossRef Search ADS Reise , S. P. ( 2012 ). The rediscovery of bifactor measurement models . Multivariate Behavioral Research , 47 , 667 – 696 . http://doi.org/10.1080/00273171.2012.715555 . Google Scholar CrossRef Search ADS Salthouse , T. A. ( 2005 ). Relations between cognitive abilities and measures of executive functioning . Neuropsychology , 19 , 532 – 545 . http://dx.doi.org/10.1037/0894-4105.19.4.532 . Google Scholar CrossRef Search ADS Sanchez-Cubillo , I. , Perianez , J. A. , Adrover-Roig , D. , Rodriguez-Sanchez , J. M. , Rios-Lago , M. , Tirapu , J. E. E. A. , et al. . ( 2009 ). Construct validity of the Trail Making Test: Role of task-switching, working memory, inhibition/interference control, and visuomotor abilities . Journal of the International Neuropsychological Society , 15 , 438 – 450 . http://dx.doi.org/10.1017/S1355617709090626 . Google Scholar CrossRef Search ADS Schwarz , G. ( 1978 ). Estimating the dimension of a model . The Annals of Statistics , 6 , 461 – 464 . http://dx.doi.org/10.1214/aos/1176344136 . Google Scholar CrossRef Search ADS Shao , Z. , Janse , E. , Visser , K. , & Meyer , A. S. ( 2014 ). What do verbal fluency tasks measure? Predictors of verbal fluency performance in older adults . Frontiers in Psychology , 5 , 772 . http://dx.doi.org/10.3389/fpsyg.2014.00772 . Google Scholar CrossRef Search ADS Shunk , A. W. , Davis , A. S. , & Dean , R. S. ( 2006 ). Review of Delis-Kaplan Executive Function System (D-KEFS) . Applied Neuropsychology , 13 , 275 – 279 . http://dx.doi.org/10.1207/s15324826an1304_9 . Google Scholar CrossRef Search ADS Snyder , H. R. , Miyake , A. , & Hankin , B. L. ( 2015 ). Advancing understanding of executive function impairments and psychopathology: Bridging the gap between clinical and cognitive approaches . Frontiers in Psychology , 6 , 1 – 24 . http://doi.org/10.3389/fpsyg.2015.00328 . Google Scholar CrossRef Search ADS Steiger , J. H. ( 2007 ). Understanding the limitations of global fit assessment in structural equation modeling . Personality and Individual Differences , 42 , 893 – 898 . http://dx.doi.org/10.1016/j.paid.2006.09.017 . Google Scholar CrossRef Search ADS Sternberg , R. J. , Grigorenko , E. , & Bundy , D. A. ( 2001 ). The predictive value of IQ . Merrill-Palmer Quarterly , 47 ( 1 ), 1 – 41 . http://doi.org/10.1353/mpq.2001.0005 . Google Scholar CrossRef Search ADS Tanaka , J. S. ( 1987 ). ‘How big is big enough?’: Sample size and goodness of fit in structural equation models with latent variables . Child Development , 58 , 134 – 146 . http://dx.doi.org/10.1111/1467-8624.ep7264172 . Google Scholar CrossRef Search ADS Tucker , L. R. , Damarin , F. , & Messick , S. ( 1965 ). A base‐free measure of change . ETS Research Report Series , 1965 ( 1 ), 1 – 29 . http://dx.doi.org/10.1002/j.2333-8504.1965.tb00900.x . Unsworth , N. , Spillers , G. J. , & Brewer , G. A. ( 2011 ). Variation in verbal fluency: A latent variable analysis of clustering, switching, and overall performance . The Quarterly Journal of Experimental Psychology , 64 , 447 – 466 . http://dx.doi.org/10.1080/17470218.2010.505292 . Google Scholar CrossRef Search ADS Wechsler , D. ( 1999 ). Wechsler Abbreviated Scale of Intelligence . San Antonio, TX : Psychological Corporation . Wechsler , D. ( 2008 ). Wechsler adult intelligence scale ( 4th ed. ). San Antonio, TX : Pearson . Willoughby , M. , Holochwost , S. J. , Blanton , Z. E. , & Blair , C. B. ( 2014 ). Executive functions: Formative versus reflective measurement . Measurement: Interdisciplinary Research & Perspectives , 12 , 69 – 95 . http://dx.doi.org/10.1080/15366367.2014.929453 . Google Scholar CrossRef Search ADS © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Archives of Clinical Neuropsychology Oxford University Press

Examining the Latent Structure of the Delis–Kaplan Executive Function System

Loading next page...
 
/lp/ou_press/examining-the-latent-structure-of-the-delis-kaplan-executive-function-xDAN0oGREy
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISSN
0887-6177
eISSN
1873-5843
D.O.I.
10.1093/arclin/acy043
Publisher site
See Article on Publisher Site

Abstract

Abstract Objective The current study aimed to determine whether the Delis–Kaplan Executive Function System (D-KEFS) taps into three executive function factors (inhibition, shifting, fluency) and to assess the relationship between these factors and tests of executive-related constructs less often measured in latent variable research: reasoning, abstraction, and problem solving. Method Participants included 425 adults from the D-KEFS standardization sample (20–49 years old; 50.1% female; 70.1% White). Eight alternative measurement models were compared based on model fit, with test scores assigned a priori to three factors: inhibition (Color-Word Interference, Tower), shifting (Trail Making, Sorting, Design Fluency), and fluency (Verbal/Design Fluency). The Twenty Questions, Word Context, and Proverb Tests were predicted in separate structural models. Results The three-factor model fit the data well (CFI = 0.938; RMSEA = 0.047), although a two-factor model, with shifting and fluency merged, fit similarly well (CFI = 0.929; RMSEA = 0.048). A bifactor model fit best (CFI = 0.977; RMSEA = 0.032) and explained the most variance in shifting indicators, but rarely converged among 5,000 bootstrapped samples. When the three first-order factors simultaneously predicted the criterion variables, only shifting was uniquely predictive (p < .05; R2 = 0.246–0.408). The bifactor significantly predicted all three criterion variables (p < .001; R2 = 0.141–242). Conclusions Results supported a three-factor D-KEFS model (i.e., inhibition, shifting, and fluency), although shifting and fluency were highly related (r = 0.696). The bifactor showed superior fit, but converged less often than other models. Shifting best predicted tests of reasoning, abstraction, and problem solving. These findings support the validity of D-KEFS scores for measuring executive-related constructs and provide a framework through which clinicians can interpret D-KEFS results. Assessment, Executive functions, Test construction Introduction A seminal article in executive function research reported a confirmatory factor analysis that supported three latent variables based on the common construct variance in their test battery: updating, shifting, and inhibition (Miyake et al., 2000). Following this article, numerous sequential latent variable studies have examined the structure of executive functions throughout the lifespan (e.g., de Frias, Dixon, & Strauss, 2009; Friedman, Corley, Hewitt, & Wright, 2009; Lehto, Juujärvi, Kooistra, & Pulkkinen, 2003; Fournier-Vicente, Larigauderie, & Gaonac’h, 2008). Tests of executive functions are inherently impure, and confirmatory factor analysis helps control for this impurity (Miyake & Friedman, 2012). Executive functions operate through lower-level abilities as control processes involved in the self-regulation of cognition and behavior (Jurado & Rosselli, 2007), meaning that non-executive cognitive functions contribute to performances on executive-related neuropsychological measures (Burgess, 1997; Duggan & Garcia-Barrera, 2015; Phillips, 1997). A confirmatory factor analytic approach results in latent variables based on shared variance between tests of executive functions, providing a purer estimate of these higher-order cognitive abilities than any individual test (Miyake et al., 2000). Since the first confirmatory factor analysis on executive functions, researchers have recommended the evaluation of latent factors in clinical settings (Miyake, Emerson, & Friedman, 2000), but clinicians do not commonly use composite scores based on evidenced-based factors in clinical practice. The three most heavily researched factors of executive function – the updating of working memory, the shifting of mental sets, and the inhibition of prepotent responses – have garnered a significant amount of empirical support through conceptual replications of the first measurement model (Miyake et al., 2000) across different age groups (e.g., de Frias et al., 2009; Friedman et al., 2009; Klauer, Schmitz, Teige-Mocigemba, & Voss, 2010; Lehto et al., 2003). As well, a more recent conceptualization has supported inhibition as a more unitary dimension of the executive system (Miyake & Friedman, 2012). However, despite the evidence for these factors, they have only been evaluated in an experimental context, with no published applications of these factors through clinical test batteries. Although executive-related composite scores remain uncommon in clinical settings (Rabin, Paolillo, & Barr, 2016), clinicians have moved towards broadband assessments of executive functions by developing batteries that evaluate diverse higher-order abilities through multiple tests (Jurado & Rosselli, 2007). Increasingly common in clinical practice, the Delis–Kaplan Executive Function System (D-KEFS; Delis, Kaplan, & Kramer, 2001) stands as a nine-test battery composed of traditional and newly developed measures of executive functions. Past researchers have evaluated its latent structure, identifying some evidence of diverse latent abilities explaining test performances (Floyd, Bergeron, Hamilton, & Parra, 2010; Latzman & Markon, 2010); however, these researchers used largely exploratory approaches, although confirmatory approaches, specifying a priori measurement models that consider task impurity and previous empirical findings, have since become the standard of the field when examining the latent structure of executive functions. Analyzing the correlation matrices published in the D-KEFS technical manual (Delis et al., 2001), Latzman and Markon (2010) used an exploratory factor analysis to identify a three-factor solution for the D-KEFS Total Achievement scores (for all tests except the Proverb Test) across three age spans (i.e., 8–19, 20–49, and 50–89). These researchers found performances on D-KEFS tests grouping to form test-specific factors rather than construct-specific factors. Only Sorting outcomes loaded on the first factor, only Verbal Fluency outcomes loaded on the second factor, and two Color-Word Interference Test outcomes, plus one Trail Making Test outcome, loaded on the third factor (i.e., all timed tests). The researchers labeled these factors cognitive flexibility, monitoring, and inhibition, respectively. A second research team administered 25 tests from the D-KEFS and the third edition of the Woodcock-Johnson Test of Cognitive Abilities (WJ-III) to a sample of 100 children, evaluating whether the latent structure of a combined test battery aligned with the Cattell-Horn-Carroll (CHC) theory of cognitive abilities (Floyd et al., 2010). These researchers found the D-KEFS tasks dispersed across different CHC factors during the exploratory factor analysis, while also identifying support for a second-order general factor. Taking these findings, the researchers ran a confirmatory model of their exploratory factor structure and identified acceptable fit for their model. However, many D-KEFS tests loaded on factors conceptually disparate from the construct that the D-KEFS tests were purported to measure. The D-KEFS Color-Word Interference Test, a version of the Stroop test (i.e., a measure used as an indicator for inhibition in past measurement models; Klauer et al., 2010; Miyake et al., 2000), loaded on the Processing Speed (Gs) factor, likely due to a primary outcome of time-to-completion. Three other D-KEFS tests (i.e., Free Sorting, Word Context, and Twenty Questions) loaded on a Comprehension-Knowledge (Gc) factor. These three tests tap into higher-order cognition, but likely share variance with Gc tasks because of their reliance on crystallized knowledge. When these tests co-loaded on a broad Executive Function factor, the overall model fit improved, which indicates unexplained executive function variance remaining in the model; however, these additional paths were rejected due to a lack of parsimony. Both past factor analyses on D-KEFS data ignore the task impurity problem endemic in the measurement of executive functions (Burgess, 1997; Duggan & Garcia-Barrera, 2015; Phillips, 1997) and identify factors based largely on common method variance, rather than the variance of executive function constructs. Floyd and colleagues (2010) concluded that “if there are measures of abilities associated with executive functions, they are contaminated by the general factor and more specific ability factors, so that there is probably little unique about them” (p. 734). Notably, in a second-order confirmatory factor model, the relationship between the second-order factor and the manifest variable is fully mediated by the first-order factors (Reise, Moore, & Haviland, 2010). In turn, the executive function variance not accounted for by first-order factors, composed largely of method variance, remains unexplained error variance in the model. The primary issue with the approach of these researchers is their limited consideration for past theory and research on executive functions. A latent variable analysis of the D-KEFS guided by past empirical evidence would provide an outcome more useful to clinicians using the broadband measure in clinical settings. Although flawed, these past factor analyses evidence (a) the task impurity of the D-KEFS measures – as is common to all tests of executive functions (Burgess, 1997; Duggan & Garcia-Barrera, 2015; Phillips, 1997) – and (b) the latent interrelatedness between these measures. In turn, an interpretation of D-KEFS tests independently from one another provides a biased estimate of executive functions; however, an interpretation that understands performance patterns in aggregate will provide an evaluation of executive functions more in alignment with current research on the construct, so long as the interpretation considers the influence of common method variance. With its multidimensional nature, the D-KEFS is an appropriate measure for the development of composite scores and the translation of evidence-based factors into clinical practice. However, to develop composite scores, a confirmatory factor analysis must first demonstrate that these evidence-based factors explain performance patterns across D-KEFS tests. As noted earlier, three factors of diverse executive functions (i.e., updating, shifting, and inhibition) have garnered substantial empirical support; however, they do not present an exhaustive list of executive functions (Miyake & Friedman, 2012; Miyake et al., 2000). Among the most frequently used terms for executive functions in the literature, five occur most often: planning, working memory, fluency, set-shifting, and inhibition (Packwood, Hodgetts, & Tremblay, 2011). The D-KEFS battery offers a series of tests that tap into many of these established constructs and provides a method for evaluating theoretical executive functions via a norm-referenced and clinically validated battery of tests. Among the executive function constructs measured by the D-KEFS, those with multiple tests available to serve as indicators in a factor analytic model include inhibition, shifting, and fluency. Both inhibition (i.e., the volitional restriction of a dominant or prepotent response pattern in reaction to a change in task demands) and shifting (i.e., the flexible switching between mental sets) are consistent with those more basic functions (Miyake & Friedman, 2012), but the third factor of updating (i.e., the monitoring of working memory content, along with the active addition and deletion of said content) is not directly measured by any D-KEFS tests. However, the D-KEFS does include fluency tasks, and the two mechanisms possibly underlying updating include the “effective gating of information and controlled retrieval from long-term memory” (Miyake & Friedman, 2012, p. 11). Verbal fluency performances require the strategic retrieval of information from long-term memory, involving both working memory and the lexicon (Shao, Janse, Visser, & Meyer, 2014; Unsworth, Spillers, & Brewer, 2011). Within the D-KEFS framework, fluency may serve as the best proxy for updating. Complementing these three factors, higher-order dimensions of executive function may be present in the D-KEFS data. Recently published statistical models have shown a common executive function dimension fully explaining the variance in inhibition, while also explaining a significant amount of variance in shifting and updating (Fleming, Heintzelman, & Bartholow, 2016; Friedman et al., 2008; Friedman, Miyake, Robinson, & Hewitt, 2011; Ito et al., 2015). Therefore, the D-KEFS may demonstrate a similar structure, whereby a common executive function explains performances across inhibition, shifting, and fluency. In addition to these three basic factors, additional tasks embedded in the D-KEFS, such as the Twenty Questions, Word Context, and Proverb Tests, may tap into constructs that have not been extensively evaluated in past confirmatory factor analyses (e.g., abstraction, reasoning, and problem solving; Baron, 2004; Delis et al., 2001; Shunk, Davis, & Dean, 2006). These constructs are semantically related to each other, but based on a latent semantic analysis (Packwood et al., 2011), they were unrelated to the five terms most frequently used to describe executive functions: planning, working memory, fluency, set-shifting, and inhibition. The D-KEFS offers a method for evaluating whether inhibition, shifting, or fluency are substantially related to these constructs, and these tests may best serve as outcomes in structural models rather than separable factors in measurement models. The D-KEFS is a widely disseminated, multidimensional clinical instrument, and the first goal of the proposed study was to use the normative D-KEFS data to link research on executive functions with clinical practice by deriving evidence-based factors of executive functions from the D-KEFS test battery. Achieving this goal involves two research aims that could guide the future development of evidence-based composite scores for executive functions in clinical settings. Specifically, we aimed to (a) derive a three-factor statistical model of executive functions (i.e., inhibition, shifting, and fluency) from D-KEFS tests, and (b) compare this three-factor model to alternative first-order models (e.g., one-factor, two-factor models) and a bifactor model (Reise, 2012), to determine any support for a common executive function composite. The second goal of this study was to evaluate how well these factors explained performance on tasks related to abstraction, reasoning, and problem solving using an analytical framework similar to the structural models tested by Miyake and colleagues (2000). To achieve this goal, we aimed to evaluate the relative contribution of each factor at predicting complex task performances through (a) each factor separately serving as a single predictor, (b) all three factors simultaneously serving as predictors, and (c) a common executive function bifactor serving as a sole predictor. Method Participants Participant data came from the D-KEFS normative sample. The D-KEFS norming procedure involved a standardized sampling of 1,750 participants (8–89 years old), with representation of sex, age, race/ethnicity, education, and geographic region consistent with the 2000 U.S. Census data. The large-scale systematic sampling procedure served to minimize sampling bias and improve the generalizability of the inference drawn from the normative data. The data were received anonymized from Pearson, and an institutional Human Research Ethics Board approved the secondary analyses conducted herein. For the current study, an adult sub-sample was selected for analysis (N = 425; 20–49 years old; 49.9% male; 70.1% White, 13.6% African-American, 12.2% Hispanic, 4% other races/ethnicities). This age range was selected because three-factor and incomplete bifactor models have been consistently observed in previous studies examining similarly aged samples (Fleming et al., 2016; Ito et al., 2015; Klauer et al., 2010; Miyake et al., 2000). An a priori power analysis also estimated the sample size for this age span to be sufficient (i.e., πˆ ~ 0.85) based on the approximate degrees of freedom for the evaluated models (df = ~25; Hancock, 2006). The D-KEFS technical manual (Delis et al., 2001) offers further information on the normative sample. A random sub-sample of 358 participants also completed the Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler, 1999). Materials D-KEFS test scores were assigned a priori to three factors: inhibition, shifting, and fluency. Additional test scores were used as control or criterion variables. Inhibition Two D-KEFS tests tap into inhibition: the Tower and Color-Word Interference Tests. The Tower Test involves participants re-organizing disks on three pegs to match a target design. The primary Tower Test outcome is the Total Achievement score, which is a sum of points awarded in each trial, weighted by moves-to-completion. One structural model found an alternative version of the tower test related to inhibition (Miyake et al., 2000). The Color-Word Interference Test involves two conditions requiring inhibition, both with a primary score of time-to-completion: Inhibition and Inhibition/Switching. The Inhibition trial consists of an incongruent Stroop condition, where participants read the color of the ink and not the word written. The Inhibition/Switching trial requires participants read either the color of the ink or the word written depending on whether the word is displayed inside a box. Shifting The first shifting indicator was time-to-completion on the Trail Making Test Number-Letter Switching trial. This test requires participants to connect labeled dots in both alphabetical and numerical order while actively switching between number and letter sets. The second indicator was the Sorting Test: Confirmed Correct Sorts, which has correlated significantly (r = 0.59) with performance on the Wisconsin Card Sorting Test (Delis et al., 2001), another test requiring shifting for effective performance (Miyake et al., 2000). Participants arrange cards based on verbal and visual-spatial characteristics without being told how to sort the cards, shifting from previous sorting rules to new rules to attain a greater number of accurate sorts. The last indicator was the Switching–Total Correct score (i.e., number of unique designs accurately drawn) from the Design Fluency Test, where participants had to draw novel abstract designs, while switching between connecting black and empty dots. Fluency For the Verbal Fluency Test, participants complete multiple one-minute sub-trials, where they must generate as many possible words beginning with either a letter or fitting within a category. The sum of the total correct words generated across all sub-trials served as fluency indicators: Letter Fluency–Total Correct and Category Fluency–Total Correct. The Design Fluency Test involves two trials where participants connected empty or filled dots to draw novel abstract designs. The total unique designs drawn across these two trials served as the third fluency indicator. Control variables The D-KEFS includes contrast scores through which scaled scores on control conditions are subtracted from scaled scores on conditions involving a greater executive demand (e.g., for the Color-Word Interference Test, the scaled score for combined performance on the Word Reading and Color Naming conditions is subtracted from the scaled score for the Inhibition condition to obtain a contrast score). Conceptually, these contrast scores would result in a purer score of executive function, controlling for lower-level cognitive abilities. However, raw difference scores have been criticized for poor psychometric properties (Furr, 2011) and an alternative residualized difference score was used in analyses (Tucker, Damarin, & Messick, 1965). Previous confirmatory factor analyses on executive function test batteries have used similar scores as dependent variables (Bettcher et al., 2016; Frazier et al., 2015; Pettigrew and Martin, 2014). To calculate a residualized difference score, an ordinary least squares regression was conducted, where the control condition (e.g., combined Word Reading and Color Naming performance) predicts the condition with higher cognitive demands (e.g., the Inhibition condition). Through this equation, the residual value (i.e., error) is saved and used as a residualized difference score. The residualized difference score is orthogonal to the score from the control condition, and it represents the variance in the executive function task that is not attributable to performance on the control condition. The percent of variance in the scaled score for the executive function task accounted for by the residualized difference score can be calculated through 1 minus the R2 value from the regression equation used to acquire the residualized score (i.e., the variance not attributable to the control condition). To control for processing speed, the two Color-Word Interference scores were made orthogonal to the summed performance on the Word Reading and Color Naming trials. For the Color-Word Interference Test, 70.90% of the variance in the scaled score for the Inhibition condition and 76.20% of the variance in the scaled score for the Inhibition/Switching condition were represented in the residualized scores. Also controlling for processing speed, the Trail Making Number-Letter Switching score was made orthogonal to the summed performance on the Number and Letter Sequencing trials. This residualized score represented 71.00% of the variance in the original Number-Letter Switching scaled score. The Design Fluency Switching score that loaded on the shifting factor was made orthogonal to the Design Fluency score that loaded on the fluency factor. This residualized score represented 83.50% of the variance in the original Design Fluency Switching score. Lastly, the Verbal Fluency scores were made orthogonal to the WASI Vocabulary subtest, controlling for the impact of language functioning on these outcomes. A total of 79.00% of the variance in the scaled score for the Letter Fluency condition and 84.10% of the variance in the scaled score for the Category Fluency condition were represented in the residualized scores. Criterion variables Three measures from the D-KEFS (i.e., Twenty Questions, Word Context, and Proverb Tests) will serve as criterion variables in structural models, predicted by the latent variables included in the measurement model. These D-KEFS tests require abstraction, reasoning, and problem solving for effective performance (Baron, 2004; Delis et al., 2001; Shunk et al., 2006), which represent semantically unique constructs from the latent variables included in the measurement model (Packwood et al., 2011). For the Twenty Questions Test, participants ask a series of questions to identify which picture the examiner pre-selected from a page of images, attempting to identify the correct image with the fewest questions possible. The Twenty Questions outcome used in the analysis was the Total Weighted Achievement Score. For the Word Context Test, participants are provided clues about the definitions of a set of neologisms, attempting to infer their meaning with as few clues as possible. The Word Context criterion score was the Total Consecutively Correct. During the Proverb Test, participants describe the meaning of a series of proverbs, with their accuracy and level of abstraction summed. The Proverb Test Total Achievement Score: Free Inquiry was used in analysis. Statistical Analysis The normative data are age-corrected and standardized for all D-KEFS variables (M = 10, SD = 3) and WASI variables (M = 50, SD = 10), with a higher value indicating a greater performance. All structural equation modeling was conducted in MPlus v.7.3 (Muthén & Muthén, 2014), using maximum likelihood to estimate model parameters based on the covariance matrix of the selected D-KEFS outcomes. Fit indices All models were evaluated in relation to statistical fit, with optimally fitting models selected to guide interpretation. The reported fit indices include the χ2 Test of Model Fit, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), the Root Mean Square Error of Approximation (RMSEA), the Comparative Fit Index (CFI), and the Tucker–Lewis Index (TLI). With large sample size, the χ2 test has increased power to reject the null hypothesis (Tanaka, 1987), and alternative fit indices will serve as primary metrics for interpretation. The AIC represents a modification of the χ2 statistic that penalizes for model complexity, with values closer to zero representing better fit (Akaike, 1974). Closely related to the AIC, the BIC can be interpreted in the same manner (Schwarz, 1978). For the CFI and TLI, values closer to 1.0 indicate better fit (Bentler, 1990; Cheung & Rensvold, 2002), while for the RMSEA values closer to zero are desirable. Past researchers have posited fit cutoffs of ≥.95 for the CFI/TLI and ≤0.06 for the RMSEA (Hu & Bentler, 1999). Some alternative recommendations provide greater leniency on these cutoffs, including a lower-bound cutoff of ≥.90 for the CFI (Bentler & Bonett, 1980) and a higher-bound cutoff of ≤.07 for the RMSEA (Steiger, 2007). The ∆CFI served as the primary metric for model comparison, using the recommend cutoff of ∆CFI ≥ .01 for significantly improved fit (Cheung & Rensvold, 2002). The ∆BIC was also used in model comparison, where BIC values closer to zero were preferable. Typical cutoffs for the ∆BIC include 0–2 for weak, 2–6 for positive, 6–10 for strong, and >10 for very strong evidence of one model being preferable over another (Raftery, 1995). Measurement models The hypothesized three-factor model (i.e., inhibition, shifting, and fluency) was evaluated first for statistical fit, and alternative measurement models were evaluated thereafter. These alternative models included a one-factor model and a set of three two-factor models merging two of the first-order factors: inhibition=fluency, fluency=shifting, inhibition=shifting. A bifactor model was also evaluated, where indicators co-loaded onto their specific factor (i.e., inhibition, shifting, fluency) and a general factor (i.e., common executive function). Aligning with past research, an incomplete bifactor model (Chen, West, & Sousa, 2006) was also evaluated, where a specific inhibition factor was not included in the model (Miyake & Friedman, 2012). For all models evaluated, 5,000 samples were bootstrapped to calculate how often among those samples the factor model “properly converged,” meaning the model both converged and provided an admissible solution (e.g., no correlation above 1.0, no negative residual variances, etc.). Reliability The factors included in each model were evaluated for their reliability using omega (ω) as an estimate, which represents the ratio of true-score variance to the total variance among indicators loading onto a given factor (McDonald, 1999). In the context of the bifactor model, the reliability of the general and specific factors was calculated using omega-hierarchical (ωH) and omega-subscale values (ωS), which provide estimates of variance accounted for by the bifactor and the specific factors (i.e., inhibition, shifting, fluency), respectively (Reise, 2012). There are no commonly used cutoff values to guide interpretation of ω values, but the values can be interpreted as the amount of variance in the indicators attributable to the factor and not error. Structural Model The best-fitting first-order factor model – and bifactor model if it showed acceptable fit – was treated as the accepted models, with its latent variable(s) used as predictors in structural models. In a series of models, either diverse (e.g., inhibition, shifting, fluency) or unitary (i.e., common executive function) factor(s) predicted performances on three complex tasks of executive functions (i.e., Twenty Questions, Word Context, and Proverb Tests). The path coefficient from each factor to each complex task was evaluated for its unique significance, along with the R2 value for each model to determine the amount of variance accounted for in each task. Results The descriptive statistics and bivariate correlations for all variables are provided in Tables 1 and 2, respectively. Skewness and kurtosis were within normal limits, indicating an approximate univariate normal distribution for each variable. One multivariate outlier was identified via Mahalanobis’ distance, but the exclusion of the outlier did not change the results, and the outlier’s data were included in analyses. Table 1. Descriptive statistics for variables included in measurement and structural models Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Note: *Indicates a value that was residualized of variance attributable to a control variable. Table 1. Descriptive statistics for variables included in measurement and structural models Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Indicator n Mean SD Skewness Kurtosis Min. Max. Color-Word Interference Test: Inhibition Time* 421 0.09 0.96 −0.32 1.10 −3.57 4.04 Color-Word Interference Test: Inhibition/Switching Time* 421 0.00 0.96 −0.84 1.17 −3.25 2.77 Tower Test: Total Achievement 367 10.26 3.02 −0.09 0.42 1.00 19.00 Trail Making Test: Switch Time* 424 0.04 0.95 −1.09 1.82 −3.70 2.71 Design Fluency Test: Total Designs - Switch Dots* 425 0.04 0.98 −0.65 1.22 −3.47 2.72 Sorting Test: Total Confirmed Sorts 425 9.96 3.23 −0.60 0.79 1.00 18.00 Verbal Fluency Test: Letters - Total Correct Words* 357 0.11 1.04 0.41 0.20 −2.52 3.48 Verbal Fluency Test: Category - Total Correct Words* 358 0.09 1.02 −0.06 0.03 −3.00 2.98 Design Fluency Test: Filled + Empty 425 9.92 3.26 −0.02 −0.61 1.00 19.00 Twenty Questions Test: Total Weighted Achievement 417 10.02 3.09 −0.80 0.32 1.00 16.00 Word Context Test: Total Achievement 423 10.18 2.94 −0.58 0.02 1.00 18.00 Proverb Test: Total Achievement 417 10.08 2.69 −0.75 0.29 1.00 14.00 Note: *Indicates a value that was residualized of variance attributable to a control variable. Table 2. Correlation matrix for variables included in measurement and structural models 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 Note: *p < .05; **p < .01; +Indicates a value that was residualized of variance attributable to a control variable. Table 2. Correlation matrix for variables included in measurement and structural models 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 1 2 3 4 5 6 7 8 9 10 11 12 1. Color-Word Interference Test: Inhibition Time+ 1 2. Color-Word Interference Test: Inhibition/Switching Time+ .467** 1 3. Tower Test: Total Achievement .136** .224** 1 4. Trail Making Test: Switch Time+ .153** .266** .073 1 5. Design Fluency Test: Switch Dots+ .124* .159** .104* .032 1 6. Sorting Test: Total Confirmed Sorts .121* .195** .223** .184** .250** 1 7. Verbal Fluency Test: Letters - Total Correct Words+ .072 −.001 .068 .047 .029 .088 1 8. Verbal Fluency Test: Category - Total Correct Words+ .026 .094 .029 .132* .039 .149** .443** 1 9. Design Fluency Test: Filled + Empty .134** .193** .069 .166** .032 .271** .180** .202** 1 10. Twenty Questions Test: Total Weighted Achievement .052 .170** .131* .169** .137** .226** −.031 .102 .097* 1 11. Word Context Test: Total Achievement .108* .152** .181** .226** .127** .337** .060 .165** .178** .273** 1 12. Proverb Test: Total Achievement .124* .143** .168** .181** .153** .333** .103 .197** .177** .240** .386** 1 Note: *p < .05; **p < .01; +Indicates a value that was residualized of variance attributable to a control variable. Measurement Models Table 3 provides the fit indices for each model evaluated. The hypothesized three-factor model did not meet full criteria for adequate fit (CFI = 0.871; RMSEA = 0.066). Modification indices recommended correlated errors between the two Verbal Fluency scores, and the three-factor model with this modification had good statistical fit (CFI = 0.938; RMSEA = 0.047). The fit for the three-factor model surpassed the fit of the unidimensional model (CFI = 0.838; RMSEA = 0.072) and all models with merged factors. Although fit was much greater for the three-factor model in comparison to models that merged inhibition with shifting (CFI = 0.871; RMSEA = 0.065) or fluency (CFI = 0.854; RMSEA = 0.069), the model that merged fluency and shifting showed good fit (CFI = 0.929; RMSEA = 0.048) that was not significantly different than the fit of the three-factor model (∆CFI = 0.009), but came extremely close to the ∆CFI ≥ .01 threshold for a significant difference. The ∆BIC between these two models was 6.803 with a preference for the merged factor model; however, all other fit indices showed better fit for the three-factor model. In turn, the three-factor model was the accepted first-order factor model, displayed graphically in Fig. 1. Based on the bootstrapping analysis, this model properly converged at a frequency of 97.98%, with convergence frequencies for other models presented in Table 3. Reliability estimates were calculated for each latent variable included in the three-factor model, resulting in ω values of 0.59 for inhibition, 0.38 for shifting, and 0.38 for fluency. Table 3. Measurement model fit indices Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Note. χ2 = Chi-Square Test of Model Fit; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated. Table 3. Measurement model fit indices Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Model χ2 (p) df AIC BIC CFI TLI RMSEA (90% C.I.) Percent convergence Three Factors (Inh., Shi., Flu.) 69.017 (.0000) 24 12733.811 12855.374 0.871 0.807 0.066 (0.048–0.085) 99.88% Three Factors (Inh., Shi., Flu.), VFT corr. 44.542 (.0045) 23 12711.336 12836.951 0.938 0.904 0.047 (0.026–0.067) 97.98% Two Factors (Flu., Inh.=Shi.), VFT corr. 70.203 (.0000) 25 12732.997 12850.508 0.871 0.814 0.065 (0.047–0.084) 93.10% Two Factors (Shi., Inh.=Flu.), VFT corr. 76.030 (.0000) 25 12738.825 12856.335 0.854 0.790 0.069 (0.052–0.087) 100% Two Factors (Inh., Shi.=Flu.), VFT corr. 49.804 (.0023) 25 12712.598 12830.108 0.929 0.898 0.048 (0.028–0.068) 100% Unidimensional (Inh.=Shi.=Flu.), VFT corr. 82.683 (.0000) 26 12743.477 12856.935 0.838 0.776 0.072 (0.055–0.089) 100% Bifactor 25.963 (.1006) 18 12702.757 12848.633 0.977 0.954 0.032 (0.000–0.058) 65.86% Bifactor without Inh. Did not converge. Note. χ2 = Chi-Square Test of Model Fit; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated. Fig. 1. View largeDownload slide Three-factor first-order measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Errors are not displayed above, but were as follows: CWIT Inhibition = 0.682; CWIT Inhibition/Switching = 0.326; TWT Total Achievement = 0.917; TMT Switch Time = 0.843; DF Switch Dots = 0.918; CS Total Confirmed Sorts = 0.710; VF Letters Total Correct Words = 0.936; VF Categories Total Correct Words = 0.884; DF Filled + Empty = 0.607. Fig. 1. View largeDownload slide Three-factor first-order measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Errors are not displayed above, but were as follows: CWIT Inhibition = 0.682; CWIT Inhibition/Switching = 0.326; TWT Total Achievement = 0.917; TMT Switch Time = 0.843; DF Switch Dots = 0.918; CS Total Confirmed Sorts = 0.710; VF Letters Total Correct Words = 0.936; VF Categories Total Correct Words = 0.884; DF Filled + Empty = 0.607. The bifactor model, displayed graphically in Fig. 2, was also evaluated and showed the best fit based on some indices (CFI = 0.977; RMSEA = 0.032). It was the only model to provide a non-significant χ2 value (p = .1006) despite the large sample size, but it had a ∆BIC of 11.682 when compared to the three-factor model. The bifactor model also properly converged among only 65.86% of 5,000 bootstrapped samples. Some loadings in the bifactor model were non-significant, including Trail Making on the shifting-specific factor and the Verbal Fluency phonemic trial on the bifactor. Although non-significant, their removal did not improve statistical fit and they were included in all structural models. The bifactor model without a specific inhibition factor did not converge. Reliability estimates were calculated for the general and specific factors: ωH came to 0.49 for the bifactor, while the ωS values were 0.39 for inhibition, 0.03 for shifting, and 0.48 for fluency. Despite its low rate of convergence, the bifactor model was used in structural models in addition to the three-factor model, considering its excellent fit and the parsimony of interpreting the general factor. Fig. 2. View largeDownload slide Bifactor measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Dashed lines designate non-significant loadings. Errors are neither displayed nor diagrammed above, but were as follows: CWIT Inhibition = 0.709; CWIT Inhibition/Switching = 0.248; TWT Total Achievement = 0.909; TMT Switch Time = 0.644; DF Switch Dots = 0.845; CS Total Confirmed Sorts = 0.547; VF Letters Total Correct Words = 0.455; VF Categories Total Correct Words = 0.614; DF Filled + Empty = 0.809. Fig. 2. View largeDownload slide Bifactor measurement model. ST = Sorting Test; CWIT = Color-Word Interference Test; DFT = Design Fluency Test; TMT = Trail Making Test; TWT = Tower Test; VFT = Verbal Fluency Test. *Indicates a value that was residualized of variance attributable to a control variable. Dashed lines designate non-significant loadings. Errors are neither displayed nor diagrammed above, but were as follows: CWIT Inhibition = 0.709; CWIT Inhibition/Switching = 0.248; TWT Total Achievement = 0.909; TMT Switch Time = 0.644; DF Switch Dots = 0.845; CS Total Confirmed Sorts = 0.547; VF Letters Total Correct Words = 0.455; VF Categories Total Correct Words = 0.614; DF Filled + Empty = 0.809. Structural Models Two series of structural models were conducted, involving the prediction of the Twenty Questions, Word Context, and Proverb Tests using latent dimensions of executive function. The first series included three models in which all three first-order factors simultaneously predicted each task, and follow-up models in which each diverse factor predicted each task one at a time. The second series consisted of three models, where the common executive function factor predicted each task separately. Table 4 provides results for all structural models. The bifactor significantly predicted all complex tasks, accounting for a significant amount of variance in each outcome. The models involving the simultaneous prediction of each task by inhibition, shifting, and fluency accounted for a significant amount of variance in all tasks except for the Twenty Questions Test, although the R2 value for this model approached significance (p = .057). Notably, in all models involving simultaneous prediction, shifting was the only uniquely significant predictor. In models where only inhibition predicted the criterion, inhibition was a significant predictor; and in models where only shifting predicted the criterion, shifting was also a significant predictor. However, in models where only fluency predicted the criterion, the models did not properly converge due to a correlation above 1.0 between fluency and shifting. Table 4. Structural model results Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Note: *p<.05, **p<.01, ***p<.001; χ2 = Chi-Square Test of Model Fit; 20Q = Twenty Questions Test; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; PVT = Proverb Test; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated; WCT = Word Context Test. Table 4. Structural model results Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Criterion Model df χ2 AIC BIC CFI TLI RMSEA (90% CI) Standardized Path Coefficients R2 Bifactor Inh. Shi. Flu. 20Q Bifactor 26 32.332 14796.58 14954.61 0.984 0.971 0.024 (0.000–0.048) 0.376*** 0.141** Three paths 29 51.587** 14809.83 14955.71 0.941 0.909 0.043 (0.023–0.062) −0.123 0.725* −0.302 0.246 Path from Inh. 31 71.395*** 14825.64 14963.41 0.895 0.847 0.055 (0.039–0.072) 0.237*** 0.056* Path from Shi. 31 54.405** 14808.65 14946.42 0.939 0.911 0.042 (0.023–0.060) 0.379*** 0.143** Path from Flu. Non-Positive Definite Covariance Matrix WCT Bifactor 26 32.741 14748.09 14906.12 0.984 0.972 0.025 (0.000–0.048) 0.492*** 0.242*** Three paths 29 52.880** 14762.23 14908.11 0.943 0.911 0.044 (0.024–0.063) −0.248 0.915* −0.254 0.408* Path from Inh. 31 99.240*** 14804.59 14942.36 0.837 0.763 0.072 (0.056–0.088) 0.309*** 0.095* Path from Shi. 31 56.602** 14761.95 14899.72 0.939 0.911 0.044 (0.025–0.062) 0.511*** 0.261*** Path from Flu. Non-Positive Definite Covariance Matrix PVT Bifactor 26 34.580 14651.97 14810 0.979 0.964 0.028 (0.000–0.050) 0.478*** 0.229*** Three paths 29 51.771*** 14663.16 14809.03 0.945 0.915 0.043 (0.023–0.062) −0.187 0.742* −0.114 0.333** Path from Inh. 31 94.468*** 14701.86 14839.63 0.846 0.777 0.069 (0.054–0.086) 0.305*** 0.093* Path from Shi. 31 54.27** 14661.66 14799.43 0.944 0.918 0.042 (0.022–0.060) 0.501*** 0.251*** Path from Flu. Non-Positive Definite Covariance Matrix Note: *p<.05, **p<.01, ***p<.001; χ2 = Chi-Square Test of Model Fit; 20Q = Twenty Questions Test; AIC = Akaike Information Criterion; BIC = Bayesian Information Criterion; CFI = Comparative Fit Index; Flu. = Fluency; Inh. = Inhibition; PVT = Proverb Test; RMSEA = Root Mean Square Error of Approximation; Shi. = Shifting; TLI = Tucker–Lewis Index; VFT corr. = Verbal Fluency Test score errors correlated; WCT = Word Context Test. Discussion A series of confirmatory factor analyses on a subset of D-KEFS tests identified a three-factor solution as the best-fitting measurement model. This model included three executive functions posited by past researchers (Miyake et al., 2000; Packwood et al., 2011): inhibition, shifting, and fluency. Inter-factor correlations were high between shifting and inhibition (r= 0.591) and shifting and fluency (r = 0.696), although much lower between inhibition and fluency (r = 0.360). The relatively high correlations between shifting and the other factors likely derive from the close relationship between shifting and the common executive function dimension. A bifactor model showed that a general factor, representative of a common executive function, explained nearly all variance in the shifting indicators, leaving virtually no unique variance for a shifting-specific factor (ωS = 0.03). Unlike previous measurement models showing inhibition as fully explained by a common executive function factor (Fleming et al., 2016; Friedman et al., 2016; Ito et al., 2015), a bifactor model without inhibition failed to converge for the D-KEFS battery. Although the bifactor model fit the data better than the three-factor model, the model properly converged among only 65.86% of 5,000 bootstrapped samples; and while this model was informative for explicating the relationship between shifting and a general executive function dimension, the replicability of this model remains questionable. The low convergence rate for this model likely derives from the limited construct-specific variance represented in the manifest variables. Small inter-test correlations are typical for executive function test batteries (Willoughby et al., 2014), and they were also small in the current study. These low inter-test correlations correspond to limited amounts of shared variance between indicators, which equates to low construct reliability of factors in the model (Gagne & Hancock, 2006), as shown by the modest ωH and ωS reliability estimates. Limited construct reliability could explain the low rates of convergence for the bifactor model, where there was insufficient shared variance in the bifactor and all specific factors for many of the bootstrapped samples. Despite issues of low convergence and construct reliability, a robust general factor is consistent with previous research (Fleming et al., 2016; Friedman et al., 2008, 2016; Ito et al., 2015; Klauer et al., 2010), but the relationship between shifting and a common executive function dimension was novel, and can be interpreted in multiple ways. A first interpretation could consider shifting as higher-order, which aligns with previous claims that, in early development, shifting requires the establishment of more basic executive abilities prior to successful performance (Müller & Kerns, 2015). There is evidence for a superordinate fronto-cingulo-parietal network being activated during tasks of inhibition, working memory, and flexibility (Niendam et al., 2012), which supports a common executive function ability, but does not indicate one ability as more central than another. In contrast to past research supporting inhibition as closely related to common executive function (Fleming et al., 2016; Fournier-Vicente et al., 2008; Ito et al., 2015; Klauer et al., 2010), a substitute model with a central shifting ability could accurately represent the true nature of executive function. Alternatively, other possible interpretations for the near perfect relationship between common executive function and shifting may consider the nature of the tests assigned to its measurement. The tests that loaded onto the shifting factor in the accepted model conceptually and empirically align with general executive function abilities. The Sorting Test score loaded highest onto the shifting factor, and a previous analysis has shown a strong relationship between the Sorting Test and the Wisconsin Card Sorting Test (i.e., r = 0.59; Delis et al., 2001), which is related to both shifting ability (Miyake et al., 2000) and general executive function (Greve, Stickle, Love, Bianchini, & Stanford, 2005). The Trail Making Test had the second highest loading on the shifting factor, but this loading was non-significant in the bifactor model, indicating that Trail Making Test performance was highly related to common executive function. A recent construct validity analysis of the Trail Making Test found that both working memory and task-switching explained performance of the switching condition (Sanchez-Cubillo et al., 2009). The tasks used as indicators for shifting likely tapped into shifting, but also other executive-related constructs, making the variance within this factor more general than specific. Compared to the shifting factor, the more modest relationship between the general factor and the specific inhibition and fluency factors may also derive from the nature of the tests assigned to these factors. Method variance influenced the findings of a previous factor analysis on the D-KEFS (Latzman & Markon, 2010), and has potentially impacted the current findings as well. For the bifactor model, the specific inhibition (ωS = 0.39) and fluency (ωS = 0.48) factors had far more unique variance than the shifting factor (ωS = 0.03); however, the co-loadings for some of their respective indicators decreased, suggesting the bifactor explained some variance in those indicators. The inhibition-specific loading for the Tower Test decreased from the three-factor (λ = 0.289) to the bifactor model (λ = 0.144); and the fluency-specific loading for Design Fluency decreased from the three-factor (λ = 0.627) to the bifactor model (λ = 0.187) as well. The indicators loading the highest on the specific factors for these constructs were related based on method variance. Two Verbal Fluency scores loaded highly on the fluency-specific factor (i.e., λ = 0.572 and 0.730), and two Color-Word Interference scores loaded highly on the inhibition-specific factor (i.e., λ = 0.466 and 0.745). The strong relationship between shifting and the general factor could have resulted from a lack of common method variance between shifting indicators in comparison to the other specific factors, because shifting was the only diverse factor with indicators from three different D-KEFS tests. Aside from examining the latent structure of the D-KEFS, the current study also evaluated the relationship between executive function factors and tests purported to measure abstraction, reasoning, and problem solving. In a series of models, the bifactor or the three diverse factors (i.e., inhibition, shifting, and fluency) predicted a set of complex executive function tasks not often evaluated in latent variable research: Twenty Questions, Word Context, and Proverb Tests. The bifactor significantly predicted performance on all tasks. In comparison, when all diverse factors predicted these tasks simultaneously, shifting was the only factor to significantly predict performance. In models including just a single path from one diverse factor to the criterion variable, shifting and inhibition predicted all tasks significantly; however, when fluency served as the sole predictor, the models did not converge due to high co-linearity between shifting and fluency. The models involving shifting as the sole predictor yielded higher R2 values than models with inhibition or the bifactor as sole predictors, while models including three paths from inhibition, shifting, and fluency had higher absolute R2 values than any single path models. These findings indicate that, for the three-factor model, shifting explained the most variance in the criterion outcomes, likely due to its association with the common executive function factor. In turn, the constructs measured by the Twenty Questions, Word Context, and Proverb Tests are closely related to either shifting or general executive function ability. The results discussed above achieved the primary aims of this study: determining the latent structure of the D-KEFS and which constructs explain performances on a set of complex tasks. An overarching goal that guided these two aims was the linkage between the current findings and clinical practice; however, the analysis used residualized variables, which were free of common method variance attributable to speed and language ability, but are not normed through the D-KEFS. The residualization process has the benefit of dealing with the issue of task impurity that characterizes executive function measurement; however, it causes a disconnect between the scores within this analysis and those applied in clinical practice. Summed scaled scores from the D-KEFS would not fully correspond to the factors identified in the accepted model, so the findings cannot be directly translated into clinical practice. Researchers have tried various methods to control for lower-level abilities when calculating scores in clinical practice. For example, different derived Trail Making Test scores have become the subject of research evaluation (Drane, Yuspeh, Huthwaite, & Klingler, 2002; Perianez et al., 2007), including a difference score (i.e., B-A), ratio score (i.e., B/A), and proportional score (i.e., B-A/A). These derived scores attempt to control for processing speed to provide a purer estimate of executive function. The residualization process follows the same intent as these approaches, but offers a more statistically rigorous method for controlling lower-level abilities than a simple difference score. Considering the issue of task impurity in executive function measurement, future editions of the D-KEFS could provide norms for residualized scores, using regression-based norming procedures to control for lower-order abilities that influence test scores. Previous researchers have discussed the relationship between processing speed in executive functions (Albinet, Boucard, Bouquet, & Audiffren, 2012; Salthouse, 2005), and the residualization approach adjusts for individual differences in processing speed on task performance. However, if speed were not residualized from certain scores, the shared variance between tasks could have been largely attributable to speed rather than the three hypothesized constructs ultimately represented as factors in the model. In a previous analysis of the latent structure of the D-KEFS, an exploratory model resulted in a factor composed of only time-based outcomes from the Trail Making and Color Word Interference Tests (Latzman & Markon, 2010); and if not for residualized scores, the current analysis could have resulted in a model inclusive of a similar speed-based factor. While other batteries include more basic measurements of processing speed (e.g., WAIS-IV Processing Speed Index; Wechsler, 2008), the goal of this study was to identify factors that aligned best with constructs reported previously in the executive function literature (Miyake et al., 2000; Packwood et al., 2011). In turn, although the factor model is not directly translatable, it is based on less confounded indicators that are more uniquely associated with their respective constructs. Another notable limitation was the inclusion of scores from the same D-KEFS test on either the same or different factors. The D-KEFS offers different conditions within each test that tap into different constructs, but these conditions often rely on a common method of measurement. Although the residualization procedure attempted to control for method variance, the use of multiple indicators from the same tests led to shared method variance between Color-Word Interference and Verbal Fluency scores infiltrating the inhibition and fluency factor scores, respectively. As well, the two orthogonalized Design Fluency scores loaded on the shifting and fluency factors; and while this approach controlled for shared method variance between indicators, it may have biased the model towards a multidimensional solution by orthogonalizing indicators assigned to separate, but correlated factors. Considering issues of multiple indicators from the same test loading onto the same or different factors, the findings also identify potential considerations for future editions of the D-KEFS, where the latent structure of the battery may be hypothesized and evaluated during test development. Future batteries could include a greater diversity of tests selected with the aim of having multiple indicators tapping into specific executive function constructs. Many experimental tasks often used in executive function research have not made their way into common clinical practice, likely due to limited use of computerized tests in applied settings (Rabin et al., 2014) and minimal change in assessment practices by active practitioners (Rabin et al., 2016). The D-KEFS offers the first co-normed battery of executive function tests, which likely contributes to its frequency of use; and incorporation of computerized tests in future editions may increase their dissemination, offering a linkage between research and clinical practice. The translation of experimental tasks of executive functions into clinical practice would increase the diversity of tasks available to measure different constructs, which could result in evidence-based composite scores in future D-KEFS editions. By design, the D-KEFS does not provide composite scores (Delis, Jacobson, Bondi, Hamilton, & Salmon, 2003), but instead offers multiple scores evaluating many proposed dimensions of executive function. Delis and colleagues questioned the utility of composite scores among clinical samples, because unitary constructs dissociate among clinical populations. Despite these concerns, past researchers have provided methods for the calculation of an atheoretical global D-KEFS composite score (Crawford, Garthwaite, Sutherland, & Borland, 2011); although this composite does not align with the current findings or any previous research on executive functions. Considering the improved purity of factor estimates of executive functions (Miyake & Friedman, 2012), the use of composite scores in clinical practice, analogous to evidence-based factors, may provide a more precise estimate of executive functions. Without composite scores, the current assessment practices for executive functions appear inconsistent compared to the assessment practices for other cognitive domains. Composite scores have become particularly common in clinical practice due to their ability to simplify psychometric interpretation by summarizing performances across tests. They have become almost ubiquitous in the assessment of intellectual functioning, where composite scores of intelligence present with strong reliability and predictive validity (Kamphaus, 2005; Lubinski, 2004; Sternberg, Grigorenko, & Bundy, 2001). Although composite scores allow clinicians to reduce the dimensions of their test findings, they are unfortunately unavailable for the D-KEFS, but they do not represent the only useful multivariate method in neuropsychological assessment to simplify complex test profiles. Alternative to composite scores, multivariate base rates for the D-KEFS, which quantify the normal frequency of low scores when multiple tests are administered and interpreted, offer an additional method for aggregating test information (Karr, Garcia-Barrera, Holdnack, & Iverson, 2017a, 2017b), but this method also does not consider the latent structure of the D-KEFS. Efforts to provide D-KEFS composite scores in future editions of the test battery would benefit from considering the theoretical structure of executive functions in their calculation, ensuring the highest possible level of reliability and construct validity. Although the residualization process prevents the calculation of composite scores in clinical practice, the latent structure identified herein offers an empirical framework through which clinicians can interpret their assessment findings. The modeling results illustrate that the D-KEFS taps into three core executive functions along with more global executive-related abilities. This approach may enhance interpretability of test scores and provide a better understanding of the effect of psychopathology and brain damage on executive functions (Snyder, Miyake, & Hankin, 2015). Neuropsychologists often discuss their constructs in groupings in assessment reports (e.g., attention, memory), and often infer the constructs underlying specific task performances (e.g., attention span, concentration, encoding, retrieval). The accuracy of these groupings is based on the construct validity of the administered tasks. In turn, neuropsychologists must know which cognitive abilities are tapped into by their specific tests to inform their case conceptualization and recommendations. Previous research has involved numerous posited executive function constructs based on face valid interpretations of task performance. This study offers the first attempt to explain performances on D-KEFS tests through hypothesized constructs supported by previous research (Miyake et al., 2000). Authors' Note Justin E. Karr is a Vanier Canada Graduate Scholar and he thanks the Natural Sciences and Engineering Research Council of Canada (NSERC) for their support of his graduate studies. This study was completed in partial fulfillment of the requirements for his dissertation. Data used for the analyses reported within this manuscript were provided by Pearson, Inc. (2001). Standardization data from the Delis–Kaplan Executive Function System (D-KEFS). Copyright © 2001 NCS Pearson, Inc. Used with permission. All rights reserved. San Antonio: Pearson, Inc. Conflict of Interest None of the authors have conflicts of interests to declare, but the following general disclosures are offered for the consideration of the readers. Mauricio A. Garcia-Barrera has served in the past as a consultant for Pearson. Grant L. Iverson has received research support from test publishing companies in the past, including PAR, Inc., ImPACT Applications, Inc., and CNS Vital Signs. He acknowledges current unrestricted philanthropic support from ImPACT Applications, Inc. He receives royalties for one neuropsychological test (Wisconsin Card Sorting Test-64 Card Version). References Akaike , H. ( 1974 ). A new look at the statistical model identification . IEEE Transactions on Automatic Control , 19 , 716 – 723 . http://dx.doi.org/10.1109/TAC.1974.1100705 . Google Scholar CrossRef Search ADS Albinet , C. T. , Boucard , G. , Bouquet , C. A. , & Audiffren , M. ( 2012 ). Processing speed and executive functions in cognitive aging: How to disentangle their mutual relationship? Brain and Cognition , 79 ( 1 ), 1 – 11 . http://doi.org/10.1016/j.bandc.2012.02.001 . Google Scholar CrossRef Search ADS Baron , I. S. ( 2004 ). Delis-Kaplan Executive Function System . Child Neuropsychology , 10 , 147 – 152 . http://dx.doi.org/10.1080/09297040490911140 . Google Scholar CrossRef Search ADS Bentler , P. M. ( 1990 ). Comparative fit indexes in structural models . Psychological Bulletin , 107 , 238 – 246 . http://dx.doi.org/10.1037/0033-2909.107.2.238 . Google Scholar CrossRef Search ADS Bentler , P. M. , & Bonett , D. G. ( 1980 ). Significance tests and goodness of fit in the analysis of covariance structures . Psychological Bulletin , 88 , 588 – 606 . http://dx.doi.org/10.1037/0033-2909.88.3.588 . Google Scholar CrossRef Search ADS Bettcher , B. M. , Mungas , D. , Patel , N. , Elofson , J. , Dutt , S. , Wynn , M. , et al. . ( 2016 ). Neuroanatomical substrates of executive functions: Beyond prefrontal structures . Neuropsychologia , 85 , 100 – 109 . http://dx.doi.org/10.1016/j.neuropsychologia.2016.03.001 . Google Scholar CrossRef Search ADS Burgess , P. W. ( 1997 ). Theory and methodology in executive function research. In Rabbitt P. (Ed.) , Methodology of frontal and executive function (pp. 81 – 116 ). Hove, UK : Psychology Press . Chen , F. F. , West , S. G. , & Sousa , K. H. ( 2006 ). A comparison of bifactor and second-order models of quality of life . Multivariate Behavioral Research , 41 , 189 – 225 . http://dx.doi.org/10.1207/s15327906mbr4102_5 . Google Scholar CrossRef Search ADS Cheung , G. W. , & Rensvold , R. B. ( 2002 ). Evaluating goodness-of-fit indexes for testing measurement invariance . Structural Equation Modeling , 9 , 233 – 255 . http://dx.doi.org/10.1207/S15328007SEM0902_5 . Google Scholar CrossRef Search ADS Crawford , J. R. , Garthwaite , P. H. , Sutherland , D. , & Borland , N. ( 2011 ). Some supplementary methods for the analysis of the Delis–Kaplan Executive Function System . Psychological Assessment , 23 , 888 – 898 . http://dx.doi.org/10.1037/a0023712 . Google Scholar CrossRef Search ADS de Frias , C. M. , Dixon , R. A. , & Strauss , E. ( 2009 ). Characterizing executive functioning in older special populations: From cognitively elite to cognitively impaired . Neuropsychology , 23 , 778 – 791 . http://dx.doi.org/10.1037/a0016743 . Google Scholar CrossRef Search ADS Delis , D. C. , Jacobson , M. , Bondi , M. W. , Hamilton , J. M. , & Salmon , D. P. ( 2003 ). The myth of testing construct validity using shared variance techniques with normal or mixed clinical populations: Lessons from memory assessment . Journal of the International Neuropsychological Society , 9 , 936 – 946 . Google Scholar CrossRef Search ADS Delis , D. C. , Kaplan , E. , & Kramer , J. H. ( 2001 ). Delis–Kaplan Executive Function System . Odessa, FL : Psychological Assessment Resources . Drane , D. L. , Yuspeh , R. L. , Huthwaite , J. S. , & Klingler , L. K. ( 2002 ). Demographic characteristics and normative observations for derived-trail making test indices . Cognitive and Behavioral Neurology , 15 , 39 – 43 . Duggan , E. C. , & Garcia-Barrera , M. A. ( 2015 ). Executive functioning and intelligence. In Goldstein S. , Naglieri J. A. , & Princiotta D. (Eds.) , Handbook of intelligence: Evolutionary theory, historical perspective, and current concepts (pp. 435 – 458 ). New York : Springer Publishing Co . http://dx.doi.org/10.1007/978-1-4939-1562-0_27 . Fleming , K. A. , Heintzelman , S. J. , & Bartholow , B. D. ( 2016 ). Specifying associations between conscientiousness and executive functioning: Mental set shifting, not prepotent response inhibition or working memory updating . Journal of Personality , 84 , 348 – 360 . http://dx.doi.org/10.1111/jopy.12163 . Google Scholar CrossRef Search ADS Floyd , R. G. , Bergeron , R. , Hamilton , G. , & Parra , G. R. ( 2010 ). How do executive functions fit with the Cattell–Horn–Carroll model? Some evidence from a joint factor analysis of the Delis–Kaplan executive function system and the Woodcock–Johnson III tests of cognitive abilities . Psychology in the Schools , 47 , 721 – 738 . http://dx.doi.org/10.1002/pits.20500 . Fournier-Vicente , S. , Larigauderie , P. , & Gaonac’h , D. ( 2008 ). More dissociations and interactions within central executive functioning: A comprehensive latent-variable analysis . Acta Psychologica , 129 , 32 – 48 . http://dx.doi.org/10.1016/j.actpsy.2008.04.004 . Google Scholar CrossRef Search ADS Frazier , D. T. , Bettcher , B. M. , Dutt , S. , Patel , N. , Mungas , D. , Miller , J. , et al. . ( 2015 ). Relationship between insulin-resistance processing speed and specific executive function profiles in neurologically intact older adults . Journal of the International Neuropsychological Society , 21 , 622 – 628 . http://dx.doi.org/10.1017/S1355617715000624 . Google Scholar CrossRef Search ADS Friedman , N. P. , Corley , R. P. , Hewitt , J. K. , & Wright , K. P. ( 2009 ). Individual differences in childhood sleep problems predict later cognitive executive control . Sleep , 32 , 323 – 333 . http://doi.org/10.1093/sleep/32.3.323 . Google Scholar CrossRef Search ADS Friedman , N. P. , Miyake , A. , Altamirano , L. J. , Corley , R. P. , Young , S. E. , Rhea , S. A. , et al. . ( 2016 ). Stability and change in executive function abilities from late adolescence to early adulthood: A longitudinal twin study . Developmental Psychology , 52 , 326 – 340 . http://dx.doi.org/10.1037/dev0000075 . Google Scholar CrossRef Search ADS Friedman , N. P. , Miyake , A. , Robinson , J. L. , & Hewitt , J. K. ( 2011 ). Developmental trajectories in toddlers’ self-restraint predict individual differences in executive functions 14 years later: A behavioral genetic analysis . Developmental Psychology , 47 , 1410 – 1430 . http://dx.doi.org/10.1037/a0023750 . Google Scholar CrossRef Search ADS Friedman , N. P. , Miyake , A. , Young , S. E. , DeFries , J. C. , Corley , R. P. , & Hewitt , J. K. ( 2008 ). Individual differences in executive functions are almost entirely genetic in origin . Journal of Experimental Psychology: General , 137 , 201 – 225 . http://dx.doi.org/10.1037/0096-3445.137.2.201 . Google Scholar CrossRef Search ADS Furr , R. M. ( 2011 ). Scale construction and psychometrics for social and personality psychology . London, England : SAGE Publications Ltd . Google Scholar CrossRef Search ADS Gagne , P. , & Hancock , G. R. ( 2006 ). Measurement model quality, sample size, and solution propriety in confirmatory factor models . Multivariate Behavioral Research , 41 , 65 – 83 . http://dx.doi.org/10.1207/s15327906mbr4101_5 . Google Scholar CrossRef Search ADS Greve , K. W. , Stickle , T. R. , Love , J. M. , Bianchini , K. J. , & Stanford , M. S. ( 2005 ). Latent structure of the Wisconsin Card Sorting Test: A confirmatory factor analytic study . Archives of Clinical Neuropsychology , 20 , 355 – 364 . http://dx.doi.org/10.1016/j.acn.2004.09.004 . Google Scholar CrossRef Search ADS Hancock , G. R. ( 2006 ). Power analysis in covariance structure modeling. In Hancock G. R. , & Mueller R. O. (Eds.) , Structural equation modeling: A second course (pp. 69 – 115 ). Greenwich, Connecticut : Information Age Publishing . Hu , L. , & Bentler , P. M. ( 1999 ). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives . Structural Equation Modeling , 6 ( 1 ), 1 – 55 . http://dx.doi.org/10.1080/10705519909540118 . Google Scholar CrossRef Search ADS Ito , T. A. , Friedman , N. P. , Bartholow , B. D. , Correll , J. , Loersch , C. , Altamirano , L. J. , et al. . ( 2015 ). Toward a comprehensive understanding of executive cognitive function in implicit racial bias . Journal of Personality and Social Psychology , 108 , 187 – 218 . http://dx.doi.org/10.1037/a0038557 . Google Scholar CrossRef Search ADS Jurado , M. B. , & Rosselli , M. ( 2007 ). The elusive nature of executive functions: A review of our current understanding . Neuropsychology Review , 17 , 213 – 233 . http://dx.doi.org/10.1007/s11065-007-9040-z . Google Scholar CrossRef Search ADS Kamphaus , R. W. ( 2005 ). Clinical assessment of child and adolescent intelligence ( 2nd ed. ). New York : Springer . Karr , J. E. , Garcia-Barrera , M. A. , Holdnack , J. A. , & Iverson , G. L. ( 2017 a). Advanced clinical interpretation of the Delis-Kaplan Executive Function System: Multivariate base rates of low scores . The Clinical Neuropsychologist , 32 , 42 – 53 . http://dx.doi.org/10.1080/13854046.2017.1334828 . Google Scholar CrossRef Search ADS Karr , J. E. , Garcia-Barrera , M. A. , Holdnack , J. A. , & Iverson , G. L. ( 2017 b). Using multivariate base rates to interpret low scores on an abbreviated battery of the Delis-Kaplan Executive Function System . Archives of Clinical Neuropsychology , 32 , 297 – 305 . http://dx.doi.org/10.1093/arclin/acw105 . Google Scholar CrossRef Search ADS Klauer , K. C. , Schmitz , F. , Teige-Mocigemba , S. , & Voss , A. ( 2010 ). Understanding the role of executive control in the Implicit Association Test: Why flexible people have small IAT effects . The Quarterly Journal of Experimental Psychology , 63 , 595 – 619 . http://dx.doi.org/10.1080/17470210903076826 . Google Scholar CrossRef Search ADS Latzman , R. D. , & Markon , K. E. ( 2010 ). The factor structure and age-related factorial invariance of the Delis-Kaplan Executive Function System (D-KEFS) . Assessment , 17 , 172 – 184 . http://dx.doi.org/10.1177/1073191109356254 . Google Scholar CrossRef Search ADS Lehto , J. E. , Juujärvi , P. , Kooistra , L. , & Pulkkinen , L. ( 2003 ). Dimensions of executive functioning: Evidence from children . British Journal of Developmental Psychology , 21 , 59 – 80 . http://dx.doi.org/10.1348/026151003321164627 . Google Scholar CrossRef Search ADS Lubinski , D. ( 2004 ). Introduction to the special section on cognitive abilities: 100 years after Spearman’s (1904)”‘General intelligence,’ objectively determined and measured” . Journal of Personality and Social Psychology , 86 , 96 – 111 . http://dx.doi.org/10.1037/0022-3514.86.1.96 . Google Scholar CrossRef Search ADS McDonald , R. P. ( 1999 ). Test theory: A unified treatment . Mahwah, New Jersey : Lawrence Erlbaum Associates . Miyake , A. , Emerson , M. J. , & Friedman , N. P. ( 2000 a). Assessment of executive functions in clinical settings: Problems and recommendations . Seminars in Speech and Language , 21 , 169 – 183 . http://dx.doi.org/10.1055/s-2000-7563 . Google Scholar CrossRef Search ADS Miyake , A. , & Friedman , N. P. ( 2012 ). The nature and organization of individual differences in executive functions four general conclusions . Current Directions in Psychological Science , 21 , 8 – 14 . http://dx.doi.org/10.1177/0963721411429458 . Google Scholar CrossRef Search ADS Miyake , A. , Friedman , N. P. , Emerson , M. J. , Witzki , A. H. , Howerter , A. , & Wager , T. D. ( 2000 b). The unity and diversity of executive functions and their contributions to complex “Frontal Lobe” tasks: A latent variable analysis . Cognitive Psychology , 41 , 49 – 100 . http://dx.doi.org/10.1006/cogp.1999.0734 . Google Scholar CrossRef Search ADS Müller , U. , & Kerns , K. ( 2015 ). The development of executive function. In Liben L. S. , Müller U. , & Lerner R. M. (Eds.) , Handbook of child psychology and developmental science, Vol. 2: Cognitive processes ( 7th ed. , pp. 571 – 623 ). Hoboken, NJ : Wiley . http://dx.doi.org/10.1002/9781118963418.childpsy214 . Muthén , L. , & Muthén , B. ( 2014 ). MPlus (Version 7.3) [Computer Software] . Los Angeles : Muthén & Muthén . Niendam , T. A. , Laird , A. R. , Ray , K. L. , Dean , Y. M. , Glahn , D. C. , & Carter , C. S. ( 2012 ). Meta-analytic evidence for a superordinate cognitive control network subserving diverse executive functions . Cognitive, Affective, & Behavioral Neuroscience , 12 , 241 – 268 . http://dx.doi.org/10.3758/s13415-011-0083-5 . Google Scholar CrossRef Search ADS Packwood , S. , Hodgetts , H. M. , & Tremblay , S. ( 2011 ). A multiperspective approach to the conceptualization of executive functions . Journal of Clinical and Experimental Neuropsychology , 33 , 456 – 470 . http://dx.doi.org/10.1080/13803395.2010.533157 . Google Scholar CrossRef Search ADS Perianez , J. A. , Rios-Lago , M. , Rodriguez-Sanchez , J. M. , Adrover-Roig , D. , Sanchez-Cubillo , I. , Crespo-Facorro , B. E. E. A. , et al. . ( 2007 ). Trail Making Test in traumatic brain injury, schizophrenia, and normal ageing: Sample comparisons and normative data . Archives of Clinical Neuropsychology , 22 , 433 – 447 . http://dx.doi.org/10.1016/j.acn.2007.01.022 . Google Scholar CrossRef Search ADS Pettigrew , C. , & Martin , R. C. ( 2014 ). Cognitive declines in healthy aging: Evidence from multiple aspects of interference resolution . Psychology and Aging , 29 , 187 – 204 . http://dx.doi.org/10.1037/a0036085 . Google Scholar CrossRef Search ADS Phillips , L. H. ( 1997 ). Do “frontal tests” measure executive function? Issues of assessment and evidence from fluency tests. In Rabbitt P. (Ed.) , Methodology of frontal and executive function (pp. 191 – 213 ). Hove, UK : Psychology Press . Rabin , L. A. , Paolillo , E. , & Barr , W. B. ( 2016 ). Stability in test-usage practices of clinical neuropsychologists in the United States and Canada over a 10-year period: A follow-up survey of INS and NAN members . Archives of Clinical Neuropsychology , 31 , 206 – 230 . http://dx.doi.org/10.1093/arclin/acw007 . Google Scholar CrossRef Search ADS Rabin , L. A. , Spadaccini , A. T. , Brodale , D. L. , Grant , K. S. , Elbulok-Charcape , M. M. , & Barr , W. B. ( 2014 ). Utilization rates of computerized tests and test batteries among clinical neuropsychologists in the United States and Canada . Professional Psychology: Research and Practice , 45 , 368 – 377 . http://dx.doi.org/10.1037/a0037987 . Google Scholar CrossRef Search ADS Raftery , A. E. ( 1995 ). Bayesian model selection in social research . Sociological Methodology , 25 , 111 – 163 . http://dx.doi.org/10.2307/271063 . Google Scholar CrossRef Search ADS Reise , S. P. ( 2012 ). The rediscovery of bifactor measurement models . Multivariate Behavioral Research , 47 , 667 – 696 . http://doi.org/10.1080/00273171.2012.715555 . Google Scholar CrossRef Search ADS Salthouse , T. A. ( 2005 ). Relations between cognitive abilities and measures of executive functioning . Neuropsychology , 19 , 532 – 545 . http://dx.doi.org/10.1037/0894-4105.19.4.532 . Google Scholar CrossRef Search ADS Sanchez-Cubillo , I. , Perianez , J. A. , Adrover-Roig , D. , Rodriguez-Sanchez , J. M. , Rios-Lago , M. , Tirapu , J. E. E. A. , et al. . ( 2009 ). Construct validity of the Trail Making Test: Role of task-switching, working memory, inhibition/interference control, and visuomotor abilities . Journal of the International Neuropsychological Society , 15 , 438 – 450 . http://dx.doi.org/10.1017/S1355617709090626 . Google Scholar CrossRef Search ADS Schwarz , G. ( 1978 ). Estimating the dimension of a model . The Annals of Statistics , 6 , 461 – 464 . http://dx.doi.org/10.1214/aos/1176344136 . Google Scholar CrossRef Search ADS Shao , Z. , Janse , E. , Visser , K. , & Meyer , A. S. ( 2014 ). What do verbal fluency tasks measure? Predictors of verbal fluency performance in older adults . Frontiers in Psychology , 5 , 772 . http://dx.doi.org/10.3389/fpsyg.2014.00772 . Google Scholar CrossRef Search ADS Shunk , A. W. , Davis , A. S. , & Dean , R. S. ( 2006 ). Review of Delis-Kaplan Executive Function System (D-KEFS) . Applied Neuropsychology , 13 , 275 – 279 . http://dx.doi.org/10.1207/s15324826an1304_9 . Google Scholar CrossRef Search ADS Snyder , H. R. , Miyake , A. , & Hankin , B. L. ( 2015 ). Advancing understanding of executive function impairments and psychopathology: Bridging the gap between clinical and cognitive approaches . Frontiers in Psychology , 6 , 1 – 24 . http://doi.org/10.3389/fpsyg.2015.00328 . Google Scholar CrossRef Search ADS Steiger , J. H. ( 2007 ). Understanding the limitations of global fit assessment in structural equation modeling . Personality and Individual Differences , 42 , 893 – 898 . http://dx.doi.org/10.1016/j.paid.2006.09.017 . Google Scholar CrossRef Search ADS Sternberg , R. J. , Grigorenko , E. , & Bundy , D. A. ( 2001 ). The predictive value of IQ . Merrill-Palmer Quarterly , 47 ( 1 ), 1 – 41 . http://doi.org/10.1353/mpq.2001.0005 . Google Scholar CrossRef Search ADS Tanaka , J. S. ( 1987 ). ‘How big is big enough?’: Sample size and goodness of fit in structural equation models with latent variables . Child Development , 58 , 134 – 146 . http://dx.doi.org/10.1111/1467-8624.ep7264172 . Google Scholar CrossRef Search ADS Tucker , L. R. , Damarin , F. , & Messick , S. ( 1965 ). A base‐free measure of change . ETS Research Report Series , 1965 ( 1 ), 1 – 29 . http://dx.doi.org/10.1002/j.2333-8504.1965.tb00900.x . Unsworth , N. , Spillers , G. J. , & Brewer , G. A. ( 2011 ). Variation in verbal fluency: A latent variable analysis of clustering, switching, and overall performance . The Quarterly Journal of Experimental Psychology , 64 , 447 – 466 . http://dx.doi.org/10.1080/17470218.2010.505292 . Google Scholar CrossRef Search ADS Wechsler , D. ( 1999 ). Wechsler Abbreviated Scale of Intelligence . San Antonio, TX : Psychological Corporation . Wechsler , D. ( 2008 ). Wechsler adult intelligence scale ( 4th ed. ). San Antonio, TX : Pearson . Willoughby , M. , Holochwost , S. J. , Blanton , Z. E. , & Blair , C. B. ( 2014 ). Executive functions: Formative versus reflective measurement . Measurement: Interdisciplinary Research & Perspectives , 12 , 69 – 95 . http://dx.doi.org/10.1080/15366367.2014.929453 . Google Scholar CrossRef Search ADS © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

Archives of Clinical NeuropsychologyOxford University Press

Published: May 4, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off