A Single Error Is One Too Many: The Forced Choice Recognition Trial of the CVLT-II as a Measure of Performance Validity in Adults with TBI

A Single Error Is One Too Many: The Forced Choice Recognition Trial of the CVLT-II as a Measure... Abstract Objective The Forced Choice Recognition (FCR) trial of the California Verbal Learning Test—Second Edition (CVLT-II) was designed to serve as a performance validity test (PVT). The present study was designed to compare the classification accuracy of a more liberal alternative (≤15) to the de facto FCR cutoff (≤14). Method The classification accuracy of the two cutoffs was computed in reference to psychometrically defined invalid performance, across various criterion measures, in a sample of 104 adults with TBI clinically referred for neuropsychological assessment. Results The FCR was highly predictive (AUC: .71–.83) of Pass/Fail status on reference PVTs, but unrelated to performance on measures known to be sensitive to TBI. On average, FCR ≤15 correctly identified an additional 6% of invalid response sets compared to FCR ≤14, while maintaining .92 specificity. Patients who failed the FCR reported higher levels of emotional distress. Conclusions Results suggest that even a single error on the FCR is a reliable indicator of invalid responding. Further research is needed to investigate the clinical significance of the relationship between failing the FCR and level of self-reported psychiatric symptoms. CVLT-II, Forced choice recognition, Traumatic brain injury, Performance validity assessment, Alternative cutoffs Introduction The interpretation of neuropsychological tests rests on the assumption that examinees are able and willing to fully engage with the tasks presented to them and therefore, demonstrate their maximal ability level (Delis, Kramer, Kaplan, & Ober, 2000). There is a growing consensus within the field of neuropsychology that valid performance cannot be assumed by default, but should be objectively evaluated (Boone, 2009; Chafetz et al., 2015; Heilbronner, Sweet, Morgan, Larrabee, & Millis, 2009). Some even consider assessments that omit formal measures of test-taking effort incomplete (Iverson, 2003). Along with free-standing performance validity tests [PVTs; Test of Memory Malingering (TOMM; Tombaugh, 1996); Word Memory Test (WMT; Green, 2003); Validity Indicator Profile (VIP; Frederick, 2003)] that represent the traditional approach to performance validity assessment, a growing array of embedded validity indicators (EVIs) have also been developed to help clinicians determine the credibility of a given response set (Arnold et al., 2005; Erdodi, Sagar, et al., 2017; Greiffenstein, Baker, & Gola, 1994; Larrabee, 2003). The Forced Choice Recognition (FCR) task of the California Verbal Learning Test—Second Edition (CVLT-II; Delis et al., 2000) falls somewhere in between these two categories of validity measures. It was introduced as an optional module with the explicit purpose of evaluating test-taking effort, and is administered 10 min after the original recall and recognition trials are completed. These features are consistent with a free-standing PVT. However, FCR is dependent on the prior administration of the rest of the CVLT-II, which makes it an EVI. The technical manual references a study by Connor, Drake, Bondi and Delis (1997) on an early experimental version of FCR administered in conjunction with the original CVLT. On this instrument, a cutoff of ≤13 produced impressive classification accuracy (.80 sensitivity at .97 specificity) separating credible and simulated memory deficits. Although the authors refrained from endorsing this or any other cutoff, they reported that over 90% of the CVLT-II normative sample obtained a perfect score on FCR (16/16), with ≤1% scoring ≤14. Nobody scored ≤13. They suggested that its pronounced ceiling effect in neurologically healthy individuals makes FCR a viable instrument for detecting non-credible responding in unsophisticated examinees who blatantly exaggerate their memory problems. Early studies on FCR in clinical samples provided indirect support for this claim. Baldo, Delis, Kramer and Shimamura (2002) reported that all of the 11 patients with neuroradiologically confirmed focal frontal lesions and notable impairment in acquisition, recall and recognition on the CVLT-II obtained perfect scores on FCR. Demonstrating that performance on FCR is unrelated to brain lesions or credibly poor performance on the rest of the CVLT-II was an essential first step in gaining acceptance as a validity indicator. The other requirement for validation was examining whether FCR can correctly identify individuals who fail other established PVTs. Moore and Donders (2004) conducted the first large scale concordance rate analysis against the TOMM in 132 clinically referred adults with TBI. Most of the sample was male (62.1%) and classified as mild TBI (54.5%). Mean age was 35.8 (SD = 14.2) and mean level of education was 12.3 (SD = 2.6). The majority (81.1%) obtained a perfect score on FCR; 6.8% scored 15 and 11.4% scored ≤14. The authors turned the base rate argument advanced by the test developers into an explicit diagnostic threshold, defining failure on FCR as ≤14. The reference PVT was TOMM Trial 2 ≤44 resulting in an 8.3% base rate of failure (BRFail). FCR ≤ 14 produced a respectable combination of sensitivity (.55) and specificity (.93) against this criterion. No alternative cutoff was considered. The authors expressed concerns that both the TOMM and FCR might be too transparent as PVTs and thus, highly specific, but not very sensitive to invalid responding. Bauer, Yantz, Ryan, Warden and McCaffrey (2005) examined the relationship between FCR and the WMT in a military sample of 120 patients with TBI. Mean age was 28.4 (SD = 9.2) and mean level of education was 13.4 (SD = 2.3). The BRFail, defined by the WMT at standard cutoffs, was 24.2%. Although the authors did not provide enough detail to compute classification accuracy, the mean FCR value in the “invalid” group (14.9) was significantly lower than in the “valid” group (15.7). Also, there was a positive linear relationship between WMT performance as a continuous variable (average of the IR, DR and CNS subtests) and FCR: those with MWMT ≥91% produced a MFCR of 15.8, while those with MWMT 61–70% produced a MFCR of 14.2. The authors concluded that while FCR was effective at discriminating those who passed and those who failed the WMT, the mean difference was lower (0.8) than what was observed on Yes/No recognition hits raw scores (2.0). They attributed this to the inherently lower cognitive demands of FCR paradigm compared to the Yes/No recognition trial, which has a 3:1 distractor-to-target ratio. They also emphasized that FCR has high specificity, but low sensitivity to invalid performance. Root, Robbins, Chang and van Gorp (2006) investigated the relationship among FCR scores, memory impairment and performance validity across three groups: a mixed clinical sample (n = 25), a forensic sample with adequate effort (n = 27) and a forensic sample with inadequate effort (n = 25). Performance validity was operationalized as passing or failing the TOMM and/or the VIP among forensic patients, resulting in an overall BRFail of 48%. Performance validity was not formally assessed in the clinically referred patients; instead, they were assumed to have valid neuropsychological profiles based on the lack of apparent secondary gain. Given the emerging evidence that even university students with no incentive to appear impaired fail PVTs (An, Zakzanis, & Joordens, 2012; An, Kaploun, Erdodi, & Abeare, 2017; Ross et al., 2016; Santos, Kazakov, Reamer, Park, & Osmon, 2014), this logic (“lack of apparent motivation to perform poorly = evidence of valid performance”) seems flawed by current methodological standards that emphasize the importance of objective evidence from multiple independent sources in establishing the credibility of a given neurocognitive profile (Boone, 2009, 2013; Larrabee, 2012). Nevertheless, Root et al. (2006) found that FCR scores were unrelated to delayed free recall performance. An FCR cutoff of ≤15 produced .60 sensitivity at .81 specificity. Lowering the cutoff to ≤14 resulted in improved specificity (.93), but decreased sensitivity (.44). The authors endorsed FCR as a “brief screen of effort” given its quick and easy administration and strong positive predictive power. At the same time, they cautioned against using a passing score on FCR as evidence for credible performance. They also acknowledged the modality specificity as a potential confound in establishing the optimal cutoff on FCR: the TOMM is a visually based PVT, while the VIP is non-memory based. As such, they may not be ideal reference PVTs to cross-validate FCR. Once FCR’s ability to separate valid and invalid response sets had been established, later studies used it as an EVI. Some of these reports provide indirect evidence that further consolidates the evidence base supporting its diagnostic utility. For example, the investigation by Donders and Strong (2011) based on 100 clinically referred adults with TBI found that the majority (85%) of the patients obtained a perfect score on FCR, 6% scored 15 and 9% scored ≤14. Although concordance rates were not provided, 24% of the sample failed the WMT. The authors noted that FCR and WMT were unrelated to injury severity, while other CVLT-II measures (Trials 1–5, LD-FR, d’, Total Recall Discriminability) covaried with duration of coma. Another method for assessing FCR’s ability to differentiate invalid responding from credible impairment is to examine its distribution in clinical populations with severe, credible neurological deficits. Extremely low intellectual functioning (FSIQ < 70) and dementia are two conditions that have been identified as being at risk for high false positive rates on PVTs (Boone, 2009, 2013). Marshall and Happe (2007) examined the BRFail in several PVTs in a sample of 100 adults with intellectual disability (52% male, MAge = 36.6; MFSIQ = 63). Mean FCR score was 15.1 (SD = 1.9). A frequency distribution for a subset of 71 participants for which FCR data were available revealed that the majority (66.2%) obtained a perfect score, 18.3% of the sample scored 15 and 15.5% scored ≤14. Clark and colleagues (2012) demonstrated that FCR performance is a useful clinical marker of anterograde amnesia in later stages of Alzheimer’s disease (AD). As such, in conjunction with other CVLT-II measures, it can aid the subtyping of late life memory disorders and track disease progression in individuals diagnosed with a neurodegenerative disorder. Mean FCR was 13.9 in the Alzheimer’s sample (n = 37), 15.8 in the amnestic MCI sample (n = 18), 15.7 in the non-amnestic MCI sample (n = 19) and a near-perfect score in the control sample (n = 35). Research on FCR appears to converge in a few areas. First, the exceptionally good signal detection profile of the ≤13 cutoff in the original experimental version has not been replicated. Second, the ≤14 cutoff performed well against established PVTs, with classification accuracy hovering around the “Larrabee limit”: .50 sensitivity at .90 specificity (Lichtenstein, Erdodi, & Linnea, 2017). Third, no alternative cutoff has been systematically evaluated, despite accumulating evidence that the vast majority of credible individuals produce perfect scores on FCR, making ≤15 a logical candidate for a more liberal cutoff. A recent systematic review of the literature on the FCR’s classification accuracy found that the ≤14 cutoff tended to sacrifice sensitivity for specificity, and identified investigating the more liberal alternative cutoff (≤15) as a direction for future research (Schwartz et al., 2016). The implication of these findings is that a score of 15 on FCR is more likely to be a Fail than a Pass. Even if it might not constitute strong enough evidence to render the whole profile invalid, FCR = 15 should be considered at least a red flag in the evaluation of performance validity (D. Delis, personal communication, 10 May 2012). In fact, some researchers started treating an FCR score of 15 as the first level of invalid performance (Erdodi, Kirsch, Lajiness-O’Neill, Vingilis & Medoff, 2014; Erdodi et al., 2016). Similarly, the authors of the newly introduced FCR trial to the child version of CVLT suggested that even one incorrect response is indicative of suboptimal effort (Lichtenstein et al., 2017). The present study was designed to investigate two psychometric issues involving FCR. First, we proposed to compare the de facto FCR cutoff of ≤14 to its more liberal alternative (≤15) in a sample of clinically referred adults with TBI. We hypothesized that raising the cutoff to ≤15 would improve the sensitivity of FCR, while maintaining acceptable specificity, as reported in the child version of CVLT. Second, based on earlier reports that active psychiatric conditions increase the BRFail on PVTs (Moore & Donders, 2004), we also hypothesized that performance on FCR would be related to self-ratings of emotional distress. Although previous research suggests that failing one type of validity indicator is predictive of failing other types of validity measures (Constantinou, Bauer, Ashendorf, Fisher, & McCaffrey, 2005), most clinicians seem to agree that the credibility of symptom report and performance on cognitive tests are conceptually distinct and hence, should be assessed separately (van Dyke, Millis, Axelrod, & Hanks, 2013). Overall, the link between performance validity and psychiatric functioning remains controversial. Some investigators found that PVT failure was unrelated to depression (Considine et al., 2011; Egeland et al., 2005; Rees, Tombaugh, & Boulay, 2001), while others reported an increase the BRFail in patients with psychiatric disorders (Erdodi et al., 2016). Recent research suggests that while PVT failure is unrelated to self-reported depression and anxiety, it has a strong relationship with somatic symptoms (Erdodi, Sagar et al., 2017). Method Participants The sample consisted of 104 outpatients medically referred for neuropsychological assessment after TBI. The majority were males (55.8%) and right-handed (90.4%). Mean age was 38.8 years (SD = 16.7) and mean level of education was 13.7 years (SD = 2.6). Mean FSIQWAIS-IV was 92.6 (SD = 15.9). Using data on injury severity indices gathered through clinical interview and the review of medical records (presence/length of loss of consciousness, neuroradiological findings, peri-traumatic amnesia, GCS score), 75.0% were classified as mild (mTBI). The rest were classified as moderate-to-severe. All patients were in the post-acute stage of recovery (>3 months post mTBI and >1 year post moderate-to-severe TBI). As the assessments were conducted for clinical purposes, no data were available on litigation status. Materials A fixed battery of commonly used, standardized measures of general intelligence, learning, memory, attention, executive functions, language, visuoperceptual and motor skills was administered to all patients (Table 1). Emotional functioning was assessed with self-report inventories. Performance validity was evaluated using a combination of stand-alone and embedded PVTs. As a free-standing PVT based on multiple trials separated by time-delay, the WMT represented the traditional approach to performance validity research. Table 1. List of Tests Administered: Abbreviations, Scales and Norms Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Note:T: Heaton: Demographically adjusted norms published by Heaton, Miller, Taylor, & Grant (2004); Manual: Normative data published in the technical manual. aAdministered and scored on a computer. Table 1. List of Tests Administered: Abbreviations, Scales and Norms Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Note:T: Heaton: Demographically adjusted norms published by Heaton, Miller, Taylor, & Grant (2004); Manual: Normative data published in the technical manual. aAdministered and scored on a computer. To address concerns about instrumentation artifacts as a threat to the internal validity of signal detection analyses (Root et al., 2006), we developed two composites using five independent validity measures to complement the WMT. The first one consists of PVTs based on recognition memory, labeled “Erdodi Index Five—Recognition” (EI-5REC). The second consists of EVIs that are not based on recognition memory, labeled “Erdodi Index Five—Non-Recognition” (EI-5NR). Each component of the EI-5s was recoded into a four-point scale: the performance reflecting an incontrovertible Pass was assigned a value of zero, while the most extreme level of failure was assigned the value of three, with intermediate levels of failure coded as one and two, following the methodology described by Erdodi (2017). Table 2 provides the details of the re-scaling process and references to the cutoffs used. Table 2. The Components of the EI-5s and Base Rates of Failure at Given Cutoffs EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 Note: WMT (IR, DR & CNS): Word Memory Test—Number of failures on Immediate Recall, Delayed Recall & Consistency trials at standard cutoffs; WCT: Word Choice Test (Pearson, 2009); VR: WMS-IV Visual Reproduction (Pearson, 2009); LM: WMS-IV Logical Memory (Bortnik et al., 2010; Pearson, 2009); VPA: WMS-IV Verbal Paired Associates (Pearson, 2009); RDS: Reliable Digit Span (Greiffenstein et al., 1994; Pearson, 2009); WCST FMS: Wisconsin Card Sorting Test Failures to Maintain Set (Greve, Bianchini, Mathias, Houston & Crouch, 2002; Larrabee, 2003; Suhr & Boyer, 1999); FTT: Finger Tapping Test, number of cutoffs failed of dominant hand raw score ≤28/35 and combined raw scores ≤58/66 (Arnold et al., 2005); Animals: Animal fluency raw score (Sugarman & Axelrod, 2015); CPT-II OMI: Conners’ Continuous Performance Test, 2nd edition Omissions T-scores (Erdodi et al., 2014; Lange et al., 2013; Ord, Boettcher, Greve, & Bianchini, 2010). The italic values represent the percent of the sample that scored within a given range of cutoffs. Table 2. The Components of the EI-5s and Base Rates of Failure at Given Cutoffs EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 Note: WMT (IR, DR & CNS): Word Memory Test—Number of failures on Immediate Recall, Delayed Recall & Consistency trials at standard cutoffs; WCT: Word Choice Test (Pearson, 2009); VR: WMS-IV Visual Reproduction (Pearson, 2009); LM: WMS-IV Logical Memory (Bortnik et al., 2010; Pearson, 2009); VPA: WMS-IV Verbal Paired Associates (Pearson, 2009); RDS: Reliable Digit Span (Greiffenstein et al., 1994; Pearson, 2009); WCST FMS: Wisconsin Card Sorting Test Failures to Maintain Set (Greve, Bianchini, Mathias, Houston & Crouch, 2002; Larrabee, 2003; Suhr & Boyer, 1999); FTT: Finger Tapping Test, number of cutoffs failed of dominant hand raw score ≤28/35 and combined raw scores ≤58/66 (Arnold et al., 2005); Animals: Animal fluency raw score (Sugarman & Axelrod, 2015); CPT-II OMI: Conners’ Continuous Performance Test, 2nd edition Omissions T-scores (Erdodi et al., 2014; Lange et al., 2013; Ord, Boettcher, Greve, & Bianchini, 2010). The italic values represent the percent of the sample that scored within a given range of cutoffs. In addition to aggregating multiple independent validity indicators and thus, increasing the overall diagnostic power of the measurement model (Larrabee, 2003), an essential feature of the EI-5s is that they recapture the underlying continuity in performance validity, distinguishing between near-passes and clear failures. An EI-5 score provides a summary of both the “number” and “extent” of PVT failures. Since the practical demands of validity assessment require a dichotomous outcome, the first two levels were considered a Pass, and values of ≥4 were considered a Fail. EI-5 values 2–3 were considered borderline (Table 3), and excluded from further analyses involving the EI-5 to ensure the purity of the criterion groups (Pass/Fail), a methodological standard in calibrating new PVTs (Erdodi, Tyson, Abeare et al., 2017; Greve & Bianchini, 2004; Sugarman & Axelrod, 2015). Table 3. Frequency, Cumulative Frequency and Classification Range for the First Eight Levels of the EI-5s EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL Note: EI-5REC: Erdodi Index Five—Recognition memory based; EI-5NR: Erdodi Index Five—Non-recognition memory based. Table 3. Frequency, Cumulative Frequency and Classification Range for the First Eight Levels of the EI-5s EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL Note: EI-5REC: Erdodi Index Five—Recognition memory based; EI-5NR: Erdodi Index Five—Non-recognition memory based. To complement the WMT and the EI-5s, several other validity measures were used as reference PVTs to provide a more representative sample of sensory modalities, testing paradigms and sensitivity to invalid responding. Including a variety of independent PVTs is essential to keep multicollinearity at a minimum and thus, maximize the predictive power of the multivariate model of performance validity assessment (Boone, 2013; Larrabee, 2012). The Word Choice Test (WCT) is a single-trial free-standing PVT based on the FCR paradigm (Pearson, 2009). Number of hits on the Yes/No recognition trial of the CVLT-II (RHCVLT-II) was selected because it is nested within the same test as FCR and there are previous comparisons between the two tasks. The logistic regression equation developed by Wolfe and colleagues (2010; LREWolfe) was the alternative CVLT-II-based reference PVT. Given reports of high false positive rates associated with the original cutoff (≥.50), the more conservative alternative (≥.625) was used in cross-validation analyses (Donders & Strong, 2011). The WAIS-IV Digit Span age-corrected scaled score (DSACSS) is a measure of auditory attention and working memory that has been shown to be effective at detecting invalid responding (Axelrod, Fichteberg, Millis & Wertheimer, 2006; Reese, Suhr, & Riffle, 2012; Spencer et al., 2013). Self-reported emotional functioning was assessed using the Beck Depression Inventory—Second Edition (BDI-II) and the Symptom Checklist-90-Revised (SCL-90-R). The BDI-II was designed to measure depressive symptoms consistent with the DSM-IV (APA, 1996) diagnostic criteria (Beck, Steer, & Brown, 1996). Its brevity (21 items rated on a 4-point ordinal scale [0–3]) combined with excellent psychometric properties and discriminant validity in both healthy controls and psychiatric patients make the BDI-II a popular screening tool for depression (Sprinkle et al., 2002; Storch, Roberti, & Roth, 2004). The SCL-90-R is a widely used screening tool for a broad range of psychiatric symptoms in clinical populations with a broad range of etiology (Derogatis, 1994) and in patients with TBI specifically (Hoofien, Barak, Vakil, & Gilboa, 2005). As the name indicates, it contains 90 items self-rated on a 5-point ordinal scale [0–4] that converge into nine scales: somatization (SOM), obsessive-compulsive symptoms (O-C), interpersonal sensitivity (I-S), depression (DEP), anxiety (ANX), hostility (HOS), phobic anxiety (PHO), paranoid ideation (PAR) and psychotic symptoms (PSY). In addition, a Global Severity Index (GSI) is computed to reflect the mean of all items. The GSI has been found to be the most sensitive of the SCL-90-R indicators to disruptions in emotional and social functioning post TBI (Baker, Schmidt, Heinemann, Langley & Miranti, 1998; Marschark, Richtsmeier, Richardson, Crovitz, & Henry, 2000; Westcott & Alfano, 2005). Clinical elevations (T ≥ 63; Derogatis, 1994) were also commonly observed on the O-C, I-S, DEP and PHO scales (Baker et al., 1998; Marschark et al., 2000; Palav, Ortega, & McCaffrey, 2001; Westcott & Alfano, 2005). Procedure Participants were assessed in two half-day appointments through the neurorehabilitation service of a Midwestern academic medical center. Psychometric testing was completed in an outpatient setting by trained psychometricians. A staff neuropsychologist conducted the clinical interview and review of medical records, wrote the integrative report and provided feedback to patients. Data were collected through an archival retrospective chart review of a consecutive series of TBI referrals. The study was approved by the Institutional Review Board. Ethical guidelines regulating research with human participants were followed throughout the project. Data Analysis Descriptive statistics (frequency, percentage and cumulative percentage; mean, standard deviation) were computed for the key variables. Significance testing was performed using the F- and t-tests as well as χ2. ANOVAs were followed up with uncorrected t-tests. Since all post hoc contrasts were a priori planned comparisons, no statistical correction was applied (Rothman, 1990; Perneger, 1998). In addition, the tension between statistical versus clinical significance was resolved by consistently reporting effect size estimates associated with each relevant contrast: partial eta squared (η2p), Cohen’s d and Ф2. Receiver operating characteristics (ROC) analyses [area under the curve (AUC) with 95% CI] were performed using SPSS 22.0. The rest of the classification accuracy parameters [sensitivity, specificity, positive and negative likelihood ratio (+LR and −LR)] were computed using standard formulas. Results Mean scores on tests of cognitive ability ranged from Low Average to Average (Table 4). The mean FCR score in the sample was 15.4 (SD = 1.4; range: 9–16). The median value was 16. The distribution was negatively skewed (−2.48) and had a strong positive kurtosis (+6.47). The majority of the sample (75.0%) obtained a perfect score on FCR; 6.7% scored 15 and 18.3% scored ≤14. Table 4. Group-Level Performance on the Tests Administered Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Note: LD-FR: Long-delay free recall; RT: Reaction Time; GSI: Global Severity Index; VCI: Verbal Comprehension Index; PRI: Perceptual Reasoning Index; WMI: Working Memory Index; PSI: Processing Speed Index; LM: Logical Memory; I: Immediate Recall; II: Delayed Recall; VPA: Verbal Paired Associates; VR: Visual Reproduction. Values for standard deviations were italicized. Table 4. Group-Level Performance on the Tests Administered Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Note: LD-FR: Long-delay free recall; RT: Reaction Time; GSI: Global Severity Index; VCI: Verbal Comprehension Index; PRI: Perceptual Reasoning Index; WMI: Working Memory Index; PSI: Processing Speed Index; LM: Logical Memory; I: Immediate Recall; II: Delayed Recall; VPA: Verbal Paired Associates; VR: Visual Reproduction. Values for standard deviations were italicized. The Effect of Age, Education, Cognitive Functioning and Injury Severity on FCR As the study focused on comparing the discriminant power of two cutoffs (FCR ≤14 and ≤15) against a perfect score, the sample was divided into three groups: FCR = 16, FCR = 15 and FCR ≤14. This trichotomy was used as the independent variable (IV) for a series of ANOVAs with age, education and cognitive functioning as dependent variables (DVs). There was no difference in age among groups. However, there was a significant overall effect on level of education, driven by the higher mean of FCR = 16 subsample. A medium effect was observed on word knowledge, picture vocabulary and single word reading performance. ANOVAs were not significant for BCT (Total Errors), TPT (Total Time) and TMT-B T-scores (Table 5). Table 5. Age, Education, Injury Severity and Performance on Select Neuropsychological Tests as a Function of Trichotomized FCR Scores FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — Note. FCR: CVLT-II Forced Choice Recognition trial raw score; ED: years of formal education; VC: WAIS-IV Vocabulary age-corrected scale score; PPVT: Peabody Picture Vocabulary Test, 4th edition; WRAT-4 Reading: Wide Range Achievement Test, 4th edition, Reading subtest standard score; BCT: Booklet Category Test Total Errors T-score; Tactual Performance Test Total Time T-score; TMT-B: Trail Making Test B T-score; % mTBI: % patients with mild traumatic brain injury (vs. those with moderate-to-severe TBI); +NR: positive neuroradiological findings; Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤14 (n = 19). Values for standard deviations were italicized. Table 5. Age, Education, Injury Severity and Performance on Select Neuropsychological Tests as a Function of Trichotomized FCR Scores FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — Note. FCR: CVLT-II Forced Choice Recognition trial raw score; ED: years of formal education; VC: WAIS-IV Vocabulary age-corrected scale score; PPVT: Peabody Picture Vocabulary Test, 4th edition; WRAT-4 Reading: Wide Range Achievement Test, 4th edition, Reading subtest standard score; BCT: Booklet Category Test Total Errors T-score; Tactual Performance Test Total Time T-score; TMT-B: Trail Making Test B T-score; % mTBI: % patients with mild traumatic brain injury (vs. those with moderate-to-severe TBI); +NR: positive neuroradiological findings; Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤14 (n = 19). Values for standard deviations were italicized. Likewise, the three groups did not differ in TBI severity (percentage of mTBI patients and those with positive neuroradiological findings). In addition, the mTBI subsample was almost three times more likely to fail the old FCR cutoff (≤14; BRFail = 21.8%) than the subsample with moderate-to-severe TBI (BRFail = 7.7%). Similarly, patients with mTBI were twice as likely to fail the alternative FCR cutoff (≤15; BRFail = 28.2%) than the subsample with moderate-to-severe TBI (BRFail = 15.4%). The Classification Accuracy of FCR Against Reference PVTs All ROC models evaluating the level of agreement between FCR and reference PVTs were statistically significant (p < .01). AUC values ranged from .71 (DSACSS) to .83 (RHCVLT-II). The most stable AUC estimates were obtained against the WMT (95% CI: .65–.85), while the least stable estimates were observed on EI-5NR (95% CI:.61–.93). ROC analyses were followed up with direct comparisons between the classification accuracy of the old FCR cutoff (≤14) and the proposed alternative (≤15) against the reference PVTs. All cross-validation analyses met the minimum standard of specificity (.84; Larrabee, 2003), with values ranging from .85 to .98. Sensitivity was more variable, fluctuating between .40 and .72. The BRFail in reference PVTs ranged from 10.6% (TMT-A) to 38.5% (WMT). FCR ≤14 produced a sensitivity of .40 against the WMT, at .95 specificity. The switch to ≤15 increased sensitivity to .47, while preserving the same specificity. Classification accuracy was comparable between the two cutoffs against WCT (.48–.50 sensitivity at .93 specificity). The new cutoff outperformed the old one against EI-5REC in sensitivity (.52/.44) while both maintained very high specificity (.98). A similar pattern of increased sensitivity (.50/.58) and steady specificity (.91/.90) was observed against EI-5NR as the analyses shifted from the old to the new cutoff. Sensitivity spiked against RHCVLT-II with both cutoffs (.65/.72) in the backdrop of good specificity (.93/.92). Again, the new cutoff outperformed the old one against DSACSS in sensitivity (.45/.53) while producing the same specificity (.89). Overall, the new cutoff increased sensitivity from .50 to .56 compared to the old one, while preserving the same specificity (.92). This pattern of consistently higher sensitivity and essentially unchanged specificity associated with the new cutoff was also observed at the level of LRs (Table 6). With the exception of WCT, FCR ≤15 produced higher +LRs than FCR ≤14 against the reference PVTs. The new cutoff had consistently lower −LRs against the all reference PVTs than the old cutoff, suggesting superior classification accuracy. Table 6. A Direct Comparison between the Classification Accuracy of the Two FCR Cutoffs against Reference PVTs FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 Note: WMT: Word Memory Test (Green, 2003); WCT: Word Choice Test (Pearson, 2009); EI-5REC: Erdodi Index Five—Recognition based; EI-5NR: Erdodi Index Five—Non-recognition based; RHCVLT-II: CVLT-II Yes/No Recognition hits raw score (Wolfe et al., 2010); LREWolfe: Logistical regression equation based on a combination of three CVLT-II scores: long-delay free recall raw score, total recall discriminability z-score and d’ raw score (Donders & Strong, 2011; Wolfe et al., 2010); DSACSS: Digit Span age-corrected scaled score (Axelrod et al., 2006; Spencer et al., 2013); BRFail: Base rate of failure (%); AUC: Area under the curve; FCR: CVLT-II forced choice recognition; SENS: Sensitivity; SPEC: Specificity; +LR: Positive likelihood ratio; −LR: Negative likelihood ratio; Number of participants with FCR ≤ 15 is 26; Number of participants with FCR ≤ 14 is 19. The italic values represent base rates of failure. Table 6. A Direct Comparison between the Classification Accuracy of the Two FCR Cutoffs against Reference PVTs FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 Note: WMT: Word Memory Test (Green, 2003); WCT: Word Choice Test (Pearson, 2009); EI-5REC: Erdodi Index Five—Recognition based; EI-5NR: Erdodi Index Five—Non-recognition based; RHCVLT-II: CVLT-II Yes/No Recognition hits raw score (Wolfe et al., 2010); LREWolfe: Logistical regression equation based on a combination of three CVLT-II scores: long-delay free recall raw score, total recall discriminability z-score and d’ raw score (Donders & Strong, 2011; Wolfe et al., 2010); DSACSS: Digit Span age-corrected scaled score (Axelrod et al., 2006; Spencer et al., 2013); BRFail: Base rate of failure (%); AUC: Area under the curve; FCR: CVLT-II forced choice recognition; SENS: Sensitivity; SPEC: Specificity; +LR: Positive likelihood ratio; −LR: Negative likelihood ratio; Number of participants with FCR ≤ 15 is 26; Number of participants with FCR ≤ 14 is 19. The italic values represent base rates of failure. The Relationship Between FCR and Emotional Functioning The majority of the sample (54.1%) scored in the non-clinical range on the SCL-90-R using a GSI T-score ≥63 as the cutoff. However, only 38.5% had fewer than two elevations (T ≥63) on the nine clinical scales, the other criterion for establishing the presence of clinically significant distress (Derogatis, 1994). The number of clinical elevations (M = 3.6, SD = 3.3) produced a bimodal distribution with two distinct clusters: patients with either zero (25.0%) or nine (14.6%) scores ≥63. ANOVAs using the trichotomized FCR (16, 15 and ≤14) as IV and the SCL-90-R scales as DVs produced significant main effects for all SCL-90-R scales except ANX and PHO. Effect sizes (η2p) ranged from .08 (medium) on HOS to .18 (large) on PSY. All post hoc contrasts were significant between FCR = 16 and FCR = 15 subsamples except ANX and PHO. Effect sizes (d) ranged from .87 (large) on SOM to 1.67 (very large) on O-C. All post hoc contrasts were significant between FCR = 16 and FCR ≤ 14 subsamples except HOS. Effect sizes (d) ranged from .62 (medium) on PHO to .83 (large) on PSY. When SCL-90-R scores were dichotomized around the T ≥ 63 cutoff into “clinical” versus “non-clinical”, non-parametric contrasts produced essentially the same results (Table 7). One comparison (PAR) became non-significant. All effect size estimates (Φ2) were within .02 of η2p values produced by ANOVAs with the exception of the GSI. Table 7. SCL-90-R Scores as a Function of FCR Performance FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 Note. All SCL-90-R scales are in T-scores (M = 50, SD = 10); FCR: CVLT-II Forced Choice Recognition trial raw score; SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; %CLIN: Percent of the subsample scoring T ≥ 63 on the SCL-90-R clinical scales; percent of the subsample with two or more scores T ≥ 63 on Σ 63; and percent of the subsample with BDI-II raw score ≥20 (cutoff for Moderate Depression); Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤ 14 (n = 19). Italic and bold values represent standard deviations and percent of the sample above the clinical threshold/phi-squared, respectively. Table 7. SCL-90-R Scores as a Function of FCR Performance FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 Note. All SCL-90-R scales are in T-scores (M = 50, SD = 10); FCR: CVLT-II Forced Choice Recognition trial raw score; SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; %CLIN: Percent of the subsample scoring T ≥ 63 on the SCL-90-R clinical scales; percent of the subsample with two or more scores T ≥ 63 on Σ 63; and percent of the subsample with BDI-II raw score ≥20 (cutoff for Moderate Depression); Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤ 14 (n = 19). Italic and bold values represent standard deviations and percent of the sample above the clinical threshold/phi-squared, respectively. All three groups produced saw-tooth profiles, with distinct spikes on O-C, DEP and PSY (Fig. 1). FCR = 16 subsample had only one mean ≥63 on O-C, and on average had 2.9 elevations (SD = 3.2). The FCR = 15 subsample produced mean T ≥63 on all scales, and on average had 6.9 elevations (SD = 1.9). FCR ≤14 subsample produced mean T ≥63 on SOM, O-C, DEP, PSY and the GSI, and on average had 5.4 elevations (SD = 3.1). Fig. 1. View largeDownload slide SCL-90-R profiles associated with three levels of FCR performance; number of participants with perfect score on the FCR is 78; number of participants with FCR = 15 is 7; number of participants with FCR ≤ 14 is 19. Fig. 1. View largeDownload slide SCL-90-R profiles associated with three levels of FCR performance; number of participants with perfect score on the FCR is 78; number of participants with FCR = 15 is 7; number of participants with FCR ≤ 14 is 19. ANOVAs were repeated on the BDI-II, producing a large effect (η2p = .17), driven by the non-clinical range score of the FCR = 16 group (M = 12.5, SD = 10.5). FCR = 15 group (M = 24.9, SD = 6.9) did not differ from the FCR ≤14 group (M = 22.9, SD = 11.8). Both of these means were in the range of moderate clinical depression, and significantly higher than FCR = 16 mean (d = .93 and 1.40, large). To investigate whether these findings would generalize to other PVTs, a series of independent t-tests were performed between patients who passed and those who failed the WMT on SCL-90-R and BDI-II scores. All contrasts were significant, with the Fail group reporting higher levels of symptoms. Effect size estimates ranged from .46 (medium) to 1.01 (large). The analyses were repeated using a series of ANOVAs with the EI-5REC as trichotomous independent variable (Pass/Borderline/Fail) and the SCL-90-R and BDI-II scores as dependent variables. All ANOVAs were significant (η2p: .06–.12; medium-large effects) with the exception of the SOM scale (Table 8). The only post hoc contrast that consistently reached significance was between the Pass and Fail groups, with effect sizes ranging from .43 (medium) to .87 (large). Unlike with FCR, there was a linear relationship between level of PVT failure and self-reported emotional distress, with the Pass group reporting the least, the Fail group reporting the most emotional distress, with the Borderline group in the middle (Fig. 2). Table 8. SCL-90-R and BDI-II Scores as a Function of Passing or Failing the WMT and the EI-5REC SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 Note: All SCL-90-R scales are in T-scores (M = 50, SD = 10); SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index;; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; WMT: Word Memory Test; EI-5REC: Erdodi Index Five—Recognition based; dP-F: Cohen’s d for the Pass vs. Fail post hoc contrast. Italic and bold values represent standard deviations and Cohen’s d, respectively. Table 8. SCL-90-R and BDI-II Scores as a Function of Passing or Failing the WMT and the EI-5REC SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 Note: All SCL-90-R scales are in T-scores (M = 50, SD = 10); SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index;; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; WMT: Word Memory Test; EI-5REC: Erdodi Index Five—Recognition based; dP-F: Cohen’s d for the Pass vs. Fail post hoc contrast. Italic and bold values represent standard deviations and Cohen’s d, respectively. Fig. 2. View largeDownload slide SCL-90-R profiles associated with the three levels of EI-5REC performance; number of participants in the Pass range (0–1) is 51; number of participants in the Borderline range (2–3) is 18; number of participants in the Fail range (≥4) is 29. Fig. 2. View largeDownload slide SCL-90-R profiles associated with the three levels of EI-5REC performance; number of participants in the Pass range (0–1) is 51; number of participants in the Borderline range (2–3) is 18; number of participants in the Fail range (≥4) is 29. Discussion The present study was designed to compare the de facto FCR cutoff (≤14) to a more liberal alternative (≤15) in a sample of clinically referred patients with TBI. Both cutoffs performed around the “Larrabee limit” (.50 sensitivity at .90 specificity). The hypothesis that increasing the cutoff will improve sensitivity while maintaining specificity was supported by the data. On average, FCR ≤15 correctly classified an additional 6% of the invalid response sets, while maintaining a false positive rate of <10%. Likewise, the alternative cutoff produced comparable or better classification accuracy at the level of likelihood ratios. This pattern of findings was remarkably consistent across a wide range of reference PVTs, including auditory and visual, univariate and multivariate criteria, free-standing and embedded PVTs, indicators based on the FCR paradigm and those derived from tests of attention. The replicable superiority of the new cutoff against a variety of criterion measures addresses previous concerns about modality specificity (Erdodi, Tyson, Abeare et al., 2017; Erdodi, Tyson, Shahein et al., 2017), and provides empirical validation to earlier predictions that even a single error on FCR may indicate invalid responding (D. Delis, personal communication, 10 May 2012). Our results are also consistent with research on the child version of FCR (Lichtenstein et al., 2017). In addition, the consistently high specificity and +LR of the new cutoff against multiple reference PVTs suggests that the more liberal FCR cutoff does not inflate false positive rates. Equally importantly, subsamples with FCR scores 16, 15 and ≤14 did not differ from each other in injury severity, neuroradiological findings, or on the measures known to be sensitive to TBI (Booklet Category Test, Tactual Performance Test and Trails B; Grant & Adams, 1996). These findings suggest that FCR is independent of objective measures of impairment, consistent with previous reports (Baldo et al., 2002; Donders & Strong, 2011). The fact that, paradoxically, a significant difference emerged on “hold” tests (Boone, 2013) known to be resistant to the deleterious effects of TBI (i.e., word knowledge, picture vocabulary and single word reading) provides further evidence that FCR is unrelated to cognitive impairment subsequent to TBI. In fact, internally inconsistent patterns of test scores have been identified as emergent markers of invalid responding (Boone, 2013; Larrabee, 2012; Slick, Sherman & Iverson, 1999). Furthermore, there was a “reverse injury severity effect” on FCR. In other words, patients with mTBI were two to three times more likely to fail FCR cutoffs compared to patients with moderate-to-severe TBI. Although counterintuitive, this phenomenon is well-replicated in the research literature (Carone, 2008; Erdodi & Rai, 2017; Green, Iverson, & Allen, 1999; Green, Flaro, & Courtney, 2009; Sweet, Goldman, & Guidotti Breting, 2013). In the broader context of this well-established apparent paradox of elevated BRFail in mTBI, the current results should alleviate concerns about false positive errors on FCR due to genuine neurological impairment. The second hypothesis that performance on FCR would be related to self-reported emotional distress was also supported. Patients who obtained a perfect score on FCR had the lowest level of depression on SCL-90-R and BDI-II, both as continuous scales and as percentage in the clinical range. Those who made any error on FCR reported more severe psychiatric symptoms globally, with large to very large effect sizes. These findings are consistent with some of the existing literature that documents a link between psychiatric history and invalid performance on neurocognitive testing (Martens, Donders, & Millis, 2001; Moore & Donders, 2004), but contradicts other reports that anxiety and depression are unrelated to PVT failure (Ashendorf, Constantinou & McCaffery, 2004; Considine et al., 2011; Egeland et al., 2005; Rees et al., 2001). The divergence between our study and some previous investigations on PVTs and psychological distress may be driven by two main factors. First, many of them operationalized performance validity using a single criterion measure, such as the TOMM or the Rey 15-item test at traditional cutoffs (Trial 2 <45 and free recall <9, respectively), which are known to have limited sensitivity to invalid responding (Green, 2007; Reznek, 2005). Therefore, those negative findings may reflect undetected invalid profiles. Second, those studies focused on psychiatric disorders, whereas our sample was comprised of patients with TBI, some of whom also reported emotional problems. As such, our positive findings could be due to the additive effect of neuropsychological deficits subsequent to TBI, pre-existing or emerging deficits in emotional regulation, or other contextual factors uniquely related to TBI and post-TBI depression and anxiety. While the evidence linking depression and memory deficits is mixed both within and between studies (Bearden et al., 2006; Ilsley, Moffoot, & Carroll, 1995; Keiski, Shore, & Hamilton, 2007; Kessels, Ruis, & Kappelle, 2007; Langenecker et al., 2005; Raskin, Mateer, & Tweeten, 1998), there is growing evidence that memory tests are impacted more by invalid responding than psychiatric disorders (Boone, 2013; Coleman, Rapport, Millis, Ricker, & Farchione, 1998; Larrabee, 2012; Suhr, Tranel, Wefel, & Barrash, 1997; Trueblood, 1994). In fact, Rohling, Green, Allen and Iverson (2002) argue that a meaningful investigation of the interaction between depression and cognitive functioning must exclude individuals who fail PVTs. Our findings are congruent with this line of research on co-existing TBI, self-reported emotional problems and PVT failures. As FCR performance correlates with scores on both PVTs and psychiatric symptom inventories, the clinical interpretation of failing this validity indicator is a challenge. The group-level pattern of scores observed in this sample fits several criteria of “Cogniform Disorder” introduced by Delis and Wetter (2007): internally inconsistent neurocognitive profiles, combination of test scores that are rare in patients with genuine neurological impairment, and objective evidence of poor effort. Although the observational data presented in this study does not allow for causal attributions, they raise some important questions. Does genuine emotional distress increase vulnerability to PVT failures? Do patients with non-credible presentation exaggerate both emotional distress and cognitive deficits? Are certain PVTs more sensitive than others to both forms of invalid responding? Although there is an emerging consensus that symptom and performance validity are distinct constructs and therefore, should be evaluated separately (van Dyke et al., 2013), it is plausible that they share part of their etiology. If the link between FCR and psychiatric symptoms is replicated in future studies, failing FCR might become a marker of not only invalid performance, but perhaps also of “psychogenic interference”—a failure to demonstrate one’s true ability level on cognitive testing due to acute psychiatric symptoms. It is interesting that the FCR = 15 group reported more severe psychiatric symptomatology than the FCR ≤ 14 group. Also, the FCR = 16 group produced a pattern of performance that is consistent with the bona fide cognitive sequelae subsequent to TBI (i.e., intact performance on “hold” tests, and mild deficits on measures known to be sensitive to head injury). In contrast, the FCR ≤ 14 group demonstrated uniformly low performance across both types of tests, with the FCR = 15 group in between. It is possible that there are group-level differences in the etiology of PVT failures, with the more heavily psychogenic influences having a milder impact than other factors that are known to have strong effects on PVT performance, such as external incentives to appear impaired (Boone, 2013; Larrabee, 2012). However, this cannot be determined with the current sample, given the absence of data on litigation status. While previous research found that certain PVTs appear to be uniquely sensitive to emotional distress (Erdodi et al., 2016), it failed to disentangle the relative contribution of psychogenic interference and volitional suppression of performance on cognitive testing. The cumulative clinical evidence suggests that the etiology of invalid performance is likely multifactorial. A PVT failure can be the expression of several contributing and potentially interacting factors and hence, does not automatically mean deliberate suppression of cognitive ability (i.e., malingering). A full consideration of alternative explanations to non-credible presentation is instrumental in providing an accurate, nuanced and clinically helpful interpretation of neurocognitive profiles (Bigler, 2012, 2015). Developing a conceptually sound and empirically supported model for subtyping non-credible responding has important forensic and clinical applications. For example, in personal injury litigation, multiple unequivocal PVT failures raise the possibility of malingering and thus, have obvious implications for the legitimacy of the lawsuit. In contrast, a neuropsychologist’s conclusion that a plaintiff failed to put forth adequate effort, but not deliberately so, may shift the focus to exploring other plausible clinical issues that may or may not be related to the accident (depression, unresolved developmental trauma, exacerbation of a pre-existing psychological vulnerability, righteous anger towards the perpetrator of the injury, etc.). In those cases, the assessor’s responsibility is to (a) determine whether the data are consistent with an alternative accident-related etiology; (b) render an opinion that even if psychogenic factors are operative, they cannot account for the level of impairment demonstrated during testing, or (c) conclude that regardless of the reason behind unexpectedly low scores, they cannot be attributed to accident-related factors. Even in clinical settings and in the absence of apparent external incentives to appear impaired, assessors often face the complex task of interpreting co-existing PVT failures and medically verified neurological problems (Erdodi et al., 2016). In such cases, it is the neuropsychologist’s responsibility to determine whether (a) low scores are a manifestation of a legitimate disease process; (b) even in the context of documented severe impairments the low scores are still not credible; or (c) independent of neurological manifestations, ancillary issues are contributing to low performance, such as living with a debilitating neurological impairment for many years has resulted in unremitting dependence or chronic resignation in the face of cognitive demands. These considerations are important for optimizing the clinical management of the patient. If an evaluation is deemed valid (i.e., PVT failures are attributable to despondent resignation that deflated scores throughout testing), certain aspects of the patient’s impairment might be reversible. In such cases referral for psychotherapy or cognitive rehabilitation has the potential to restore some of the cognitive functioning. For example, in the present sample elevations on SCL-90-R were related to errors on FCR and failures on other PVTs. If self-reported psychopathology is causally related to invalid responding, treating the psychiatric symptoms could conceivably improve cognitive performance. Although speculations about the reasons behind poor efforts are epistemologically risky, providing a sound, albeit tentative, explanation could be important, as the clinical outcome hinges on the correct interpretation of non-credible presentation. Beyond the simple “valid/invalid” dichotomy, the assessor carries the responsibility of determining whether a meaningful intervention is feasible. Erring on either side can be costly. Dismissing a patient as non-credible may deprive the individual of the opportunity to recover lost function. Recommending therapy for a malingerer may allocate limited health care services to an individual who is invested in appearing impaired and thus, is unlikely to benefit from the intervention. In conclusion, FCR scores should be interpreted in the larger context of injury severity, clinical and psycho-social history, incentive status as well as the rest of the neurocognitive profile. Marginal failures (FCR = 15) likely have a different clinical meaning in patients with medically verified severe pathology and those with mild or questionable TBI. In the former group, a single error may be a direct manifestation of the injury. Conversely, in the latter group it may raise concerns about non-neurological factors contributing to the presentation. The present study has several strengths. It provided a direct comparison of the classification accuracy of two different FCR cutoffs across a wide range of reference PVTs in a clinically referred sample with mild and moderate-to-severe TBI. We also examined the link between FCR failures and self-reported emotional functioning. Inevitably, the study has a number of limitations, too: the sample is relatively small and geographically restricted. More importantly, the FCR = 15 subsample was too small to draw definite conclusions about the neurocognitive profile of patients who only failed the liberal cutoff on FCR. In addition, as psychiatric symptoms were assessed using face-valid self-report inventories without built-in validity scales, the veracity of these data is unknown, which is a considerable limitation of our measurement model. However, given the limited research on the link between emotional functioning and performance validity, documenting a systematic difference in the level of self-reported psychiatric symptoms as a function of passing or failing PVTs is a meaningful initial step towards a better understanding of this complex relationship. The fact that previous research that controlled for response bias in self-report produced similar results (Erdodi, Sagar et al., 2017; Erdodi, Seke et al. 2017) suggests that the shared variance between elevated symptom report and PVT failure cannot be attributed to a common “malingering factor” (i.e., the same people fabricate/exaggerate both psychiatric problems and cognitive deficits). More importantly, the nature of the data (archival/observational) precludes causal modeling of the main effects. Prospective experimental and longitudinal studies that can separate invalid performance from psychiatric history by design are needed to determine the clinical meaning of FCR failures—evidence of non-credible responding, emotional distress or both? Conclusion Even a single error on the FCR is a reliable marker of invalid responding. Based on its superior classification accuracy, ≤15 should replace the current de facto FCR cutoff of ≤14. Failing the FCR was associated with elevated self-reported psychiatric symptoms. Given that the link between invalid performance and emotional distress is poorly understood, further research is needed to explore the underlying causal mechanisms. Funding None. Conflict of Interest None declared. Acknowledgments The authors would like to thank Drs. Donders and Marshall for providing additional data on the clinical samples used in their studies that were not included in the original publications. References American Psychiatric Association . ( 1996 ). Diagnostic and statistical manual of mental disorders ( 4th ed. ). Washington, DC : Author . An , K. Y. , Kaploun , K. , Erdodi , L. A. , & Abeare , C. A. ( 2017 ). Performance validity in undergraduate research participants: A comparison of failure rates across tests and cutoffs . The Clinical Neuropsychologist , 31 , 193 – 206 . doi:10.1080/13854046.2016.1217046 . Google Scholar CrossRef Search ADS PubMed An , K. Y. , Zakzanis , K. K. , & Joordens , S. ( 2012 ). Conducting research with non-clinical healthy undergraduates: Does effort play a role in neuropsychological test performance? Archives of Clinical Neuropsychology , 27 , 849 – 857 . Google Scholar CrossRef Search ADS PubMed Arnold , G. , Boone , K. B. , Lu , P. , Dean , A. , Wen , J. , Nitch , S. , et al. . ( 2005 ). Sensitivity and specificity of finger tapping test scores for the detection of suspect effort . The Clinical Neuropsychologist , 19 , 105 – 120 . Google Scholar CrossRef Search ADS PubMed Ashendorf , L. , Constantinou , M. , & McCaffrey , R. J. ( 2004 ). The effect of depression and anxiety on the TOMM in community-dwelling older adults . Archives of Clinical Neuropsychology , 19 , 125 – 130 . Google Scholar CrossRef Search ADS PubMed Axelrod , B. N. , Fichteberg , N. L. , Millis , S. R. , & Wertheimer , J. C. ( 2006 ). Detecting incomplete effort with Digit Span from the Wechsler Adult Intelligence Scale—Third Edition . The Clinical Neuropsychologist , 10 , 513 – 523 . Google Scholar CrossRef Search ADS Baker , K. A. , Schmidt , M. F. , Heinemann , A. W. , Langley , M. , & Miranti , S. V. ( 1998 ). The validity of the Katz Adjustment Scale among people with traumatic brain injury . Rehabilitation Psychology , 43 , 30 – 40 . Google Scholar CrossRef Search ADS Baldo , J. V. , Delis , D. , Kramer , J. , & Shimamura , A. ( 2002 ). Memory performance on the California Verbal Learning Test-II: Findings from patients with focal frontal lesions . Journal of the International Neuropsychological Society , 8 , 539 – 546 . Google Scholar CrossRef Search ADS PubMed Bauer , L. , Yantz , C. L. , Ryan , L. M. , Warned , D. L. , & McCaffrey , R. J. ( 2005 ). An examination of the California Verbal Learning Test II to detect incomplete effort in a traumatic brain injury sample . Applied Neuropsychology , 12 , 202 – 207 . Google Scholar CrossRef Search ADS PubMed Bearden , C. E. , Glahn , D. C. , Monkul , E. S. , Barrett , J. , Najt , P. , Villarreal , V. , et al. . ( 2006 ). Patterns of memory impairment in bipolar disorder and unipolar major depression . Psychiatry Research , 142 , 139 – 150 . Google Scholar CrossRef Search ADS PubMed Beck , A. T. , Steer , R. A. , & Brown , G. K. ( 1996 ). Beck Depression Inventory ( 2nd ed. ). San Antonio, TX : Psychological Corporation . Bigler , E. D. ( 2012 ). Symptom validity testing, effort and neuropsychological assessment . Journal of the International Neuropsychological Society , 18 , 632 – 642 . Google Scholar CrossRef Search ADS PubMed Bigler , E. D. ( 2015 ). Neuroimaging as a biomarker in symptom validity and performance validity testing . Brain Imaging and Behavior , 9 , 421 – 444 . Google Scholar CrossRef Search ADS PubMed Boone , K. B. ( 2013 ). Clinical practice of forensic neuropsychology . New York : Guilford . Boone , K. B. ( 2009 ). The need for continuous and comprehensive sampling of effort/response bias during neuropsychological examinations . The Clinical Neuropsychologist , 23 , 729 – 741 . Google Scholar CrossRef Search ADS PubMed Bortnik , K. E. , Boone , K. B. , Marion , S. D. , Amano , S. , Ziegler , E. , Victor , T. L. , et al. . ( 2010 ). Examination of various WMS-III logical memory scores in the assessment of response bias . The Clinical Neuropsychologist , 24 , 344 – 357 . Google Scholar CrossRef Search ADS PubMed Carone , D. A. ( 2008 ). Children with moderate/severe brain damage/dysfunction outperform adults with mild-to-no brain damage on the Medical Symptom Validity Test . Brain Injury , 22 , 960 – 971 . Google Scholar CrossRef Search ADS PubMed Chafetz , M. D. , Williams , M. A. , Ben-Porath , Y. S. , Bianchini , K. J. , Boone , K. B. , Kirkwood , M. W. , et al. . ( 2015 ). Official position of the American Academy of Clinical Neuropsychology Social Security Administration policy on validity testing: Guidance and recommendations for change . The Clinical Neuropsychologist , 29 , 723 – 740 . Google Scholar CrossRef Search ADS PubMed Connor , D. J. , Drake , A. I. , Bondi , M. W. , & Delis , D. C. ( 1997 ). Detection of feigned cognitive impairments in patients with a history of mild to severe closed head injury. Paper presented at the American Academy of Neurology, Boston. Clark , L. R. , Stricker , N. H. , Libon , D. J. , Delano-Wood , L. , Salmon , D. P. , Delis , D. C. , et al. . ( 2012 ). Yes/No versus forced-choice recognition memory in mild cognitive impairment and alzheimer’s disease: Patterns of impairment and associations with dementia severity . The Clinical Neuropsychologist , 16 , 1201 – 1216 . Google Scholar CrossRef Search ADS Coleman , R. D. , Rapport , L. J. , Millis , S. R. , Ricker , J. H. , & Farchione , T. J. ( 1998 ). Effects of coaching on the California Verbal Learning Test . Journal of Clinical and Experimental Neuropsychology , 20 , 201 – 210 . Google Scholar CrossRef Search ADS PubMed Considine , C. , Weisenbach , S. L. , Walker , S. J. , McFadden , E. M. , Franti , L. M. , Bieliauskas , L. A. , et al. . ( 2011 ). Auditory memory decrements, without dissimulation, among patients with major depressive disorder . Archives of Clinical Neuropsychology , 26 , 445 – 453 . Google Scholar CrossRef Search ADS PubMed Constantinou , M. , Bauer , L. , Ashendorf , L. , Fisher , J. M. , & McCaffrey , R. J. ( 2005 ). Is poor performance on recognition memory effort measures indicative of generalized poor performance on neuropsychological tests? Archives of Clinical Neuropsychology , 20 , 191 – 198 . Google Scholar CrossRef Search ADS PubMed Delis , D. C. , Kramer , J. H. , Kaplan , E. , & Ober , B. A. ( 2000 ). ). California Verbal Learning Test—Second edition . San Antonio, TX : Psychological Corporation . Delis , D. , & Wetter , S. R. ( 2007 ). Cogniform disorder and cogniform condition: Proposed diagnoses for excessive cognitive symptoms . Archives of Clinical Neuropsychology , 22 , 589 – 604 . Google Scholar CrossRef Search ADS PubMed Derogatis , L. R. ( 1994 ). SCL-90-R: Administration, scoring, and procedures manual ( 3rd ed. ). Minneaplois, MN : National Computer Systems . Donders , J. , & Strong , C. A. ( 2011 ). Embedded effort indicators on the California Verbal Learning Test—Second Edition: An attempted cross-validation . The Clinical Neuropsychologist , 25 , 173 – 184 . Google Scholar CrossRef Search ADS PubMed Egeland , J. , Lund , A. , Landro , N. I. , Rund , B. R. , Sudet , K. , Asbjornsen , A. , et al. . ( 2005 ). Cortisol level predicts executive and memory function in depression, symptom level predicts psychomotor speed . Acta Psychiatrica Scandinavica , 112 , 434 – 441 . Google Scholar CrossRef Search ADS PubMed Erdodi , L. A. ( 2017 ). Aggregating validity indicators: The salience of domain specificity and the indeterminate range in multivariate models of performance validity assessment . Applied Neuropsychology: Adult . doi: 10.1080/23279095.2017.1384925 Advance online publication. Erdodi , L. A. , & Rai , J. K. ( 2017 ). A single error is one too many: Examining alternative cutoffs on Trial 2 on the TOMM . Brain Injury , 31 , 1362 – 1368 . doi: 10.1080/02699052.2017.1332386. Google Scholar CrossRef Search ADS PubMed Erdodi , L. A. , Kirsch , N. L. , Lajiness-O’Neill , R. , Vingilis , E. , & Medoff , B. ( 2014 ). Comparing the Recognition Memory Test and the Word Choice Test in a mixed clinical sample: Are they equivalent? Psychological Injury and Law , 7 , 255 – 263 . doi:10.1007/s12207-014-9197-8 . Google Scholar CrossRef Search ADS Erdodi , L. A. , Roth , R. M. , Kirsch , N. L. , Lajiness-O’Neill , R. , & Medoff , B. ( 2014 ). Aggregating validity indicators embedded in Conners’ CPT-II outperforms individual cutoffs at separating valid from invalid performance in adults with traumatic brain injury . Archives of Clinical Neuropsychology , 29 , 456 – 466 . Google Scholar CrossRef Search ADS PubMed Erdodi , L. A. , Sagar , S. , Seke , K. , Zuccato , B. G. , Schwartz , E. S. , & Roth , R. M. ( 2017 ). The Stroop Test as a measure of performance validity in adults clinically referred for neuropsychological assessment . Psychological Assessment . doi:10.1037/pas0000525 . Erdodi , L. A. , Seke , K. R. , Shahein , A. , Tyson , B. T. , Sagar , S. , & Roth , R. M. ( 2017 ). Low scores on the Grooved Pegboard Test are associated with invalid responding and psychiatric symptoms . Psychology and Neuroscience , 10 , 325 – 344 . doi: 10.1037/pne0000103. Google Scholar CrossRef Search ADS Erdodi , L. A. , Tyson , B. , Abeare , T. , Lichtenstein , C. A. , Pelletier , J. D. , Rai , C. L. , et al. . ( 2016 ). The BDAE Complex Ideational Material—A measure of receptive language or performance validity? Psychological Injury and Law , 9 , 112 – 120 . doi: 10.1007/s12207-016-9254-6. Google Scholar CrossRef Search ADS Erdodi , L. A. , Tyson , B. T. , Abeare , C. A. , Zuccato , B. G. , Rai , J. K. , Seke , K. R. , et al. . ( 2017 ). Utility of critical items within the Recognition Memory Test and Word Choice Test. Advance online publication . Applied Neuropsychology: Adult . doi:10.1080/23279095.2017.1298600 . Erdodi , L. A. , Tyson , B. T. , Shahein , A. , Lichtenstein , J. D. , Abeare , C. A. , Pelletiere , C. L. , et al. . ( 2017 ). The power of timing: Adding a time-to-completion cutoff to the Word Choice Test and Recognition Memory Test improves classification accuracy . Journal of Clinical and Experimental Neuropsychology , 39 , 369 – 383 . doi:10.1080/13803395.2016.1230181 . Google Scholar CrossRef Search ADS PubMed Frederick , R. I. ( 2003 ). VIP: Validity indicator profile. Manual ( 2nd ed. ). Minneapolis, MN : NCS Pearson . Grant I. , & Adams K. M. (Eds.) ( 1996 ). Neuropsychological assessment of neuropsychiatric disorders . New York : Oxford University Press . Greve , K. W. , & Bianchini , K. J. ( 2004 ). Setting empirical cut-offs on psychometric indicators of negative response bias: A methodological commentary with recommendations . Archives of Clinical Neuropsychology , 19 , 533 – 541 . Google Scholar CrossRef Search ADS PubMed Greve , K. W. , Bianchini , K. J. , Mathias , C. W. , Houston , R. J. , & Crouch , J. A. ( 2002 ). Detecting malingered neurocognitive dysfunction with the Wisconsin Card Sorting Test: A preliminary investigation in traumatic brain injury . The Clinical Neuropsychologist , 16 , 179 – 191 . Google Scholar CrossRef Search ADS PubMed Green , P. ( 2003 ). Green’s Word Memory Test . Edmonton, Canada : Green’s Publishing . Green , P. ( 2007 ). Spoiled for choice: Making comparisons between forced-choice effort tests. In Boone K. B. (Ed.) , Assessment of feigned cognitive impairment (pp. 50 – 77 ). New York, NY : Guilford . Green , P. , Iverson , G. , & Allen , L. ( 1999 ). Detecting malingering in head injury litigation with the Word Memory Test . Brain Injury , 13 , 813 – 819 . Google Scholar CrossRef Search ADS PubMed Green , P. , Flaro , L. , & Courtney , J. ( 2009 ). Examining false positives on the word memory test in adults with mild traumatic brain injury . Brain Injury , 23 , 741 – 750 . Google Scholar CrossRef Search ADS PubMed Greiffenstein , M. F. , Baker , W. J. , & Gola , T. ( 1994 ). Validation of malingered amnesic measures with a large clinical sample . Psychological Assessment , 6 , 218 – 224 . Google Scholar CrossRef Search ADS Heaton , R. K. , Miller , S. W. , Taylor , M. J. , & Grant , L. ( 2004 ). Revised comprehensive norms for an expanded Halstead-Reitan battery: Demographically adjusted neuropsychological norms for African American and Caucasian adults . Lutz, Fla. : PAR . Heilbronner , R. L. , Sweet , J. J. , Morgan , J. E. , Larrabee , G. J. , & Millis , S. R. ( 2009 ). American Academy of Neuropsychology Consensus Conference Statement on the neuropsychological assessment of effort, response bias, and malingering . The Clinical Neuropsychologist , 23 , 1093 – 1129 . Google Scholar CrossRef Search ADS PubMed Hoofien , D. , Barak , O. , Vakil , E. , & Gilboa , A. ( 2005 ). Symptom Checklist 90 Revised scores in persons with traumatic brain injury: Affective reactions of neurobehavioral outcomes of the injury? Applied Neuropsychology , 12 , 30 – 39 . Google Scholar CrossRef Search ADS PubMed Ilsley , J. E. , Moffoot , A. P. R. , & O’Carroll , R. E. ( 1995 ). An analysis of memory dysfunction in major depression . Journal of Affective Disorders , 35 ( 1-2 ), 1 – 9 . Google Scholar CrossRef Search ADS PubMed Iverson , G. L. ( 2003 ). Detecting malingering in civil forensic evaluations. In Horton A. M. , & Hartlage L. C. (Eds.) , Handbook of forensic neuropsychology (pp. 137 – 177 ). New York : Springer Publishing Company . Keiski , M. A. , Shore , D. L. , & Hamilton , J. M. ( 2007 ). The role of depression in verbal memory following traumatic brain injury . The Clinical Neuropsychologist , 21 , 744 – 761 . Google Scholar CrossRef Search ADS PubMed Kessels , R. P. C. , Ruis , C. , & Kappelle , L. J. ( 2007 ). The impact of self-reported depressive symptoms on memory function in neurological outpatients . Clinical Neurology and Neurosurgery , 109 , 323 – 326 . Google Scholar CrossRef Search ADS PubMed Lange , R. T. , Iverson , G. L. , Brickell , T. A. , Staver , T. , Pancholi , S. , Bhagwat , A. , et al. . ( 2013 ). Clinical utility of the Conners’ Continuous Performance Test-II to detect poor effort in U.S. military personnel following traumatic brain injury . Psychological Assessment , 25 , 339 – 352 . Google Scholar CrossRef Search ADS PubMed Langenecker , S. A. , Bieliauskas , L. A. , Rapport , L. J. , Zubieta , J. K. , Wilde , E. A. , & Berent , S. ( 2005 ). Face emotion perception and executive functioning deficits in depression . Journal of Clinical and Experimental Psychology , 27 , 320 – 333 . Larrabee , G. J. ( 2003 ). Detection of malingering using atypical performance on standard neuropsychological tests . The Clinical Neuropsychologist , 17 , 410 – 425 . Google Scholar CrossRef Search ADS PubMed Larrabee , G. J. ( 2012 ). Assessment of malingering. In Larrabee G. J. (Ed.) , Forensic neuropsychology: A scientific approach . NY : Oxford University Press . Leighton , A. , Weinborn , M. , & Maybery , M. ( 2014 ). Bridging the gap between neurocognitive processing theory and performance validity assessment among the cognitively impaired: A review and methodological approach . Journal of the International Neuropsychological Society , 20 , 873 – 886 . doi:10.1017/S135561771400085X . Google Scholar CrossRef Search ADS PubMed Lichtenstein , J. D. , Erdodi , L. A. , & Linnea , K. S. ( 2017 ). Introducing a forced-choice recognition task to the California Verbal Learning Test—Children’s Version . Child Neuropsychology , 23 , 284 – 299 . doi:10.1080/09297049.2015.1135422 . Google Scholar CrossRef Search ADS PubMed Marschark , M. , Richtsmeier , L. M. , Richardson , J. T. E. , Crovitz , H. F. , & Henry , J. ( 2000 ). Intellectual and emotional functioning in college students following mild traumatic brain injury in childhood and adolescence . Journal of Head Trauma Rehabilitation , 15 , 1227 – 1245 . Google Scholar CrossRef Search ADS PubMed Marshall , P. , & Happe , M. ( 2007 ). The performance of individuals with mental retardation on cognitive tests assessing effort and motivation . The Clinical Neuropsychologist , 21 , 826 – 840 . Google Scholar CrossRef Search ADS PubMed Martens , M. , Donders , J. , & Millis , S. R. ( 2001 ). Evaluation of invalid response sets after traumatic head injury . Journal of Forensic Neuropsychology , 2 ( 1 ), 1 – 18 . Google Scholar CrossRef Search ADS Moore , B. A. , & Donders , J. ( 2004 ). Predictors of invalid neuropsychological test performance after traumatic brain injury . Brain Injury , 18 , 975 – 984 . Google Scholar CrossRef Search ADS PubMed Ord , J. S. , Boettcher , A. C. , Greve , K. J. , & Bianchini , K. J. ( 2010 ). Detection of malingering in mild traumatic brain injury with the Conners’ Continuous Performance Test-II . Journal of Clinical and Experimental Neuropsychology , 32 , 380 – 387 . Google Scholar CrossRef Search ADS PubMed Palav , A. , Ortega , A. , & McCaffrey , R. J. ( 2001 ). Incremental validity of the MMPI-2 content scales: A preliminary study with brain-injured patients . Journal of Head Trauma Rehabilitation , 16 , 275 – 283 . Google Scholar CrossRef Search ADS PubMed Pearson ( 2009 ). Advanced clinical solutions for the WAIS-IV and WMS-IV—Technical manual. San Antonio, TX : Author . Perneger , T. V. ( 1998 ). What’s wrong with Bonferroni adjustments . BMJ (Clinical research ed.) , 316 , 1236 – 1238 . Google Scholar CrossRef Search ADS PubMed Raskin , S. A. , Mateer , C. A. , & Tweeten , R. ( 1998 ). Neuropsychological assessment of individuals with mild traumatic brain injury . The Clinical Neuropsychologist , 12 , 21 – 30 . Google Scholar CrossRef Search ADS Rees , L. M. , Tombaugh , T. N. , & Boulay , L. ( 2001 ). Depression and the Test of Memory Malingering . Archives of Clinical Neuropsychology , 16 , 501 – 506 . Google Scholar CrossRef Search ADS PubMed Reese , C. S. , Suhr , J. A. , & Riddle , T. L. ( 2012 ). Exploration of malingering indices in the Wechsler Adult Intelligence Scale—Fourth Edition Digit Span subtest . Archives of Clinical Neuropsychology , 27 , 176 – 181 . Google Scholar CrossRef Search ADS PubMed Reznek , L. ( 2005 ). The Rey 15-item memory test for malingering: A meta-analysis . Brain Injury , 19 , 539 – 543 . doi:10.1080/02699050400005242 . Google Scholar CrossRef Search ADS PubMed Rohling , M. L. , Green , P. , Allen , L. M. , & Iverson , G. L. ( 2002 ). Depressive symptoms and neurocognitive test scores in patients passing symptom validity tests . Archives of Clinical Neuropsychology , 17 , 205 – 222 . Google Scholar CrossRef Search ADS PubMed Root , J. C. , Robbins , R. N. , Chang , L. , & van Gorp , W. ( 2006 ). Detection of inadequate effort on the California Verbal Learning Test-Second edition: Forced choice recognition and critical item analysis . Journal of the International Neuropsychological Society , 12 , 688 – 696 . Google Scholar CrossRef Search ADS PubMed Ross , T. P. , Poston , A. M. , Rein , P. A. , Salvatore , A. N. , Wills , N. L. , & York , T. M. ( 2016 ). Performance invalidity base rates among healthy undergraduate research participants . Archives of Clinical Neuropsychology , 31 , 97 – 104 . Google Scholar CrossRef Search ADS PubMed Rothman , K. J. ( 1990 ). No adjustments are needed for multiple comparisons . Epidemiology (Cambridge, Mass.) , 1 , 43 – 46 . Google Scholar CrossRef Search ADS PubMed Santos , O. A. , Kazakov , D. , Reamer , M. K. , Park , S. E. , & Osmon , D. C. ( 2014 ). Effort in college undergraduate is sufficient on the Word Memory Test . Archives of Clinical Neuropsychology , 29 , 609 – 613 . Google Scholar CrossRef Search ADS PubMed Schwartz , E. S. , Erdodi , L. , Rodriguez , N. , Jyotsna , J. G. , Curtain , J. R. , Flashman , L. A. , et al. . ( 2016 ). CVLT-II forced choice recognition trial as an embedded validity indicator: A systematic review of the evidence . Journal of the International Neuropsychological Society , 22 , 851 – 858 . doi:10.1017/S1355617716000746 . Google Scholar CrossRef Search ADS PubMed Slick , D. J. , Sherman , E. M. S. , Grant , L. , & Iverson , G. L. ( 1999 ). Diagnostic criteria for malingered neurocognitive dysfunction: Proposed standards for clinical practice and research . The Clinical Neuropsychologist , 13 , 545 – 561 . Google Scholar CrossRef Search ADS PubMed Spencer , R. J. , Axelrod , B. N. , Drag , L. L. , Waldron-Perrine , B. , Pangilinan , P. H. , & Bieliauskas , L. A. ( 2013 ). WAIS-IV reliable digit span is no more accurate than age corrected scaled score as an indicator of invalid performance in a veteran sample undergoing evaluation for mTBI . The Clinical Neuropsychologist , 27 , 1362 – 1372 . Google Scholar CrossRef Search ADS PubMed Sprinkle , S. D. , Lurie , D. , Insko , S. L. , Atkinson , G. , Jones , G. L. , Logan , A. R. , et al. . ( 2002 ). Criterion validity, severity cut scores, and test-retest reliability of the Beck Depression Inventory-II in a university counseling center sample . Journal of Counseling Psychology , 49 , 381 . Google Scholar CrossRef Search ADS Storch , E. A. , Roberti , J. W. , & Roth , D. A. ( 2004 ). Factor structure, concurrent validity, and internal consistency of the Beck Depression Inventory—Second edition in a sample of college students . Depression and Anxiety , 19 , 187 – 189 . Google Scholar CrossRef Search ADS PubMed Suhr , J. A. , & Boyer , D. ( 1999 ). Use of the Wisconsin Card Sorting Test in the detection of malingering in student simulator and patient samples . Journal of Clinical and Experimental Psychology , 21 , 701 – 708 . doi:10.1076/jcen.21.5.701.868 . Suhr , J. , Tranel , D. , Wefel , J. , & Barrash , J. ( 1997 ). Memory performance after head injury: Contributions of malingering, litigation status, psychological factors, and medication use . Journal of Clinical and Experimental Psychology , 19 , 500 – 514 . Sugarman , M. A. , & Axelrod , B. N. ( 2014 ). Embedded measures of performance validity using verbal fluency tests in a clinical sample . Applied Neuropsychology: Adult . DOI:10.1080/23279095.2013.873439 . Sugarman , M. A. , & Axelrod , B. N. ( 2015 ). Embedded measures of performance validity using verbal fluency tests in a clinical sample . Applied Neuropsychology: Adult , 22 , 141 – 146 . Google Scholar CrossRef Search ADS PubMed Sweet , J. J. , Goldman , D. J. , & Guidotti Breting , L. M. ( 2013 ). Traumatic brain injury: Guidance in a forensic context from outcome, dose-response, and response bias research . Behavioral Sciences and the Law , 31 , 756 – 778 . Google Scholar CrossRef Search ADS PubMed Tombaugh , T. N. ( 1996 ). Test of Memory Malingering . New York : Multi-Health Systems . Trueblood , W. ( 1994 ). Qualitative and quantitative characteristics of malingered and other invalid WAIS-R and clinical memory data . Journal of Clinical and Experimental Neuropsychology , 14 , 597 – 607 . Google Scholar CrossRef Search ADS van Dyke , S. A. , Millis , S. R. , Axelrod , B. N. , & Hanks , R. A. ( 2013 ). Assessing effort: Differentiating performance and symptom validity . The Clinical Neuropsychologist , 27 , 1234 – 1246 . Google Scholar CrossRef Search ADS PubMed Westcott , M. C. , & Alfano , D. P. ( 2005 ). The Symptom Checklist-90-Revised and mild traumatic brain injury . Brain Injury , 19 , 1261 – 1267 . Google Scholar CrossRef Search ADS PubMed Wolfe , P. L. , Millis , S. R. , Hanks , R. , Fichtenberg , N. , Larrabee , G. J. , & Sweet , J. J. ( 2010 ). Effort indicators within the California Verbal Learning Test-II (CVLT-II) . The Clinical Neuropsychologist , 24 , 153 – 168 . Google Scholar CrossRef Search ADS PubMed © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Archives of Clinical Neuropsychology Oxford University Press

A Single Error Is One Too Many: The Forced Choice Recognition Trial of the CVLT-II as a Measure of Performance Validity in Adults with TBI

Loading next page...
 
/lp/ou_press/a-single-error-is-one-too-many-the-forced-choice-recognition-trial-of-00qkrQoely
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISSN
0887-6177
eISSN
1873-5843
D.O.I.
10.1093/acn/acx110
Publisher site
See Article on Publisher Site

Abstract

Abstract Objective The Forced Choice Recognition (FCR) trial of the California Verbal Learning Test—Second Edition (CVLT-II) was designed to serve as a performance validity test (PVT). The present study was designed to compare the classification accuracy of a more liberal alternative (≤15) to the de facto FCR cutoff (≤14). Method The classification accuracy of the two cutoffs was computed in reference to psychometrically defined invalid performance, across various criterion measures, in a sample of 104 adults with TBI clinically referred for neuropsychological assessment. Results The FCR was highly predictive (AUC: .71–.83) of Pass/Fail status on reference PVTs, but unrelated to performance on measures known to be sensitive to TBI. On average, FCR ≤15 correctly identified an additional 6% of invalid response sets compared to FCR ≤14, while maintaining .92 specificity. Patients who failed the FCR reported higher levels of emotional distress. Conclusions Results suggest that even a single error on the FCR is a reliable indicator of invalid responding. Further research is needed to investigate the clinical significance of the relationship between failing the FCR and level of self-reported psychiatric symptoms. CVLT-II, Forced choice recognition, Traumatic brain injury, Performance validity assessment, Alternative cutoffs Introduction The interpretation of neuropsychological tests rests on the assumption that examinees are able and willing to fully engage with the tasks presented to them and therefore, demonstrate their maximal ability level (Delis, Kramer, Kaplan, & Ober, 2000). There is a growing consensus within the field of neuropsychology that valid performance cannot be assumed by default, but should be objectively evaluated (Boone, 2009; Chafetz et al., 2015; Heilbronner, Sweet, Morgan, Larrabee, & Millis, 2009). Some even consider assessments that omit formal measures of test-taking effort incomplete (Iverson, 2003). Along with free-standing performance validity tests [PVTs; Test of Memory Malingering (TOMM; Tombaugh, 1996); Word Memory Test (WMT; Green, 2003); Validity Indicator Profile (VIP; Frederick, 2003)] that represent the traditional approach to performance validity assessment, a growing array of embedded validity indicators (EVIs) have also been developed to help clinicians determine the credibility of a given response set (Arnold et al., 2005; Erdodi, Sagar, et al., 2017; Greiffenstein, Baker, & Gola, 1994; Larrabee, 2003). The Forced Choice Recognition (FCR) task of the California Verbal Learning Test—Second Edition (CVLT-II; Delis et al., 2000) falls somewhere in between these two categories of validity measures. It was introduced as an optional module with the explicit purpose of evaluating test-taking effort, and is administered 10 min after the original recall and recognition trials are completed. These features are consistent with a free-standing PVT. However, FCR is dependent on the prior administration of the rest of the CVLT-II, which makes it an EVI. The technical manual references a study by Connor, Drake, Bondi and Delis (1997) on an early experimental version of FCR administered in conjunction with the original CVLT. On this instrument, a cutoff of ≤13 produced impressive classification accuracy (.80 sensitivity at .97 specificity) separating credible and simulated memory deficits. Although the authors refrained from endorsing this or any other cutoff, they reported that over 90% of the CVLT-II normative sample obtained a perfect score on FCR (16/16), with ≤1% scoring ≤14. Nobody scored ≤13. They suggested that its pronounced ceiling effect in neurologically healthy individuals makes FCR a viable instrument for detecting non-credible responding in unsophisticated examinees who blatantly exaggerate their memory problems. Early studies on FCR in clinical samples provided indirect support for this claim. Baldo, Delis, Kramer and Shimamura (2002) reported that all of the 11 patients with neuroradiologically confirmed focal frontal lesions and notable impairment in acquisition, recall and recognition on the CVLT-II obtained perfect scores on FCR. Demonstrating that performance on FCR is unrelated to brain lesions or credibly poor performance on the rest of the CVLT-II was an essential first step in gaining acceptance as a validity indicator. The other requirement for validation was examining whether FCR can correctly identify individuals who fail other established PVTs. Moore and Donders (2004) conducted the first large scale concordance rate analysis against the TOMM in 132 clinically referred adults with TBI. Most of the sample was male (62.1%) and classified as mild TBI (54.5%). Mean age was 35.8 (SD = 14.2) and mean level of education was 12.3 (SD = 2.6). The majority (81.1%) obtained a perfect score on FCR; 6.8% scored 15 and 11.4% scored ≤14. The authors turned the base rate argument advanced by the test developers into an explicit diagnostic threshold, defining failure on FCR as ≤14. The reference PVT was TOMM Trial 2 ≤44 resulting in an 8.3% base rate of failure (BRFail). FCR ≤ 14 produced a respectable combination of sensitivity (.55) and specificity (.93) against this criterion. No alternative cutoff was considered. The authors expressed concerns that both the TOMM and FCR might be too transparent as PVTs and thus, highly specific, but not very sensitive to invalid responding. Bauer, Yantz, Ryan, Warden and McCaffrey (2005) examined the relationship between FCR and the WMT in a military sample of 120 patients with TBI. Mean age was 28.4 (SD = 9.2) and mean level of education was 13.4 (SD = 2.3). The BRFail, defined by the WMT at standard cutoffs, was 24.2%. Although the authors did not provide enough detail to compute classification accuracy, the mean FCR value in the “invalid” group (14.9) was significantly lower than in the “valid” group (15.7). Also, there was a positive linear relationship between WMT performance as a continuous variable (average of the IR, DR and CNS subtests) and FCR: those with MWMT ≥91% produced a MFCR of 15.8, while those with MWMT 61–70% produced a MFCR of 14.2. The authors concluded that while FCR was effective at discriminating those who passed and those who failed the WMT, the mean difference was lower (0.8) than what was observed on Yes/No recognition hits raw scores (2.0). They attributed this to the inherently lower cognitive demands of FCR paradigm compared to the Yes/No recognition trial, which has a 3:1 distractor-to-target ratio. They also emphasized that FCR has high specificity, but low sensitivity to invalid performance. Root, Robbins, Chang and van Gorp (2006) investigated the relationship among FCR scores, memory impairment and performance validity across three groups: a mixed clinical sample (n = 25), a forensic sample with adequate effort (n = 27) and a forensic sample with inadequate effort (n = 25). Performance validity was operationalized as passing or failing the TOMM and/or the VIP among forensic patients, resulting in an overall BRFail of 48%. Performance validity was not formally assessed in the clinically referred patients; instead, they were assumed to have valid neuropsychological profiles based on the lack of apparent secondary gain. Given the emerging evidence that even university students with no incentive to appear impaired fail PVTs (An, Zakzanis, & Joordens, 2012; An, Kaploun, Erdodi, & Abeare, 2017; Ross et al., 2016; Santos, Kazakov, Reamer, Park, & Osmon, 2014), this logic (“lack of apparent motivation to perform poorly = evidence of valid performance”) seems flawed by current methodological standards that emphasize the importance of objective evidence from multiple independent sources in establishing the credibility of a given neurocognitive profile (Boone, 2009, 2013; Larrabee, 2012). Nevertheless, Root et al. (2006) found that FCR scores were unrelated to delayed free recall performance. An FCR cutoff of ≤15 produced .60 sensitivity at .81 specificity. Lowering the cutoff to ≤14 resulted in improved specificity (.93), but decreased sensitivity (.44). The authors endorsed FCR as a “brief screen of effort” given its quick and easy administration and strong positive predictive power. At the same time, they cautioned against using a passing score on FCR as evidence for credible performance. They also acknowledged the modality specificity as a potential confound in establishing the optimal cutoff on FCR: the TOMM is a visually based PVT, while the VIP is non-memory based. As such, they may not be ideal reference PVTs to cross-validate FCR. Once FCR’s ability to separate valid and invalid response sets had been established, later studies used it as an EVI. Some of these reports provide indirect evidence that further consolidates the evidence base supporting its diagnostic utility. For example, the investigation by Donders and Strong (2011) based on 100 clinically referred adults with TBI found that the majority (85%) of the patients obtained a perfect score on FCR, 6% scored 15 and 9% scored ≤14. Although concordance rates were not provided, 24% of the sample failed the WMT. The authors noted that FCR and WMT were unrelated to injury severity, while other CVLT-II measures (Trials 1–5, LD-FR, d’, Total Recall Discriminability) covaried with duration of coma. Another method for assessing FCR’s ability to differentiate invalid responding from credible impairment is to examine its distribution in clinical populations with severe, credible neurological deficits. Extremely low intellectual functioning (FSIQ < 70) and dementia are two conditions that have been identified as being at risk for high false positive rates on PVTs (Boone, 2009, 2013). Marshall and Happe (2007) examined the BRFail in several PVTs in a sample of 100 adults with intellectual disability (52% male, MAge = 36.6; MFSIQ = 63). Mean FCR score was 15.1 (SD = 1.9). A frequency distribution for a subset of 71 participants for which FCR data were available revealed that the majority (66.2%) obtained a perfect score, 18.3% of the sample scored 15 and 15.5% scored ≤14. Clark and colleagues (2012) demonstrated that FCR performance is a useful clinical marker of anterograde amnesia in later stages of Alzheimer’s disease (AD). As such, in conjunction with other CVLT-II measures, it can aid the subtyping of late life memory disorders and track disease progression in individuals diagnosed with a neurodegenerative disorder. Mean FCR was 13.9 in the Alzheimer’s sample (n = 37), 15.8 in the amnestic MCI sample (n = 18), 15.7 in the non-amnestic MCI sample (n = 19) and a near-perfect score in the control sample (n = 35). Research on FCR appears to converge in a few areas. First, the exceptionally good signal detection profile of the ≤13 cutoff in the original experimental version has not been replicated. Second, the ≤14 cutoff performed well against established PVTs, with classification accuracy hovering around the “Larrabee limit”: .50 sensitivity at .90 specificity (Lichtenstein, Erdodi, & Linnea, 2017). Third, no alternative cutoff has been systematically evaluated, despite accumulating evidence that the vast majority of credible individuals produce perfect scores on FCR, making ≤15 a logical candidate for a more liberal cutoff. A recent systematic review of the literature on the FCR’s classification accuracy found that the ≤14 cutoff tended to sacrifice sensitivity for specificity, and identified investigating the more liberal alternative cutoff (≤15) as a direction for future research (Schwartz et al., 2016). The implication of these findings is that a score of 15 on FCR is more likely to be a Fail than a Pass. Even if it might not constitute strong enough evidence to render the whole profile invalid, FCR = 15 should be considered at least a red flag in the evaluation of performance validity (D. Delis, personal communication, 10 May 2012). In fact, some researchers started treating an FCR score of 15 as the first level of invalid performance (Erdodi, Kirsch, Lajiness-O’Neill, Vingilis & Medoff, 2014; Erdodi et al., 2016). Similarly, the authors of the newly introduced FCR trial to the child version of CVLT suggested that even one incorrect response is indicative of suboptimal effort (Lichtenstein et al., 2017). The present study was designed to investigate two psychometric issues involving FCR. First, we proposed to compare the de facto FCR cutoff of ≤14 to its more liberal alternative (≤15) in a sample of clinically referred adults with TBI. We hypothesized that raising the cutoff to ≤15 would improve the sensitivity of FCR, while maintaining acceptable specificity, as reported in the child version of CVLT. Second, based on earlier reports that active psychiatric conditions increase the BRFail on PVTs (Moore & Donders, 2004), we also hypothesized that performance on FCR would be related to self-ratings of emotional distress. Although previous research suggests that failing one type of validity indicator is predictive of failing other types of validity measures (Constantinou, Bauer, Ashendorf, Fisher, & McCaffrey, 2005), most clinicians seem to agree that the credibility of symptom report and performance on cognitive tests are conceptually distinct and hence, should be assessed separately (van Dyke, Millis, Axelrod, & Hanks, 2013). Overall, the link between performance validity and psychiatric functioning remains controversial. Some investigators found that PVT failure was unrelated to depression (Considine et al., 2011; Egeland et al., 2005; Rees, Tombaugh, & Boulay, 2001), while others reported an increase the BRFail in patients with psychiatric disorders (Erdodi et al., 2016). Recent research suggests that while PVT failure is unrelated to self-reported depression and anxiety, it has a strong relationship with somatic symptoms (Erdodi, Sagar et al., 2017). Method Participants The sample consisted of 104 outpatients medically referred for neuropsychological assessment after TBI. The majority were males (55.8%) and right-handed (90.4%). Mean age was 38.8 years (SD = 16.7) and mean level of education was 13.7 years (SD = 2.6). Mean FSIQWAIS-IV was 92.6 (SD = 15.9). Using data on injury severity indices gathered through clinical interview and the review of medical records (presence/length of loss of consciousness, neuroradiological findings, peri-traumatic amnesia, GCS score), 75.0% were classified as mild (mTBI). The rest were classified as moderate-to-severe. All patients were in the post-acute stage of recovery (>3 months post mTBI and >1 year post moderate-to-severe TBI). As the assessments were conducted for clinical purposes, no data were available on litigation status. Materials A fixed battery of commonly used, standardized measures of general intelligence, learning, memory, attention, executive functions, language, visuoperceptual and motor skills was administered to all patients (Table 1). Emotional functioning was assessed with self-report inventories. Performance validity was evaluated using a combination of stand-alone and embedded PVTs. As a free-standing PVT based on multiple trials separated by time-delay, the WMT represented the traditional approach to performance validity research. Table 1. List of Tests Administered: Abbreviations, Scales and Norms Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Note:T: Heaton: Demographically adjusted norms published by Heaton, Miller, Taylor, & Grant (2004); Manual: Normative data published in the technical manual. aAdministered and scored on a computer. Table 1. List of Tests Administered: Abbreviations, Scales and Norms Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Test Name Abbreviation Norms Beck Depression Inventory, 2nd edition BDI-II — Booklet Category Test BCT Heaton California Verbal Leaning Test, 2nd edition CVLT-II Manual Conners’ Continuous Performance Test, 2nd editiona CPT-II Manual Letter and Category Fluency Test FAS & Animals Heaton Finger Tapping Test FTT Heaton Green’s Word Memory Testa WMT Manual Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual Symptom Checklist-90-Reviseda SCL-90-R Manual Tactual Performance Test TPT Heaton Trail Making Test TMT Heaton Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual Wechsler Memory Scale, 4th edition WMS-IV Manual Wide Range Achievement Test, 4th edition WRAT-4 Manual Wisconsin Card Sorting Test WCST Manual Word Choice Test WCT Manual Note:T: Heaton: Demographically adjusted norms published by Heaton, Miller, Taylor, & Grant (2004); Manual: Normative data published in the technical manual. aAdministered and scored on a computer. To address concerns about instrumentation artifacts as a threat to the internal validity of signal detection analyses (Root et al., 2006), we developed two composites using five independent validity measures to complement the WMT. The first one consists of PVTs based on recognition memory, labeled “Erdodi Index Five—Recognition” (EI-5REC). The second consists of EVIs that are not based on recognition memory, labeled “Erdodi Index Five—Non-Recognition” (EI-5NR). Each component of the EI-5s was recoded into a four-point scale: the performance reflecting an incontrovertible Pass was assigned a value of zero, while the most extreme level of failure was assigned the value of three, with intermediate levels of failure coded as one and two, following the methodology described by Erdodi (2017). Table 2 provides the details of the re-scaling process and references to the cutoffs used. Table 2. The Components of the EI-5s and Base Rates of Failure at Given Cutoffs EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 Note: WMT (IR, DR & CNS): Word Memory Test—Number of failures on Immediate Recall, Delayed Recall & Consistency trials at standard cutoffs; WCT: Word Choice Test (Pearson, 2009); VR: WMS-IV Visual Reproduction (Pearson, 2009); LM: WMS-IV Logical Memory (Bortnik et al., 2010; Pearson, 2009); VPA: WMS-IV Verbal Paired Associates (Pearson, 2009); RDS: Reliable Digit Span (Greiffenstein et al., 1994; Pearson, 2009); WCST FMS: Wisconsin Card Sorting Test Failures to Maintain Set (Greve, Bianchini, Mathias, Houston & Crouch, 2002; Larrabee, 2003; Suhr & Boyer, 1999); FTT: Finger Tapping Test, number of cutoffs failed of dominant hand raw score ≤28/35 and combined raw scores ≤58/66 (Arnold et al., 2005); Animals: Animal fluency raw score (Sugarman & Axelrod, 2015); CPT-II OMI: Conners’ Continuous Performance Test, 2nd edition Omissions T-scores (Erdodi et al., 2014; Lange et al., 2013; Ord, Boettcher, Greve, & Bianchini, 2010). The italic values represent the percent of the sample that scored within a given range of cutoffs. Table 2. The Components of the EI-5s and Base Rates of Failure at Given Cutoffs EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 EI-5REC Components EI-5 Values EI-5NR Components EI-5 Values 0 1 2 3 0 1 2 3 WMT 0 1 2 3 RDS ≥ 8 7 6 ≤ 5 Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8 WCT >47 44–47 40–43 ≤39 WCST FMS ≤1 2 3 ≥4 Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9 VR Recognition >4 4 3 ≤2 FTT None One Both — Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 — LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9 Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8 VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100 Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4 Note: WMT (IR, DR & CNS): Word Memory Test—Number of failures on Immediate Recall, Delayed Recall & Consistency trials at standard cutoffs; WCT: Word Choice Test (Pearson, 2009); VR: WMS-IV Visual Reproduction (Pearson, 2009); LM: WMS-IV Logical Memory (Bortnik et al., 2010; Pearson, 2009); VPA: WMS-IV Verbal Paired Associates (Pearson, 2009); RDS: Reliable Digit Span (Greiffenstein et al., 1994; Pearson, 2009); WCST FMS: Wisconsin Card Sorting Test Failures to Maintain Set (Greve, Bianchini, Mathias, Houston & Crouch, 2002; Larrabee, 2003; Suhr & Boyer, 1999); FTT: Finger Tapping Test, number of cutoffs failed of dominant hand raw score ≤28/35 and combined raw scores ≤58/66 (Arnold et al., 2005); Animals: Animal fluency raw score (Sugarman & Axelrod, 2015); CPT-II OMI: Conners’ Continuous Performance Test, 2nd edition Omissions T-scores (Erdodi et al., 2014; Lange et al., 2013; Ord, Boettcher, Greve, & Bianchini, 2010). The italic values represent the percent of the sample that scored within a given range of cutoffs. In addition to aggregating multiple independent validity indicators and thus, increasing the overall diagnostic power of the measurement model (Larrabee, 2003), an essential feature of the EI-5s is that they recapture the underlying continuity in performance validity, distinguishing between near-passes and clear failures. An EI-5 score provides a summary of both the “number” and “extent” of PVT failures. Since the practical demands of validity assessment require a dichotomous outcome, the first two levels were considered a Pass, and values of ≥4 were considered a Fail. EI-5 values 2–3 were considered borderline (Table 3), and excluded from further analyses involving the EI-5 to ensure the purity of the criterion groups (Pass/Fail), a methodological standard in calibrating new PVTs (Erdodi, Tyson, Abeare et al., 2017; Greve & Bianchini, 2004; Sugarman & Axelrod, 2015). Table 3. Frequency, Cumulative Frequency and Classification Range for the First Eight Levels of the EI-5s EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL Note: EI-5REC: Erdodi Index Five—Recognition memory based; EI-5NR: Erdodi Index Five—Non-recognition memory based. Table 3. Frequency, Cumulative Frequency and Classification Range for the First Eight Levels of the EI-5s EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL EI-5 EI-5REC EI-5NR f Cumulative % f Cumulative % Classification 0 42 40.4 49 47.1 PASS 1 12 51.9 15 61.5 Pass 2 12 63.5 10 71.2 Borderline 3 6 69.2 16 86.5 Borderline 4 2 71.2 4 90.4 Fail 5 7 77.9 2 92.3 Fail 6 6 83.7 2 94.2 FAIL 7 7 90.4 1 95.2 FAIL 8 1 91.3 3 98.1 FAIL Note: EI-5REC: Erdodi Index Five—Recognition memory based; EI-5NR: Erdodi Index Five—Non-recognition memory based. To complement the WMT and the EI-5s, several other validity measures were used as reference PVTs to provide a more representative sample of sensory modalities, testing paradigms and sensitivity to invalid responding. Including a variety of independent PVTs is essential to keep multicollinearity at a minimum and thus, maximize the predictive power of the multivariate model of performance validity assessment (Boone, 2013; Larrabee, 2012). The Word Choice Test (WCT) is a single-trial free-standing PVT based on the FCR paradigm (Pearson, 2009). Number of hits on the Yes/No recognition trial of the CVLT-II (RHCVLT-II) was selected because it is nested within the same test as FCR and there are previous comparisons between the two tasks. The logistic regression equation developed by Wolfe and colleagues (2010; LREWolfe) was the alternative CVLT-II-based reference PVT. Given reports of high false positive rates associated with the original cutoff (≥.50), the more conservative alternative (≥.625) was used in cross-validation analyses (Donders & Strong, 2011). The WAIS-IV Digit Span age-corrected scaled score (DSACSS) is a measure of auditory attention and working memory that has been shown to be effective at detecting invalid responding (Axelrod, Fichteberg, Millis & Wertheimer, 2006; Reese, Suhr, & Riffle, 2012; Spencer et al., 2013). Self-reported emotional functioning was assessed using the Beck Depression Inventory—Second Edition (BDI-II) and the Symptom Checklist-90-Revised (SCL-90-R). The BDI-II was designed to measure depressive symptoms consistent with the DSM-IV (APA, 1996) diagnostic criteria (Beck, Steer, & Brown, 1996). Its brevity (21 items rated on a 4-point ordinal scale [0–3]) combined with excellent psychometric properties and discriminant validity in both healthy controls and psychiatric patients make the BDI-II a popular screening tool for depression (Sprinkle et al., 2002; Storch, Roberti, & Roth, 2004). The SCL-90-R is a widely used screening tool for a broad range of psychiatric symptoms in clinical populations with a broad range of etiology (Derogatis, 1994) and in patients with TBI specifically (Hoofien, Barak, Vakil, & Gilboa, 2005). As the name indicates, it contains 90 items self-rated on a 5-point ordinal scale [0–4] that converge into nine scales: somatization (SOM), obsessive-compulsive symptoms (O-C), interpersonal sensitivity (I-S), depression (DEP), anxiety (ANX), hostility (HOS), phobic anxiety (PHO), paranoid ideation (PAR) and psychotic symptoms (PSY). In addition, a Global Severity Index (GSI) is computed to reflect the mean of all items. The GSI has been found to be the most sensitive of the SCL-90-R indicators to disruptions in emotional and social functioning post TBI (Baker, Schmidt, Heinemann, Langley & Miranti, 1998; Marschark, Richtsmeier, Richardson, Crovitz, & Henry, 2000; Westcott & Alfano, 2005). Clinical elevations (T ≥ 63; Derogatis, 1994) were also commonly observed on the O-C, I-S, DEP and PHO scales (Baker et al., 1998; Marschark et al., 2000; Palav, Ortega, & McCaffrey, 2001; Westcott & Alfano, 2005). Procedure Participants were assessed in two half-day appointments through the neurorehabilitation service of a Midwestern academic medical center. Psychometric testing was completed in an outpatient setting by trained psychometricians. A staff neuropsychologist conducted the clinical interview and review of medical records, wrote the integrative report and provided feedback to patients. Data were collected through an archival retrospective chart review of a consecutive series of TBI referrals. The study was approved by the Institutional Review Board. Ethical guidelines regulating research with human participants were followed throughout the project. Data Analysis Descriptive statistics (frequency, percentage and cumulative percentage; mean, standard deviation) were computed for the key variables. Significance testing was performed using the F- and t-tests as well as χ2. ANOVAs were followed up with uncorrected t-tests. Since all post hoc contrasts were a priori planned comparisons, no statistical correction was applied (Rothman, 1990; Perneger, 1998). In addition, the tension between statistical versus clinical significance was resolved by consistently reporting effect size estimates associated with each relevant contrast: partial eta squared (η2p), Cohen’s d and Ф2. Receiver operating characteristics (ROC) analyses [area under the curve (AUC) with 95% CI] were performed using SPSS 22.0. The rest of the classification accuracy parameters [sensitivity, specificity, positive and negative likelihood ratio (+LR and −LR)] were computed using standard formulas. Results Mean scores on tests of cognitive ability ranged from Low Average to Average (Table 4). The mean FCR score in the sample was 15.4 (SD = 1.4; range: 9–16). The median value was 16. The distribution was negatively skewed (−2.48) and had a strong positive kurtosis (+6.47). The majority of the sample (75.0%) obtained a perfect score on FCR; 6.7% scored 15 and 18.3% scored ≤14. Table 4. Group-Level Performance on the Tests Administered Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Note: LD-FR: Long-delay free recall; RT: Reaction Time; GSI: Global Severity Index; VCI: Verbal Comprehension Index; PRI: Perceptual Reasoning Index; WMI: Working Memory Index; PSI: Processing Speed Index; LM: Logical Memory; I: Immediate Recall; II: Delayed Recall; VPA: Verbal Paired Associates; VR: Visual Reproduction. Values for standard deviations were italicized. Table 4. Group-Level Performance on the Tests Administered Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Test Name Measure M SD Descriptive Range Animals T-score 42.6 12.0 Low Average BDI-II Total Raw Score 15.3 11.5 Mild Depression BCT Total Errors T-score 41.7 13.8 Low Average CVLT-II Trials 1–5 T-score 45.1 14.5 Average LD-FR z-score −1.03 1.54 Low Average CPT-II Omissions T-score 73.4 61.5 Elevated Commissions T-score 52.2 11.2 Within Normal Limits Hit RT T-score 53.7 14.6 Within Normal Limits FTT Dominant Hand T-score 45.4 12.5 Average WMT % Fail 38.5 N/A PPVT-4 Standard Score 98.1 13.8 Average SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits TPT Total Time T-score 45.0 13.7 Average TMT Trails A T-score 43.0 13.5 Low Average Trails B T-score 43.3 13.9 Low Average WAIS-IV VCI Standard Score 95.1 15.3 Average PRI Standard Score 96.2 16.4 Average WMI Standard Score 92.7 15.6 Average PSI Standard Score 89.4 16.4 Low Average WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average LM II Age-Corrected Scaled Score 7.6 3.4 Low Average VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average VPA II Age-Corrected Scaled Score 8.6 3.7 Average VR I Age-Corrected Scaled Score 8.6 3.7 Average VR II Age-Corrected Scaled Score 7.9 3.1 Low Average WRAT-4 Word Reading Standard Score 93.9 12.5 Average WCST Perseverative Errors T-score 46.9 11.1 Average WCT Total Accuracy Raw Score 47.6 3.7 Pass Note: LD-FR: Long-delay free recall; RT: Reaction Time; GSI: Global Severity Index; VCI: Verbal Comprehension Index; PRI: Perceptual Reasoning Index; WMI: Working Memory Index; PSI: Processing Speed Index; LM: Logical Memory; I: Immediate Recall; II: Delayed Recall; VPA: Verbal Paired Associates; VR: Visual Reproduction. Values for standard deviations were italicized. The Effect of Age, Education, Cognitive Functioning and Injury Severity on FCR As the study focused on comparing the discriminant power of two cutoffs (FCR ≤14 and ≤15) against a perfect score, the sample was divided into three groups: FCR = 16, FCR = 15 and FCR ≤14. This trichotomy was used as the independent variable (IV) for a series of ANOVAs with age, education and cognitive functioning as dependent variables (DVs). There was no difference in age among groups. However, there was a significant overall effect on level of education, driven by the higher mean of FCR = 16 subsample. A medium effect was observed on word knowledge, picture vocabulary and single word reading performance. ANOVAs were not significant for BCT (Total Errors), TPT (Total Time) and TMT-B T-scores (Table 5). Table 5. Age, Education, Injury Severity and Performance on Select Neuropsychological Tests as a Function of Trichotomized FCR Scores FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — Note. FCR: CVLT-II Forced Choice Recognition trial raw score; ED: years of formal education; VC: WAIS-IV Vocabulary age-corrected scale score; PPVT: Peabody Picture Vocabulary Test, 4th edition; WRAT-4 Reading: Wide Range Achievement Test, 4th edition, Reading subtest standard score; BCT: Booklet Category Test Total Errors T-score; Tactual Performance Test Total Time T-score; TMT-B: Trail Making Test B T-score; % mTBI: % patients with mild traumatic brain injury (vs. those with moderate-to-severe TBI); +NR: positive neuroradiological findings; Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤14 (n = 19). Values for standard deviations were italicized. Table 5. Age, Education, Injury Severity and Performance on Select Neuropsychological Tests as a Function of Trichotomized FCR Scores FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR Reading Total Total B 16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7 SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3 15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7 SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9 ≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4 SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0 p .18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96 η2 .03 .06 .06 .08 .07 .02 .03 .04 Φ2 .03 .00 Sig. post hocs None 0–1 0–2 0–2 0–2 None None 0–2 — — d — .92 .72 .83 .67 — — .47 — — Note. FCR: CVLT-II Forced Choice Recognition trial raw score; ED: years of formal education; VC: WAIS-IV Vocabulary age-corrected scale score; PPVT: Peabody Picture Vocabulary Test, 4th edition; WRAT-4 Reading: Wide Range Achievement Test, 4th edition, Reading subtest standard score; BCT: Booklet Category Test Total Errors T-score; Tactual Performance Test Total Time T-score; TMT-B: Trail Making Test B T-score; % mTBI: % patients with mild traumatic brain injury (vs. those with moderate-to-severe TBI); +NR: positive neuroradiological findings; Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤14 (n = 19). Values for standard deviations were italicized. Likewise, the three groups did not differ in TBI severity (percentage of mTBI patients and those with positive neuroradiological findings). In addition, the mTBI subsample was almost three times more likely to fail the old FCR cutoff (≤14; BRFail = 21.8%) than the subsample with moderate-to-severe TBI (BRFail = 7.7%). Similarly, patients with mTBI were twice as likely to fail the alternative FCR cutoff (≤15; BRFail = 28.2%) than the subsample with moderate-to-severe TBI (BRFail = 15.4%). The Classification Accuracy of FCR Against Reference PVTs All ROC models evaluating the level of agreement between FCR and reference PVTs were statistically significant (p < .01). AUC values ranged from .71 (DSACSS) to .83 (RHCVLT-II). The most stable AUC estimates were obtained against the WMT (95% CI: .65–.85), while the least stable estimates were observed on EI-5NR (95% CI:.61–.93). ROC analyses were followed up with direct comparisons between the classification accuracy of the old FCR cutoff (≤14) and the proposed alternative (≤15) against the reference PVTs. All cross-validation analyses met the minimum standard of specificity (.84; Larrabee, 2003), with values ranging from .85 to .98. Sensitivity was more variable, fluctuating between .40 and .72. The BRFail in reference PVTs ranged from 10.6% (TMT-A) to 38.5% (WMT). FCR ≤14 produced a sensitivity of .40 against the WMT, at .95 specificity. The switch to ≤15 increased sensitivity to .47, while preserving the same specificity. Classification accuracy was comparable between the two cutoffs against WCT (.48–.50 sensitivity at .93 specificity). The new cutoff outperformed the old one against EI-5REC in sensitivity (.52/.44) while both maintained very high specificity (.98). A similar pattern of increased sensitivity (.50/.58) and steady specificity (.91/.90) was observed against EI-5NR as the analyses shifted from the old to the new cutoff. Sensitivity spiked against RHCVLT-II with both cutoffs (.65/.72) in the backdrop of good specificity (.93/.92). Again, the new cutoff outperformed the old one against DSACSS in sensitivity (.45/.53) while producing the same specificity (.89). Overall, the new cutoff increased sensitivity from .50 to .56 compared to the old one, while preserving the same specificity (.92). This pattern of consistently higher sensitivity and essentially unchanged specificity associated with the new cutoff was also observed at the level of LRs (Table 6). With the exception of WCT, FCR ≤15 produced higher +LRs than FCR ≤14 against the reference PVTs. The new cutoff had consistently lower −LRs against the all reference PVTs than the old cutoff, suggesting superior classification accuracy. Table 6. A Direct Comparison between the Classification Accuracy of the Two FCR Cutoffs against Reference PVTs FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 Note: WMT: Word Memory Test (Green, 2003); WCT: Word Choice Test (Pearson, 2009); EI-5REC: Erdodi Index Five—Recognition based; EI-5NR: Erdodi Index Five—Non-recognition based; RHCVLT-II: CVLT-II Yes/No Recognition hits raw score (Wolfe et al., 2010); LREWolfe: Logistical regression equation based on a combination of three CVLT-II scores: long-delay free recall raw score, total recall discriminability z-score and d’ raw score (Donders & Strong, 2011; Wolfe et al., 2010); DSACSS: Digit Span age-corrected scaled score (Axelrod et al., 2006; Spencer et al., 2013); BRFail: Base rate of failure (%); AUC: Area under the curve; FCR: CVLT-II forced choice recognition; SENS: Sensitivity; SPEC: Specificity; +LR: Positive likelihood ratio; −LR: Negative likelihood ratio; Number of participants with FCR ≤ 15 is 26; Number of participants with FCR ≤ 14 is 19. The italic values represent base rates of failure. Table 6. A Direct Comparison between the Classification Accuracy of the Two FCR Cutoffs against Reference PVTs FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 FCR WMT WCT EI-5REC EI-5NR RHCVLT-II LREWolfe DSACSS Cutoff Standard ≤47 ≥4 ≥4 ≤10 ≥.625 ≤6 BRFail 38.5 27.0 37.2 17.9 19.2 18.0 21.2 AUC .75 .72 .78 .77 .83 .74 .71 p <.001 <.01 <.001 <.001 <.001 <.01 <.01 95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85 ≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53 SPEC .95 .93 .98 .90 .92 .88 .89 +LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56 −LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54 ≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45 SPEC .95 .93 .98 .91 .93 .89 .89 +LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14 −LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61 Note: WMT: Word Memory Test (Green, 2003); WCT: Word Choice Test (Pearson, 2009); EI-5REC: Erdodi Index Five—Recognition based; EI-5NR: Erdodi Index Five—Non-recognition based; RHCVLT-II: CVLT-II Yes/No Recognition hits raw score (Wolfe et al., 2010); LREWolfe: Logistical regression equation based on a combination of three CVLT-II scores: long-delay free recall raw score, total recall discriminability z-score and d’ raw score (Donders & Strong, 2011; Wolfe et al., 2010); DSACSS: Digit Span age-corrected scaled score (Axelrod et al., 2006; Spencer et al., 2013); BRFail: Base rate of failure (%); AUC: Area under the curve; FCR: CVLT-II forced choice recognition; SENS: Sensitivity; SPEC: Specificity; +LR: Positive likelihood ratio; −LR: Negative likelihood ratio; Number of participants with FCR ≤ 15 is 26; Number of participants with FCR ≤ 14 is 19. The italic values represent base rates of failure. The Relationship Between FCR and Emotional Functioning The majority of the sample (54.1%) scored in the non-clinical range on the SCL-90-R using a GSI T-score ≥63 as the cutoff. However, only 38.5% had fewer than two elevations (T ≥63) on the nine clinical scales, the other criterion for establishing the presence of clinically significant distress (Derogatis, 1994). The number of clinical elevations (M = 3.6, SD = 3.3) produced a bimodal distribution with two distinct clusters: patients with either zero (25.0%) or nine (14.6%) scores ≥63. ANOVAs using the trichotomized FCR (16, 15 and ≤14) as IV and the SCL-90-R scales as DVs produced significant main effects for all SCL-90-R scales except ANX and PHO. Effect sizes (η2p) ranged from .08 (medium) on HOS to .18 (large) on PSY. All post hoc contrasts were significant between FCR = 16 and FCR = 15 subsamples except ANX and PHO. Effect sizes (d) ranged from .87 (large) on SOM to 1.67 (very large) on O-C. All post hoc contrasts were significant between FCR = 16 and FCR ≤ 14 subsamples except HOS. Effect sizes (d) ranged from .62 (medium) on PHO to .83 (large) on PSY. When SCL-90-R scores were dichotomized around the T ≥ 63 cutoff into “clinical” versus “non-clinical”, non-parametric contrasts produced essentially the same results (Table 7). One comparison (PAR) became non-significant. All effect size estimates (Φ2) were within .02 of η2p values produced by ANOVAs with the exception of the GSI. Table 7. SCL-90-R Scores as a Function of FCR Performance FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 Note. All SCL-90-R scales are in T-scores (M = 50, SD = 10); FCR: CVLT-II Forced Choice Recognition trial raw score; SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; %CLIN: Percent of the subsample scoring T ≥ 63 on the SCL-90-R clinical scales; percent of the subsample with two or more scores T ≥ 63 on Σ 63; and percent of the subsample with BDI-II raw score ≥20 (cutoff for Moderate Depression); Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤ 14 (n = 19). Italic and bold values represent standard deviations and percent of the sample above the clinical threshold/phi-squared, respectively. Table 7. SCL-90-R Scores as a Function of FCR Performance FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II 16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5 SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5 %CLIN 34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2 15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9 SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9 %CLIN 71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1 ≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9 SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8 %CLIN 72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1 ANOVA p <.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01 η2p .09 .15 .14 .13 — .08 — .11 .18 .15 .16 .17 Sig. post hocs 0–1 0–1 0–1 0–1 NS 0–1 NS 0–1 0–1 0–1 0–1 0–1 0–2 0–2 0–2 0–2 0–2 NS 0–2 0–2 0–2 0–2 0–2 0–2 d for 0–1 .87 1.67 1.39 1.45 — 1.11 — 1.09 1.64 1.55 1.55 1.40 d for 0–2 .65 .66 .74 .71 .61 — .52 .63 .83 .76 .79 .93 χ2 p <.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01 Φ2 .11 .13 .12 .14 — .06 — — .17 .21 .18 .15 Note. All SCL-90-R scales are in T-scores (M = 50, SD = 10); FCR: CVLT-II Forced Choice Recognition trial raw score; SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; %CLIN: Percent of the subsample scoring T ≥ 63 on the SCL-90-R clinical scales; percent of the subsample with two or more scores T ≥ 63 on Σ 63; and percent of the subsample with BDI-II raw score ≥20 (cutoff for Moderate Depression); Sig. post hocs: pairwise comparisons with p < .05; 0: FCR = 16 (n = 78); 1: FCR = 15 (n = 7); 2: FCR ≤ 14 (n = 19). Italic and bold values represent standard deviations and percent of the sample above the clinical threshold/phi-squared, respectively. All three groups produced saw-tooth profiles, with distinct spikes on O-C, DEP and PSY (Fig. 1). FCR = 16 subsample had only one mean ≥63 on O-C, and on average had 2.9 elevations (SD = 3.2). The FCR = 15 subsample produced mean T ≥63 on all scales, and on average had 6.9 elevations (SD = 1.9). FCR ≤14 subsample produced mean T ≥63 on SOM, O-C, DEP, PSY and the GSI, and on average had 5.4 elevations (SD = 3.1). Fig. 1. View largeDownload slide SCL-90-R profiles associated with three levels of FCR performance; number of participants with perfect score on the FCR is 78; number of participants with FCR = 15 is 7; number of participants with FCR ≤ 14 is 19. Fig. 1. View largeDownload slide SCL-90-R profiles associated with three levels of FCR performance; number of participants with perfect score on the FCR is 78; number of participants with FCR = 15 is 7; number of participants with FCR ≤ 14 is 19. ANOVAs were repeated on the BDI-II, producing a large effect (η2p = .17), driven by the non-clinical range score of the FCR = 16 group (M = 12.5, SD = 10.5). FCR = 15 group (M = 24.9, SD = 6.9) did not differ from the FCR ≤14 group (M = 22.9, SD = 11.8). Both of these means were in the range of moderate clinical depression, and significantly higher than FCR = 16 mean (d = .93 and 1.40, large). To investigate whether these findings would generalize to other PVTs, a series of independent t-tests were performed between patients who passed and those who failed the WMT on SCL-90-R and BDI-II scores. All contrasts were significant, with the Fail group reporting higher levels of symptoms. Effect size estimates ranged from .46 (medium) to 1.01 (large). The analyses were repeated using a series of ANOVAs with the EI-5REC as trichotomous independent variable (Pass/Borderline/Fail) and the SCL-90-R and BDI-II scores as dependent variables. All ANOVAs were significant (η2p: .06–.12; medium-large effects) with the exception of the SOM scale (Table 8). The only post hoc contrast that consistently reached significance was between the Pass and Fail groups, with effect sizes ranging from .43 (medium) to .87 (large). Unlike with FCR, there was a linear relationship between level of PVT failure and self-reported emotional distress, with the Pass group reporting the least, the Fail group reporting the most emotional distress, with the Borderline group in the middle (Fig. 2). Table 8. SCL-90-R and BDI-II Scores as a Function of Passing or Failing the WMT and the EI-5REC SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 Note: All SCL-90-R scales are in T-scores (M = 50, SD = 10); SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index;; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; WMT: Word Memory Test; EI-5REC: Erdodi Index Five—Recognition based; dP-F: Cohen’s d for the Pass vs. Fail post hoc contrast. Italic and bold values represent standard deviations and Cohen’s d, respectively. Table 8. SCL-90-R and BDI-II Scores as a Function of Passing or Failing the WMT and the EI-5REC SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ 63 BDI-II WMT  Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3 SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4  Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3 SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7 p <.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01 d .68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72 EI-5REC  Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3  n = 51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7  Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9  n = 18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3  Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0  n = 29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2 p .06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05 η2p .06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08 dP-F .43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61 Note: All SCL-90-R scales are in T-scores (M = 50, SD = 10); SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic symptoms; GSI: Global Severity Index;; Σ 63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition; WMT: Word Memory Test; EI-5REC: Erdodi Index Five—Recognition based; dP-F: Cohen’s d for the Pass vs. Fail post hoc contrast. Italic and bold values represent standard deviations and Cohen’s d, respectively. Fig. 2. View largeDownload slide SCL-90-R profiles associated with the three levels of EI-5REC performance; number of participants in the Pass range (0–1) is 51; number of participants in the Borderline range (2–3) is 18; number of participants in the Fail range (≥4) is 29. Fig. 2. View largeDownload slide SCL-90-R profiles associated with the three levels of EI-5REC performance; number of participants in the Pass range (0–1) is 51; number of participants in the Borderline range (2–3) is 18; number of participants in the Fail range (≥4) is 29. Discussion The present study was designed to compare the de facto FCR cutoff (≤14) to a more liberal alternative (≤15) in a sample of clinically referred patients with TBI. Both cutoffs performed around the “Larrabee limit” (.50 sensitivity at .90 specificity). The hypothesis that increasing the cutoff will improve sensitivity while maintaining specificity was supported by the data. On average, FCR ≤15 correctly classified an additional 6% of the invalid response sets, while maintaining a false positive rate of <10%. Likewise, the alternative cutoff produced comparable or better classification accuracy at the level of likelihood ratios. This pattern of findings was remarkably consistent across a wide range of reference PVTs, including auditory and visual, univariate and multivariate criteria, free-standing and embedded PVTs, indicators based on the FCR paradigm and those derived from tests of attention. The replicable superiority of the new cutoff against a variety of criterion measures addresses previous concerns about modality specificity (Erdodi, Tyson, Abeare et al., 2017; Erdodi, Tyson, Shahein et al., 2017), and provides empirical validation to earlier predictions that even a single error on FCR may indicate invalid responding (D. Delis, personal communication, 10 May 2012). Our results are also consistent with research on the child version of FCR (Lichtenstein et al., 2017). In addition, the consistently high specificity and +LR of the new cutoff against multiple reference PVTs suggests that the more liberal FCR cutoff does not inflate false positive rates. Equally importantly, subsamples with FCR scores 16, 15 and ≤14 did not differ from each other in injury severity, neuroradiological findings, or on the measures known to be sensitive to TBI (Booklet Category Test, Tactual Performance Test and Trails B; Grant & Adams, 1996). These findings suggest that FCR is independent of objective measures of impairment, consistent with previous reports (Baldo et al., 2002; Donders & Strong, 2011). The fact that, paradoxically, a significant difference emerged on “hold” tests (Boone, 2013) known to be resistant to the deleterious effects of TBI (i.e., word knowledge, picture vocabulary and single word reading) provides further evidence that FCR is unrelated to cognitive impairment subsequent to TBI. In fact, internally inconsistent patterns of test scores have been identified as emergent markers of invalid responding (Boone, 2013; Larrabee, 2012; Slick, Sherman & Iverson, 1999). Furthermore, there was a “reverse injury severity effect” on FCR. In other words, patients with mTBI were two to three times more likely to fail FCR cutoffs compared to patients with moderate-to-severe TBI. Although counterintuitive, this phenomenon is well-replicated in the research literature (Carone, 2008; Erdodi & Rai, 2017; Green, Iverson, & Allen, 1999; Green, Flaro, & Courtney, 2009; Sweet, Goldman, & Guidotti Breting, 2013). In the broader context of this well-established apparent paradox of elevated BRFail in mTBI, the current results should alleviate concerns about false positive errors on FCR due to genuine neurological impairment. The second hypothesis that performance on FCR would be related to self-reported emotional distress was also supported. Patients who obtained a perfect score on FCR had the lowest level of depression on SCL-90-R and BDI-II, both as continuous scales and as percentage in the clinical range. Those who made any error on FCR reported more severe psychiatric symptoms globally, with large to very large effect sizes. These findings are consistent with some of the existing literature that documents a link between psychiatric history and invalid performance on neurocognitive testing (Martens, Donders, & Millis, 2001; Moore & Donders, 2004), but contradicts other reports that anxiety and depression are unrelated to PVT failure (Ashendorf, Constantinou & McCaffery, 2004; Considine et al., 2011; Egeland et al., 2005; Rees et al., 2001). The divergence between our study and some previous investigations on PVTs and psychological distress may be driven by two main factors. First, many of them operationalized performance validity using a single criterion measure, such as the TOMM or the Rey 15-item test at traditional cutoffs (Trial 2 <45 and free recall <9, respectively), which are known to have limited sensitivity to invalid responding (Green, 2007; Reznek, 2005). Therefore, those negative findings may reflect undetected invalid profiles. Second, those studies focused on psychiatric disorders, whereas our sample was comprised of patients with TBI, some of whom also reported emotional problems. As such, our positive findings could be due to the additive effect of neuropsychological deficits subsequent to TBI, pre-existing or emerging deficits in emotional regulation, or other contextual factors uniquely related to TBI and post-TBI depression and anxiety. While the evidence linking depression and memory deficits is mixed both within and between studies (Bearden et al., 2006; Ilsley, Moffoot, & Carroll, 1995; Keiski, Shore, & Hamilton, 2007; Kessels, Ruis, & Kappelle, 2007; Langenecker et al., 2005; Raskin, Mateer, & Tweeten, 1998), there is growing evidence that memory tests are impacted more by invalid responding than psychiatric disorders (Boone, 2013; Coleman, Rapport, Millis, Ricker, & Farchione, 1998; Larrabee, 2012; Suhr, Tranel, Wefel, & Barrash, 1997; Trueblood, 1994). In fact, Rohling, Green, Allen and Iverson (2002) argue that a meaningful investigation of the interaction between depression and cognitive functioning must exclude individuals who fail PVTs. Our findings are congruent with this line of research on co-existing TBI, self-reported emotional problems and PVT failures. As FCR performance correlates with scores on both PVTs and psychiatric symptom inventories, the clinical interpretation of failing this validity indicator is a challenge. The group-level pattern of scores observed in this sample fits several criteria of “Cogniform Disorder” introduced by Delis and Wetter (2007): internally inconsistent neurocognitive profiles, combination of test scores that are rare in patients with genuine neurological impairment, and objective evidence of poor effort. Although the observational data presented in this study does not allow for causal attributions, they raise some important questions. Does genuine emotional distress increase vulnerability to PVT failures? Do patients with non-credible presentation exaggerate both emotional distress and cognitive deficits? Are certain PVTs more sensitive than others to both forms of invalid responding? Although there is an emerging consensus that symptom and performance validity are distinct constructs and therefore, should be evaluated separately (van Dyke et al., 2013), it is plausible that they share part of their etiology. If the link between FCR and psychiatric symptoms is replicated in future studies, failing FCR might become a marker of not only invalid performance, but perhaps also of “psychogenic interference”—a failure to demonstrate one’s true ability level on cognitive testing due to acute psychiatric symptoms. It is interesting that the FCR = 15 group reported more severe psychiatric symptomatology than the FCR ≤ 14 group. Also, the FCR = 16 group produced a pattern of performance that is consistent with the bona fide cognitive sequelae subsequent to TBI (i.e., intact performance on “hold” tests, and mild deficits on measures known to be sensitive to head injury). In contrast, the FCR ≤ 14 group demonstrated uniformly low performance across both types of tests, with the FCR = 15 group in between. It is possible that there are group-level differences in the etiology of PVT failures, with the more heavily psychogenic influences having a milder impact than other factors that are known to have strong effects on PVT performance, such as external incentives to appear impaired (Boone, 2013; Larrabee, 2012). However, this cannot be determined with the current sample, given the absence of data on litigation status. While previous research found that certain PVTs appear to be uniquely sensitive to emotional distress (Erdodi et al., 2016), it failed to disentangle the relative contribution of psychogenic interference and volitional suppression of performance on cognitive testing. The cumulative clinical evidence suggests that the etiology of invalid performance is likely multifactorial. A PVT failure can be the expression of several contributing and potentially interacting factors and hence, does not automatically mean deliberate suppression of cognitive ability (i.e., malingering). A full consideration of alternative explanations to non-credible presentation is instrumental in providing an accurate, nuanced and clinically helpful interpretation of neurocognitive profiles (Bigler, 2012, 2015). Developing a conceptually sound and empirically supported model for subtyping non-credible responding has important forensic and clinical applications. For example, in personal injury litigation, multiple unequivocal PVT failures raise the possibility of malingering and thus, have obvious implications for the legitimacy of the lawsuit. In contrast, a neuropsychologist’s conclusion that a plaintiff failed to put forth adequate effort, but not deliberately so, may shift the focus to exploring other plausible clinical issues that may or may not be related to the accident (depression, unresolved developmental trauma, exacerbation of a pre-existing psychological vulnerability, righteous anger towards the perpetrator of the injury, etc.). In those cases, the assessor’s responsibility is to (a) determine whether the data are consistent with an alternative accident-related etiology; (b) render an opinion that even if psychogenic factors are operative, they cannot account for the level of impairment demonstrated during testing, or (c) conclude that regardless of the reason behind unexpectedly low scores, they cannot be attributed to accident-related factors. Even in clinical settings and in the absence of apparent external incentives to appear impaired, assessors often face the complex task of interpreting co-existing PVT failures and medically verified neurological problems (Erdodi et al., 2016). In such cases, it is the neuropsychologist’s responsibility to determine whether (a) low scores are a manifestation of a legitimate disease process; (b) even in the context of documented severe impairments the low scores are still not credible; or (c) independent of neurological manifestations, ancillary issues are contributing to low performance, such as living with a debilitating neurological impairment for many years has resulted in unremitting dependence or chronic resignation in the face of cognitive demands. These considerations are important for optimizing the clinical management of the patient. If an evaluation is deemed valid (i.e., PVT failures are attributable to despondent resignation that deflated scores throughout testing), certain aspects of the patient’s impairment might be reversible. In such cases referral for psychotherapy or cognitive rehabilitation has the potential to restore some of the cognitive functioning. For example, in the present sample elevations on SCL-90-R were related to errors on FCR and failures on other PVTs. If self-reported psychopathology is causally related to invalid responding, treating the psychiatric symptoms could conceivably improve cognitive performance. Although speculations about the reasons behind poor efforts are epistemologically risky, providing a sound, albeit tentative, explanation could be important, as the clinical outcome hinges on the correct interpretation of non-credible presentation. Beyond the simple “valid/invalid” dichotomy, the assessor carries the responsibility of determining whether a meaningful intervention is feasible. Erring on either side can be costly. Dismissing a patient as non-credible may deprive the individual of the opportunity to recover lost function. Recommending therapy for a malingerer may allocate limited health care services to an individual who is invested in appearing impaired and thus, is unlikely to benefit from the intervention. In conclusion, FCR scores should be interpreted in the larger context of injury severity, clinical and psycho-social history, incentive status as well as the rest of the neurocognitive profile. Marginal failures (FCR = 15) likely have a different clinical meaning in patients with medically verified severe pathology and those with mild or questionable TBI. In the former group, a single error may be a direct manifestation of the injury. Conversely, in the latter group it may raise concerns about non-neurological factors contributing to the presentation. The present study has several strengths. It provided a direct comparison of the classification accuracy of two different FCR cutoffs across a wide range of reference PVTs in a clinically referred sample with mild and moderate-to-severe TBI. We also examined the link between FCR failures and self-reported emotional functioning. Inevitably, the study has a number of limitations, too: the sample is relatively small and geographically restricted. More importantly, the FCR = 15 subsample was too small to draw definite conclusions about the neurocognitive profile of patients who only failed the liberal cutoff on FCR. In addition, as psychiatric symptoms were assessed using face-valid self-report inventories without built-in validity scales, the veracity of these data is unknown, which is a considerable limitation of our measurement model. However, given the limited research on the link between emotional functioning and performance validity, documenting a systematic difference in the level of self-reported psychiatric symptoms as a function of passing or failing PVTs is a meaningful initial step towards a better understanding of this complex relationship. The fact that previous research that controlled for response bias in self-report produced similar results (Erdodi, Sagar et al., 2017; Erdodi, Seke et al. 2017) suggests that the shared variance between elevated symptom report and PVT failure cannot be attributed to a common “malingering factor” (i.e., the same people fabricate/exaggerate both psychiatric problems and cognitive deficits). More importantly, the nature of the data (archival/observational) precludes causal modeling of the main effects. Prospective experimental and longitudinal studies that can separate invalid performance from psychiatric history by design are needed to determine the clinical meaning of FCR failures—evidence of non-credible responding, emotional distress or both? Conclusion Even a single error on the FCR is a reliable marker of invalid responding. Based on its superior classification accuracy, ≤15 should replace the current de facto FCR cutoff of ≤14. Failing the FCR was associated with elevated self-reported psychiatric symptoms. Given that the link between invalid performance and emotional distress is poorly understood, further research is needed to explore the underlying causal mechanisms. Funding None. Conflict of Interest None declared. Acknowledgments The authors would like to thank Drs. Donders and Marshall for providing additional data on the clinical samples used in their studies that were not included in the original publications. References American Psychiatric Association . ( 1996 ). Diagnostic and statistical manual of mental disorders ( 4th ed. ). Washington, DC : Author . An , K. Y. , Kaploun , K. , Erdodi , L. A. , & Abeare , C. A. ( 2017 ). Performance validity in undergraduate research participants: A comparison of failure rates across tests and cutoffs . The Clinical Neuropsychologist , 31 , 193 – 206 . doi:10.1080/13854046.2016.1217046 . Google Scholar CrossRef Search ADS PubMed An , K. Y. , Zakzanis , K. K. , & Joordens , S. ( 2012 ). Conducting research with non-clinical healthy undergraduates: Does effort play a role in neuropsychological test performance? Archives of Clinical Neuropsychology , 27 , 849 – 857 . Google Scholar CrossRef Search ADS PubMed Arnold , G. , Boone , K. B. , Lu , P. , Dean , A. , Wen , J. , Nitch , S. , et al. . ( 2005 ). Sensitivity and specificity of finger tapping test scores for the detection of suspect effort . The Clinical Neuropsychologist , 19 , 105 – 120 . Google Scholar CrossRef Search ADS PubMed Ashendorf , L. , Constantinou , M. , & McCaffrey , R. J. ( 2004 ). The effect of depression and anxiety on the TOMM in community-dwelling older adults . Archives of Clinical Neuropsychology , 19 , 125 – 130 . Google Scholar CrossRef Search ADS PubMed Axelrod , B. N. , Fichteberg , N. L. , Millis , S. R. , & Wertheimer , J. C. ( 2006 ). Detecting incomplete effort with Digit Span from the Wechsler Adult Intelligence Scale—Third Edition . The Clinical Neuropsychologist , 10 , 513 – 523 . Google Scholar CrossRef Search ADS Baker , K. A. , Schmidt , M. F. , Heinemann , A. W. , Langley , M. , & Miranti , S. V. ( 1998 ). The validity of the Katz Adjustment Scale among people with traumatic brain injury . Rehabilitation Psychology , 43 , 30 – 40 . Google Scholar CrossRef Search ADS Baldo , J. V. , Delis , D. , Kramer , J. , & Shimamura , A. ( 2002 ). Memory performance on the California Verbal Learning Test-II: Findings from patients with focal frontal lesions . Journal of the International Neuropsychological Society , 8 , 539 – 546 . Google Scholar CrossRef Search ADS PubMed Bauer , L. , Yantz , C. L. , Ryan , L. M. , Warned , D. L. , & McCaffrey , R. J. ( 2005 ). An examination of the California Verbal Learning Test II to detect incomplete effort in a traumatic brain injury sample . Applied Neuropsychology , 12 , 202 – 207 . Google Scholar CrossRef Search ADS PubMed Bearden , C. E. , Glahn , D. C. , Monkul , E. S. , Barrett , J. , Najt , P. , Villarreal , V. , et al. . ( 2006 ). Patterns of memory impairment in bipolar disorder and unipolar major depression . Psychiatry Research , 142 , 139 – 150 . Google Scholar CrossRef Search ADS PubMed Beck , A. T. , Steer , R. A. , & Brown , G. K. ( 1996 ). Beck Depression Inventory ( 2nd ed. ). San Antonio, TX : Psychological Corporation . Bigler , E. D. ( 2012 ). Symptom validity testing, effort and neuropsychological assessment . Journal of the International Neuropsychological Society , 18 , 632 – 642 . Google Scholar CrossRef Search ADS PubMed Bigler , E. D. ( 2015 ). Neuroimaging as a biomarker in symptom validity and performance validity testing . Brain Imaging and Behavior , 9 , 421 – 444 . Google Scholar CrossRef Search ADS PubMed Boone , K. B. ( 2013 ). Clinical practice of forensic neuropsychology . New York : Guilford . Boone , K. B. ( 2009 ). The need for continuous and comprehensive sampling of effort/response bias during neuropsychological examinations . The Clinical Neuropsychologist , 23 , 729 – 741 . Google Scholar CrossRef Search ADS PubMed Bortnik , K. E. , Boone , K. B. , Marion , S. D. , Amano , S. , Ziegler , E. , Victor , T. L. , et al. . ( 2010 ). Examination of various WMS-III logical memory scores in the assessment of response bias . The Clinical Neuropsychologist , 24 , 344 – 357 . Google Scholar CrossRef Search ADS PubMed Carone , D. A. ( 2008 ). Children with moderate/severe brain damage/dysfunction outperform adults with mild-to-no brain damage on the Medical Symptom Validity Test . Brain Injury , 22 , 960 – 971 . Google Scholar CrossRef Search ADS PubMed Chafetz , M. D. , Williams , M. A. , Ben-Porath , Y. S. , Bianchini , K. J. , Boone , K. B. , Kirkwood , M. W. , et al. . ( 2015 ). Official position of the American Academy of Clinical Neuropsychology Social Security Administration policy on validity testing: Guidance and recommendations for change . The Clinical Neuropsychologist , 29 , 723 – 740 . Google Scholar CrossRef Search ADS PubMed Connor , D. J. , Drake , A. I. , Bondi , M. W. , & Delis , D. C. ( 1997 ). Detection of feigned cognitive impairments in patients with a history of mild to severe closed head injury. Paper presented at the American Academy of Neurology, Boston. Clark , L. R. , Stricker , N. H. , Libon , D. J. , Delano-Wood , L. , Salmon , D. P. , Delis , D. C. , et al. . ( 2012 ). Yes/No versus forced-choice recognition memory in mild cognitive impairment and alzheimer’s disease: Patterns of impairment and associations with dementia severity . The Clinical Neuropsychologist , 16 , 1201 – 1216 . Google Scholar CrossRef Search ADS Coleman , R. D. , Rapport , L. J. , Millis , S. R. , Ricker , J. H. , & Farchione , T. J. ( 1998 ). Effects of coaching on the California Verbal Learning Test . Journal of Clinical and Experimental Neuropsychology , 20 , 201 – 210 . Google Scholar CrossRef Search ADS PubMed Considine , C. , Weisenbach , S. L. , Walker , S. J. , McFadden , E. M. , Franti , L. M. , Bieliauskas , L. A. , et al. . ( 2011 ). Auditory memory decrements, without dissimulation, among patients with major depressive disorder . Archives of Clinical Neuropsychology , 26 , 445 – 453 . Google Scholar CrossRef Search ADS PubMed Constantinou , M. , Bauer , L. , Ashendorf , L. , Fisher , J. M. , & McCaffrey , R. J. ( 2005 ). Is poor performance on recognition memory effort measures indicative of generalized poor performance on neuropsychological tests? Archives of Clinical Neuropsychology , 20 , 191 – 198 . Google Scholar CrossRef Search ADS PubMed Delis , D. C. , Kramer , J. H. , Kaplan , E. , & Ober , B. A. ( 2000 ). ). California Verbal Learning Test—Second edition . San Antonio, TX : Psychological Corporation . Delis , D. , & Wetter , S. R. ( 2007 ). Cogniform disorder and cogniform condition: Proposed diagnoses for excessive cognitive symptoms . Archives of Clinical Neuropsychology , 22 , 589 – 604 . Google Scholar CrossRef Search ADS PubMed Derogatis , L. R. ( 1994 ). SCL-90-R: Administration, scoring, and procedures manual ( 3rd ed. ). Minneaplois, MN : National Computer Systems . Donders , J. , & Strong , C. A. ( 2011 ). Embedded effort indicators on the California Verbal Learning Test—Second Edition: An attempted cross-validation . The Clinical Neuropsychologist , 25 , 173 – 184 . Google Scholar CrossRef Search ADS PubMed Egeland , J. , Lund , A. , Landro , N. I. , Rund , B. R. , Sudet , K. , Asbjornsen , A. , et al. . ( 2005 ). Cortisol level predicts executive and memory function in depression, symptom level predicts psychomotor speed . Acta Psychiatrica Scandinavica , 112 , 434 – 441 . Google Scholar CrossRef Search ADS PubMed Erdodi , L. A. ( 2017 ). Aggregating validity indicators: The salience of domain specificity and the indeterminate range in multivariate models of performance validity assessment . Applied Neuropsychology: Adult . doi: 10.1080/23279095.2017.1384925 Advance online publication. Erdodi , L. A. , & Rai , J. K. ( 2017 ). A single error is one too many: Examining alternative cutoffs on Trial 2 on the TOMM . Brain Injury , 31 , 1362 – 1368 . doi: 10.1080/02699052.2017.1332386. Google Scholar CrossRef Search ADS PubMed Erdodi , L. A. , Kirsch , N. L. , Lajiness-O’Neill , R. , Vingilis , E. , & Medoff , B. ( 2014 ). Comparing the Recognition Memory Test and the Word Choice Test in a mixed clinical sample: Are they equivalent? Psychological Injury and Law , 7 , 255 – 263 . doi:10.1007/s12207-014-9197-8 . Google Scholar CrossRef Search ADS Erdodi , L. A. , Roth , R. M. , Kirsch , N. L. , Lajiness-O’Neill , R. , & Medoff , B. ( 2014 ). Aggregating validity indicators embedded in Conners’ CPT-II outperforms individual cutoffs at separating valid from invalid performance in adults with traumatic brain injury . Archives of Clinical Neuropsychology , 29 , 456 – 466 . Google Scholar CrossRef Search ADS PubMed Erdodi , L. A. , Sagar , S. , Seke , K. , Zuccato , B. G. , Schwartz , E. S. , & Roth , R. M. ( 2017 ). The Stroop Test as a measure of performance validity in adults clinically referred for neuropsychological assessment . Psychological Assessment . doi:10.1037/pas0000525 . Erdodi , L. A. , Seke , K. R. , Shahein , A. , Tyson , B. T. , Sagar , S. , & Roth , R. M. ( 2017 ). Low scores on the Grooved Pegboard Test are associated with invalid responding and psychiatric symptoms . Psychology and Neuroscience , 10 , 325 – 344 . doi: 10.1037/pne0000103. Google Scholar CrossRef Search ADS Erdodi , L. A. , Tyson , B. , Abeare , T. , Lichtenstein , C. A. , Pelletier , J. D. , Rai , C. L. , et al. . ( 2016 ). The BDAE Complex Ideational Material—A measure of receptive language or performance validity? Psychological Injury and Law , 9 , 112 – 120 . doi: 10.1007/s12207-016-9254-6. Google Scholar CrossRef Search ADS Erdodi , L. A. , Tyson , B. T. , Abeare , C. A. , Zuccato , B. G. , Rai , J. K. , Seke , K. R. , et al. . ( 2017 ). Utility of critical items within the Recognition Memory Test and Word Choice Test. Advance online publication . Applied Neuropsychology: Adult . doi:10.1080/23279095.2017.1298600 . Erdodi , L. A. , Tyson , B. T. , Shahein , A. , Lichtenstein , J. D. , Abeare , C. A. , Pelletiere , C. L. , et al. . ( 2017 ). The power of timing: Adding a time-to-completion cutoff to the Word Choice Test and Recognition Memory Test improves classification accuracy . Journal of Clinical and Experimental Neuropsychology , 39 , 369 – 383 . doi:10.1080/13803395.2016.1230181 . Google Scholar CrossRef Search ADS PubMed Frederick , R. I. ( 2003 ). VIP: Validity indicator profile. Manual ( 2nd ed. ). Minneapolis, MN : NCS Pearson . Grant I. , & Adams K. M. (Eds.) ( 1996 ). Neuropsychological assessment of neuropsychiatric disorders . New York : Oxford University Press . Greve , K. W. , & Bianchini , K. J. ( 2004 ). Setting empirical cut-offs on psychometric indicators of negative response bias: A methodological commentary with recommendations . Archives of Clinical Neuropsychology , 19 , 533 – 541 . Google Scholar CrossRef Search ADS PubMed Greve , K. W. , Bianchini , K. J. , Mathias , C. W. , Houston , R. J. , & Crouch , J. A. ( 2002 ). Detecting malingered neurocognitive dysfunction with the Wisconsin Card Sorting Test: A preliminary investigation in traumatic brain injury . The Clinical Neuropsychologist , 16 , 179 – 191 . Google Scholar CrossRef Search ADS PubMed Green , P. ( 2003 ). Green’s Word Memory Test . Edmonton, Canada : Green’s Publishing . Green , P. ( 2007 ). Spoiled for choice: Making comparisons between forced-choice effort tests. In Boone K. B. (Ed.) , Assessment of feigned cognitive impairment (pp. 50 – 77 ). New York, NY : Guilford . Green , P. , Iverson , G. , & Allen , L. ( 1999 ). Detecting malingering in head injury litigation with the Word Memory Test . Brain Injury , 13 , 813 – 819 . Google Scholar CrossRef Search ADS PubMed Green , P. , Flaro , L. , & Courtney , J. ( 2009 ). Examining false positives on the word memory test in adults with mild traumatic brain injury . Brain Injury , 23 , 741 – 750 . Google Scholar CrossRef Search ADS PubMed Greiffenstein , M. F. , Baker , W. J. , & Gola , T. ( 1994 ). Validation of malingered amnesic measures with a large clinical sample . Psychological Assessment , 6 , 218 – 224 . Google Scholar CrossRef Search ADS Heaton , R. K. , Miller , S. W. , Taylor , M. J. , & Grant , L. ( 2004 ). Revised comprehensive norms for an expanded Halstead-Reitan battery: Demographically adjusted neuropsychological norms for African American and Caucasian adults . Lutz, Fla. : PAR . Heilbronner , R. L. , Sweet , J. J. , Morgan , J. E. , Larrabee , G. J. , & Millis , S. R. ( 2009 ). American Academy of Neuropsychology Consensus Conference Statement on the neuropsychological assessment of effort, response bias, and malingering . The Clinical Neuropsychologist , 23 , 1093 – 1129 . Google Scholar CrossRef Search ADS PubMed Hoofien , D. , Barak , O. , Vakil , E. , & Gilboa , A. ( 2005 ). Symptom Checklist 90 Revised scores in persons with traumatic brain injury: Affective reactions of neurobehavioral outcomes of the injury? Applied Neuropsychology , 12 , 30 – 39 . Google Scholar CrossRef Search ADS PubMed Ilsley , J. E. , Moffoot , A. P. R. , & O’Carroll , R. E. ( 1995 ). An analysis of memory dysfunction in major depression . Journal of Affective Disorders , 35 ( 1-2 ), 1 – 9 . Google Scholar CrossRef Search ADS PubMed Iverson , G. L. ( 2003 ). Detecting malingering in civil forensic evaluations. In Horton A. M. , & Hartlage L. C. (Eds.) , Handbook of forensic neuropsychology (pp. 137 – 177 ). New York : Springer Publishing Company . Keiski , M. A. , Shore , D. L. , & Hamilton , J. M. ( 2007 ). The role of depression in verbal memory following traumatic brain injury . The Clinical Neuropsychologist , 21 , 744 – 761 . Google Scholar CrossRef Search ADS PubMed Kessels , R. P. C. , Ruis , C. , & Kappelle , L. J. ( 2007 ). The impact of self-reported depressive symptoms on memory function in neurological outpatients . Clinical Neurology and Neurosurgery , 109 , 323 – 326 . Google Scholar CrossRef Search ADS PubMed Lange , R. T. , Iverson , G. L. , Brickell , T. A. , Staver , T. , Pancholi , S. , Bhagwat , A. , et al. . ( 2013 ). Clinical utility of the Conners’ Continuous Performance Test-II to detect poor effort in U.S. military personnel following traumatic brain injury . Psychological Assessment , 25 , 339 – 352 . Google Scholar CrossRef Search ADS PubMed Langenecker , S. A. , Bieliauskas , L. A. , Rapport , L. J. , Zubieta , J. K. , Wilde , E. A. , & Berent , S. ( 2005 ). Face emotion perception and executive functioning deficits in depression . Journal of Clinical and Experimental Psychology , 27 , 320 – 333 . Larrabee , G. J. ( 2003 ). Detection of malingering using atypical performance on standard neuropsychological tests . The Clinical Neuropsychologist , 17 , 410 – 425 . Google Scholar CrossRef Search ADS PubMed Larrabee , G. J. ( 2012 ). Assessment of malingering. In Larrabee G. J. (Ed.) , Forensic neuropsychology: A scientific approach . NY : Oxford University Press . Leighton , A. , Weinborn , M. , & Maybery , M. ( 2014 ). Bridging the gap between neurocognitive processing theory and performance validity assessment among the cognitively impaired: A review and methodological approach . Journal of the International Neuropsychological Society , 20 , 873 – 886 . doi:10.1017/S135561771400085X . Google Scholar CrossRef Search ADS PubMed Lichtenstein , J. D. , Erdodi , L. A. , & Linnea , K. S. ( 2017 ). Introducing a forced-choice recognition task to the California Verbal Learning Test—Children’s Version . Child Neuropsychology , 23 , 284 – 299 . doi:10.1080/09297049.2015.1135422 . Google Scholar CrossRef Search ADS PubMed Marschark , M. , Richtsmeier , L. M. , Richardson , J. T. E. , Crovitz , H. F. , & Henry , J. ( 2000 ). Intellectual and emotional functioning in college students following mild traumatic brain injury in childhood and adolescence . Journal of Head Trauma Rehabilitation , 15 , 1227 – 1245 . Google Scholar CrossRef Search ADS PubMed Marshall , P. , & Happe , M. ( 2007 ). The performance of individuals with mental retardation on cognitive tests assessing effort and motivation . The Clinical Neuropsychologist , 21 , 826 – 840 . Google Scholar CrossRef Search ADS PubMed Martens , M. , Donders , J. , & Millis , S. R. ( 2001 ). Evaluation of invalid response sets after traumatic head injury . Journal of Forensic Neuropsychology , 2 ( 1 ), 1 – 18 . Google Scholar CrossRef Search ADS Moore , B. A. , & Donders , J. ( 2004 ). Predictors of invalid neuropsychological test performance after traumatic brain injury . Brain Injury , 18 , 975 – 984 . Google Scholar CrossRef Search ADS PubMed Ord , J. S. , Boettcher , A. C. , Greve , K. J. , & Bianchini , K. J. ( 2010 ). Detection of malingering in mild traumatic brain injury with the Conners’ Continuous Performance Test-II . Journal of Clinical and Experimental Neuropsychology , 32 , 380 – 387 . Google Scholar CrossRef Search ADS PubMed Palav , A. , Ortega , A. , & McCaffrey , R. J. ( 2001 ). Incremental validity of the MMPI-2 content scales: A preliminary study with brain-injured patients . Journal of Head Trauma Rehabilitation , 16 , 275 – 283 . Google Scholar CrossRef Search ADS PubMed Pearson ( 2009 ). Advanced clinical solutions for the WAIS-IV and WMS-IV—Technical manual. San Antonio, TX : Author . Perneger , T. V. ( 1998 ). What’s wrong with Bonferroni adjustments . BMJ (Clinical research ed.) , 316 , 1236 – 1238 . Google Scholar CrossRef Search ADS PubMed Raskin , S. A. , Mateer , C. A. , & Tweeten , R. ( 1998 ). Neuropsychological assessment of individuals with mild traumatic brain injury . The Clinical Neuropsychologist , 12 , 21 – 30 . Google Scholar CrossRef Search ADS Rees , L. M. , Tombaugh , T. N. , & Boulay , L. ( 2001 ). Depression and the Test of Memory Malingering . Archives of Clinical Neuropsychology , 16 , 501 – 506 . Google Scholar CrossRef Search ADS PubMed Reese , C. S. , Suhr , J. A. , & Riddle , T. L. ( 2012 ). Exploration of malingering indices in the Wechsler Adult Intelligence Scale—Fourth Edition Digit Span subtest . Archives of Clinical Neuropsychology , 27 , 176 – 181 . Google Scholar CrossRef Search ADS PubMed Reznek , L. ( 2005 ). The Rey 15-item memory test for malingering: A meta-analysis . Brain Injury , 19 , 539 – 543 . doi:10.1080/02699050400005242 . Google Scholar CrossRef Search ADS PubMed Rohling , M. L. , Green , P. , Allen , L. M. , & Iverson , G. L. ( 2002 ). Depressive symptoms and neurocognitive test scores in patients passing symptom validity tests . Archives of Clinical Neuropsychology , 17 , 205 – 222 . Google Scholar CrossRef Search ADS PubMed Root , J. C. , Robbins , R. N. , Chang , L. , & van Gorp , W. ( 2006 ). Detection of inadequate effort on the California Verbal Learning Test-Second edition: Forced choice recognition and critical item analysis . Journal of the International Neuropsychological Society , 12 , 688 – 696 . Google Scholar CrossRef Search ADS PubMed Ross , T. P. , Poston , A. M. , Rein , P. A. , Salvatore , A. N. , Wills , N. L. , & York , T. M. ( 2016 ). Performance invalidity base rates among healthy undergraduate research participants . Archives of Clinical Neuropsychology , 31 , 97 – 104 . Google Scholar CrossRef Search ADS PubMed Rothman , K. J. ( 1990 ). No adjustments are needed for multiple comparisons . Epidemiology (Cambridge, Mass.) , 1 , 43 – 46 . Google Scholar CrossRef Search ADS PubMed Santos , O. A. , Kazakov , D. , Reamer , M. K. , Park , S. E. , & Osmon , D. C. ( 2014 ). Effort in college undergraduate is sufficient on the Word Memory Test . Archives of Clinical Neuropsychology , 29 , 609 – 613 . Google Scholar CrossRef Search ADS PubMed Schwartz , E. S. , Erdodi , L. , Rodriguez , N. , Jyotsna , J. G. , Curtain , J. R. , Flashman , L. A. , et al. . ( 2016 ). CVLT-II forced choice recognition trial as an embedded validity indicator: A systematic review of the evidence . Journal of the International Neuropsychological Society , 22 , 851 – 858 . doi:10.1017/S1355617716000746 . Google Scholar CrossRef Search ADS PubMed Slick , D. J. , Sherman , E. M. S. , Grant , L. , & Iverson , G. L. ( 1999 ). Diagnostic criteria for malingered neurocognitive dysfunction: Proposed standards for clinical practice and research . The Clinical Neuropsychologist , 13 , 545 – 561 . Google Scholar CrossRef Search ADS PubMed Spencer , R. J. , Axelrod , B. N. , Drag , L. L. , Waldron-Perrine , B. , Pangilinan , P. H. , & Bieliauskas , L. A. ( 2013 ). WAIS-IV reliable digit span is no more accurate than age corrected scaled score as an indicator of invalid performance in a veteran sample undergoing evaluation for mTBI . The Clinical Neuropsychologist , 27 , 1362 – 1372 . Google Scholar CrossRef Search ADS PubMed Sprinkle , S. D. , Lurie , D. , Insko , S. L. , Atkinson , G. , Jones , G. L. , Logan , A. R. , et al. . ( 2002 ). Criterion validity, severity cut scores, and test-retest reliability of the Beck Depression Inventory-II in a university counseling center sample . Journal of Counseling Psychology , 49 , 381 . Google Scholar CrossRef Search ADS Storch , E. A. , Roberti , J. W. , & Roth , D. A. ( 2004 ). Factor structure, concurrent validity, and internal consistency of the Beck Depression Inventory—Second edition in a sample of college students . Depression and Anxiety , 19 , 187 – 189 . Google Scholar CrossRef Search ADS PubMed Suhr , J. A. , & Boyer , D. ( 1999 ). Use of the Wisconsin Card Sorting Test in the detection of malingering in student simulator and patient samples . Journal of Clinical and Experimental Psychology , 21 , 701 – 708 . doi:10.1076/jcen.21.5.701.868 . Suhr , J. , Tranel , D. , Wefel , J. , & Barrash , J. ( 1997 ). Memory performance after head injury: Contributions of malingering, litigation status, psychological factors, and medication use . Journal of Clinical and Experimental Psychology , 19 , 500 – 514 . Sugarman , M. A. , & Axelrod , B. N. ( 2014 ). Embedded measures of performance validity using verbal fluency tests in a clinical sample . Applied Neuropsychology: Adult . DOI:10.1080/23279095.2013.873439 . Sugarman , M. A. , & Axelrod , B. N. ( 2015 ). Embedded measures of performance validity using verbal fluency tests in a clinical sample . Applied Neuropsychology: Adult , 22 , 141 – 146 . Google Scholar CrossRef Search ADS PubMed Sweet , J. J. , Goldman , D. J. , & Guidotti Breting , L. M. ( 2013 ). Traumatic brain injury: Guidance in a forensic context from outcome, dose-response, and response bias research . Behavioral Sciences and the Law , 31 , 756 – 778 . Google Scholar CrossRef Search ADS PubMed Tombaugh , T. N. ( 1996 ). Test of Memory Malingering . New York : Multi-Health Systems . Trueblood , W. ( 1994 ). Qualitative and quantitative characteristics of malingered and other invalid WAIS-R and clinical memory data . Journal of Clinical and Experimental Neuropsychology , 14 , 597 – 607 . Google Scholar CrossRef Search ADS van Dyke , S. A. , Millis , S. R. , Axelrod , B. N. , & Hanks , R. A. ( 2013 ). Assessing effort: Differentiating performance and symptom validity . The Clinical Neuropsychologist , 27 , 1234 – 1246 . Google Scholar CrossRef Search ADS PubMed Westcott , M. C. , & Alfano , D. P. ( 2005 ). The Symptom Checklist-90-Revised and mild traumatic brain injury . Brain Injury , 19 , 1261 – 1267 . Google Scholar CrossRef Search ADS PubMed Wolfe , P. L. , Millis , S. R. , Hanks , R. , Fichtenberg , N. , Larrabee , G. J. , & Sweet , J. J. ( 2010 ). Effort indicators within the California Verbal Learning Test-II (CVLT-II) . The Clinical Neuropsychologist , 24 , 153 – 168 . Google Scholar CrossRef Search ADS PubMed © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Journal

Archives of Clinical NeuropsychologyOxford University Press

Published: Dec 26, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off