Performance of the Immediate Post-Concussion Assessment and Cognitive Testing Protocol Validity Indices

Performance of the Immediate Post-Concussion Assessment and Cognitive Testing Protocol Validity... Abstract Objective Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery. Athletes provide preseason baseline ImPACT scores to which post-injury scores can be compared to aid concussion diagnosis. However, if baseline scores are not accurately representative of abilities, the utility of post-injury score comparison is diminished. For this reason, ImPACT includes low score thresholds on five validity indices to identify insufficient effort at baseline, though evidence of these indices’ performance is limited. The present study examines the classification accuracy and concurrent validity of the existing ImPACT validity indices and three proposed indices (Word Memory Correct Distractors, Design Memory Correct Distractors, Total Symptom Score). Methods The ImPACT, Word Memory Test (WMT) and Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) were administered to 242 undergraduate students. Participants were instructed to either give full effort on testing or to simulate SRC. Results Sensitivity of the existing ImPACT validity indices was marginally improved with adjusted score thresholds while maintaining acceptable specificity (0.90). Alternative score thresholds and novel validity indices demonstrated adequate specificity while improving sensitivity overall. Positive and negative predictive powers are provided to inform use of protocol validity indices across diverse treatment settings. Conclusions The existing ImPACT indices’ high specificity at the expense of lower sensitivity compared to external validity measures may under-identify poor effort, resulting in premature return-to-play decisions for athletes with concussion. Improvements or additions to the existing indices may raise sensitivity while maintaining acceptable specificity, aiding in the protection of athletes and safe athletic participation. Assessment, Head injury, Traumatic brain injury, Malingering/symptom validity testing Introduction The Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a widely used tool for neurocognitive evaluation in sports-related concussion (SRC), being one piece of a multifaceted approach toward assessment of readiness for return-to-play. SRC is known to result in acute cognitive, affective, and behavioral changes, and a second blow to the head prior to full resolution of an initial SRC has been associated with long lasting impairment and even death (Cantu, 1998; Kelly & Rosenberg, 1997). As risk for adverse outcomes increases when players return to contact sport participation before full resolution of SRC, post-injury ImPACT scores can be compared to preseason baseline scores to aid in assessment of readiness for return-to-play. However, not all baseline ImPACT protocols provide accurate assessments of pre-injury neurocognitive functioning. For example, some athletes may be distracted or fatigued at the time of testing and thus unable to give maximum performances (Lovell, 2015). In other instances, athletes may intentionally suppress scores, or “sandbag,” their baselines to obscure post-injury deficits in the event of later SRC (Lovell, 2015). Obscuring post-injury impairments, whether intentionally or unintentionally, may lead to premature return-to-play and increased risk of adverse outcomes. In an effort to identify protocols by which an athlete’s true abilities are not captured, ImPACT uses embedded protocol validity indices that alert practitioners of uncommonly low scores. Specifically, five ImPACT scores are used as validity indices. A score below the predetermined validity threshold on any one of these five indices automatically triggers an invalid protocol warning on the ImPACT. The score thresholds for the five ImPACT validity indices are based on confidence intervals, such that 95% of athletes taking the ImPACT score higher than the validity threshold (Lovell, 2015). While scores in the fifth percentile and lower are by definition uncommon, they do not necessarily indicate poor effort. It has been demonstrated that athletes with preexisting conditions, such as attention deficit hyperactivity disorder (ADHD) or learning disorders (LD) score more poorly on the ImPACT and are more likely to produce a protocol flagged as invalid (Elbin et al., 2013; Johnson, Pardini, Sandel, & Lovell, 2014; Manderino & Gunstad, 2017; Schatz, Moser, Solomon, Ott, & Karpf, 2012). Assessing effort, often referred to as validity testing, is common practice in neuropsychological assessment. Validity tests are comprised of tasks or questions that are more negatively impacted by effort than they are by genuine neurocognitive deficits. Such tasks may exist as standalone effort assessments, or may be embedded within another assessment, such as in the case of the ImPACT. Moreover, effort tests can be differentiated by assessment of two separate constructs related to test validity. Performance validity tests attempt to capture poor effort on tests of neurocognitive ability, while symptom validity tests assess the genuineness of self-reported symptoms (Millis, 2009). Although more recent evidence has suggested that symptom and performance validity are not necessarily unitary (Van Dyke, Millis, Axelrod, & Hanks, 2013), it has long been held that the failure of a symptom validity test (SVT) reflects a generalized response bias in an individual’s approach to testing that may affect both neurocognitive performance and self-reported symptoms (Meyers, Volbrecht, Axelrod, & Reinsch-Boothby, 2011). For this reason, it can be informative to include assessments of both symptom validity, as would be assessed by self-report measures such as the Minnesota Multiphasic Personality Test-2-Restructured Form, and performance validity, as assessed by instruments such as the Word Memory Test, in neuropsychological testing. Few studies have empirically investigated the ImPACT validity indices’ identification of poor effort. Erdal (2012) found that 11% of ex-collegiate athletes were able to successfully lower scores from their own baseline test performance without reaching threshold on any of the five embedded validity indices. Another study identified two scores within the ImPACT that mimic a common design for standalone performance validity tasks that are not currently used as validity indices. Schatz and Glatts (2013) found that the addition of the score thresholds Word Memory Correct Distractors (WMCD; Immediate + Delayed) < 22 and Design Memory Correct Distractors (DMCD; Immediate + Delayed) < 16 identified 95% of naïve malingerers and 100% of coached malingerers in a simulated malingering study. Without these novel indices, only 70% of naïve malingerers and 65% of coached malingerers were correctly identified as feigning (Schatz & Glatts, 2013). Additionally, the existing validity indices are limited to scores from the neurocognitive tests and thus reflect performance validity. While the ImPACT does not currently assess symptom validity, it does include a self-report measure of concussion symptoms, the Total Symptom Score. It is possible that examination of the Total Symptom Score may provide additional clinical information regarding sandbagging at baseline testing. Limited existing knowledge on the performance of ImPACT’s protocol validity indices limits the ability of clinicians to fully interpret validity scores and determine whether a retest is needed. The present study sought to examine the ImPACT’s protocol validity classification accuracy through a simulated malingering design. Here, the performance of the ImPACT is compared to the performance of well-validated external measures of performance and symptom validity, to determine the ImPACT validity indices’ concurrent validity as measures of effort. It was hypothesized that: (1) the ImPACT will be less sensitive to poor effort than the MMPI-2-RF and the WMT; (2) sensitivity will be improved with more liberal validity score thresholds on the existing indices, and with the additions of WMCD and DMCD as additional performance validity indices, as well as Total Symptom Score as a symptom validity index; and (3) performance of the ImPACT validity indices will be affected by base rate of poor effort. Methods Participants A total of 277 participants were recruited from the psychology department subject pool. Analyses were limited to include only participants who had complete data, spoke English as a first language, and appeared to follow testing instructions per their group assignment (n = 242). Participants were randomly assigned to either the simulating (n = 118) or control group (n = 124). Participant age ranged from 18 to 32 (mean age = 19.6 ± 1.99). A total of 23.1% of participants reported at least one previous concussion. Independent t-tests showed persons in the simulating and control groups were similar in demographic characteristics, though differed in years of education (simulating group: 12.92 ± 1.93 years, full effort group: 12.33 ± 2.18 years). See Table 1. Data regarding previous exposure to the ImPACT were available for a subset of participants (subset n = 102), of which 32.4% reported that they had taken the ImPACT at least once in the past. This proportion did not significantly differ between groups (χ2(1) = 0.08, p = .47). Table 1. Examination of group differences on key demographic and testing variables Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Table 1. Examination of group differences on key demographic and testing variables Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Measures Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) (Lovell, 2015): The ImPACT is a computerized test of neurocognitive function that has been well-validated for use in athletes with suspected SRC (Allen & Gfeller, 2011; Elbin, Schatz, & Covassin, 2011; Maerlender et al., 2010; Nakayama, Covassin, Schatz, Nogle, & Kovan, 2014; Schatz & Sandel, 2012). ImPACT includes six subtest modules that are used to produce five neurocognitive composite scores. It also includes five scores with validity thresholds which, when surpassed, trigger an invalid protocol warning: Xs and Os Total Incorrect > 30, Impulse Control Composite > 30, Word Memory Learning Percent Correct < 69%, Design Memory Learning Percent Correct < 50%, and Three Letters Total Letters Correct < 8. The proposed validity indices, WMCD and DMCD, are calculated by summing the correct distractor (i.e., true negatives) scores from the immediate and delayed recall portions of the Word Memory and Design Memory subtests, respectively (Schatz & Glatts, 2013). The ImPACT also includes a total concussion symptom score. Examinees are presented with a list of 22 common SRC symptoms (e.g., headache, nausea, dizziness) and are asked to rate the severity of each symptom on a 6-point Likert scale ranging from 1 (minor discomfort) to 6 (severe) or to check a box to indicate that they are not experiencing the symptom at all. The Total Symptom Score is the sum of all symptom ratings. Word Memory Test (WMT) (Green, 2003): The WMT is a computerized, forced choice recognition symptom validity test. Participants are presented with a 10 pairs of semantically related words, and, in a series of immediate and delayed tasks, must demonstrate memory of the learned list. The WMT has demonstrated good sensitivity and specificity as a measure of effort in a variety of patient samples (Green, 2003; Green, Flaro, & Courtney, 2009). Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) (Ben-Porath & Tellegen, 2008): The MMPI-2-RF contains 338 True/False items regarding aspects of personality and psychopathology, from which clinical and validity scales are derived. The final protocol includes five over-reporting validity scales that have been well-validated measures of poor effort and malingering (Tarescavage, Wygant, Gervais, & Ben-Porath, 2013). These five scales capture responses that are infrequent in the normative population (F-r scale), responses that are infrequent in a population of individuals with psychopathology (Fp-r scale), and somatic complaints that are infrequent in medical and chronic pain patients (Fs scale). The Fp-r and Fs scales provide comparisons for the F-r scale, which can be confounded by genuine, albeit infrequent, psychological, and somatic complaints. The MMPI-2-RF also includes a scale for detection of exaggerated symptom reporting (FBS-r) and a response bias scale as a measure of feigned test performance (RBS). Procedures All study procedures were approved by the local ethical review board, and all participants provided written informed consent before participating in the study. Participants completed computerized measures in groups of approximately 10–20 per testing session in a university computer laboratory. Groups of participants were randomly assigned to either the control or simulating conditions (all participants in a given session in the same condition) and instructions were read aloud by a study team member at the beginning of the testing session (for simulating instructions, see Supplementary Material online, Appendix A). Instructions to either put forth full effort or simulate a concussion were also reiterated by computerized prompts. Participants in both groups were informed that the tests they would be taking were designed to identify individuals not putting forth full effort, and that only those successfully putting forth full effort or feigning without detection would be entered into a gift card raffle, in order to provide external incentive for successful sandbagging. All participants received the computerized tests in a standardized order to optimize timing of the delayed recall portion of the WMT: (1) Demographic survey, (2) MMPI-2-RF, (3) WMT Learning, (4) ImPACT Test, (5) WMT Delayed Recall, and (6) Exit survey. This ordering allowed for an average delay of approximately 30 min between WMT Learning and Delayed Recall. The exit survey asked participants to honestly describe their group instructions as a validity check for feigning and to answer questions regarding the amount of effort put into following group instructions. Protocols for each measure (i.e., WMT, ImPACT, MMPI-2-RF) were initially classified as invalid based on published score thresholds (Table 2; Ben-Porath & Tellegen, 2008; Green, 2003; Lovell, 2015; Schatz & Glatts, 2013). In the cases of scales with multiple published validity thresholds, the most liberal cutoffs were used, as there was no presumed impairment in the sample of healthy college students. Table 2. Number of participants properly and improperly identified by published validity score thresholds Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Note: All cutoffs are published validity thresholds, except for Total Symptom Score, a novel index. True + = simulating condition participants accurately identified as sandbagging; False + = control condition participants inaccurately identified as sandbagging; False – = simulating condition participants inaccurately identified as giving adequate effort; True – = control condition participants accurately identified as giving adequate effort; Sens = sensitivity, Spec = specificity, LPC = Learning Percent Correct, TLC = Total Letters Correct. Table 2. Number of participants properly and improperly identified by published validity score thresholds Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Note: All cutoffs are published validity thresholds, except for Total Symptom Score, a novel index. True + = simulating condition participants accurately identified as sandbagging; False + = control condition participants inaccurately identified as sandbagging; False – = simulating condition participants inaccurately identified as giving adequate effort; True – = control condition participants accurately identified as giving adequate effort; Sens = sensitivity, Spec = specificity, LPC = Learning Percent Correct, TLC = Total Letters Correct. Results Group Differences on Validity Indices A series of t-tests was performed on all standard and proposed validity indices to examine group differences for full-effort and sandbagging performances, to ensure that individuals in the feigning group effectively suppressed test scores and inflated symptom report. Significant group differences were found on the five ImPACT validity indices, three WMT validity indices, and five MMPI-2-RF validity indices (p < .001). Significant group differences were also observed on all three ImPACT exploratory validity indices (WMCD, DMCD, and Total Symptom Score) (Table 3). Table 3. Results of t-tests examining group differences on validity indices Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  *Equal variances not assumed. Table 3. Results of t-tests examining group differences on validity indices Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  *Equal variances not assumed. Classification Accuracy of Validity Indices Sensitivity and specificity were calculated for all validity indices, to characterize their abilities to differentiate sandbagging from full effort. Classification accuracy statistics initially employed the most liberal published cutoffs of each instrument (ImPACT, MMPI-2-RF, and WMT; Ben-Porath & Tellegen, 2008; Green, 2003; Lovell, 2015; Schatz & Glatts, 2013). Table 2 presents classification accuracy statistics for each instrument as a whole (e.g., the sensitivity and specificity for the ImPACT overall, as a product of identification by any one of the five validity indices), as well as each index individually. As the Total Symptom Score is suggested here for the first time as a possible validity index, a score threshold has not been previously proposed. Group means and standard deviations were examined to determine a proposed Total Symptom Score invalidity threshold of >20 for initial classification accuracy investigations, though other score thresholds are also presented below. Overall, the ImPACT demonstrated very high specificity (0.94) at the expense of lower sensitivity (0.42). Consequently, PPP (0.86) was higher than NPP (0.63). The three proposed ImPACT validity indices demonstrated comparatively high sensitivity rates, though with low specificity. Specifically, WMCD demonstrated a sensitivity of 0.74 with specificity of 0.66, while DMCD demonstrated a sensitivity of 0.69 and a specificity of 0.65, and Total Symptom Score demonstrated a sensitivity of 0.75 and a specificity of 0.68. Further, we examined how many of the five validity index score thresholds were surpassed by protocols that were flagged invalid by existing ImPACT protocol validity indices. Of the 58 protocols that were flagged as invalid by the standard ImPACT indices, 50.0% surpassed the validity threshold for only one index, 29.3% invalidated two indices, 6.9% invalidated three indices, and 12.1% invalidated four indices. Only one protocol (1.7%) surpassed thresholds on all five standard ImPACT validity indices. Sensitivity and Specificity Comparisons for Existing Score Cutoffs Classification accuracy for the standard ImPACT as a whole and the three exploratory ImPACT indices were then statistically compared to the MMPI-2-RF and the WMT using McNemar’s Tests to determine whether any current or exploratory score thresholds can reach the same level of performance as these gold-standard instruments. These results are presented in Table 4. Overall, the sensitivity of the standard ImPACT was significantly lower than the sensitivity rates of the MMPI-2-RF and the WMT, as well as each of the three exploratory validity indices alone. The specificity of the standard ImPACT was significantly higher than the MMPI-2-RF and the WMT, as well as all three exploratory ImPACT indices. Table 4. Resulting p values from McNemar’s Tests, comparing sensitivities and specificities of the ImPACT validity indices to WMT and MMPI-2-RF indices, using only participants from simulating condition (n = 118) Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Note: McNemar’s Tests comparing sensitivities use only participants from the simulating condition, while comparisons of specificities use only participants from the full effort condition. Table 4. Resulting p values from McNemar’s Tests, comparing sensitivities and specificities of the ImPACT validity indices to WMT and MMPI-2-RF indices, using only participants from simulating condition (n = 118) Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Note: McNemar’s Tests comparing sensitivities use only participants from the simulating condition, while comparisons of specificities use only participants from the full effort condition. Exploratory Score Thresholds and Base Rate Analyses Classification accuracy statistics were calculated for exploratory score thresholds on the five existing and three proposed ImPACT validity indices to determine whether more liberal cutoffs may improve sensitivity without detrimentally affecting specificity. In selecting alternate score thresholds, sensitivity was maximized while a minimum specificity of 0.90 was preserved. Positive predictive power (PPP) and negative predictive power (NPP) were also calculated for hypothetical base rates of sandbagging as informed by the literature. These results are presented in Table 5. Table 5. Classification accuracy statistics for exploratory score cutoffs Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Note: Positive predictive powers represent the probability that an invalid protocol warning at the given score threshold is reflective of poor effort, rather than a false positive result. Negative predictive powers similarly represent the probability that a failure to detect poor effort is reflective of full effort. These probabilities vary by the base rate of poor effort. Sens = sensitivity; Spec = specificity; BR = base rate; LPC = Learning Percent Correct; Tot = total; TLC = Total Letters Correct. Table 5. Classification accuracy statistics for exploratory score cutoffs Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Note: Positive predictive powers represent the probability that an invalid protocol warning at the given score threshold is reflective of poor effort, rather than a false positive result. Negative predictive powers similarly represent the probability that a failure to detect poor effort is reflective of full effort. These probabilities vary by the base rate of poor effort. Sens = sensitivity; Spec = specificity; BR = base rate; LPC = Learning Percent Correct; Tot = total; TLC = Total Letters Correct. Generally, the sensitivities of individual validity indices were able to be marginally improved while maintaining acceptable specificity with adjusted score thresholds, though rarely to above 50%. As positive and negative predictive power is affected by base rates, these were calculated for each presented alternate score threshold at various hypothetical base rates of sandbagging. The selected hypothetical base rates (5%, 10%, 15%, 20%, and 25%) were informed in part by the current literature on ImPACT protocol invalidity (ranging from 2.7% [Nelson et al., 2015] to 6.3% [Schatz, Moser, Solomon, Ott, & Karpf, 2012]), as well as by literature on the prevalence of malingering in forensic settings (Mittenberg, Patton, Canyock, & Condit, 2002). Discussion The present results indicate that the currently employed protocol validity indices of the ImPACT have high specificity at the expense of poorer sensitivity as compared to external performance validity measures. An exploration of alternate validity thresholds on the ImPACT revealed that the sensitivity of the current indices can be only marginally improved while maintaining high specificity, though investigation of two additional ImPACT validity indices previously proposed in the literature (WMCD and DMCD) and one novel index proposed here (Total Symptom Score) may have the potential to improve accuracy. Several aspects of these findings warrant further discussion. The high specificity of the ImPACT’s protocol validity indices is particularly striking when compared to the specificities of external validity measures. The MMPI-2-RF and the WMT are frequently used alongside neuropsychological testing in forensic settings with TBI populations (Green, Flaro, & Courtney, 2009; Hartman, 2002; Iverson, Green, & Gervais, 1999; Tarescavage, Wygant, Gervais, & Ben-Porath, 2013) as measures of symptom and performance validity. In the present study, both of these measures were found to have significantly higher sensitivity though lower specificity to simulated SRC than the ImPACT validity indices (Table 4). Due to the potential for serious repercussions of false positives in forensic settings, tests used in these settings are held to a high standard of specificity. Comparatively, the costs associated with a false positive on the ImPACT (likely including only the time required for a second test administration) appear relatively minor. As a result, maximizing the identification of poor effort, despite some degree of sacrificed specificity, may benefit the safety of athletes. Further research in this area, including investigation of the performance of the ImPACT validity indices in a genuine sample of athletes at baseline testing, is needed. However, it could also be argued that an increased number of test protocols being identified as being invalid (both correctly and incorrectly) also has meaningful costs. Greater numbers of invalid assessments would require additional resources from clinicians and may not necessarily translate to improved athlete care. A study on the implementation of the ImPACT across a variety of sports medicine settings indicated that, while almost all treatment settings using the ImPACT administer a baseline test (94.7%), only half of these sites examine baseline protocol validity (54.8%) (Covassin, Elbin, Stiller-Ostrowski, & Kontos, 2009). Additionally, unnecessary retests may contribute to athletes’ negative feelings toward the test or prompt them to change their strategies during testing. In both situations, retest baseline scores may in fact be less representative of an athlete’s typical approach to testing, thus confounding observed post-injury changes. Related to the above, it has been proposed within the literature that baseline testing may not be effective at capturing post-injury deficits, particularly due to low reliability (which is likely due, at least in part, to variable effort at baseline testing; Elbin, Schatz, & Covassin, 2011; Nakayama et al., 2014; Schatz & Ferris, 2013). Rather, comparison to normative data has been proposed as better suited to capture post-injury deficits in some settings (Echemendia et al., 2012; Randolph, 2011) and would resolve the issue of poor effort at baseline testing. On the other hand, there is research to suggest that athletes whose neurocognitive abilities are above or below the mean (e.g., individuals with ADHD or learning disorder) are at risk for misclassification at post-injury testing when compared only to normative data (Elbin et al., 2013; Schatz & Robertshaw, 2014). For these athletes specifically, baseline testing is well suited to accommodate individual differences, though only when baseline scores accurately represent uninjured abilities. However, the baseline testing validity indices may function differently in these populations (Elbin et al., 2013; Johnson, Pardini, Sandel, & Lovell, 2014; Manderino & Gunstad, 2017). Although the prevalence of such histories did not differ between groups in the present study, the current findings may not be fully representative of the validity indices’ performance in athletes with such conditions. Continued research in this area may lead to improvements of the validity score thresholds for such athletes. Further discussion of the two previously proposed validity indices, WMCD and DMCD, is also warranted. Consistent with past results (identifying 90%–100% of malingerers in a simulation design; Schatz & Glatts, 2013), these indices demonstrated higher sensitivity than the traditional ImPACT validity indices (0.74 sensitivity for WMCD, 0.69 for DMCD). This increased sensitivity was found to be at the expense of considerably lower specificity than the existing ImPACT validity indices. Lower specificity may not preclude the use of WMCD and DMCD as validity indices, due to the relatively low risk of false positives on baseline testing as discussed earlier. Test administrators should note these limitations, as protocol invalidity as identified by these indices warrants more cautious interpretation and investigation of cause. Despite this, even while maintaining a minimum specificity of 0.90, the sensitivity of WMCD (0.54) is greater than the existing ImPACT indices, suggesting its potential to increase the overall sensitivity of the ImPACT. The investigation of Total Symptom Score as a symptom validity index also encourages further investigation. While maintaining high levels of specificity, the Total Symptom Score demonstrated higher sensitivity to simulated SRC than any one of the existing validity indices alone. In the absence of SRC or other neurologic conditions, nonspecific SRC symptoms (e.g., fatigue, headache) may be experienced but are expected to be few in number and mild in severity, suggesting that a significant elevation of symptom reporting at baseline may be indicative of feigning regardless of neurocognitive performance. In this way, the Total Symptom Score may provide a measure of symptom validity, in addition to the existing validity indices that assess only performance validity, thus tapping into a validity-related construct not yet assessed by ImPACT. However, simulated malingering studies often yield effect sizes on effort measures much larger than those of actual malingerers (Vickery, Berry, Inman, Harris, & Orey, 2001). While it is possible that a naïve sandbagger would elevate the Total Symptom Score at baseline, it would take very little coaching or experience with the test for an athlete to adopt a more sophisticated approach and limit symptom reporting. For this reason, the threshold scores used in this study are unlikely to provide the same degree of clinical utility in genuine athletic settings. Further research on the Total Symptom Score, particularly using non-simulating study designs, is needed to determine its potential as a validity index. The current findings are limited in several ways. First, testing was performed in group settings, which has been demonstrated in past work to affect test scores and protocol validity (Moser, Schatz, Neidzwski, & Ott, 2011). While this may introduce extraneous sources of variance, it does make the present results generalizable to the many athletic settings that administer the ImPACT to large numbers of athletes simultaneously. Second, while the standalone validity instruments used here were selected for their well-validated measures of effort, they may be too burdensome to be integrated into standard baseline testing. Rather, they are employed here as points of comparison to inform use of the embedded ImPACT indices. Finally, as noted above, the ImPACT validity indices have been shown to function differently in athletes with certain histories (e.g., ADHD, learning disorders), and further research using such athletes and replication of the present results in a non-simulating sample is needed. In brief summary, the high specificity and low sensitivity of the existing ImPACT protocol validity indices found in the present study, if replicated in a non-simulating study design, may diminish the clinical utility of the ImPACT’s user-friendly protocol validity warning in clinical settings. By better informing clinicians of the risks associated with premature return to play and the limitations of the ImPACT, they may play a more active role in discouraging and identifying poor-effort baseline assessments. With continued improvements to and education on the ImPACT’s protocol validity indices, clinicians may most efficiently allocate resources to ensure athletes’ continued, safe athletic participation, relieving the burden of SRC and promoting safe return-to-play. Supplementary Material Supplementary material is available at Archives of Clinical Neuropsychology online. Funding None. Conflict of Interest None declared. Acknowledgments The authors wish to acknowledge Yossef Ben-Porath, PhD, for his input on study measures. We also thank Anthony Tarescavage, PhD, for his statistical guidance. References Allen, B. J., & Gfeller, J. D. ( 2011). The Immediate Post-Concussion Assessment and Cognitive Testing battery and traditional neuropsychological measures: A construct and concurrent validity study. Brain Injury , 25, 179– 191. Google Scholar CrossRef Search ADS PubMed  Ben-Porath, Y. S., & Tellegen, A. ( 2008). The Minnesota Multiphasic Personality Inventory – 2 – Restructured Form: Manual for administration, scoring, and interpretation . Minneapolis, MN: University of Minnesota Press. Cantu, R. C. ( 1998). Second-impact syndrome. Clinics in Sports Medicine , 17, 37– 44. Google Scholar CrossRef Search ADS PubMed  Covassin, T., Elbin, R. J., Stiller-Ostrowski, J. L., & Kontos, A. P. ( 2009). Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) practices of sports medicine professionals. Journal of Athletic Training , 44, 639– 644. Google Scholar CrossRef Search ADS PubMed  Echemendia, R. J., Bruce, J. M., Bailey, C. M., Sanders, J. F., Arnett, P., & Vargas, G. ( 2012). The utility of post-concussion neuropsychological data in identifying cognitive change following sports-related mTBI in the absence of baseline data. The Clinical Neuropsychologist , 26, 1077– 1091. Google Scholar CrossRef Search ADS PubMed  Elbin, R. J., Kontos, A. P., Kegel, N., Johnson, E., Burkhart, S., & Schatz, P. ( 2013). Individual and combined effects of LD and ADHD on computerized neurocognitive concussion test performance: Evidence for separate norms. Archives of Clinical Neuropsychology , 28, 476– 484. Google Scholar CrossRef Search ADS PubMed  Elbin, R. J., Schatz, P., & Covassin, T. ( 2011). One-year test-retest reliability of the online version of ImPACT in high school athletes. The American Journal of Sports Medicine , 39, 2319– 2324. Google Scholar CrossRef Search ADS PubMed  Erdal, K. ( 2012). Neuropsychological testing for sports-related concussion: How athletes can sandbag their baseline testing without detection. Archives of Clinical Neuropsychology , 27, 473– 479. Google Scholar CrossRef Search ADS PubMed  Green, P. ( 2003). Word Memory Test for Windows: User’s manual and program . Edmonton, Alberta, Canada: Author. Green, P., Flaro, L., & Courtney, J. ( 2009). Examining false positives on the Word Memory Test in adults with mild traumatic brain injury. Brain Injury , 23, 741– 750. Google Scholar CrossRef Search ADS PubMed  Hartman, D. E. ( 2002). The unexamined lie is a lie worth fibbing: Neuropsychological malingering and the Word Memory Test. Archives of Clinical Neuropsychology , 17, 709– 714. Google Scholar CrossRef Search ADS PubMed  Iverson, G., Green, P., & Gervais, R. ( 1999). Using the word memory test to detect biased responding in head injury litigation. Journal of Cognitive Rehabilitation , 17(2), 4– 8. Johnson, E., Pardini, J., Sandel, N., & Lovell, M. ( 2014). Do athletes with dyslexia differ at baseline and/or at concussion post-injury assessment on a computer-based test battery? Archives of Clinical Neuropsychology , 29, 590– 591. Google Scholar CrossRef Search ADS   Kelly, J. P., & Rosenberg, J. H. ( 1997). Diagnosis and management of concussion in sports. Neurology , 48, 575– 580. Google Scholar CrossRef Search ADS PubMed  Lovell, M. R. ( 2015). ImPACT Test Administration and Interpretation Manual. Retrieved from http://www.impacttest.com. Maerlender, A., Flashman, L., Kessler, A., Kumbhani, S., Greenwald, R., Tosteson, T., et al.  . ( 2010). Examination of the Construct Validity of ImPACT Computerized Test, traditional, and experimental neuropsychological measures. The Clinical Neuropsychologist , 24, 1309– 1325. Google Scholar CrossRef Search ADS PubMed  Manderino, L., & Gunstad, J. ( 2017). Collegiate student-athletes with history of ADHD or academic difficulties are more likely to produce an invalid protocol on baseline ImPACT testing. Clinical Journal of Sports Medicine , doi: 10.1097/JSM.0000000000000433. Millis, S. R. ( 2009). Methodological challenges in assessment of cognition following head injury: Response to Malojcic et al. 2008. Journal of Neurotrauma , 26, 2409– 2410. Google Scholar CrossRef Search ADS PubMed  Mittenberg, W., Patton, C., Canyock, E. M., & Condit, D. C. ( 2002). Base rates of malingering and symptom exaggeration. Journal of Clinical and Experimental Neuropsychology , 24, 1094– 1102. Google Scholar CrossRef Search ADS PubMed  Moser, R. S., Schatz, P., Neidzwski, K., & Ott, S. D. ( 2011). Group versus individual administration affects baseline neurocognitive test performance. American Journal of Sports Medicine , 39, 2325– 2330. Google Scholar CrossRef Search ADS PubMed  Meyers, J. E., Volbrecht, M., Axelrod, B. N., & Reinsch-Boothby, L. ( 2011). Embedded symptom validity tests and overall neuropsychological test performance. Archives of Clinical Neuropsychology , 26, 8– 15. Google Scholar CrossRef Search ADS PubMed  Nakayama, Y., Covassin, T., Schatz, P., Nogle, S., & Kovan, J. ( 2014). Examination of the test-retest reliability of a computerized neurocognitive test battery. The American Journal of Sports Medicine , 42, 2000– 2005. Google Scholar CrossRef Search ADS PubMed  Nelson, L. D., Pfaller, A. Y., Rein, L. E., & McCrea, M. A. ( 2015). Rates and predictors of invalid baseline test performance in high school and collegiate athletes for 3 computerized neurocognitive tests. The American Journal of Sports Medicine , 42, 2018– 2026. Google Scholar CrossRef Search ADS   Randolph, C. ( 2011). Basleine neuropsychological testing in managing sport-related concussion: Does it modify risk? Current Sports Medicine Reports , 10, 21– 26. Google Scholar CrossRef Search ADS PubMed  Schatz, P., & Ferris, C. S. ( 2013). One-month test-retest reliability of the ImPACT test battery. Archives of Clinical Neuropsychology , 28( 5), 499– 504. Google Scholar CrossRef Search ADS PubMed  Schatz, P., & Glatts, C. ( 2013). “Sandbagging” baseline test performance on ImPACT, without detection, is more difficult than it appears. Archives of Clinical Neuropsychology , 28, 236– 244. Google Scholar CrossRef Search ADS PubMed  Schatz, P., Moser, R. S., Solomon, G. S., Ott, S. D., & Karpf, R. ( 2012). Prevalence of Invalid Computerized Baseline Neurocognitive Test Results in high school and collegiate athletes. Journal of Athletic Training , 47, 289– 296. Google Scholar CrossRef Search ADS PubMed  Schatz, P., & Robertshaw, S. ( 2014). Comparing post-concussive neurocognitive test data to normative data presents risks for under-classifying “above average” athletes. Archives of Clinical Neuorpsycholoy , 29, 625– 632. Google Scholar CrossRef Search ADS   Schatz, P., & Sandel, N. ( 2012). Sensitivity and Specificity of the Online Version of ImPACT in High School and Collegiate Athletes. The American Journal of Sports Medicine , 41, 321– 326. Google Scholar CrossRef Search ADS PubMed  Tarescavage, A. M., Wygant, D. B., Gervais, R. O., & Ben-Porath, Y. S. ( 2013). Association between the MMPI-2 Restructured Form (MMPI-2-RF) and malingered neurocognitive dysfunction among non-head injury disability claimants. The Clinical Neuropsychologist , 27, 313– 335. Google Scholar CrossRef Search ADS PubMed  Van Dyke, S. A., Millis, S. R., Axelrod, B. N., & Hanks, R. A. ( 2013). Assessing effort: Differentiating performance and symptom validity. The Clinical Neuorpsychologist , 27, 1234– 1246. Google Scholar CrossRef Search ADS   Vickery, C. D., Berry, D. T. R., Inman, T. H., Harris, M. J., & Orey, S. A. ( 2001). Detection of inadequate effort on neuropsychological testing: A meta-analytic review of selected procedures. Archives of Clinical Neuropsychology , 16, 45– 73. Google Scholar PubMed  © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Archives of Clinical Neuropsychology Oxford University Press

Performance of the Immediate Post-Concussion Assessment and Cognitive Testing Protocol Validity Indices

Loading next page...
 
/lp/ou_press/performance-of-the-immediate-post-concussion-assessment-and-cognitive-slvj4AgXW0
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISSN
0887-6177
eISSN
1873-5843
D.O.I.
10.1093/arclin/acx102
Publisher site
See Article on Publisher Site

Abstract

Abstract Objective Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery. Athletes provide preseason baseline ImPACT scores to which post-injury scores can be compared to aid concussion diagnosis. However, if baseline scores are not accurately representative of abilities, the utility of post-injury score comparison is diminished. For this reason, ImPACT includes low score thresholds on five validity indices to identify insufficient effort at baseline, though evidence of these indices’ performance is limited. The present study examines the classification accuracy and concurrent validity of the existing ImPACT validity indices and three proposed indices (Word Memory Correct Distractors, Design Memory Correct Distractors, Total Symptom Score). Methods The ImPACT, Word Memory Test (WMT) and Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) were administered to 242 undergraduate students. Participants were instructed to either give full effort on testing or to simulate SRC. Results Sensitivity of the existing ImPACT validity indices was marginally improved with adjusted score thresholds while maintaining acceptable specificity (0.90). Alternative score thresholds and novel validity indices demonstrated adequate specificity while improving sensitivity overall. Positive and negative predictive powers are provided to inform use of protocol validity indices across diverse treatment settings. Conclusions The existing ImPACT indices’ high specificity at the expense of lower sensitivity compared to external validity measures may under-identify poor effort, resulting in premature return-to-play decisions for athletes with concussion. Improvements or additions to the existing indices may raise sensitivity while maintaining acceptable specificity, aiding in the protection of athletes and safe athletic participation. Assessment, Head injury, Traumatic brain injury, Malingering/symptom validity testing Introduction The Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a widely used tool for neurocognitive evaluation in sports-related concussion (SRC), being one piece of a multifaceted approach toward assessment of readiness for return-to-play. SRC is known to result in acute cognitive, affective, and behavioral changes, and a second blow to the head prior to full resolution of an initial SRC has been associated with long lasting impairment and even death (Cantu, 1998; Kelly & Rosenberg, 1997). As risk for adverse outcomes increases when players return to contact sport participation before full resolution of SRC, post-injury ImPACT scores can be compared to preseason baseline scores to aid in assessment of readiness for return-to-play. However, not all baseline ImPACT protocols provide accurate assessments of pre-injury neurocognitive functioning. For example, some athletes may be distracted or fatigued at the time of testing and thus unable to give maximum performances (Lovell, 2015). In other instances, athletes may intentionally suppress scores, or “sandbag,” their baselines to obscure post-injury deficits in the event of later SRC (Lovell, 2015). Obscuring post-injury impairments, whether intentionally or unintentionally, may lead to premature return-to-play and increased risk of adverse outcomes. In an effort to identify protocols by which an athlete’s true abilities are not captured, ImPACT uses embedded protocol validity indices that alert practitioners of uncommonly low scores. Specifically, five ImPACT scores are used as validity indices. A score below the predetermined validity threshold on any one of these five indices automatically triggers an invalid protocol warning on the ImPACT. The score thresholds for the five ImPACT validity indices are based on confidence intervals, such that 95% of athletes taking the ImPACT score higher than the validity threshold (Lovell, 2015). While scores in the fifth percentile and lower are by definition uncommon, they do not necessarily indicate poor effort. It has been demonstrated that athletes with preexisting conditions, such as attention deficit hyperactivity disorder (ADHD) or learning disorders (LD) score more poorly on the ImPACT and are more likely to produce a protocol flagged as invalid (Elbin et al., 2013; Johnson, Pardini, Sandel, & Lovell, 2014; Manderino & Gunstad, 2017; Schatz, Moser, Solomon, Ott, & Karpf, 2012). Assessing effort, often referred to as validity testing, is common practice in neuropsychological assessment. Validity tests are comprised of tasks or questions that are more negatively impacted by effort than they are by genuine neurocognitive deficits. Such tasks may exist as standalone effort assessments, or may be embedded within another assessment, such as in the case of the ImPACT. Moreover, effort tests can be differentiated by assessment of two separate constructs related to test validity. Performance validity tests attempt to capture poor effort on tests of neurocognitive ability, while symptom validity tests assess the genuineness of self-reported symptoms (Millis, 2009). Although more recent evidence has suggested that symptom and performance validity are not necessarily unitary (Van Dyke, Millis, Axelrod, & Hanks, 2013), it has long been held that the failure of a symptom validity test (SVT) reflects a generalized response bias in an individual’s approach to testing that may affect both neurocognitive performance and self-reported symptoms (Meyers, Volbrecht, Axelrod, & Reinsch-Boothby, 2011). For this reason, it can be informative to include assessments of both symptom validity, as would be assessed by self-report measures such as the Minnesota Multiphasic Personality Test-2-Restructured Form, and performance validity, as assessed by instruments such as the Word Memory Test, in neuropsychological testing. Few studies have empirically investigated the ImPACT validity indices’ identification of poor effort. Erdal (2012) found that 11% of ex-collegiate athletes were able to successfully lower scores from their own baseline test performance without reaching threshold on any of the five embedded validity indices. Another study identified two scores within the ImPACT that mimic a common design for standalone performance validity tasks that are not currently used as validity indices. Schatz and Glatts (2013) found that the addition of the score thresholds Word Memory Correct Distractors (WMCD; Immediate + Delayed) < 22 and Design Memory Correct Distractors (DMCD; Immediate + Delayed) < 16 identified 95% of naïve malingerers and 100% of coached malingerers in a simulated malingering study. Without these novel indices, only 70% of naïve malingerers and 65% of coached malingerers were correctly identified as feigning (Schatz & Glatts, 2013). Additionally, the existing validity indices are limited to scores from the neurocognitive tests and thus reflect performance validity. While the ImPACT does not currently assess symptom validity, it does include a self-report measure of concussion symptoms, the Total Symptom Score. It is possible that examination of the Total Symptom Score may provide additional clinical information regarding sandbagging at baseline testing. Limited existing knowledge on the performance of ImPACT’s protocol validity indices limits the ability of clinicians to fully interpret validity scores and determine whether a retest is needed. The present study sought to examine the ImPACT’s protocol validity classification accuracy through a simulated malingering design. Here, the performance of the ImPACT is compared to the performance of well-validated external measures of performance and symptom validity, to determine the ImPACT validity indices’ concurrent validity as measures of effort. It was hypothesized that: (1) the ImPACT will be less sensitive to poor effort than the MMPI-2-RF and the WMT; (2) sensitivity will be improved with more liberal validity score thresholds on the existing indices, and with the additions of WMCD and DMCD as additional performance validity indices, as well as Total Symptom Score as a symptom validity index; and (3) performance of the ImPACT validity indices will be affected by base rate of poor effort. Methods Participants A total of 277 participants were recruited from the psychology department subject pool. Analyses were limited to include only participants who had complete data, spoke English as a first language, and appeared to follow testing instructions per their group assignment (n = 242). Participants were randomly assigned to either the simulating (n = 118) or control group (n = 124). Participant age ranged from 18 to 32 (mean age = 19.6 ± 1.99). A total of 23.1% of participants reported at least one previous concussion. Independent t-tests showed persons in the simulating and control groups were similar in demographic characteristics, though differed in years of education (simulating group: 12.92 ± 1.93 years, full effort group: 12.33 ± 2.18 years). See Table 1. Data regarding previous exposure to the ImPACT were available for a subset of participants (subset n = 102), of which 32.4% reported that they had taken the ImPACT at least once in the past. This proportion did not significantly differ between groups (χ2(1) = 0.08, p = .47). Table 1. Examination of group differences on key demographic and testing variables Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Table 1. Examination of group differences on key demographic and testing variables Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Characteristic  Mean (SD) or %  Test  p  Full effort (n = 124)  Simulators (n = 118)  Age  19.43 (2.11)  19.71 (1.85)  −1.11  .27  Years of education  12.33 (2.18)  12.92 (1.93)  −2.20  .03  Hours of sleep last night  4.94  4.87  0.33  .74  History of speech therapy  16.9%  16.1%  0.30  .50  History of special education classes  4.8%  5.1%  0.01  .58  Repeated one or more years of school  6.5%  3.4%  1.20  .21  Diagnosed with:   Learning disability  6.5%  5.1%  0.21  .43   Dyslexia  0.8%  1.7%  0.39  .48   ADHD/ADD  9.7%  10.2%  0.02  .53   Autism  0.8%  0.0%  0.96  .51  History of treatment for:   Headaches  17.7%  26.3%  2.57  .07   Migraines  15.3%  18.6%  0.47  .30   Epilepsy/seizures  3.2%  0.8%  1.69  .20   Brain surgery  0.0%  0.0%  —  —   Meningitis  0.0%  0.0%  —  —   Substance/alcohol abuse  3.2%  2.5%  0.10  .53   Psychiatric condition  21.0%  28.0%  1.61  .13  History of ≥ concussion  25.0%  22.0%  0.30  .35  Number of previous concussions  1.66 (1.37)  1.85 (1.57)  −0.46  .65  Measures Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) (Lovell, 2015): The ImPACT is a computerized test of neurocognitive function that has been well-validated for use in athletes with suspected SRC (Allen & Gfeller, 2011; Elbin, Schatz, & Covassin, 2011; Maerlender et al., 2010; Nakayama, Covassin, Schatz, Nogle, & Kovan, 2014; Schatz & Sandel, 2012). ImPACT includes six subtest modules that are used to produce five neurocognitive composite scores. It also includes five scores with validity thresholds which, when surpassed, trigger an invalid protocol warning: Xs and Os Total Incorrect > 30, Impulse Control Composite > 30, Word Memory Learning Percent Correct < 69%, Design Memory Learning Percent Correct < 50%, and Three Letters Total Letters Correct < 8. The proposed validity indices, WMCD and DMCD, are calculated by summing the correct distractor (i.e., true negatives) scores from the immediate and delayed recall portions of the Word Memory and Design Memory subtests, respectively (Schatz & Glatts, 2013). The ImPACT also includes a total concussion symptom score. Examinees are presented with a list of 22 common SRC symptoms (e.g., headache, nausea, dizziness) and are asked to rate the severity of each symptom on a 6-point Likert scale ranging from 1 (minor discomfort) to 6 (severe) or to check a box to indicate that they are not experiencing the symptom at all. The Total Symptom Score is the sum of all symptom ratings. Word Memory Test (WMT) (Green, 2003): The WMT is a computerized, forced choice recognition symptom validity test. Participants are presented with a 10 pairs of semantically related words, and, in a series of immediate and delayed tasks, must demonstrate memory of the learned list. The WMT has demonstrated good sensitivity and specificity as a measure of effort in a variety of patient samples (Green, 2003; Green, Flaro, & Courtney, 2009). Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) (Ben-Porath & Tellegen, 2008): The MMPI-2-RF contains 338 True/False items regarding aspects of personality and psychopathology, from which clinical and validity scales are derived. The final protocol includes five over-reporting validity scales that have been well-validated measures of poor effort and malingering (Tarescavage, Wygant, Gervais, & Ben-Porath, 2013). These five scales capture responses that are infrequent in the normative population (F-r scale), responses that are infrequent in a population of individuals with psychopathology (Fp-r scale), and somatic complaints that are infrequent in medical and chronic pain patients (Fs scale). The Fp-r and Fs scales provide comparisons for the F-r scale, which can be confounded by genuine, albeit infrequent, psychological, and somatic complaints. The MMPI-2-RF also includes a scale for detection of exaggerated symptom reporting (FBS-r) and a response bias scale as a measure of feigned test performance (RBS). Procedures All study procedures were approved by the local ethical review board, and all participants provided written informed consent before participating in the study. Participants completed computerized measures in groups of approximately 10–20 per testing session in a university computer laboratory. Groups of participants were randomly assigned to either the control or simulating conditions (all participants in a given session in the same condition) and instructions were read aloud by a study team member at the beginning of the testing session (for simulating instructions, see Supplementary Material online, Appendix A). Instructions to either put forth full effort or simulate a concussion were also reiterated by computerized prompts. Participants in both groups were informed that the tests they would be taking were designed to identify individuals not putting forth full effort, and that only those successfully putting forth full effort or feigning without detection would be entered into a gift card raffle, in order to provide external incentive for successful sandbagging. All participants received the computerized tests in a standardized order to optimize timing of the delayed recall portion of the WMT: (1) Demographic survey, (2) MMPI-2-RF, (3) WMT Learning, (4) ImPACT Test, (5) WMT Delayed Recall, and (6) Exit survey. This ordering allowed for an average delay of approximately 30 min between WMT Learning and Delayed Recall. The exit survey asked participants to honestly describe their group instructions as a validity check for feigning and to answer questions regarding the amount of effort put into following group instructions. Protocols for each measure (i.e., WMT, ImPACT, MMPI-2-RF) were initially classified as invalid based on published score thresholds (Table 2; Ben-Porath & Tellegen, 2008; Green, 2003; Lovell, 2015; Schatz & Glatts, 2013). In the cases of scales with multiple published validity thresholds, the most liberal cutoffs were used, as there was no presumed impairment in the sample of healthy college students. Table 2. Number of participants properly and improperly identified by published validity score thresholds Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Note: All cutoffs are published validity thresholds, except for Total Symptom Score, a novel index. True + = simulating condition participants accurately identified as sandbagging; False + = control condition participants inaccurately identified as sandbagging; False – = simulating condition participants inaccurately identified as giving adequate effort; True – = control condition participants accurately identified as giving adequate effort; Sens = sensitivity, Spec = specificity, LPC = Learning Percent Correct, TLC = Total Letters Correct. Table 2. Number of participants properly and improperly identified by published validity score thresholds Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Index  Cutoff  True +  False +  False –  True –  Sens  Spec  PPP  NPP  ImPACT (standard)    50  8  68  116  0.42  0.94  0.86  0.63   Impulse Control Composite  >30  12  0  106  124  0.10  1.00  1.00  0.54   Word Memory LPC  <69%  40  3  78  121  0.34  0.98  0.93  0.61   Design Memory LPC  <50%  15  4  103  120  0.13  0.97  0.79  0.54   Xs and Os Total Incorrect  >30  8  0  110  124  0.07  1.00  1.00  0.53   3 Letters TLC  <8  23  3  95  121  0.19  0.98  0.88  0.56  MMPI-2-RF    99  32  19  92  0.84  0.74  0.76  0.83   F-r  ≥79  71  20  47  104  0.60  0.84  0.78  0.69   Fp-r  ≥70  64  16  54  108  0.54  0.87  0.80  0.67   Fs  ≥80  83  18  35  106  0.70  0.85  0.82  0.75   FBS-r  ≥80  52  7  66  117  0.44  0.94  0.88  0.64   RBS  ≥80  58  6  60  118  0.49  0.95  0.91  0.66  WMT    71  19  47  105  0.60  0.85  0.79  0.69   Immediate Recall  <90  58  6  60  118  0.49  0.95  0.91  0.66   Delayed Recall  <90  58  7  60  117  0.49  0.94  0.89  0.66   Consistency Score  <90  70  18  48  106  0.59  0.85  0.80  0.69  ImPACT (exploratory)   WMCD  <22  87  42  31  82  0.74  0.66  0.67  0.73   DMCD  <16  82  43  36  81  0.69  0.65  0.66  0.69   Tot. Symptoms  >20  89  39  29  81  0.75  0.68  0.70  0.74  Note: All cutoffs are published validity thresholds, except for Total Symptom Score, a novel index. True + = simulating condition participants accurately identified as sandbagging; False + = control condition participants inaccurately identified as sandbagging; False – = simulating condition participants inaccurately identified as giving adequate effort; True – = control condition participants accurately identified as giving adequate effort; Sens = sensitivity, Spec = specificity, LPC = Learning Percent Correct, TLC = Total Letters Correct. Results Group Differences on Validity Indices A series of t-tests was performed on all standard and proposed validity indices to examine group differences for full-effort and sandbagging performances, to ensure that individuals in the feigning group effectively suppressed test scores and inflated symptom report. Significant group differences were found on the five ImPACT validity indices, three WMT validity indices, and five MMPI-2-RF validity indices (p < .001). Significant group differences were also observed on all three ImPACT exploratory validity indices (WMCD, DMCD, and Total Symptom Score) (Table 3). Table 3. Results of t-tests examining group differences on validity indices Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  *Equal variances not assumed. Table 3. Results of t-tests examining group differences on validity indices Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  Index  T  df  p  Mean (SD)  Effect size  Full effort  Simulators    ImPACT (standard)   Impulse Control Composite*  −6.23  136.10  <.001  5.26 (3.99)  13.37 (13.61)  0.81   Word Memory Learning % Correct*  8.61  167.45  <.001  94.14 (8.86)  78.14 (18.22)  1.12   Design Memory Learning % Correct  6.75  240  <.001  80.56 (15.82)  66.64 (16.25)  0.87   Xs and Os Total Incorrect*  −5.97  141.62  <.001  4.97 (3.75)  11.47 (11.24)  0.78   3 Letters Total Letters Correct*  7.79  180.38  <.001  13.74 (1.95)  10.87 (3.52)  1.01  MMPI-2-RF   F-r*  −9.46  211.93  <.001  60.12 (18.86)  87.97 (26.18)  1.22   Fp-r*  −7.17  203.70  <.001  58.97 (16.27)  78.04 (24.16)  0.93   Fs*  −11.27  203.40  <.001  60.19 (17.26)  92.04 (25.68)  1.46   FBS-r*  −10.58  228.22  <.001  55.70 (12.55)  74.58 (15.02)  1.36   RBS*  −12.52  227.36  <.001  57.05 (15.63)  85.02 (18.88)  1.61  WMT   Immediate Recall*  8.52  130.85  <.001  97.52 (4.53)  82.56 (18.55)  1.11   Delayed Recall*  8.79  130.85  <.001  96.53 (4.95)  80.04 (19.81)  1.14   Consistency Score*  9.80  140.07  <.001  95.06 (6.10)  77.21 (18.88)  1.27  ImPACT (exploratory)   Word Memory Correct Distractors*  8.06  197.58  <.001  21.47 (3.37)  16.86 (5.26)  1.04   Design Memory Correct Distractors  6.45  240  <.001  17.59 (5.20)  13.34 (5.03)  0.83   Total Symptom Score*  −9.74  195.09  <.001  17.19 (17.76)  46.94 (28.29)  1.26  *Equal variances not assumed. Classification Accuracy of Validity Indices Sensitivity and specificity were calculated for all validity indices, to characterize their abilities to differentiate sandbagging from full effort. Classification accuracy statistics initially employed the most liberal published cutoffs of each instrument (ImPACT, MMPI-2-RF, and WMT; Ben-Porath & Tellegen, 2008; Green, 2003; Lovell, 2015; Schatz & Glatts, 2013). Table 2 presents classification accuracy statistics for each instrument as a whole (e.g., the sensitivity and specificity for the ImPACT overall, as a product of identification by any one of the five validity indices), as well as each index individually. As the Total Symptom Score is suggested here for the first time as a possible validity index, a score threshold has not been previously proposed. Group means and standard deviations were examined to determine a proposed Total Symptom Score invalidity threshold of >20 for initial classification accuracy investigations, though other score thresholds are also presented below. Overall, the ImPACT demonstrated very high specificity (0.94) at the expense of lower sensitivity (0.42). Consequently, PPP (0.86) was higher than NPP (0.63). The three proposed ImPACT validity indices demonstrated comparatively high sensitivity rates, though with low specificity. Specifically, WMCD demonstrated a sensitivity of 0.74 with specificity of 0.66, while DMCD demonstrated a sensitivity of 0.69 and a specificity of 0.65, and Total Symptom Score demonstrated a sensitivity of 0.75 and a specificity of 0.68. Further, we examined how many of the five validity index score thresholds were surpassed by protocols that were flagged invalid by existing ImPACT protocol validity indices. Of the 58 protocols that were flagged as invalid by the standard ImPACT indices, 50.0% surpassed the validity threshold for only one index, 29.3% invalidated two indices, 6.9% invalidated three indices, and 12.1% invalidated four indices. Only one protocol (1.7%) surpassed thresholds on all five standard ImPACT validity indices. Sensitivity and Specificity Comparisons for Existing Score Cutoffs Classification accuracy for the standard ImPACT as a whole and the three exploratory ImPACT indices were then statistically compared to the MMPI-2-RF and the WMT using McNemar’s Tests to determine whether any current or exploratory score thresholds can reach the same level of performance as these gold-standard instruments. These results are presented in Table 4. Overall, the sensitivity of the standard ImPACT was significantly lower than the sensitivity rates of the MMPI-2-RF and the WMT, as well as each of the three exploratory validity indices alone. The specificity of the standard ImPACT was significantly higher than the MMPI-2-RF and the WMT, as well as all three exploratory ImPACT indices. Table 4. Resulting p values from McNemar’s Tests, comparing sensitivities and specificities of the ImPACT validity indices to WMT and MMPI-2-RF indices, using only participants from simulating condition (n = 118) Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Note: McNemar’s Tests comparing sensitivities use only participants from the simulating condition, while comparisons of specificities use only participants from the full effort condition. Table 4. Resulting p values from McNemar’s Tests, comparing sensitivities and specificities of the ImPACT validity indices to WMT and MMPI-2-RF indices, using only participants from simulating condition (n = 118) Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Index  Sensitivity comparison    Standard ImPACT  MMPI-2-RF  WMT  Sensitivity (n = 118)   MMPI-2-RF (0.84)  p < .001       WMT (0.60)  p < .001       ImPACT WMCD (0.74)  p < .001  p = .088  p < .005   ImPACT DMCD (0.69)  p < .001  p < .050  p = .080   ImPACT Total Symptom Score (0.75)  p < .001  p = .143  p < .05  Specificity (n = 124)   MMPI-2-RF (0.74)  p < .001       WMT (0.85)  p < .05       ImPACT WMCD (0.66)  p < .001  p = .220  p < .001   ImPACT DMCD (0.65)  p < .001  p = .161  p < .001   ImPACT Total Symptom Score (0.68)  p < .001  p < .05  p < .001  Note: McNemar’s Tests comparing sensitivities use only participants from the simulating condition, while comparisons of specificities use only participants from the full effort condition. Exploratory Score Thresholds and Base Rate Analyses Classification accuracy statistics were calculated for exploratory score thresholds on the five existing and three proposed ImPACT validity indices to determine whether more liberal cutoffs may improve sensitivity without detrimentally affecting specificity. In selecting alternate score thresholds, sensitivity was maximized while a minimum specificity of 0.90 was preserved. Positive predictive power (PPP) and negative predictive power (NPP) were also calculated for hypothetical base rates of sandbagging as informed by the literature. These results are presented in Table 5. Table 5. Classification accuracy statistics for exploratory score cutoffs Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Note: Positive predictive powers represent the probability that an invalid protocol warning at the given score threshold is reflective of poor effort, rather than a false positive result. Negative predictive powers similarly represent the probability that a failure to detect poor effort is reflective of full effort. These probabilities vary by the base rate of poor effort. Sens = sensitivity; Spec = specificity; BR = base rate; LPC = Learning Percent Correct; Tot = total; TLC = Total Letters Correct. Table 5. Classification accuracy statistics for exploratory score cutoffs Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Index  Cutoff  Sens  Spec  BR  Predictive power: Positive/Negative  0.05  0.10  0.15  0.20  0.25  ImPACT (traditional)   Impulse Control  >13  0.35  0.95    0.27/0.97  0.44/0.93  0.55/0.89  0.64/0.85  0.70/0.81  >11  0.44  0.92    0.22/0.97  0.38/0.94  0.49/0.90  0.58/0.87  0.65/0.83  >10  0.47  0.90    0.20/0.97  0.34/0.94  0.45/0.91  0.54/0.87  0.61/0.84  >9  0.48  0.87    0.10/0.97  0.19/0.94  0.27/0.91  0.34/0.87  0.41/0.84   Word Memory LPC  <75  0.38  0.98    0.50/0.97  0.68/0.93  0.63/0.94  0.71/0.92  0.86/0.83  <78  0.44  0.96    0.37/0.97  0.55/0.94  0.66/0.91  0.73/0.87  0.79/0.84  <81  0.49  0.91    0.22/0.97  0.38/0.94  0.49/0.91  0.58/0.88  0.64/0.84  <83  0.58  0.89    0.22/0.98  0.37/0.95  0.48/0.92  0.57/0.89  0.64/0.86   Design Memory LPC  <56  0.31  0.90    0.14/0.96  0.26/0.92  0.35/0.88  0.44/0.84  0.51/0.80  <62  0.39  0.84    0.11/0.96  0.21/0.79  0.30/0.89  0.38/0.85  0.45/0.81  <68  0.60  0.74    0.11/0.97  0.20/0.94  0.29/0.91  0.37/0.88  0.43/0.85  <74  0.68  0.67    0.10/0.98  0.19/0.95  0.27/0.92  0.34/0.89  0.41/0.86   Xs & Os Tot. Incorrect  >20  0.21  0.98    0.36/0.96  0.54/0.92  0.65/0.88  0.72/0.83  0.78/0.79  >16  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  >12  0.30  0.97    0.34/0.96  0.53/0.93  0.64/0.89  0.71/0.85  0.77/0.81  >8  0.43  0.90    0.18/0.97  0.32/0.93  0.43/0.90  0.52/0.86  0.59/0.83   3 Letters TLC  <9  0.25  0.98    0.40/0.96  0.58/0.92  0.69/0.88  0.76/0.84  0.81/0.80  <10  0.34  0.95    0.26/0.96  0.43/0.93  0.55/0.89  0.63/0.85  0.69/0.81  <11  0.42  0.94    0.27/0.97  0.44/0.94  0.55/0.90  0.64/0.87  0.70/0.83  <12  0.52  0.90    0.21/0.97  0.37/0.94  0.48/0.91  0.57/0.88  0.63/0.85  ImPACT (exploratory)   WMCD  <24  0.87  0.27    0.06/0.98  0.12/0.95  0.17/0.92  0.23/0.89  0.28/0.86  <20  0.61  0.81    0.14/0.98  0.26/0.95  0.36/0.92  0.45/0.89  0.52/0.86  <18  0.54  0.90    0.22/0.78  0.38/0.95  0.49/0.92  0.57/0.89  0.64/0.85  <16  0.43  0.95    0.31/0.97  0.49/0.94  0.60/0.90  0.68/0.87  0.74/0.83   DMCD  <18  0.77  0.57    0.09/0.98  0.17/0.96  0.24/0.93  0.31/0.91  0.37/0.88  <14  0.58  0.74    0.11/0.97  0.20/0.94  0.28/0.91  0.36/0.88  0.43/0.84  <12  0.40  0.85    0.12/0.96  0.23/0.93  0.32/0.89  0.40/0.85  0.47/0.81  <10  0.25  0.94    0.18/0.96  0.32/0.92  0.42/0.88  0.51/0.83  0.58/0.79   Total Symptom Score  >26  0.72  0.73    0.12/0.98  0.23/0.96  0.32/0.94  0.40/0.91  0.47/0.89  >32  0.62  0.81    0.15/0.98  0.27/0.95  0.37/0.92  0.45/0.90  0.52/0.86  >38  0.58  0.86    0.18/0.97  0.32/0.95  0.42/0.92  0.51/0.89  0.58/0.86  >44  0.54  0.93    0.29/0.97  0.46/0.95  0.58/0.92  0.66/0.89  0.72/0.86  Note: Positive predictive powers represent the probability that an invalid protocol warning at the given score threshold is reflective of poor effort, rather than a false positive result. Negative predictive powers similarly represent the probability that a failure to detect poor effort is reflective of full effort. These probabilities vary by the base rate of poor effort. Sens = sensitivity; Spec = specificity; BR = base rate; LPC = Learning Percent Correct; Tot = total; TLC = Total Letters Correct. Generally, the sensitivities of individual validity indices were able to be marginally improved while maintaining acceptable specificity with adjusted score thresholds, though rarely to above 50%. As positive and negative predictive power is affected by base rates, these were calculated for each presented alternate score threshold at various hypothetical base rates of sandbagging. The selected hypothetical base rates (5%, 10%, 15%, 20%, and 25%) were informed in part by the current literature on ImPACT protocol invalidity (ranging from 2.7% [Nelson et al., 2015] to 6.3% [Schatz, Moser, Solomon, Ott, & Karpf, 2012]), as well as by literature on the prevalence of malingering in forensic settings (Mittenberg, Patton, Canyock, & Condit, 2002). Discussion The present results indicate that the currently employed protocol validity indices of the ImPACT have high specificity at the expense of poorer sensitivity as compared to external performance validity measures. An exploration of alternate validity thresholds on the ImPACT revealed that the sensitivity of the current indices can be only marginally improved while maintaining high specificity, though investigation of two additional ImPACT validity indices previously proposed in the literature (WMCD and DMCD) and one novel index proposed here (Total Symptom Score) may have the potential to improve accuracy. Several aspects of these findings warrant further discussion. The high specificity of the ImPACT’s protocol validity indices is particularly striking when compared to the specificities of external validity measures. The MMPI-2-RF and the WMT are frequently used alongside neuropsychological testing in forensic settings with TBI populations (Green, Flaro, & Courtney, 2009; Hartman, 2002; Iverson, Green, & Gervais, 1999; Tarescavage, Wygant, Gervais, & Ben-Porath, 2013) as measures of symptom and performance validity. In the present study, both of these measures were found to have significantly higher sensitivity though lower specificity to simulated SRC than the ImPACT validity indices (Table 4). Due to the potential for serious repercussions of false positives in forensic settings, tests used in these settings are held to a high standard of specificity. Comparatively, the costs associated with a false positive on the ImPACT (likely including only the time required for a second test administration) appear relatively minor. As a result, maximizing the identification of poor effort, despite some degree of sacrificed specificity, may benefit the safety of athletes. Further research in this area, including investigation of the performance of the ImPACT validity indices in a genuine sample of athletes at baseline testing, is needed. However, it could also be argued that an increased number of test protocols being identified as being invalid (both correctly and incorrectly) also has meaningful costs. Greater numbers of invalid assessments would require additional resources from clinicians and may not necessarily translate to improved athlete care. A study on the implementation of the ImPACT across a variety of sports medicine settings indicated that, while almost all treatment settings using the ImPACT administer a baseline test (94.7%), only half of these sites examine baseline protocol validity (54.8%) (Covassin, Elbin, Stiller-Ostrowski, & Kontos, 2009). Additionally, unnecessary retests may contribute to athletes’ negative feelings toward the test or prompt them to change their strategies during testing. In both situations, retest baseline scores may in fact be less representative of an athlete’s typical approach to testing, thus confounding observed post-injury changes. Related to the above, it has been proposed within the literature that baseline testing may not be effective at capturing post-injury deficits, particularly due to low reliability (which is likely due, at least in part, to variable effort at baseline testing; Elbin, Schatz, & Covassin, 2011; Nakayama et al., 2014; Schatz & Ferris, 2013). Rather, comparison to normative data has been proposed as better suited to capture post-injury deficits in some settings (Echemendia et al., 2012; Randolph, 2011) and would resolve the issue of poor effort at baseline testing. On the other hand, there is research to suggest that athletes whose neurocognitive abilities are above or below the mean (e.g., individuals with ADHD or learning disorder) are at risk for misclassification at post-injury testing when compared only to normative data (Elbin et al., 2013; Schatz & Robertshaw, 2014). For these athletes specifically, baseline testing is well suited to accommodate individual differences, though only when baseline scores accurately represent uninjured abilities. However, the baseline testing validity indices may function differently in these populations (Elbin et al., 2013; Johnson, Pardini, Sandel, & Lovell, 2014; Manderino & Gunstad, 2017). Although the prevalence of such histories did not differ between groups in the present study, the current findings may not be fully representative of the validity indices’ performance in athletes with such conditions. Continued research in this area may lead to improvements of the validity score thresholds for such athletes. Further discussion of the two previously proposed validity indices, WMCD and DMCD, is also warranted. Consistent with past results (identifying 90%–100% of malingerers in a simulation design; Schatz & Glatts, 2013), these indices demonstrated higher sensitivity than the traditional ImPACT validity indices (0.74 sensitivity for WMCD, 0.69 for DMCD). This increased sensitivity was found to be at the expense of considerably lower specificity than the existing ImPACT validity indices. Lower specificity may not preclude the use of WMCD and DMCD as validity indices, due to the relatively low risk of false positives on baseline testing as discussed earlier. Test administrators should note these limitations, as protocol invalidity as identified by these indices warrants more cautious interpretation and investigation of cause. Despite this, even while maintaining a minimum specificity of 0.90, the sensitivity of WMCD (0.54) is greater than the existing ImPACT indices, suggesting its potential to increase the overall sensitivity of the ImPACT. The investigation of Total Symptom Score as a symptom validity index also encourages further investigation. While maintaining high levels of specificity, the Total Symptom Score demonstrated higher sensitivity to simulated SRC than any one of the existing validity indices alone. In the absence of SRC or other neurologic conditions, nonspecific SRC symptoms (e.g., fatigue, headache) may be experienced but are expected to be few in number and mild in severity, suggesting that a significant elevation of symptom reporting at baseline may be indicative of feigning regardless of neurocognitive performance. In this way, the Total Symptom Score may provide a measure of symptom validity, in addition to the existing validity indices that assess only performance validity, thus tapping into a validity-related construct not yet assessed by ImPACT. However, simulated malingering studies often yield effect sizes on effort measures much larger than those of actual malingerers (Vickery, Berry, Inman, Harris, & Orey, 2001). While it is possible that a naïve sandbagger would elevate the Total Symptom Score at baseline, it would take very little coaching or experience with the test for an athlete to adopt a more sophisticated approach and limit symptom reporting. For this reason, the threshold scores used in this study are unlikely to provide the same degree of clinical utility in genuine athletic settings. Further research on the Total Symptom Score, particularly using non-simulating study designs, is needed to determine its potential as a validity index. The current findings are limited in several ways. First, testing was performed in group settings, which has been demonstrated in past work to affect test scores and protocol validity (Moser, Schatz, Neidzwski, & Ott, 2011). While this may introduce extraneous sources of variance, it does make the present results generalizable to the many athletic settings that administer the ImPACT to large numbers of athletes simultaneously. Second, while the standalone validity instruments used here were selected for their well-validated measures of effort, they may be too burdensome to be integrated into standard baseline testing. Rather, they are employed here as points of comparison to inform use of the embedded ImPACT indices. Finally, as noted above, the ImPACT validity indices have been shown to function differently in athletes with certain histories (e.g., ADHD, learning disorders), and further research using such athletes and replication of the present results in a non-simulating sample is needed. In brief summary, the high specificity and low sensitivity of the existing ImPACT protocol validity indices found in the present study, if replicated in a non-simulating study design, may diminish the clinical utility of the ImPACT’s user-friendly protocol validity warning in clinical settings. By better informing clinicians of the risks associated with premature return to play and the limitations of the ImPACT, they may play a more active role in discouraging and identifying poor-effort baseline assessments. With continued improvements to and education on the ImPACT’s protocol validity indices, clinicians may most efficiently allocate resources to ensure athletes’ continued, safe athletic participation, relieving the burden of SRC and promoting safe return-to-play. Supplementary Material Supplementary material is available at Archives of Clinical Neuropsychology online. Funding None. Conflict of Interest None declared. Acknowledgments The authors wish to acknowledge Yossef Ben-Porath, PhD, for his input on study measures. We also thank Anthony Tarescavage, PhD, for his statistical guidance. References Allen, B. J., & Gfeller, J. D. ( 2011). The Immediate Post-Concussion Assessment and Cognitive Testing battery and traditional neuropsychological measures: A construct and concurrent validity study. Brain Injury , 25, 179– 191. Google Scholar CrossRef Search ADS PubMed  Ben-Porath, Y. S., & Tellegen, A. ( 2008). The Minnesota Multiphasic Personality Inventory – 2 – Restructured Form: Manual for administration, scoring, and interpretation . Minneapolis, MN: University of Minnesota Press. Cantu, R. C. ( 1998). Second-impact syndrome. Clinics in Sports Medicine , 17, 37– 44. Google Scholar CrossRef Search ADS PubMed  Covassin, T., Elbin, R. J., Stiller-Ostrowski, J. L., & Kontos, A. P. ( 2009). Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) practices of sports medicine professionals. Journal of Athletic Training , 44, 639– 644. Google Scholar CrossRef Search ADS PubMed  Echemendia, R. J., Bruce, J. M., Bailey, C. M., Sanders, J. F., Arnett, P., & Vargas, G. ( 2012). The utility of post-concussion neuropsychological data in identifying cognitive change following sports-related mTBI in the absence of baseline data. The Clinical Neuropsychologist , 26, 1077– 1091. Google Scholar CrossRef Search ADS PubMed  Elbin, R. J., Kontos, A. P., Kegel, N., Johnson, E., Burkhart, S., & Schatz, P. ( 2013). Individual and combined effects of LD and ADHD on computerized neurocognitive concussion test performance: Evidence for separate norms. Archives of Clinical Neuropsychology , 28, 476– 484. Google Scholar CrossRef Search ADS PubMed  Elbin, R. J., Schatz, P., & Covassin, T. ( 2011). One-year test-retest reliability of the online version of ImPACT in high school athletes. The American Journal of Sports Medicine , 39, 2319– 2324. Google Scholar CrossRef Search ADS PubMed  Erdal, K. ( 2012). Neuropsychological testing for sports-related concussion: How athletes can sandbag their baseline testing without detection. Archives of Clinical Neuropsychology , 27, 473– 479. Google Scholar CrossRef Search ADS PubMed  Green, P. ( 2003). Word Memory Test for Windows: User’s manual and program . Edmonton, Alberta, Canada: Author. Green, P., Flaro, L., & Courtney, J. ( 2009). Examining false positives on the Word Memory Test in adults with mild traumatic brain injury. Brain Injury , 23, 741– 750. Google Scholar CrossRef Search ADS PubMed  Hartman, D. E. ( 2002). The unexamined lie is a lie worth fibbing: Neuropsychological malingering and the Word Memory Test. Archives of Clinical Neuropsychology , 17, 709– 714. Google Scholar CrossRef Search ADS PubMed  Iverson, G., Green, P., & Gervais, R. ( 1999). Using the word memory test to detect biased responding in head injury litigation. Journal of Cognitive Rehabilitation , 17(2), 4– 8. Johnson, E., Pardini, J., Sandel, N., & Lovell, M. ( 2014). Do athletes with dyslexia differ at baseline and/or at concussion post-injury assessment on a computer-based test battery? Archives of Clinical Neuropsychology , 29, 590– 591. Google Scholar CrossRef Search ADS   Kelly, J. P., & Rosenberg, J. H. ( 1997). Diagnosis and management of concussion in sports. Neurology , 48, 575– 580. Google Scholar CrossRef Search ADS PubMed  Lovell, M. R. ( 2015). ImPACT Test Administration and Interpretation Manual. Retrieved from http://www.impacttest.com. Maerlender, A., Flashman, L., Kessler, A., Kumbhani, S., Greenwald, R., Tosteson, T., et al.  . ( 2010). Examination of the Construct Validity of ImPACT Computerized Test, traditional, and experimental neuropsychological measures. The Clinical Neuropsychologist , 24, 1309– 1325. Google Scholar CrossRef Search ADS PubMed  Manderino, L., & Gunstad, J. ( 2017). Collegiate student-athletes with history of ADHD or academic difficulties are more likely to produce an invalid protocol on baseline ImPACT testing. Clinical Journal of Sports Medicine , doi: 10.1097/JSM.0000000000000433. Millis, S. R. ( 2009). Methodological challenges in assessment of cognition following head injury: Response to Malojcic et al. 2008. Journal of Neurotrauma , 26, 2409– 2410. Google Scholar CrossRef Search ADS PubMed  Mittenberg, W., Patton, C., Canyock, E. M., & Condit, D. C. ( 2002). Base rates of malingering and symptom exaggeration. Journal of Clinical and Experimental Neuropsychology , 24, 1094– 1102. Google Scholar CrossRef Search ADS PubMed  Moser, R. S., Schatz, P., Neidzwski, K., & Ott, S. D. ( 2011). Group versus individual administration affects baseline neurocognitive test performance. American Journal of Sports Medicine , 39, 2325– 2330. Google Scholar CrossRef Search ADS PubMed  Meyers, J. E., Volbrecht, M., Axelrod, B. N., & Reinsch-Boothby, L. ( 2011). Embedded symptom validity tests and overall neuropsychological test performance. Archives of Clinical Neuropsychology , 26, 8– 15. Google Scholar CrossRef Search ADS PubMed  Nakayama, Y., Covassin, T., Schatz, P., Nogle, S., & Kovan, J. ( 2014). Examination of the test-retest reliability of a computerized neurocognitive test battery. The American Journal of Sports Medicine , 42, 2000– 2005. Google Scholar CrossRef Search ADS PubMed  Nelson, L. D., Pfaller, A. Y., Rein, L. E., & McCrea, M. A. ( 2015). Rates and predictors of invalid baseline test performance in high school and collegiate athletes for 3 computerized neurocognitive tests. The American Journal of Sports Medicine , 42, 2018– 2026. Google Scholar CrossRef Search ADS   Randolph, C. ( 2011). Basleine neuropsychological testing in managing sport-related concussion: Does it modify risk? Current Sports Medicine Reports , 10, 21– 26. Google Scholar CrossRef Search ADS PubMed  Schatz, P., & Ferris, C. S. ( 2013). One-month test-retest reliability of the ImPACT test battery. Archives of Clinical Neuropsychology , 28( 5), 499– 504. Google Scholar CrossRef Search ADS PubMed  Schatz, P., & Glatts, C. ( 2013). “Sandbagging” baseline test performance on ImPACT, without detection, is more difficult than it appears. Archives of Clinical Neuropsychology , 28, 236– 244. Google Scholar CrossRef Search ADS PubMed  Schatz, P., Moser, R. S., Solomon, G. S., Ott, S. D., & Karpf, R. ( 2012). Prevalence of Invalid Computerized Baseline Neurocognitive Test Results in high school and collegiate athletes. Journal of Athletic Training , 47, 289– 296. Google Scholar CrossRef Search ADS PubMed  Schatz, P., & Robertshaw, S. ( 2014). Comparing post-concussive neurocognitive test data to normative data presents risks for under-classifying “above average” athletes. Archives of Clinical Neuorpsycholoy , 29, 625– 632. Google Scholar CrossRef Search ADS   Schatz, P., & Sandel, N. ( 2012). Sensitivity and Specificity of the Online Version of ImPACT in High School and Collegiate Athletes. The American Journal of Sports Medicine , 41, 321– 326. Google Scholar CrossRef Search ADS PubMed  Tarescavage, A. M., Wygant, D. B., Gervais, R. O., & Ben-Porath, Y. S. ( 2013). Association between the MMPI-2 Restructured Form (MMPI-2-RF) and malingered neurocognitive dysfunction among non-head injury disability claimants. The Clinical Neuropsychologist , 27, 313– 335. Google Scholar CrossRef Search ADS PubMed  Van Dyke, S. A., Millis, S. R., Axelrod, B. N., & Hanks, R. A. ( 2013). Assessing effort: Differentiating performance and symptom validity. The Clinical Neuorpsychologist , 27, 1234– 1246. Google Scholar CrossRef Search ADS   Vickery, C. D., Berry, D. T. R., Inman, T. H., Harris, M. J., & Orey, S. A. ( 2001). Detection of inadequate effort on neuropsychological testing: A meta-analytic review of selected procedures. Archives of Clinical Neuropsychology , 16, 45– 73. Google Scholar PubMed  © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Journal

Archives of Clinical NeuropsychologyOxford University Press

Published: Oct 27, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off