Test–Retest Reliability and Minimal Detectable Change of the D2 Test of Attention in Patients with Schizophrenia

Test–Retest Reliability and Minimal Detectable Change of the D2 Test of Attention in Patients... Abstract Objective The d2 Test of Attention (D2) is a commonly used measure of selective attention for patients with schizophrenia. However, its test–retest reliability and minimal detectable change (MDC) are unknown in patients with schizophrenia, limiting its utility in both clinical and research settings. The aim of the present study was to examine the test–retest reliability and MDC of the D2 in patients with schizophrenia. Method A rater administered the D2 on 108 patients with schizophrenia twice at a 1-month interval. Test–retest reliability was determined through the calculation of the intra-class correlation coefficient (ICC). We also carried out Bland–Altman analysis, which included a scatter plot of the differences between test and retest against their mean. Systematic biases were evaluated by use of a paired t-test. Results The ICCs for the D2 ranged from 0.78 to 0.94. The MDCs (MDC%) of the seven subscores were 102.3 (29.7), 19.4 (85.0), 7.2 (94.6), 21.0 (69.0), 104.0 (33.1), 105.0 (35.8), and 7.8 (47.8), which represented limited-to-acceptable random measurement error. Trends in the Bland–Altman plots of the omissions (E1), commissions (E2), and errors (E) were noted, presenting that the data had heteroscedasticity. Conclusions According to the results, the D2 had good test–retest reliability, especially in the scores of TN, TN-E, and CP. For the further research, finding a way to improve the administration procedure to reduce random measurement error would be important for the E1, E2, E, and FR subscores. Attention, Schizophrenia, d2 Test of Attention Introduction Among the variety of cognitive deficits in schizophrenia, the most robust is deficits of attention (Elvevag & Goldberg, 2000; Nuechterlein et al., 2004). Selective attention refers to the ability to maintain a behavioral or cognitive set in the face of distracting or competing stimuli (Biehl et al., 2013; Lavie, 2005). It thus incorporates the notion of freedom from distractibility (Sohlberg & Mateer, 2001b). Individuals with selective attention deficits are easily drawn off task by extraneous, irrelevant stimuli (Smith & Lilienfeld, 2015). Examples of problems include an inability to perform therapy tasks in a stimulating environment (e.g., an open treatment area) or to prepare a meal with children playing in the background (Sohlberg & Mateer, 2001a). Therefore, the selective attention function is measured periodically to detect and monitor the attention status of patients with schizophrenia in clinical settings. For reliable measurements, a measurement for assessing selective attention with sound reliability is needed. Test–retest reliability is the highest-rated test criterion among surveyed experts because this property is critical for detecting changes with treatment of schizophrenia (Green et al., 2004). Intra-class correlation (ICC) methods have been commonly used for assessing test–retest reliability in recent studies (Brennan & Silman, 1992). The ICC reflects both the degree of correlation and level of agreement between measures (Lee, Li, Liu, & Hsieh, 2011), and the ICC is also widely accepted to better represent the reliability of a measure than a value of Pearson’s r (Brennan & Silman, 1992; Patten, Kothari, Whitney, Lexell, & Lum, 2003; Prince, Makrides, & Richman, 1980; Shrout & Fleiss, 1979). In addition, to determine whether a change score between repeated measurements is significant, researchers need to provide minimal detectable change (MDC) scores for users to interpret the results of repeated measurements (Goldsmith, Boers, Bombardier, & Tugwell, 1993; Schuck & Christian, 2003; Steffen & Seney, 2008). The MDC is defined as the minimum amount of change that is not due to variation in measurement (Schreuders et al., 2003). Namely, when a difference between two successive measurements exceeds the MDC, the change is more likely to be viewed as a real difference, rather than merely random error (Michon, Kroon, Van Weeghel, & Schene, 2004). Once the MDC is determined on a particular test for a given population, clinicians can interpret whether the change scores of their patients are at or above the minimal level of detectable change (Ries, Echternach, Nof, & Gagnon, 2009; Wagner, Rhodes, & Patten, 2008). Thus, the MDC is crucial for clinicians and researchers to determine real change in repeated measurements for an individual patient (Kovacs et al., 2008). The d2 Test of Attention (D2), a cancellation test, is a user-friendly and valid measure of selective attention often used in related research for patients with schizophrenia (Boyer et al., 2013, 2014; Brickenkamp & Zillmer, 1998; Ernst, Haugaard, Olrik, Munk-Jorgensen, & Ostergaard, 2013; Rehse et al., 2016). The D2 is used because it is low in cost, easy and fast to administer, and applicable to testing large numbers of people simultaneously (Van den Berg et al., 2016). The D2 assesses performance in terms of visual perceptual speed and concentrative capacities by assessing an individual’s ability to selectively, quickly, and accurately focus on certain relevant aspects in a task while ignoring irrelevant aspects (Vanhelst et al., 2016). In terms of reliability, the D2 has good test–retest reliability in a normal adult sample (Brickenkamp & Zillmer, 1998). However, the test–retest reliability and MDC of the D2 have been examined little in patients with schizophrenia, limiting the interpretability and applicability of this measure for research and clinical settings. Thus, the purpose of our study was to examine the test–retest reliability and MDC of the D2 in patients with schizophrenia. Methods Participants Participants were recruited from a clinical psychiatric hospital, and all were Taiwanese. Participants with the following characteristics were included: (A) diagnosis of schizophrenia according to the International Classification of Disease, 9 thRevision, Clinical Modification (ICD-9-CM) diagnostic criteria; (B) clinical stability with a stable dose of antipsychotic medication for at least 3 months prior to the study; (C) sufficient reading or listening comprehension to complete the D2 (Mini Mental State Examination [MMSE] score > 24); (D) absence of substance abuse, severe medical or neurological condition, or other psychiatric disorder that required treatment (e.g., mental retardation, dementia, or developmental disability); (E) age 20–65 years; and (F) provision of informed consent. Criterion (B) was used to support the assumption for examining test–retest reliability (i.e., the attention performance of participants was assumed to be stable between two measurements). Participants were excluded from the study if they: (A) were participating in another clinical trial; (B) were suffering an episode of major depression; and/or (C) exhibited difficulties in recognizing the letters of the English alphabet. The study protocol was reviewed and approved by the Institutional Review Board of the study hospital. Originally, a total of 150 participants were approached for recruitment in the study, and 33 of them did not meet the criteria. In the end, 117 participants who were eligible for the study were recruited. Of these, 11 participants did not complete the second session because they were discharged from the hospital, withdrew, or were otherwise lost to follow-up. Thus, the final sample included 106 participants. Participants completed the test within about 8 min. All participants confirmed that they had continued with their regular therapeutic activities during the study periods. Research Procedure The D2 was administered to the participants by a specially trained occupational therapist twice, at an interval of 1 month. The occupational therapist was trained specifically to administer the D2 test. The test was administered in a quiet room with no environmental distractions. Participants’ demographic data were collected from medical records. The psychopathological stability of the participants with schizophrenia was assessed with the Clinical Global Impression-Severity (CGI-S) (Guy, 1976). The participants with stable CGI-S (i.e., the CGI-S category was exactly the same at the two time points) were included for further analysis. The CGI-S was administered by the same rater. The participants continued with their regular therapeutic activities during the study periods. Measures The D2 is a one-page paper-and-pencil test for selective attention. The standard version of the D2 consists of 14 rows (trials), each with 47 randomly mixed “p” and “d” letters (Brickenkamp & Zillmer, 1998), for a total of 658 letters. The target symbol is a “d” with two dashes (hence “d2”). The participant’s task is to cancel out as many target symbols as possible, moving from left to right, with a time limit of 20 s/trial. The seven subscores of the D2 are total number (TN), omissions (E1), commissions (E2), errors (E), total-errors (TN-E), concentration performance (CP), and fluctuation rate (FR). TN is a quantitative measure of performance of all items that were processed. E1 occurs when relevant items (“d” with two dashes) are not crossed out. E2 occurs when irrelevant letters are crossed out in violation of the instructions. The raw score E is the sum of all mistakes. TN-E is the total number of items scanned minus error scores (E). CP is derived from the number of the correctly crossed out relevant items minus the E2. FR is the difference between the line (or lines) with the minimum numbers of items processed (Brickenkamp & Zillmer, 1998). The coefficients of test–retest reliability are good for adults at an interval of less than 3 months (r = 0.90). The results of adults are more stable than those of children or adolescents over a longer period. Regarding the validity of the D2, the highest relationship is with the symbol digit modalities test (SDMT). In order to examine the construct validity, a factor analysis was calculated for the D2. The results further supported the sensitivity of the D2 as a test of concentration and attention. Therefore, the construct and convergent validity of the D2 are good for assessing attention (Bates & Lemay, 2004). The D2 is a reliable and valid measurement tool. The CGI-S is commonly used for assessing symptom severity in schizophrenia research and clinical practice (Busner & Targum, 2007). The scale is a 1-item observer-rated scale and clinician-rated instrument (Guy, 1976). The CGI-S is a 7-point scale for rating the severity of psychiatric illness (Huang et al. 2012). The clinicians rate the severity of the patient’s mental illness at the time of measurement (1 = not at all, 2 = borderline, 3 = mild, 4 = moderate, 5 = marked, 6 = severe, and 7 = extremely ill) (Otto et al., 2010). The inter-rater reliability has an ICC of 0.75 (Haro et al., 2003). The content validity of the CGI-S is strong (Allen et al., 2012). The CGI-S was used to assess the change in clinical status of the participants (Leucht, Davis, Engel, Kissling, & Kane, 2009; Patrick et al., 2009; Wu et al., 2013). Statistical Analysis Data were analyzed in the Statistical Package for Social Science version 22.0 (SPSS Inc., Chicago, IL). The alpha level was set at 0.05 for all statistical tests, and all p-values were two tailed. Test–Retest Reliability Test–retest reliability was determined through calculation of the ICC(2,1), a two-way random-effects single-measure reliability (absolute agreement) (Rankin & Stokes, 1998; Shrout & Fleiss, 1979). The ICC is the ratio of the inter-subject component of variance (inter-subject variance + within-subject variance) (Prince et al. 1980). There is no universally agreed level for ICC values in relation to levels of reliability, but the following scheme has been previously reported as acceptable: 0.90–0.99, excellent reliability; 0.80–0.89, good reliability; 0.70–0.79, fair reliability; 0.69 or below, poor reliability (Arnall, Koumantakis, Oldham, & Cooper, 2002). The ICCs of the D2 were computed for the test and retest sessions (Haley & Fragala-Pinkham, 2006). Minimal Detectable Change (MDC) and MDC Percentage (MDC%) The MDC was calculated based on the standard error of measurement (SEM) according to the equations below (Michon et al. 2004): SEM=SDalltestingscores×√(1−r) MDC=1.96×√2×SEM In these formulas, 1.96 is the z-score at the 95% confidence level, √2 is used because of the underlying extra uncertainty during measurement at two time points, and r is the coefficient of the test–retest reliability, which was represented by the ICC. In addition, we calculated the MDC% (=MDC/mean × 100%), which presents the relative amount of random measurement error. The “mean” in this equation is the mean score of all trials. An MDC% of 30% or less is considered acceptable, and one of 10% or less, excellent (Smidt et al., 2002). Systematic Bias We used a paired t-test to examine whether systematic biases existed. The differences between the test–retest assessments were considered significant if p-values were smaller than .05. We also calculated the effect size, which was the mean changes divided by the standard deviation of the first session scores, to determine the extent of bias (Kazis, Anderson, & Meenan, 1989). According to Cohen’s criteria, an effect size of greater than 0.8 was considered large; 0.5–0.8, moderate; and 0.2–0.5, small (Cohen, Nordahl, Semple, Andreason, & Pickar, 1998). Bland–Altman Plots Bland–Altman plots were used to visually examine the agreement of a test by plotting the difference scores against the mean score of each pair of measurements (Bland & Altman, 1986). Assuming the differences follow normal distribution, 95% of the differences (limits of agreement, LOA) should lie between d ± 1.96 × SD, where d represents the mean difference between test and retest scores, and SD is the standard deviation of differences of each pair (Bland & Altman, 2003; Liaw et al., 2008). The plot shows the difference between test sessions 2 and 1 (2–1) against the mean of the two test sessions for each subject (Bland & Altman, 2003). Heteroscedasticity In addition, we used Pearson’s r to examine the association between the absolute difference and the mean of each pair of repeated measurements to examine the possibility of heteroscedasticity, meaning that a systematic trend (e.g., the higher the scores, the larger the differences) exists. If heteroscedasticity exists, the MDC should not be applied for different levels (e.g., attention deficits in this study) of patients. According to Atkinson’s suggestion, if the absolute value of Pearson’s r is greater than 0.3, the data are heteroscedastic (Atkinson & Nevill, 1998). Results Demographic and Clinical Characteristics of Participants All of these participants had the same CGI-S category at both sessions. Their CGI-S scores were mostly mild (40.6%) or borderline (31.1%). Table 1 displays demographic data for the 106 participants. The mean age was approximately 44 years old (SD = 8.9), and 56.6% of the participants were male. All participants were receiving maintenance medication, taking antipsychotic medication alone (the three most common medications were haloperidol, clozapine, and sulpiride), and there were no significant changes in medication in the 1-month study period. Table 1. Demographic characteristics of the sample (n = 106) Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Notes: SD = standard deviation; CGI-S = Clinical Global Impression-Severity. Table 1. Demographic characteristics of the sample (n = 106) Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Notes: SD = standard deviation; CGI-S = Clinical Global Impression-Severity. Test–Retest Reliability The reliability indices of the D2 are listed in Table 2. The ICCs for the seven subscores of the D2 between successive sessions were between 0.78 and 0.94. The paired t-test showed that all the subscores were significantly different. The effect sizes were 0.07, 0.04, 0.03, 0.05, 0.07, 0.08, and 0.06 for the TN, E1, E2, E, TN-E, CP, and FR, respectively. Table 2. Test–retest reliability of the D2 (raw scores) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Notes: TN = total number; E1 = Omissions; E2 = Commissions; E = errors; TNE = total-errors; CP = concentration performance; FR = fluctuation rate. Table 2. Test–retest reliability of the D2 (raw scores) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Notes: TN = total number; E1 = Omissions; E2 = Commissions; E = errors; TNE = total-errors; CP = concentration performance; FR = fluctuation rate. Minimal Detectable Change (MDC) The MDCs of the subscores of the D2 were 102.3, 19.4, 7.2, 21.0, 104.0, 105.0, and 7.8 for the TN, E1, E2, E, TN-E, CP, and FR, respectively. All MDC% of the seven subscores were between 29.7% and 94.6%. Seven Bland–Altman plots (Fig. 1) that were representative of the D2 showed the LOA ranges to be −95.6 to 113.8, −20.2 to 18.8, −7.5 to 7.0, −22.0 to 20.2, −96.4 to 116.3, −89.1 to 113.5, and −8.1 to 7.4 for the TN, E1, E2, E, TN-E, CP, and FR, respectively. In addition, trends in the plots of the E1, E2, and E were noted (Pearson’s r were 0.37, 0.62, and 0.33, respectively). Fig. 1. View largeDownload slide Bland–Altman method for plotting the difference in scores against the mean scores of D2. The two bold lines define the limits of agreement (mean of difference ± 1.96 SD). Fig. 1. View largeDownload slide Bland–Altman method for plotting the difference in scores against the mean scores of D2. The two bold lines define the limits of agreement (mean of difference ± 1.96 SD). Discussion The study first examined the D2 by assessing the test–retest reliability and MDC for persons with schizophrenia. We found that the ICC values of the D2 were fair to excellent. For the repeated uses of the D2, the scores of TN (0.93), TN-E (0.93), and CP (0.94) were found to be more reliable than E1 (0.82), E2 (0.88), E (0.85), and FR (0.78). In the published D2 manual, test–retest reliability is reported from the samples of 38 children, 37 adolescents, and 31 adults (Brickenkamp & Zillmer, 1998). The test–retest interval was approximately 3 months, and the Pearson coefficients ranged from 0.72 to 0.90 for TN-E. These results were close to ours, implying that the test–retest reliability of the D2 in persons with schizophrenia is not clearly different from that in the aforementioned samples. However, it is suggested that clinicians and researchers who use the D2 to assess patients with schizophrenia should take into account the fair agreement between test and retest sections, especially in the FR indices. On the other hand, the ICC values of TN, TN-E, and CP indices in the current study, which ranged from 0.93 (TN and TN-E) to 0.94 (CP), indicate that the D2 has excellent reliability in patients with schizophrenia. Systematic bias, which may result from the practice effect, was noted in the subscores (TN, TN-E, and CP), a fact that may affect the accuracy of our findings. Fortunately, the effect sizes of the TN, TN-E, and CP subscales were small. Therefore, for the scores of TN, TN-E, and CP, more repeated testing, about 2–3 administrations, or enhancing the practice times may reduce the influence of the practice effect (Bartels, Wegrzyn, Wiedl, Ackermann, & Ehrenreich, 2010). Clinicians should consider the practice effect when interpreting the test results of their clients. Thus, any difference in scores needs to be greater than the practice effect induced by the measure for true change to be determined. The current study revealed that only the MDC% of TN (29.65%) was smaller than the acceptable criterion (30%) (Chen et al., 2014; Huang et al., 2011). The results suggested that when the D2 is used in patients with schizophrenia, caution must be taken in explaining the change score in attentional function. According to our results, a score of 103 or higher is needed for a change score of “TN” to be considered real (i.e., beyond measurement error). This score is relatively high as compared with the mean of “TN”. In other words, it is very difficult for a patient with schizophrenia to improve a change score by 103 and thereby demonstrate real improvement. For the seven subscores, TN, TN-E, and CP had better random measurement error than the other subscores. Therefore, the TN, TN-E and CP subscores would have better utility in patients with schizophrenia in clinical settings. Our results reveal that the D2 has acceptable agreement between repeated measurements but large random measurement error in patients with schizophrenia. High standard deviation would be the major cause of high MDC and MDC%. The results implied that the participants in the study had a large range of ability. On the other hand, the participants filled out the D2 as quickly as possible; doing so too fast could result in numerous wrong answers. That is why the MDC and MDC% of E (85.0%), E1 (94.6%), and E2 (69.0%) are larger. It also implies that the error scores on the D2 were less reliable in individuals with schizophrenia in particular. The scores of E, E1, and E2 need to be applied carefully in patients with schizophrenia. Bland–Altman plots can be used to visualize systematic variations around the zero line, the LOA, and to illustrate whether a trend (heteroscedasticity) exists in the sample in a study (Flansbjer, Holmback, Downham, Pattent, & Lexell, 2005). The presence of heteroscedastic errors in measurements could have practical research implications. The current study revealed that the scores of E1, E2, and E were heteroscedastic. Therefore, a measurement with higher value also entails a greater amount of measurement error. Other important measures include the reliable change (RC) index. The RC index enables identification of patients whose response to treatment is clinically as well as statistically significant (Christensen & Mendoza, 1986; Jacobson, Follette, & Revenstorf, 1984). The RC index is calculated by determining the difference between pretest and posttest scores and then dividing this difference by a standard error measure that includes the standard deviation of the measure and the reliability coefficient (Turk, Okifuji, Sinclair, & Terence, 1998). The reliable change index modified for practice (RCIp) takes the practice effect into account when interpreting the change in scores in an individual patient. An RCIp with a 90% confidence interval (CI) calculates the range of a change score to determine whether the change in an individual patient’s scores was beyond random measurement error and practice effect (Koh et al., 2011). The RC index or RCIp may be used in related research when considering the practice effect in the future. Different indicators, such as the Ruff 2 and 7 selective attention test (RSAT), are used for assessing cognition in patients with schizophrenia. However, the MDC of the RSAT has not been examined in any other research articles, so the test–retest reliability and MDC of the RSAT will be the topic of future research. Currently, comparisons of different indicators of selective attention in persons with schizophrenia with other studies are limited. Limitations The limitation in the research should be addressed. Schizophrenia is a heterogeneous disorder with fluctuating symptom severity, even in stable patients. The CGI-S is used for assessing symptom stability, which might entail a limitation in the study. Notably, most of the participants were scored as mild or borderline on the CGI-S. Thus, the results of the study cannot be generalized to populations with more serious symptoms. The extent of subtle changes in symptoms is a point of concern as a source of variability in assessing test–retest reliability in this sample. In the future, related studies may consider more precise assessment tools for monitoring stability in such patients. Conclusion In summary, our results indicated that the D2 has good reliability, especially in the scores of TN, TN-E, and CP, in monitoring the selective attention function of patients with schizophrenia. The MDCs of TN, TN-E, and CP also indicated that those subscores would be better to use in the clinic in patients with schizophrenia. Further research should seek a way to improve the administration procedure to reduce the random measurement error of the E1, E2, E, and FR subscores. The results of the present study can be used as a reference for the measurement error of the D2 to help clinicians and researchers determine true change between successive assessments of patients with schizophrenia. Funding This work was supported by the National Science Council (NSC) research project NSC 102-2622-H-214-002-CC3 in Taiwan. The NSC had no further role in study design, collection, analysis and interpretation of data, composition of the report, or the decision to submit the paper for publication. Conflict of Interest None declared. Acknowledgments The authors would like to thank particularly to all the participants involved in this study and to the anonymous reviewers for the significant comments and suggestions that helped to improve the quality of the manuscript. References Allen , M. H. , Daniel , D. G. , Revicki , D. A. , Canuso , C. M. , Turkoz , I. , Fu , D. J. , et al. . ( 2012 ). Development and psychometric evaluation of a clinical global impression for schizoaffective disorder scale . Innovations in Clinical Neuroscience , 9 , 15 – 24 . Google Scholar PubMed Arnall , F. A. , Koumantakis , G. A. , Oldham , J. A. , & Cooper , R. G. ( 2002 ). Between-days reliability of electromyographic measures of paraspinal muscle fatigue at 40, 50 and 60% levels of maximal voluntary contractile force . Clinical Rehabilitation , 16 , 761 – 771 . Google Scholar CrossRef Search ADS PubMed Atkinson , G. , & Nevill , A. M. ( 1998 ). Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine . Sports Medicine , 26 , 217 – 238 . Google Scholar CrossRef Search ADS PubMed Bartels , C. , Wegrzyn , M. , Wiedl , A. , Ackermann , V. , & Ehrenreich , H. ( 2010 ). Practice effects in healthy adults: A longitudinal study on frequent repetitive cognitive testing . BMC Neuroscience , 11 , 118 . Google Scholar CrossRef Search ADS PubMed Bates , M. E. , & Lemay , E. P. , Jr. ( 2004 ). The d2 test of attention: Construct validity and extensions in scoring techniques . Journal of International Neuropsychological Society , 10 , 392 – 400 . Google Scholar CrossRef Search ADS Biehl , S. C. , Ehlis , A. C. , Muller , L. D. , Niklaus , A. , Pauli , P. , & Herrmann , M. J. ( 2013 ). The impact of task relevance and degree of distraction on stimulus processing . BMC Neuroscience , 14 , 107 . Google Scholar CrossRef Search ADS PubMed Bland , J. M. , & Altman , D. G. ( 1986 ). Statistical methods for assessing agreement between two methods of clinical measurement . Lancet , 1 , 307 – 310 . Google Scholar CrossRef Search ADS PubMed Bland , J. M. , & Altman , D. G. ( 2003 ). Applying the right statistics: Analyses of measurement studies . Ultrasound in Obstetrics Gynecology , 22 , 85 – 93 . Google Scholar CrossRef Search ADS PubMed Boyer , L. , Richieri , R. , Guedj , E. , Faget-Agius , C. , Loundou , A. , Llorca , P. M. , et al. . ( 2013 ). Validation of a functional remission threshold for the functional remission of general schizophrenia (FROGS) scale . Comprehensive Psychiatry , 54 , 1016 – 1022 . Google Scholar CrossRef Search ADS PubMed Boyer , L. , Testart , J. , Michel , P. , Richieri , R. , Faget-Agius , C. , Vanoye , V. , et al. . ( 2014 ). Neurophysiological correlates of metabolic syndrome and cognitive impairment in schizophrenia: A structural equation modeling approach . Psychoneuroendocrinology , 50 , 95 – 105 . Google Scholar CrossRef Search ADS PubMed Brennan , P. , & Silman , A. ( 1992 ). Statistical methods for assessing observer variability in clinical measures . BMJ (Clinical Research Ed.) , 304 , 1491 – 1494 . Google Scholar CrossRef Search ADS PubMed Brickenkamp , R. , & Zillmer , E. ( 1998 ). The d2 Test of Attention . Seattle, WA : Hogrefe & Huber Publishers . Busner , J. , & Targum , S.D. ( 2007 ). The clinical global impressions scale: applying a research tool in clinical practice. Psychiatry (Edgmont), 4 ( 7 ), 28 – 37 . Chen , C. H. , Lin , S. F. , Yu , W. H. , Lin , J. H. , Chen , H. L. , & Hsieh , C. L. ( 2014 ). Comparison of the test-retest reliability of the balance computerized adaptive test and a computerized posturography instrument in patients with stroke . Archives of Physical Medicine Rehabilitation , 95 , 1477 – 1483 . Google Scholar CrossRef Search ADS PubMed Cohen , R. M. , Nordahl , T. E. , Semple , W. E. , Andreason , P. , & Pickar , D. ( 1998 ). Abnormalities in the distributed network of sustained attention predict neuroleptic treatment response in schizophrenia . Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology , 19 , 36 – 47 . Google Scholar CrossRef Search ADS PubMed Christensen , L. , & Mendoza , J. L. ( 1986 ). A method of assessing change in a single subject: An alteration of the RC index . Behavior Therapy , 17 , 305 – 308 . Google Scholar CrossRef Search ADS Elvevag , B. , & Goldberg , T. E. ( 2000 ). Cognitive impairment in schizophrenia is the core of the disorder . Critical Reviews in Neurobiology , 14 ( 1 ), 1 – 21 . Google Scholar CrossRef Search ADS PubMed Ernst , N. R. , Haugaard , C. , Olrik , W. J. S. , Munk-Jorgensen , P. , & Ostergaard Christensen , T. ( 2013 ). Prediction of patient contacts by cognition in schizophrenia . The Australian and New Zealand Journal of Psychiatry , 47 , 637 – 645 . Google Scholar CrossRef Search ADS PubMed Flansbjer , U. B. , Holmback , A. M. , Downham , D. , Patten , C. , & Lexell , J. ( 2005 ). Reliability of gait performance tests in men and women with hemiparesis after stroke . Journal of Rehabilitation Medicine , 37 , 75 – 82 . Google Scholar CrossRef Search ADS PubMed Goldsmith , C. H. , Boers , M. , Bombardier , C. , & Tugwell , P. ( 1993 ). Criteria for clinically important changes in outcomes: Development, scoring and evaluation of rheumatoid arthritis patient and trial profiles . The Journal of Rheumatology , 20 , 561 – 565 . Google Scholar PubMed Green , M. F. , Nuechterlein , K. H. , Gold , J. M. , Barch , D. M. , Cohen , J. , Essock , S. , et al. . ( 2004 ). Approaching a consensus cognitive battery for clinical trials in schizophrenia: The NIMH-MATRICS conference to select cognitive domains and test criteria . Biological Psychiatry , 56 , 301 – 307 . Google Scholar CrossRef Search ADS PubMed Guy , W. ( 1976 ). Assessment Manual for Psychopharmacology. Washington, DC : US Government Printing Office . Haley , S. M. , & Fragala-Pinkham , M. A. ( 2006 ). Interpreting change scores of tests and measures used in physical therapy . Physical Therapy , 86 , 735 – 743 . Google Scholar PubMed Haro , J. M. , Kamath , S. A. , Ochoa , S. , Novick , D. , Rele , K. , Fargas , A. , et al. . ( 2003 ). The Clinical Global Impression-Schizophrenia scale: A simple instrument to measure the diversity of symptoms present in schizophrenia . Acta Psychiatrica Scandinavica , 107 ( s416 ), 16 – 23 . Google Scholar CrossRef Search ADS Huang , R. R. , Chen , Y. S. , Chen , C. C. , Chou , F. H. , Su , S. F. , Chen , M. C. , et al. . ( 2012 ). Quality of life and its associated factors among patients with two common types of chronic mental illness living in Kaohsiung City . Psychiatry and Clinical Neurosciences , 66 , 482 – 490 . Google Scholar CrossRef Search ADS PubMed Huang , S. L. , Hsieh , C. L. , Wu , R. M. , Tai , C. H. , Lin , C. H. , & Lu , W. S. ( 2011 ). Minimal detectable change of the timed “up & go” test and the dynamic gait index in people with Parkinson disease . Physical Therapy , 91 , 114 – 121 . Google Scholar CrossRef Search ADS PubMed Jacobson , N. S. , Follette , W. , & Revenstorf , D. ( 1984 ). Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance . Behavior Therapy , 15 , 336 – 352 . Google Scholar CrossRef Search ADS Kazis , L. E. , Anderson , J. J. , & Meenan , R. F. ( 1989 ). Effect sizes for interpreting changes in health status . Medical Care , 27 , S178 – S189 . Google Scholar CrossRef Search ADS PubMed Koh , C. L. , Lu , W. S. , Chen , H. C. , Hsueh , I. P. , Hsieh , J. J. , & Hsieh , C. L. ( 2011 ). Test-retest reliability and practice effect of the oral-format Symbol Digit Modalities Test in patients with stroke . Archives of Clinical Neuropsychology , 26 , 356 – 363 . doi:10.1093/arclin/acr029 . Google Scholar CrossRef Search ADS PubMed Kovacs , F. M. , Abraira , V. , Royuela , A. , Corcoll , J. , Alegre , L. , Tomas , M. , et al. . ( 2008 ). Minimum detectable and minimal clinically important changes for pain in patients with nonspecific neck pain . BMC Musculoskeletal Disorders , 9 , 43 . Google Scholar CrossRef Search ADS PubMed Lavie , N. ( 2005 ). Distracted and confused? Selective attention under load . Trends in Cognitive Science , 9 , 75 – 82 . Google Scholar CrossRef Search ADS Lee , P. , Li , P. C. , Liu , C. H. , & Hsieh , C. L. ( 2011 ). Test-retest reliability of two attention tests in schizophrenia . Archives of Clinical Neuropsychology , 26 , 405 – 411 . Google Scholar CrossRef Search ADS PubMed Leucht , S. , Davis , J. M. , Engel , R. R. , Kissling , W. , & Kane , J. M. ( 2009 ). Definitions of response and remission in schizophrenia: Recommendations for their use and their presentation . Acta Psychiatrica Scandinavica , 438 , 7 – 14 . Google Scholar CrossRef Search ADS Liaw , L. J. , Hsieh , C. L. , Lo , S. K. , Chen , H. M. , Lee , S. , & Lin , J. H. ( 2008 ). The relative and absolute reliability of two balance performance measures in chronic stroke patients . Disability and Rehabilitation , 30 , 656 – 661 . Google Scholar CrossRef Search ADS PubMed Michon , H. W. , Kroon , H. , Van Weeghel , J. , & Schene , A. H. ( 2004 ). The generic work behavior questionnaire (GWBQ): Assessment of core dimensions of generic work behavior of people with severe mental illnesses in vocational rehabilitation . Psychiatric Rehabilitation Journal , 28 , 40 – 47 . Google Scholar CrossRef Search ADS PubMed Nuechterlein , K. H. , Barch , D. M. , Gold , J. M. , Goldberg , T. E. , Green , M. F. , & Heaton , R. K. ( 2004 ). Identification of separable cognitive factors in schizophrenia . Schizophrenia Research , 72 , 29 – 39 . Google Scholar CrossRef Search ADS PubMed Otto , M. W. , Tolin , D. F. , Simon , N. M. , Pearlson , G. D. , Basden , S. , Meunier , S. A. , et al. . ( 2010 ). Efficacy of d-cycloserine for enhancing response to cognitive-behavior therapy for panic disorder . Biological Psychiatry , 67 , 365 – 370 . Google Scholar CrossRef Search ADS PubMed Patrick , D. L. , Burns , T. , Morosini , P. , Rothman , M. , Gagnon , D. D. , Wild , D. , et al. . ( 2009 ). Reliability, validity and ability to detect change of the clinician-rated personal and social performance scale in patients with acute symptoms of schizophrenia . Current Medical Research and Opinion , 25 , 325 – 338 . Google Scholar CrossRef Search ADS PubMed Patten , C. , Kothari , D. , Whitney , J. , Lexell , J. , & Lum , P. S. ( 2003 ). Reliability and responsiveness of elbow trajectory tracking in chronic poststroke hemiparesis . Journal of Rehabilitation Research and Development , 40 , 487 – 500 . Google Scholar CrossRef Search ADS PubMed Prince , B. , Makrides , L. , & Richman , J. ( 1980 ). Research methodology and applied statistics. Part 2: The literature search . Physiotherapy Canada , 32 , 201 – 206 . Google Scholar PubMed Rankin , G. , & Stokes , M. ( 1998 ). Reliability of assessment tools in rehabilitation: An illustration of appropriate statistical analyses . Clinical Rehabilitation , 12 , 187 – 199 . Google Scholar CrossRef Search ADS PubMed Rehse , M. , Bartolovic , M. , Baum , K. , Richter , D. , Weisbrod , M. , & Roesch-Ely , D. ( 2016 ). Influence of antipsychotic and anticholinergic loads on cognitive functions in patients with schizophrenia . Schizophrenia Research and Treatment , 2016 , 8213165 . Google Scholar CrossRef Search ADS PubMed Ries , J. D. , Echternach , J. L. , Nof , L. , & Gagnon Blodgett , M. ( 2009 ). Test-retest reliability and minimal detectable change scores for the timed “up & go” test, the six-minute walk test, and gait speed in people with Alzheimer disease . Physical Therapy , 89 , 569 – 579 . Google Scholar CrossRef Search ADS PubMed Schreuders , T. A. R. , Roebroeck , M. E. , Goumans , J. , Van Nieuwenhuijzen , J. F. , Stijnen , T. H. , & Stam , H. J. ( 2003 ). Measurement error in grip and pinch force measurements in patients with hand injuries . Physical Therapy , 83 , 806 – 815 . Google Scholar PubMed Schuck , P. , & Christian , Z. ( 2003 ). The ‘smallest real difference’ as a measure of sensitivity to change: A critical analysis . International Journal of Rehabilitation Research , 26 , 85 – 91 . Google Scholar CrossRef Search ADS PubMed Shrout , P. E. , & Fleiss , J. L. ( 1979 ). Intraclass correlations: Uses in assessing rater reliability . Psychological Bulletin , 86 , 420 – 428 . Google Scholar CrossRef Search ADS PubMed Smidt , N. , Van der Windt , D. A. , Assendelft , W. J. , Mourits , A. J. , Devill , W. L. , de Winter , A. F. , et al. . ( 2002 ). Interobserver reproducibility of the assessment of severity of complaints, grip strength, and pressure pain threshold in patients with lateral epicondylitis . Archives of Physical Medicine and Rehabilitation , 83 , 1145 – 1150 . Google Scholar CrossRef Search ADS PubMed Smith , S. F. , & Lilienfeld , S. O. ( 2015 ). The response modulation hypothesis of psychopathy: A meta-analytic and narrative analysis . Psychological Bulletin , 141 , 1145 – 1177 . Google Scholar CrossRef Search ADS PubMed Sohlberg , M. M. , & Mateer , C. A. ( 2001 a). Cognitive rehabilitation: An integrative neuropsychological approach . New York : Guilford Press . Sohlberg , M. M. , & Mateer , C. A. ( 2001 b). Improving attention and managing attentional problems. Adapting rehabilitation techniques to adults with ADD . Annals of the New York Academy Sciences , 931 , 359 – 375 . Google Scholar CrossRef Search ADS Steffen , T. , & Seney , M. ( 2008 ). Test-retest reliability and minimal detectable change on balance and ambulation tests, the 36-item short-form health survey, and the unified Parkinson disease rating scale in people with parkinsonism . Physical Therapy , 88 , 733 – 746 . Google Scholar CrossRef Search ADS PubMed Turk , D. C. , Okifuji , A. , Sinclair , J. D. , & Starz , T. W. ( 1998 ). Interdisciplinary treatment for fibromyalgia syndrome: Clinical and statistical significance . Arthritis Care and Research , 11 , 186 – 195 . Google Scholar CrossRef Search ADS PubMed Van den Berg , V. , Saliasi , E. , de Groot , R. H. , Jolles , J. , Chinapaw , M. J. , & Singh , A. S. ( 2016 ). Physical activity in the school setting: Cognitive performance is not affected by three different types of acute exercise . Frontiers in Psychology , 7 , 723 . Google Scholar CrossRef Search ADS PubMed Vanhelst , J. , Beghin , L. , Duhamel , A. , Manios , Y. , Molnar , D. , De Henauw , S. , et al. . ( 2016 ). Physical activity is associated with attention capacity in adolescents . The Journal of Pediatrics , 168 , 126 – 131 . Google Scholar CrossRef Search ADS PubMed Wagner , J. M. , Rhodes , J. A. , & Patten , C. ( 2008 ). Reproducibility and minimal detectable change of three-dimensional kinematic analysis of reaching tasks in people with hemiparesis after stroke . Physical Therapy , 88 , 652 – 663 . Google Scholar CrossRef Search ADS PubMed Wu , B. J. , Lin , C. H. , Tseng , H. F. , Liu , W. M. , Chen , W. C. , Huang , L. S. , et al. . ( 2013 ). Validation of the Taiwanese mandarin version of the personal and social performance scale in a sample of 655 stable schizophrenic patients . Schizophrenia Research , 146 , 34 – 39 . Google Scholar CrossRef Search ADS PubMed © The Author(s) 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Archives of Clinical Neuropsychology Oxford University Press

Test–Retest Reliability and Minimal Detectable Change of the D2 Test of Attention in Patients with Schizophrenia

Loading next page...
 
/lp/ou_press/test-retest-reliability-and-minimal-detectable-change-of-the-d2-test-L0qYL2tm8y
Publisher
Oxford University Press
Copyright
© The Author(s) 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISSN
0887-6177
eISSN
1873-5843
D.O.I.
10.1093/arclin/acx123
Publisher site
See Article on Publisher Site

Abstract

Abstract Objective The d2 Test of Attention (D2) is a commonly used measure of selective attention for patients with schizophrenia. However, its test–retest reliability and minimal detectable change (MDC) are unknown in patients with schizophrenia, limiting its utility in both clinical and research settings. The aim of the present study was to examine the test–retest reliability and MDC of the D2 in patients with schizophrenia. Method A rater administered the D2 on 108 patients with schizophrenia twice at a 1-month interval. Test–retest reliability was determined through the calculation of the intra-class correlation coefficient (ICC). We also carried out Bland–Altman analysis, which included a scatter plot of the differences between test and retest against their mean. Systematic biases were evaluated by use of a paired t-test. Results The ICCs for the D2 ranged from 0.78 to 0.94. The MDCs (MDC%) of the seven subscores were 102.3 (29.7), 19.4 (85.0), 7.2 (94.6), 21.0 (69.0), 104.0 (33.1), 105.0 (35.8), and 7.8 (47.8), which represented limited-to-acceptable random measurement error. Trends in the Bland–Altman plots of the omissions (E1), commissions (E2), and errors (E) were noted, presenting that the data had heteroscedasticity. Conclusions According to the results, the D2 had good test–retest reliability, especially in the scores of TN, TN-E, and CP. For the further research, finding a way to improve the administration procedure to reduce random measurement error would be important for the E1, E2, E, and FR subscores. Attention, Schizophrenia, d2 Test of Attention Introduction Among the variety of cognitive deficits in schizophrenia, the most robust is deficits of attention (Elvevag & Goldberg, 2000; Nuechterlein et al., 2004). Selective attention refers to the ability to maintain a behavioral or cognitive set in the face of distracting or competing stimuli (Biehl et al., 2013; Lavie, 2005). It thus incorporates the notion of freedom from distractibility (Sohlberg & Mateer, 2001b). Individuals with selective attention deficits are easily drawn off task by extraneous, irrelevant stimuli (Smith & Lilienfeld, 2015). Examples of problems include an inability to perform therapy tasks in a stimulating environment (e.g., an open treatment area) or to prepare a meal with children playing in the background (Sohlberg & Mateer, 2001a). Therefore, the selective attention function is measured periodically to detect and monitor the attention status of patients with schizophrenia in clinical settings. For reliable measurements, a measurement for assessing selective attention with sound reliability is needed. Test–retest reliability is the highest-rated test criterion among surveyed experts because this property is critical for detecting changes with treatment of schizophrenia (Green et al., 2004). Intra-class correlation (ICC) methods have been commonly used for assessing test–retest reliability in recent studies (Brennan & Silman, 1992). The ICC reflects both the degree of correlation and level of agreement between measures (Lee, Li, Liu, & Hsieh, 2011), and the ICC is also widely accepted to better represent the reliability of a measure than a value of Pearson’s r (Brennan & Silman, 1992; Patten, Kothari, Whitney, Lexell, & Lum, 2003; Prince, Makrides, & Richman, 1980; Shrout & Fleiss, 1979). In addition, to determine whether a change score between repeated measurements is significant, researchers need to provide minimal detectable change (MDC) scores for users to interpret the results of repeated measurements (Goldsmith, Boers, Bombardier, & Tugwell, 1993; Schuck & Christian, 2003; Steffen & Seney, 2008). The MDC is defined as the minimum amount of change that is not due to variation in measurement (Schreuders et al., 2003). Namely, when a difference between two successive measurements exceeds the MDC, the change is more likely to be viewed as a real difference, rather than merely random error (Michon, Kroon, Van Weeghel, & Schene, 2004). Once the MDC is determined on a particular test for a given population, clinicians can interpret whether the change scores of their patients are at or above the minimal level of detectable change (Ries, Echternach, Nof, & Gagnon, 2009; Wagner, Rhodes, & Patten, 2008). Thus, the MDC is crucial for clinicians and researchers to determine real change in repeated measurements for an individual patient (Kovacs et al., 2008). The d2 Test of Attention (D2), a cancellation test, is a user-friendly and valid measure of selective attention often used in related research for patients with schizophrenia (Boyer et al., 2013, 2014; Brickenkamp & Zillmer, 1998; Ernst, Haugaard, Olrik, Munk-Jorgensen, & Ostergaard, 2013; Rehse et al., 2016). The D2 is used because it is low in cost, easy and fast to administer, and applicable to testing large numbers of people simultaneously (Van den Berg et al., 2016). The D2 assesses performance in terms of visual perceptual speed and concentrative capacities by assessing an individual’s ability to selectively, quickly, and accurately focus on certain relevant aspects in a task while ignoring irrelevant aspects (Vanhelst et al., 2016). In terms of reliability, the D2 has good test–retest reliability in a normal adult sample (Brickenkamp & Zillmer, 1998). However, the test–retest reliability and MDC of the D2 have been examined little in patients with schizophrenia, limiting the interpretability and applicability of this measure for research and clinical settings. Thus, the purpose of our study was to examine the test–retest reliability and MDC of the D2 in patients with schizophrenia. Methods Participants Participants were recruited from a clinical psychiatric hospital, and all were Taiwanese. Participants with the following characteristics were included: (A) diagnosis of schizophrenia according to the International Classification of Disease, 9 thRevision, Clinical Modification (ICD-9-CM) diagnostic criteria; (B) clinical stability with a stable dose of antipsychotic medication for at least 3 months prior to the study; (C) sufficient reading or listening comprehension to complete the D2 (Mini Mental State Examination [MMSE] score > 24); (D) absence of substance abuse, severe medical or neurological condition, or other psychiatric disorder that required treatment (e.g., mental retardation, dementia, or developmental disability); (E) age 20–65 years; and (F) provision of informed consent. Criterion (B) was used to support the assumption for examining test–retest reliability (i.e., the attention performance of participants was assumed to be stable between two measurements). Participants were excluded from the study if they: (A) were participating in another clinical trial; (B) were suffering an episode of major depression; and/or (C) exhibited difficulties in recognizing the letters of the English alphabet. The study protocol was reviewed and approved by the Institutional Review Board of the study hospital. Originally, a total of 150 participants were approached for recruitment in the study, and 33 of them did not meet the criteria. In the end, 117 participants who were eligible for the study were recruited. Of these, 11 participants did not complete the second session because they were discharged from the hospital, withdrew, or were otherwise lost to follow-up. Thus, the final sample included 106 participants. Participants completed the test within about 8 min. All participants confirmed that they had continued with their regular therapeutic activities during the study periods. Research Procedure The D2 was administered to the participants by a specially trained occupational therapist twice, at an interval of 1 month. The occupational therapist was trained specifically to administer the D2 test. The test was administered in a quiet room with no environmental distractions. Participants’ demographic data were collected from medical records. The psychopathological stability of the participants with schizophrenia was assessed with the Clinical Global Impression-Severity (CGI-S) (Guy, 1976). The participants with stable CGI-S (i.e., the CGI-S category was exactly the same at the two time points) were included for further analysis. The CGI-S was administered by the same rater. The participants continued with their regular therapeutic activities during the study periods. Measures The D2 is a one-page paper-and-pencil test for selective attention. The standard version of the D2 consists of 14 rows (trials), each with 47 randomly mixed “p” and “d” letters (Brickenkamp & Zillmer, 1998), for a total of 658 letters. The target symbol is a “d” with two dashes (hence “d2”). The participant’s task is to cancel out as many target symbols as possible, moving from left to right, with a time limit of 20 s/trial. The seven subscores of the D2 are total number (TN), omissions (E1), commissions (E2), errors (E), total-errors (TN-E), concentration performance (CP), and fluctuation rate (FR). TN is a quantitative measure of performance of all items that were processed. E1 occurs when relevant items (“d” with two dashes) are not crossed out. E2 occurs when irrelevant letters are crossed out in violation of the instructions. The raw score E is the sum of all mistakes. TN-E is the total number of items scanned minus error scores (E). CP is derived from the number of the correctly crossed out relevant items minus the E2. FR is the difference between the line (or lines) with the minimum numbers of items processed (Brickenkamp & Zillmer, 1998). The coefficients of test–retest reliability are good for adults at an interval of less than 3 months (r = 0.90). The results of adults are more stable than those of children or adolescents over a longer period. Regarding the validity of the D2, the highest relationship is with the symbol digit modalities test (SDMT). In order to examine the construct validity, a factor analysis was calculated for the D2. The results further supported the sensitivity of the D2 as a test of concentration and attention. Therefore, the construct and convergent validity of the D2 are good for assessing attention (Bates & Lemay, 2004). The D2 is a reliable and valid measurement tool. The CGI-S is commonly used for assessing symptom severity in schizophrenia research and clinical practice (Busner & Targum, 2007). The scale is a 1-item observer-rated scale and clinician-rated instrument (Guy, 1976). The CGI-S is a 7-point scale for rating the severity of psychiatric illness (Huang et al. 2012). The clinicians rate the severity of the patient’s mental illness at the time of measurement (1 = not at all, 2 = borderline, 3 = mild, 4 = moderate, 5 = marked, 6 = severe, and 7 = extremely ill) (Otto et al., 2010). The inter-rater reliability has an ICC of 0.75 (Haro et al., 2003). The content validity of the CGI-S is strong (Allen et al., 2012). The CGI-S was used to assess the change in clinical status of the participants (Leucht, Davis, Engel, Kissling, & Kane, 2009; Patrick et al., 2009; Wu et al., 2013). Statistical Analysis Data were analyzed in the Statistical Package for Social Science version 22.0 (SPSS Inc., Chicago, IL). The alpha level was set at 0.05 for all statistical tests, and all p-values were two tailed. Test–Retest Reliability Test–retest reliability was determined through calculation of the ICC(2,1), a two-way random-effects single-measure reliability (absolute agreement) (Rankin & Stokes, 1998; Shrout & Fleiss, 1979). The ICC is the ratio of the inter-subject component of variance (inter-subject variance + within-subject variance) (Prince et al. 1980). There is no universally agreed level for ICC values in relation to levels of reliability, but the following scheme has been previously reported as acceptable: 0.90–0.99, excellent reliability; 0.80–0.89, good reliability; 0.70–0.79, fair reliability; 0.69 or below, poor reliability (Arnall, Koumantakis, Oldham, & Cooper, 2002). The ICCs of the D2 were computed for the test and retest sessions (Haley & Fragala-Pinkham, 2006). Minimal Detectable Change (MDC) and MDC Percentage (MDC%) The MDC was calculated based on the standard error of measurement (SEM) according to the equations below (Michon et al. 2004): SEM=SDalltestingscores×√(1−r) MDC=1.96×√2×SEM In these formulas, 1.96 is the z-score at the 95% confidence level, √2 is used because of the underlying extra uncertainty during measurement at two time points, and r is the coefficient of the test–retest reliability, which was represented by the ICC. In addition, we calculated the MDC% (=MDC/mean × 100%), which presents the relative amount of random measurement error. The “mean” in this equation is the mean score of all trials. An MDC% of 30% or less is considered acceptable, and one of 10% or less, excellent (Smidt et al., 2002). Systematic Bias We used a paired t-test to examine whether systematic biases existed. The differences between the test–retest assessments were considered significant if p-values were smaller than .05. We also calculated the effect size, which was the mean changes divided by the standard deviation of the first session scores, to determine the extent of bias (Kazis, Anderson, & Meenan, 1989). According to Cohen’s criteria, an effect size of greater than 0.8 was considered large; 0.5–0.8, moderate; and 0.2–0.5, small (Cohen, Nordahl, Semple, Andreason, & Pickar, 1998). Bland–Altman Plots Bland–Altman plots were used to visually examine the agreement of a test by plotting the difference scores against the mean score of each pair of measurements (Bland & Altman, 1986). Assuming the differences follow normal distribution, 95% of the differences (limits of agreement, LOA) should lie between d ± 1.96 × SD, where d represents the mean difference between test and retest scores, and SD is the standard deviation of differences of each pair (Bland & Altman, 2003; Liaw et al., 2008). The plot shows the difference between test sessions 2 and 1 (2–1) against the mean of the two test sessions for each subject (Bland & Altman, 2003). Heteroscedasticity In addition, we used Pearson’s r to examine the association between the absolute difference and the mean of each pair of repeated measurements to examine the possibility of heteroscedasticity, meaning that a systematic trend (e.g., the higher the scores, the larger the differences) exists. If heteroscedasticity exists, the MDC should not be applied for different levels (e.g., attention deficits in this study) of patients. According to Atkinson’s suggestion, if the absolute value of Pearson’s r is greater than 0.3, the data are heteroscedastic (Atkinson & Nevill, 1998). Results Demographic and Clinical Characteristics of Participants All of these participants had the same CGI-S category at both sessions. Their CGI-S scores were mostly mild (40.6%) or borderline (31.1%). Table 1 displays demographic data for the 106 participants. The mean age was approximately 44 years old (SD = 8.9), and 56.6% of the participants were male. All participants were receiving maintenance medication, taking antipsychotic medication alone (the three most common medications were haloperidol, clozapine, and sulpiride), and there were no significant changes in medication in the 1-month study period. Table 1. Demographic characteristics of the sample (n = 106) Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Notes: SD = standard deviation; CGI-S = Clinical Global Impression-Severity. Table 1. Demographic characteristics of the sample (n = 106) Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Variable Mean SD Age 43.6 8.9 Onset 22.9 6.6 Psychiatric history in years 20.7 9.0 Variable N % Gender  Female 46 43.4  Male 60 56.6 Education status  College 12 11.3  Senior high school 56 52.8  Junior high school 33 31.1  Elementary school 3 2.8  None 2 1.9 Schizophrenia subtypes  Disorganized 9 8.5  Paranoid 84 79.3  Residual 11 10.4  Undifferentiated 2 1.9 CGI-S scores  Not at all (1) 30 28.3  Mild (2) 43 40.6  Borderline (3) 33 31.1 Notes: SD = standard deviation; CGI-S = Clinical Global Impression-Severity. Test–Retest Reliability The reliability indices of the D2 are listed in Table 2. The ICCs for the seven subscores of the D2 between successive sessions were between 0.78 and 0.94. The paired t-test showed that all the subscores were significantly different. The effect sizes were 0.07, 0.04, 0.03, 0.05, 0.07, 0.08, and 0.06 for the TN, E1, E2, E, TN-E, CP, and FR, respectively. Table 2. Test–retest reliability of the D2 (raw scores) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Notes: TN = total number; E1 = Omissions; E2 = Commissions; E = errors; TNE = total-errors; CP = concentration performance; FR = fluctuation rate. Table 2. Test–retest reliability of the D2 (raw scores) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Measure First testing M(SD) Second testing M (SD) Difference M (SD) ICC (95% CI) SEM MDC (%) TN (total number) 340.33 (137.88) 349.41 (141.46) 9.08 (53.43) 0.93 (0.89–0.95) 36.89 102.3 (29.7) Omissions: E1 23.16 (16.43) 22.48 (16.64) −0.68 (9.95) 0.82 (0.75–0.87) 7.00 19.4 (85.0) Commissions: E2 7.71 (7.37) 7.47 (7.64) −0.24 (3.69) 0.88 (0.83–0.92) 2.59 7.2 (94.6) E (errors) 30.87 (19.35) 29.95 (19.83) −0.92 (10.75) 0.85 (0.79–0.90) 7.57 21.0 (69.0) TN-E (total-errors) 309.46 (140.26) 319.45 (143.89) 9.99 (54.28) 0.93 (0.90–0.95) 37.53 104.0 (33.1) CP (concentration performance) 287.36 (154.69) 299.58 (154.99) 12.22 (51.67) 0.94 (0.92–0.96) 37.87 105.0 (35.8) FR (fluctuation rate) 16.42 (6.04) 16.08 (5.90) −0.34 (3.94) 0.78 (0.70–0.85) 2.80 7.8 (47.8) Notes: TN = total number; E1 = Omissions; E2 = Commissions; E = errors; TNE = total-errors; CP = concentration performance; FR = fluctuation rate. Minimal Detectable Change (MDC) The MDCs of the subscores of the D2 were 102.3, 19.4, 7.2, 21.0, 104.0, 105.0, and 7.8 for the TN, E1, E2, E, TN-E, CP, and FR, respectively. All MDC% of the seven subscores were between 29.7% and 94.6%. Seven Bland–Altman plots (Fig. 1) that were representative of the D2 showed the LOA ranges to be −95.6 to 113.8, −20.2 to 18.8, −7.5 to 7.0, −22.0 to 20.2, −96.4 to 116.3, −89.1 to 113.5, and −8.1 to 7.4 for the TN, E1, E2, E, TN-E, CP, and FR, respectively. In addition, trends in the plots of the E1, E2, and E were noted (Pearson’s r were 0.37, 0.62, and 0.33, respectively). Fig. 1. View largeDownload slide Bland–Altman method for plotting the difference in scores against the mean scores of D2. The two bold lines define the limits of agreement (mean of difference ± 1.96 SD). Fig. 1. View largeDownload slide Bland–Altman method for plotting the difference in scores against the mean scores of D2. The two bold lines define the limits of agreement (mean of difference ± 1.96 SD). Discussion The study first examined the D2 by assessing the test–retest reliability and MDC for persons with schizophrenia. We found that the ICC values of the D2 were fair to excellent. For the repeated uses of the D2, the scores of TN (0.93), TN-E (0.93), and CP (0.94) were found to be more reliable than E1 (0.82), E2 (0.88), E (0.85), and FR (0.78). In the published D2 manual, test–retest reliability is reported from the samples of 38 children, 37 adolescents, and 31 adults (Brickenkamp & Zillmer, 1998). The test–retest interval was approximately 3 months, and the Pearson coefficients ranged from 0.72 to 0.90 for TN-E. These results were close to ours, implying that the test–retest reliability of the D2 in persons with schizophrenia is not clearly different from that in the aforementioned samples. However, it is suggested that clinicians and researchers who use the D2 to assess patients with schizophrenia should take into account the fair agreement between test and retest sections, especially in the FR indices. On the other hand, the ICC values of TN, TN-E, and CP indices in the current study, which ranged from 0.93 (TN and TN-E) to 0.94 (CP), indicate that the D2 has excellent reliability in patients with schizophrenia. Systematic bias, which may result from the practice effect, was noted in the subscores (TN, TN-E, and CP), a fact that may affect the accuracy of our findings. Fortunately, the effect sizes of the TN, TN-E, and CP subscales were small. Therefore, for the scores of TN, TN-E, and CP, more repeated testing, about 2–3 administrations, or enhancing the practice times may reduce the influence of the practice effect (Bartels, Wegrzyn, Wiedl, Ackermann, & Ehrenreich, 2010). Clinicians should consider the practice effect when interpreting the test results of their clients. Thus, any difference in scores needs to be greater than the practice effect induced by the measure for true change to be determined. The current study revealed that only the MDC% of TN (29.65%) was smaller than the acceptable criterion (30%) (Chen et al., 2014; Huang et al., 2011). The results suggested that when the D2 is used in patients with schizophrenia, caution must be taken in explaining the change score in attentional function. According to our results, a score of 103 or higher is needed for a change score of “TN” to be considered real (i.e., beyond measurement error). This score is relatively high as compared with the mean of “TN”. In other words, it is very difficult for a patient with schizophrenia to improve a change score by 103 and thereby demonstrate real improvement. For the seven subscores, TN, TN-E, and CP had better random measurement error than the other subscores. Therefore, the TN, TN-E and CP subscores would have better utility in patients with schizophrenia in clinical settings. Our results reveal that the D2 has acceptable agreement between repeated measurements but large random measurement error in patients with schizophrenia. High standard deviation would be the major cause of high MDC and MDC%. The results implied that the participants in the study had a large range of ability. On the other hand, the participants filled out the D2 as quickly as possible; doing so too fast could result in numerous wrong answers. That is why the MDC and MDC% of E (85.0%), E1 (94.6%), and E2 (69.0%) are larger. It also implies that the error scores on the D2 were less reliable in individuals with schizophrenia in particular. The scores of E, E1, and E2 need to be applied carefully in patients with schizophrenia. Bland–Altman plots can be used to visualize systematic variations around the zero line, the LOA, and to illustrate whether a trend (heteroscedasticity) exists in the sample in a study (Flansbjer, Holmback, Downham, Pattent, & Lexell, 2005). The presence of heteroscedastic errors in measurements could have practical research implications. The current study revealed that the scores of E1, E2, and E were heteroscedastic. Therefore, a measurement with higher value also entails a greater amount of measurement error. Other important measures include the reliable change (RC) index. The RC index enables identification of patients whose response to treatment is clinically as well as statistically significant (Christensen & Mendoza, 1986; Jacobson, Follette, & Revenstorf, 1984). The RC index is calculated by determining the difference between pretest and posttest scores and then dividing this difference by a standard error measure that includes the standard deviation of the measure and the reliability coefficient (Turk, Okifuji, Sinclair, & Terence, 1998). The reliable change index modified for practice (RCIp) takes the practice effect into account when interpreting the change in scores in an individual patient. An RCIp with a 90% confidence interval (CI) calculates the range of a change score to determine whether the change in an individual patient’s scores was beyond random measurement error and practice effect (Koh et al., 2011). The RC index or RCIp may be used in related research when considering the practice effect in the future. Different indicators, such as the Ruff 2 and 7 selective attention test (RSAT), are used for assessing cognition in patients with schizophrenia. However, the MDC of the RSAT has not been examined in any other research articles, so the test–retest reliability and MDC of the RSAT will be the topic of future research. Currently, comparisons of different indicators of selective attention in persons with schizophrenia with other studies are limited. Limitations The limitation in the research should be addressed. Schizophrenia is a heterogeneous disorder with fluctuating symptom severity, even in stable patients. The CGI-S is used for assessing symptom stability, which might entail a limitation in the study. Notably, most of the participants were scored as mild or borderline on the CGI-S. Thus, the results of the study cannot be generalized to populations with more serious symptoms. The extent of subtle changes in symptoms is a point of concern as a source of variability in assessing test–retest reliability in this sample. In the future, related studies may consider more precise assessment tools for monitoring stability in such patients. Conclusion In summary, our results indicated that the D2 has good reliability, especially in the scores of TN, TN-E, and CP, in monitoring the selective attention function of patients with schizophrenia. The MDCs of TN, TN-E, and CP also indicated that those subscores would be better to use in the clinic in patients with schizophrenia. Further research should seek a way to improve the administration procedure to reduce the random measurement error of the E1, E2, E, and FR subscores. The results of the present study can be used as a reference for the measurement error of the D2 to help clinicians and researchers determine true change between successive assessments of patients with schizophrenia. Funding This work was supported by the National Science Council (NSC) research project NSC 102-2622-H-214-002-CC3 in Taiwan. The NSC had no further role in study design, collection, analysis and interpretation of data, composition of the report, or the decision to submit the paper for publication. Conflict of Interest None declared. Acknowledgments The authors would like to thank particularly to all the participants involved in this study and to the anonymous reviewers for the significant comments and suggestions that helped to improve the quality of the manuscript. References Allen , M. H. , Daniel , D. G. , Revicki , D. A. , Canuso , C. M. , Turkoz , I. , Fu , D. J. , et al. . ( 2012 ). Development and psychometric evaluation of a clinical global impression for schizoaffective disorder scale . Innovations in Clinical Neuroscience , 9 , 15 – 24 . Google Scholar PubMed Arnall , F. A. , Koumantakis , G. A. , Oldham , J. A. , & Cooper , R. G. ( 2002 ). Between-days reliability of electromyographic measures of paraspinal muscle fatigue at 40, 50 and 60% levels of maximal voluntary contractile force . Clinical Rehabilitation , 16 , 761 – 771 . Google Scholar CrossRef Search ADS PubMed Atkinson , G. , & Nevill , A. M. ( 1998 ). Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine . Sports Medicine , 26 , 217 – 238 . Google Scholar CrossRef Search ADS PubMed Bartels , C. , Wegrzyn , M. , Wiedl , A. , Ackermann , V. , & Ehrenreich , H. ( 2010 ). Practice effects in healthy adults: A longitudinal study on frequent repetitive cognitive testing . BMC Neuroscience , 11 , 118 . Google Scholar CrossRef Search ADS PubMed Bates , M. E. , & Lemay , E. P. , Jr. ( 2004 ). The d2 test of attention: Construct validity and extensions in scoring techniques . Journal of International Neuropsychological Society , 10 , 392 – 400 . Google Scholar CrossRef Search ADS Biehl , S. C. , Ehlis , A. C. , Muller , L. D. , Niklaus , A. , Pauli , P. , & Herrmann , M. J. ( 2013 ). The impact of task relevance and degree of distraction on stimulus processing . BMC Neuroscience , 14 , 107 . Google Scholar CrossRef Search ADS PubMed Bland , J. M. , & Altman , D. G. ( 1986 ). Statistical methods for assessing agreement between two methods of clinical measurement . Lancet , 1 , 307 – 310 . Google Scholar CrossRef Search ADS PubMed Bland , J. M. , & Altman , D. G. ( 2003 ). Applying the right statistics: Analyses of measurement studies . Ultrasound in Obstetrics Gynecology , 22 , 85 – 93 . Google Scholar CrossRef Search ADS PubMed Boyer , L. , Richieri , R. , Guedj , E. , Faget-Agius , C. , Loundou , A. , Llorca , P. M. , et al. . ( 2013 ). Validation of a functional remission threshold for the functional remission of general schizophrenia (FROGS) scale . Comprehensive Psychiatry , 54 , 1016 – 1022 . Google Scholar CrossRef Search ADS PubMed Boyer , L. , Testart , J. , Michel , P. , Richieri , R. , Faget-Agius , C. , Vanoye , V. , et al. . ( 2014 ). Neurophysiological correlates of metabolic syndrome and cognitive impairment in schizophrenia: A structural equation modeling approach . Psychoneuroendocrinology , 50 , 95 – 105 . Google Scholar CrossRef Search ADS PubMed Brennan , P. , & Silman , A. ( 1992 ). Statistical methods for assessing observer variability in clinical measures . BMJ (Clinical Research Ed.) , 304 , 1491 – 1494 . Google Scholar CrossRef Search ADS PubMed Brickenkamp , R. , & Zillmer , E. ( 1998 ). The d2 Test of Attention . Seattle, WA : Hogrefe & Huber Publishers . Busner , J. , & Targum , S.D. ( 2007 ). The clinical global impressions scale: applying a research tool in clinical practice. Psychiatry (Edgmont), 4 ( 7 ), 28 – 37 . Chen , C. H. , Lin , S. F. , Yu , W. H. , Lin , J. H. , Chen , H. L. , & Hsieh , C. L. ( 2014 ). Comparison of the test-retest reliability of the balance computerized adaptive test and a computerized posturography instrument in patients with stroke . Archives of Physical Medicine Rehabilitation , 95 , 1477 – 1483 . Google Scholar CrossRef Search ADS PubMed Cohen , R. M. , Nordahl , T. E. , Semple , W. E. , Andreason , P. , & Pickar , D. ( 1998 ). Abnormalities in the distributed network of sustained attention predict neuroleptic treatment response in schizophrenia . Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology , 19 , 36 – 47 . Google Scholar CrossRef Search ADS PubMed Christensen , L. , & Mendoza , J. L. ( 1986 ). A method of assessing change in a single subject: An alteration of the RC index . Behavior Therapy , 17 , 305 – 308 . Google Scholar CrossRef Search ADS Elvevag , B. , & Goldberg , T. E. ( 2000 ). Cognitive impairment in schizophrenia is the core of the disorder . Critical Reviews in Neurobiology , 14 ( 1 ), 1 – 21 . Google Scholar CrossRef Search ADS PubMed Ernst , N. R. , Haugaard , C. , Olrik , W. J. S. , Munk-Jorgensen , P. , & Ostergaard Christensen , T. ( 2013 ). Prediction of patient contacts by cognition in schizophrenia . The Australian and New Zealand Journal of Psychiatry , 47 , 637 – 645 . Google Scholar CrossRef Search ADS PubMed Flansbjer , U. B. , Holmback , A. M. , Downham , D. , Patten , C. , & Lexell , J. ( 2005 ). Reliability of gait performance tests in men and women with hemiparesis after stroke . Journal of Rehabilitation Medicine , 37 , 75 – 82 . Google Scholar CrossRef Search ADS PubMed Goldsmith , C. H. , Boers , M. , Bombardier , C. , & Tugwell , P. ( 1993 ). Criteria for clinically important changes in outcomes: Development, scoring and evaluation of rheumatoid arthritis patient and trial profiles . The Journal of Rheumatology , 20 , 561 – 565 . Google Scholar PubMed Green , M. F. , Nuechterlein , K. H. , Gold , J. M. , Barch , D. M. , Cohen , J. , Essock , S. , et al. . ( 2004 ). Approaching a consensus cognitive battery for clinical trials in schizophrenia: The NIMH-MATRICS conference to select cognitive domains and test criteria . Biological Psychiatry , 56 , 301 – 307 . Google Scholar CrossRef Search ADS PubMed Guy , W. ( 1976 ). Assessment Manual for Psychopharmacology. Washington, DC : US Government Printing Office . Haley , S. M. , & Fragala-Pinkham , M. A. ( 2006 ). Interpreting change scores of tests and measures used in physical therapy . Physical Therapy , 86 , 735 – 743 . Google Scholar PubMed Haro , J. M. , Kamath , S. A. , Ochoa , S. , Novick , D. , Rele , K. , Fargas , A. , et al. . ( 2003 ). The Clinical Global Impression-Schizophrenia scale: A simple instrument to measure the diversity of symptoms present in schizophrenia . Acta Psychiatrica Scandinavica , 107 ( s416 ), 16 – 23 . Google Scholar CrossRef Search ADS Huang , R. R. , Chen , Y. S. , Chen , C. C. , Chou , F. H. , Su , S. F. , Chen , M. C. , et al. . ( 2012 ). Quality of life and its associated factors among patients with two common types of chronic mental illness living in Kaohsiung City . Psychiatry and Clinical Neurosciences , 66 , 482 – 490 . Google Scholar CrossRef Search ADS PubMed Huang , S. L. , Hsieh , C. L. , Wu , R. M. , Tai , C. H. , Lin , C. H. , & Lu , W. S. ( 2011 ). Minimal detectable change of the timed “up & go” test and the dynamic gait index in people with Parkinson disease . Physical Therapy , 91 , 114 – 121 . Google Scholar CrossRef Search ADS PubMed Jacobson , N. S. , Follette , W. , & Revenstorf , D. ( 1984 ). Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance . Behavior Therapy , 15 , 336 – 352 . Google Scholar CrossRef Search ADS Kazis , L. E. , Anderson , J. J. , & Meenan , R. F. ( 1989 ). Effect sizes for interpreting changes in health status . Medical Care , 27 , S178 – S189 . Google Scholar CrossRef Search ADS PubMed Koh , C. L. , Lu , W. S. , Chen , H. C. , Hsueh , I. P. , Hsieh , J. J. , & Hsieh , C. L. ( 2011 ). Test-retest reliability and practice effect of the oral-format Symbol Digit Modalities Test in patients with stroke . Archives of Clinical Neuropsychology , 26 , 356 – 363 . doi:10.1093/arclin/acr029 . Google Scholar CrossRef Search ADS PubMed Kovacs , F. M. , Abraira , V. , Royuela , A. , Corcoll , J. , Alegre , L. , Tomas , M. , et al. . ( 2008 ). Minimum detectable and minimal clinically important changes for pain in patients with nonspecific neck pain . BMC Musculoskeletal Disorders , 9 , 43 . Google Scholar CrossRef Search ADS PubMed Lavie , N. ( 2005 ). Distracted and confused? Selective attention under load . Trends in Cognitive Science , 9 , 75 – 82 . Google Scholar CrossRef Search ADS Lee , P. , Li , P. C. , Liu , C. H. , & Hsieh , C. L. ( 2011 ). Test-retest reliability of two attention tests in schizophrenia . Archives of Clinical Neuropsychology , 26 , 405 – 411 . Google Scholar CrossRef Search ADS PubMed Leucht , S. , Davis , J. M. , Engel , R. R. , Kissling , W. , & Kane , J. M. ( 2009 ). Definitions of response and remission in schizophrenia: Recommendations for their use and their presentation . Acta Psychiatrica Scandinavica , 438 , 7 – 14 . Google Scholar CrossRef Search ADS Liaw , L. J. , Hsieh , C. L. , Lo , S. K. , Chen , H. M. , Lee , S. , & Lin , J. H. ( 2008 ). The relative and absolute reliability of two balance performance measures in chronic stroke patients . Disability and Rehabilitation , 30 , 656 – 661 . Google Scholar CrossRef Search ADS PubMed Michon , H. W. , Kroon , H. , Van Weeghel , J. , & Schene , A. H. ( 2004 ). The generic work behavior questionnaire (GWBQ): Assessment of core dimensions of generic work behavior of people with severe mental illnesses in vocational rehabilitation . Psychiatric Rehabilitation Journal , 28 , 40 – 47 . Google Scholar CrossRef Search ADS PubMed Nuechterlein , K. H. , Barch , D. M. , Gold , J. M. , Goldberg , T. E. , Green , M. F. , & Heaton , R. K. ( 2004 ). Identification of separable cognitive factors in schizophrenia . Schizophrenia Research , 72 , 29 – 39 . Google Scholar CrossRef Search ADS PubMed Otto , M. W. , Tolin , D. F. , Simon , N. M. , Pearlson , G. D. , Basden , S. , Meunier , S. A. , et al. . ( 2010 ). Efficacy of d-cycloserine for enhancing response to cognitive-behavior therapy for panic disorder . Biological Psychiatry , 67 , 365 – 370 . Google Scholar CrossRef Search ADS PubMed Patrick , D. L. , Burns , T. , Morosini , P. , Rothman , M. , Gagnon , D. D. , Wild , D. , et al. . ( 2009 ). Reliability, validity and ability to detect change of the clinician-rated personal and social performance scale in patients with acute symptoms of schizophrenia . Current Medical Research and Opinion , 25 , 325 – 338 . Google Scholar CrossRef Search ADS PubMed Patten , C. , Kothari , D. , Whitney , J. , Lexell , J. , & Lum , P. S. ( 2003 ). Reliability and responsiveness of elbow trajectory tracking in chronic poststroke hemiparesis . Journal of Rehabilitation Research and Development , 40 , 487 – 500 . Google Scholar CrossRef Search ADS PubMed Prince , B. , Makrides , L. , & Richman , J. ( 1980 ). Research methodology and applied statistics. Part 2: The literature search . Physiotherapy Canada , 32 , 201 – 206 . Google Scholar PubMed Rankin , G. , & Stokes , M. ( 1998 ). Reliability of assessment tools in rehabilitation: An illustration of appropriate statistical analyses . Clinical Rehabilitation , 12 , 187 – 199 . Google Scholar CrossRef Search ADS PubMed Rehse , M. , Bartolovic , M. , Baum , K. , Richter , D. , Weisbrod , M. , & Roesch-Ely , D. ( 2016 ). Influence of antipsychotic and anticholinergic loads on cognitive functions in patients with schizophrenia . Schizophrenia Research and Treatment , 2016 , 8213165 . Google Scholar CrossRef Search ADS PubMed Ries , J. D. , Echternach , J. L. , Nof , L. , & Gagnon Blodgett , M. ( 2009 ). Test-retest reliability and minimal detectable change scores for the timed “up & go” test, the six-minute walk test, and gait speed in people with Alzheimer disease . Physical Therapy , 89 , 569 – 579 . Google Scholar CrossRef Search ADS PubMed Schreuders , T. A. R. , Roebroeck , M. E. , Goumans , J. , Van Nieuwenhuijzen , J. F. , Stijnen , T. H. , & Stam , H. J. ( 2003 ). Measurement error in grip and pinch force measurements in patients with hand injuries . Physical Therapy , 83 , 806 – 815 . Google Scholar PubMed Schuck , P. , & Christian , Z. ( 2003 ). The ‘smallest real difference’ as a measure of sensitivity to change: A critical analysis . International Journal of Rehabilitation Research , 26 , 85 – 91 . Google Scholar CrossRef Search ADS PubMed Shrout , P. E. , & Fleiss , J. L. ( 1979 ). Intraclass correlations: Uses in assessing rater reliability . Psychological Bulletin , 86 , 420 – 428 . Google Scholar CrossRef Search ADS PubMed Smidt , N. , Van der Windt , D. A. , Assendelft , W. J. , Mourits , A. J. , Devill , W. L. , de Winter , A. F. , et al. . ( 2002 ). Interobserver reproducibility of the assessment of severity of complaints, grip strength, and pressure pain threshold in patients with lateral epicondylitis . Archives of Physical Medicine and Rehabilitation , 83 , 1145 – 1150 . Google Scholar CrossRef Search ADS PubMed Smith , S. F. , & Lilienfeld , S. O. ( 2015 ). The response modulation hypothesis of psychopathy: A meta-analytic and narrative analysis . Psychological Bulletin , 141 , 1145 – 1177 . Google Scholar CrossRef Search ADS PubMed Sohlberg , M. M. , & Mateer , C. A. ( 2001 a). Cognitive rehabilitation: An integrative neuropsychological approach . New York : Guilford Press . Sohlberg , M. M. , & Mateer , C. A. ( 2001 b). Improving attention and managing attentional problems. Adapting rehabilitation techniques to adults with ADD . Annals of the New York Academy Sciences , 931 , 359 – 375 . Google Scholar CrossRef Search ADS Steffen , T. , & Seney , M. ( 2008 ). Test-retest reliability and minimal detectable change on balance and ambulation tests, the 36-item short-form health survey, and the unified Parkinson disease rating scale in people with parkinsonism . Physical Therapy , 88 , 733 – 746 . Google Scholar CrossRef Search ADS PubMed Turk , D. C. , Okifuji , A. , Sinclair , J. D. , & Starz , T. W. ( 1998 ). Interdisciplinary treatment for fibromyalgia syndrome: Clinical and statistical significance . Arthritis Care and Research , 11 , 186 – 195 . Google Scholar CrossRef Search ADS PubMed Van den Berg , V. , Saliasi , E. , de Groot , R. H. , Jolles , J. , Chinapaw , M. J. , & Singh , A. S. ( 2016 ). Physical activity in the school setting: Cognitive performance is not affected by three different types of acute exercise . Frontiers in Psychology , 7 , 723 . Google Scholar CrossRef Search ADS PubMed Vanhelst , J. , Beghin , L. , Duhamel , A. , Manios , Y. , Molnar , D. , De Henauw , S. , et al. . ( 2016 ). Physical activity is associated with attention capacity in adolescents . The Journal of Pediatrics , 168 , 126 – 131 . Google Scholar CrossRef Search ADS PubMed Wagner , J. M. , Rhodes , J. A. , & Patten , C. ( 2008 ). Reproducibility and minimal detectable change of three-dimensional kinematic analysis of reaching tasks in people with hemiparesis after stroke . Physical Therapy , 88 , 652 – 663 . Google Scholar CrossRef Search ADS PubMed Wu , B. J. , Lin , C. H. , Tseng , H. F. , Liu , W. M. , Chen , W. C. , Huang , L. S. , et al. . ( 2013 ). Validation of the Taiwanese mandarin version of the personal and social performance scale in a sample of 655 stable schizophrenic patients . Schizophrenia Research , 146 , 34 – 39 . Google Scholar CrossRef Search ADS PubMed © The Author(s) 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Journal

Archives of Clinical NeuropsychologyOxford University Press

Published: Dec 8, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off