Abstract Objective The Brown Location Test (BLT) was developed to remedy some of the problems in existing visual-based memory tests. The hand version has demonstrated good psychometric properties, the ability to provide lateralization information for mesial temporal lobe epilepsy patients, and has normative data. The purpose of this study was to compare the hand administered format to the more recently developed computer administered format. Methods We used Generalizability Theory analyses to assess the degree of variability in scores across the hand and computer versions of the test, and across alternate test forms, A and B. We also compared the means and standard deviations for the different versions and forms using paired t-tests, and Pearson correlation coefficients. Results There was minimal variability and high levels of score similarity across the various test administration formats and forms. Conclusions The high degree of comparability between versions allows one to apply the validity findings and normative data collected using the hand administered version to the computer version of the BLT. Nonverbal, Visual, Memory, Reliability, Computer, Location Introduction The Brown Location Test (Brown, Roth, Saykin, & Gibson-Beverly, 2007) was developed to assess visual memory for location without the drawing, fine motor, and verbally mediated stimuli present in many traditional visual-based memory tests. Problems cited with traditional visual-based memory tests have included a heavy reliance on drawing skills (Benedict & Groninger, 1995; Benedict, Schretlen, Groninger, Dobraski, & Shpritz, 1996; Paolo, Troester, & Ryan, 1998), poor lateralization to the right (non-dominant) temporal lobe as predicted by lateralization of memory functioning (Barr, 1997; Barr et al., 1997; Barr, Morrison, Zaroff, & Devinsky, 2004; Lee, Yip, & Jones-Gotman, 2002), evidence that “nonverbal” stimuli were in fact affected by verbal strategies (Golby et al., 2001; Lee et al., 2002), and that many visual based memory tests appeared to be affected by disorders of executive and related non-memory dysfunctions such as Attention-Deficit/Hyperactivity Disorder (AD/HD; Barnett, Maruff, & Vance, 2009; Gropper & Tannock, 2009; Johnson et al., 2001; Muller et al., 2007; Shang & Gau, 2011), and Obsessive-Compulsive Disorder (Moritz, Kloss, von Eckstaedt, & Jelinek, 2009). The above and other findings supported the need for a novel measure of visual or nonverbal memory functions. Consistent with the above concerns, there have been several studies that supported the clinical utility of the BLT. In a study of presurgical temporal lobe epilepsy patients, the BLT Long Delay (LD) Z-score correctly classified 87.5% of right mesial temporal lobe epilepsy patients (RTLE), and 76.9% of left mesial temporal lobe epilepsy patients (LTLE), per a binary logistic regression model. A receiver-operator characteristic (ROC) analysis further revealed that a BLT LD score of ≤−1.3 had an 87.5% positive predictive power in lateralizing to the RTLE, and a 68.8% negative predictive power. From a practical perspective, this meant that 14 out of 16 RTLE patients had a score of −1.3 or less, whereas only 2/13 LTLE patients had such a score (Brown, Hirsch, & Spencer, 2015). A small postsurgical study (Brown et al., 2010) found that those who had a right anterior temporal lobectomy had significantly (F (1,16) = 12.14, p = .003, partial η2 = 0.431) lower learning trial Z-scores (M = −1.69, SD = 0.24) than those who had a left anterior temporal lobectomy (M = −1.09, SD = 0.46). Thus, these studies indicated that the BLT was useful at differentiating right from left mesial temporal lobe pathology. The BLT format has also demonstrated resilience to non-memory factors such as attention span. Specifically, a study compared patients with AD/HD with healthy participants on the BLT. The results indicated that the groups did not differ on the learning trials or primary delayed memory scores (Brown et al., 2010), whereas prior studies found that single-exposure, drawing based visual memory tests were often lower among AD/HD patients (Kibby & Cohen, 2008; Shang & Gau, 2011). The exception to these findings was that AD/HD patients scored significantly lower than controls on a rotated long delay subtest. The authors concluded that this may reflect the greater connectivity demands required by recalling visual spatial information from a position other than in which it was originally learned (Brown et al., 2010). Regardless, the primary memory subtests appeared relatively resistant to the effects of AD/HD. The BLT was originally developed in a hand-administered format with materials consisting of a binder of stimulus items, single-page display for dot position recall, and wooden red and black chips for the examinee to use to denote stimulus location. Specifically, the examiner presents 12 pages each with a red dot in a different location superimposed on a background with an asymmetrical array of circles. The circle array is identical on each page, whereas the dots are in different locations (see Fig. 1 for a simplified example). Each page is presented for exactly 4 s. This timing is important, as it was found that reducing the presentation to 3 s resulted in less reliable and lower test scores among older volunteers during the development phase of this test. After the 12 pages and the respective dot locations are presented, the examiner gives the examinee 12 checker-like wooden chips and asks him or her to place them on the asymmetrical circle array where the red dots were previously located. The examiner then marks off the location where the examinee placed the red chips on a score sheet. This procedure is repeated for a total of five learning trials in which the same 12 dot locations are presented each trial. Next, the examiner administers a learning interference trial that uses 12 black dots at different locations, again on the same asymmetrical circle array. The interference trial is immediately followed by a short delay recall trial where the examinee is asked to indicate the red dot locations. Twenty minutes after the short delay, a long delay recall trial is administered. This is followed by a rotated long delay recall trial where the response page is covertly rotated 90°, and the examinee is again asked to indicate the red dot locations. The test concludes with a recognition yes/no formatted subtest. The two different forms (BLT-A and BLT-B) both have the same number of circles and dot locations; however, the circle array configuration and the dot locations are different. The BLT has demonstrated satisfactory alternate form (A and B) reliabilities (Brown et al., 2007), and normative data were also previously published for the BLT (Brown et al., 2010) to support its clinical use. Fig. 1. View largeDownload slide Simplified version of dot location. Fig. 1. View largeDownload slide Simplified version of dot location. The developers of the test employed the above format to make the administration of the test easy in order to ensure good reliability among different examiners (Brown et al., 2007). However, as the test has become used by more people who were not part of its original development team, it has become apparent that this format can periodically challenge examiners and therefore possibly affect its reliability. The most common administrative error appears to be that not all examiners consistently display each page for exactly 4 s. There is a tendency for some examiners to start rushing the page exposure, reducing it to three seconds. A second concern involves differing visual spatial skills among examiners. There are 56 circles in the array, and some examiners have reported that they found it difficult to rapidly mark off the correct location on the answer sheet, as it required matching the same circle seen on the answer board with the smaller display on the answer sheet. Most reported that this scoring difficulty improved with practice, but it does increase the risk of an error for those administering the test who have weakness with visual spatial skills. Though variations in standardized procedures and administration challenges can occur among most neuropsychological measures, we noticed that the format of this test (e.g., visual presentation without any verbal responses required by the examinee), made conversion to a computer administration format straightforward and we believed this would correct some of the administrator challenges. The computer version of the BLT uses the same visual stimuli as the hand version, with the exception that it is presented on a computer screen. Instead of using red checker-like chips to indicate the locations, the computer version requires that the examinee point and click with a mouse on a screen with the circles to indicate the red dot locations. In order to emulate the original version as much as possible, in which there are a total of 12 red checker-like chips, the computer screen shows the unused red dots to the side of the display. Each time a person clicks on a circle to indicate a red dot location, one red dot disappears from the red dot tray, until all 12 red dots are used. This provides a similar experience to the hand version. Among those with fine motor control difficulties, or lack of computer familiarity, the examinee can simply point to the correct dot location on the screen, with the examiner clicking on that location with the mouse or touch screen. Though the computer version appears similar to the hand version, it is possible that there are differences between the hand and computer versions. For example, the different administration formats may result in different test scores for the same persons. Thus, it is important to determine that the forms are psychometrically equivalent. Historically, classical test theory has been used to assess test reliability. However, traditional methods examine just one source of measurement error at a time. This may result in underestimates of reliability because traditional methods do not estimate or take into account other sources of error variance that may also exist for a measure. Generalizability theory (G theory) allows one to examine multiple sources of test score variance (person, form, administration type, etc.) and their interactions at the same time (Bloch & Norman, 2012; Brennan, 2001; Mushquash & O’Connor, 2006). It is also possible to estimate the various combinations of levels of the facets that are required to reach specific levels of reliability (e.g., 0.80). Though important for any form differences, we felt it would be particularly helpful to use G-theory when the administration methods also varied. The present study used G-theory to address the following questions regarding the BLT: What proportion of the variability in BLT scores is due to differences between respondents and what proportions are due to other sources of measurement error (versions, trials, and interactions)? What is the overall reliability estimate of the BLT scores? How is the overall reliability estimate of BLT scores affected by varying the number of trials? Are the means and standard deviation of the hand and computer versions essentially the same so that the different forms can be used interchangeably with the same normative data? Method Participants There were three different samples used within this study. The sample for the Hand-A to Computer-A comparison was taken from a group of healthy participants that were part of a larger multiple sclerosis study. Potential participants were informed about the study through a university website, advertisement flyers, and word of mouth. The Computer-A to Computer-B comparison group was recruited from undergraduate psychology courses. These students were required to choose from a variety of research projects or other activities as partial fulfillment of a research experience requirement within their courses. The Hand-A and Hand-B comparison group was a re-analysis from a previously reported study (Brown et al., 2007) whose sample was recruited from among staff, faculty, and study volunteers from universities within Louisiana, New Hampshire, and Connecticut. The demographic characteristics of these samples are described in the Results section. The recruitment of healthy participants was approved by the relevant institutional review boards for human subjects. The study was reviewed with potential participants, and informed consent was provided by those who agreed to participate. Prior to consideration for inclusion, patients were screened on the phone and then again at the time of their participation for inclusion and exclusion criteria. Participants were required to be at least 18 years of age. They were excluded if they had a medical history that included a neurological disorder, cardiovascular disease (e.g., myocardial infarction, bypass surgery, and other vascular surgery), cerebral vascular disease (e.g., stroke), Type II Diabetes, sleep apnea, history of concussion with loss of consciousness, developmental disorder, learning disorder, attention-deficit/hyperactivity disorder, bipolar disorder, or schizophrenia. Individuals with severe depression or anxiety that was not well-controlled at the time of the study as demonstrated by causing significant distress were excluded. Individuals were included if their depression and/or anxiety symptoms were well controlled with counseling and/or medication, and did not interfere with their daily functioning, or social activities. To determine whether a psychiatric problem interfered with a volunteer’s functioning, individuals who indicated the presence of a psychiatric disorder or symptoms were further interviewed by the first author to determine the severity of the symptoms. If a potential participant was having difficulties within their family, social life, vocation, or education that was attributable to their psychiatric symptoms, he or she was excluded from the study. Materials The hand administered version of the BLT (Brown et al., 2007) was described in detail within the Introduction section. The computer version of the test was also generally described earlier. More specifically, the computer version was written for the Windows operating system using the.NET framework with Visual Basic. The first author’s original stimuli used for the hand version were digitally converted to JPEG format and used for the program’s display stimuli. During the learning trials, each target dot stimulus is shown for exactly four seconds followed by a 750 ms inter-stimulus interval. This inter-stimulus interval was selected during pilot testing to mimic the hand version as much as possible while minimizing after-images or motion effects. The computer program administers the BLT exactly according to the original hand version’s specifications. Only minor wording changes were made due to the change in interface. For example, “Point and click with your mouse on the circles where the red dots were earlier” replaced “Place the red chips where you saw the red dots.” The computer version handles all scoring and generation of normed scores. For the current analysis, we included the raw scores for each learning trial (1 through 5), the learning trials total, the interference trial total, the short delay recall trial, long delay recall trial, rotated long delay recall trial, and recognition total. Demographic questionnaires and a structured interview, developed for this study, were used to obtain relevant background information. Procedures As mentioned in the Introduction section, the BLT previously demonstrated good alternate form reliabilities when comparing the Hand A and Hand B versions. Therefore, the current study included reliability analyses examining the Hand BLT-A with Computer BLT-A; Computer BLT-A and Computer BLT-B; and re-analyzed the Hand BLT-A and Hand BLT-B scores using G-theory rather than classical reliability analyses used in the original (Brown et al., 2007) BLT study. However, classical reliability analyses in the form of Pearson correlation coefficients will be provided for comparison with the more traditional alternate form reliabilities. Hand BLT-A versus Computer BLT-A comparison The BLT-A hand and computer comparison procedures consisted of the following. Participants were administered an approximately three-hour neuropsychological test battery as part of a larger study. During the first evaluation, they were administered either the hand or computer version of the BLT test, which was chosen through random assignment. The BLT was administered approximately one hour into the evaluation. Individuals who completed this first study were asked if they were interested in being retested 3 weeks later in an approximately half hour appointment. Those who agreed were administered the BLT hand or computer version which they had not been administered during the first evaluation. Between the immediate learning trials and the delay, they completed questionnaires as part of a different study. Thirty-five out of the 50 participants who took part in the first study participated in the second part of the study. The demographic variables of these groups are described in the Results section. Computer BLT-A versus Computer BLT-B comparison The participants in the Computer BLT-A and BLT-B comparison group were administered either the BLT-A or BLT-B on a computer (per random assignment) during the first time point. They completed demographic and other questionnaires from a different study during the time between the short and long delays. They were administered the alternate BLT form (A or B) on the computer which they had not taken the first time approximately 3 weeks later. Hand BLT-A versus Hand BLT-B comparison The participants in the Hand BLT-A and BLT-B comparison group were randomly administered either the A or B version along with some demographics and other study questionnaires during the first time point. They were then administered the BLT form not previously administered approximately 3 weeks later. Statistical Methods Means and standard deviations were calculated for participants’ ages and education levels. One-way ANOVAs were used to examine these variables according to administration order in the three samples, and retest versus non-retest participants within the Hand-A and Computer-B comparison sample. Frequencies were calculated for gender and ethnicity; chi-square analyses were used to compare these categorical variables according to administration order within the three samples, and retest versus non-retest participants within the Hand-A and Computer-B comparison sample. BLT raw scores means and standard deviations for each administration type and/or form were compared with paired t-tests. Pearson r correlation coefficients were also calculated for the paired subtests to provide classical test theory alternate form reliabilities. G-theory methods were used to estimate the reliability and the variance components of the BLT scores. Specifically, we used G-theory to obtain estimates of the variance due to between-person differences in BLT scores (universe score variance), and of the variances due to BLT versions, trials, and of the interactions among these sources of variance. A G-theory investigation typically involves two parts: a generalizability or “G-study,” and a decision or “D-study.” A G-study estimates the reliability and the variance components for the examined facets, which in our case were persons, versions, and trials. Restricted maximum likelihood estimation was used to compute the variance components (identical results were obtained using the ANOVA method), using the software provided by Mushquash and O’Connor (2006). We also provide the standard error of measurement, which is the standard deviation of the differences between observed scores and universe scores. The three facets were fully crossed: for each comparison, all of the respondents were administered the same versions of the BLT, and across five learning trials in each case. The facets were treated as random because we wished to generalize the findings beyond the specificities of this particular G study (Fig. 2). Fig. 2. View largeDownload slide D-study G-coefficients. Fig. 2. View largeDownload slide D-study G-coefficients. The statistical information that was gained from the G-study was then used to conduct a D-study. The goal was to determine how reliability would change under differing assessment conditions, such as differing numbers of trials. A D-study is simply a forecast based on formulas and based on the G-study variance components. It does not involve new raw data collection. The D-study in our case made it possible to determine the different possible combinations of versions and trials that are required to obtain a G-coefficient of at least 0.80. Results Participant Demographics Hand BLT-A versus Computer BLT-A comparison A total of 35 (22 females, 13 males) participants completed the BLT-A Hand and BLT-A Computer versions with the 3-week time period. The participants who completed the test and retest condition were 33 non-Hispanic White, and two Hispanic individuals. Their mean age was 42.37 years (SD = 12.79), ranging from 22 to 63 years old. They had completed between 11 and 20 years of education (M = 16.31, SD = 1.78). A total of 15 (14 females, 1 male; 15 non-Hispanic White) participants did not complete the retest condition. Their mean age was 45.27 years (SD = 11.13), and had completed between 16 and 20 years of education (M = 17.07, SD = 1.34). One-way ANOVAs did not reveal any differences in age or education according to test form order administration, or when comparing those who completed versus did not complete the retest condition. Chi-square analysis did not reveal any differences in ethnicity according to test form order administration, or when comparing those who completed versus did not complete the retest condition. There was, however, a greater proportion (χ2 = (1,49) = 4.837, p = .028) of females to males in the group that chose not to complete the retest when compared to those who completed both testing sessions. Computer BLT-A versus Computer BLT-B comparison A total of 30 (24 females, 6 males) participants completed the BLT-A and BLT-B computer versions. This sample was 86.7% non-Hispanic White, 10% African American, and 3.3% Asian. Their mean age was 22.63 years (SD = 4.21), ranging from 20 to 43 years old. They were all students who had completed between 12 and 15 years of education (M = 14.67, SD = 0.66). One-way ANOVAs did not reveal any differences in age or education according to test form order administration. Chi-square analysis did not reveal any differences in ethnicity or gender according to test form order administration. Hand BLT-A versus Hand BLT-B comparison A total of 56 (38 females, 18 males) participants completed the BLT-A and BLT-B hand versions. This sample was 89.3% non-Hispanic White, 7.1% African American, and 3.6% Asian. Their mean age was 39.54 years (SD = 17.00), ranging from 18 to 71 years old. They had completed between 12 to 23 years of education (M = 15.75, SD = 2.54). One-way ANOVAs did not reveal any differences in education according to test form order administration. However, participants in the BLT-A then BLT-B condition (M = 44.31, SD = 16.59) were significantly (F (1,54) = 17.121, p < .001) older than those who completed the BLT-B and then the BLT-A (M = 25.21, SD = 7.75). Chi-square analysis did not reveal any differences in ethnicity or gender according to test form order administration. Score Similarities Across Comparison Conditions The means, standard deviations, Cohen’s d effect sizes for the mean differences (Cohen, 1988), and the Pearson correlations between scores for the comparison conditions are provided in Tables 1–3. The BLT Hand-A and BLT Computer-A scores were not significantly different on paired t-tests. The effect sizes for the differences between the means were weak, based on the interpretation conventions for d-value effect sizes. The Pearson correlations between scores on the individual subtests were generally very high, although the recognition subtest had a slightly weaker correlation. Similarly, the BLT Computer A and Computer B scores were not significantly different on paired t-tests and the effect sizes for the differences between the means were weak. The Pearson correlations between scores on the individual subtests were generally very high, except for the interference and recognition subtests. Finally, the BLT Hand A and BLT Hand B scores were not significantly different according to paired t-tests and the effect sizes for the differences between the means were weak. The correlations were again quite high, except for the correlation between the interference and recognition subtests. Table 1. Mean scores for the hand A and computer A comparison group Subtest score Hand BLT-A, n = 36 Computer BLT-A, n = 36 d r* M SD M SD Trial 1 5.29 1.60 5.23 1.82 0.06 0.82 Trial 2 6.91 2.03 7.23 2.29 0.22 0.77 Trial 3 8.29 2.41 8.80 2.34 0.29 0.73 Trial 4 9.77 1.93 9.57 2.53 0.14 0.78 Trial 5 10.46 1.79 10.57 1.70 0.12 0.85 Trials 1–5 40.69 8.14 41.49 8.76 0.28 0.91 Interference 5.03 1.48 5.22 1.65 0.16 0.71 Short delay 9.00 2.50 8.77 2.76 0.20 0.90 Long delay 9.03 2.61 9.02 2.60 0.01 0.85 Rotated long delay 7.40 2.75 7.74 2.74 0.22 0.84 Recognition total 19.37 3.40 19.49 2.98 0.04 0.53 Subtest score Hand BLT-A, n = 36 Computer BLT-A, n = 36 d r* M SD M SD Trial 1 5.29 1.60 5.23 1.82 0.06 0.82 Trial 2 6.91 2.03 7.23 2.29 0.22 0.77 Trial 3 8.29 2.41 8.80 2.34 0.29 0.73 Trial 4 9.77 1.93 9.57 2.53 0.14 0.78 Trial 5 10.46 1.79 10.57 1.70 0.12 0.85 Trials 1–5 40.69 8.14 41.49 8.76 0.28 0.91 Interference 5.03 1.48 5.22 1.65 0.16 0.71 Short delay 9.00 2.50 8.77 2.76 0.20 0.90 Long delay 9.03 2.61 9.02 2.60 0.01 0.85 Rotated long delay 7.40 2.75 7.74 2.74 0.22 0.84 Recognition total 19.37 3.40 19.49 2.98 0.04 0.53 Note: *p < .001 for all correlations; d = Cohen’s d effect size; r = Pearson’s r correlation coefficient. Table 2. Mean scores for the Computer A and Computer B comparison group Subtest score Computer BLT-A, n = 30 Computer BLT-B, n = 30 d r* M SD M SD Trial 1 6.43 2.47 5.87 2.03 0.32 0.70 Trial 2 8.07 2.64 7.63 2.06 0.24 0.70 Trial 3 9.73 2.13 9.17 1.98 0.31 0.61 Trial 4 10.10 2.06 9.87 1.74 0.19 0.80 Trial 5 10.67 1.71 10.40 1.45 0.27 0.79 Trials 1–5 44.90 9.99 43.00 8.07 0.38 0.85 Interference 4.47 1.85 4.63 2.31 0.07 0.46 Short delay 9.90 2.29 9.50 1.87 0.32 0.82 Long delay 9.70 2.14 9.43 1.92 0.22 0.81 Rotated long delay 7.40 2.43 6.97 1.88 0.28 0.75 Recognition total 17.93 2.10 17.60 2.22 0.15 0.48 Subtest score Computer BLT-A, n = 30 Computer BLT-B, n = 30 d r* M SD M SD Trial 1 6.43 2.47 5.87 2.03 0.32 0.70 Trial 2 8.07 2.64 7.63 2.06 0.24 0.70 Trial 3 9.73 2.13 9.17 1.98 0.31 0.61 Trial 4 10.10 2.06 9.87 1.74 0.19 0.80 Trial 5 10.67 1.71 10.40 1.45 0.27 0.79 Trials 1–5 44.90 9.99 43.00 8.07 0.38 0.85 Interference 4.47 1.85 4.63 2.31 0.07 0.46 Short delay 9.90 2.29 9.50 1.87 0.32 0.82 Long delay 9.70 2.14 9.43 1.92 0.22 0.81 Rotated long delay 7.40 2.43 6.97 1.88 0.28 0.75 Recognition total 17.93 2.10 17.60 2.22 0.15 0.48 Note: *p < .001 for all correlations; d = Cohen’s d effect size; r = Pearson’s r correlation coefficient. Table 3. Mean scores for the Hand A and Hand B comparison group Subtest score Hand BLT-A, n = 56 Hand BLT-B, n = 56 d r* M SD M SD Trial 1 4.95 1.99 5.49 2.01 0.33 0.66 Trial 2 6.93 2.21 7.05 1.99 0.07 0.65 Trial 3 8.70 2.20 8.61 1.93 0.06 0.69 Trial 4 9.61 1.98 9.77 1.91 0.10 0.69 Trial 5 10.68 1.55 10.55 1.64 0.12 0.78 Trials 1–5 40.86 8.08 41.48 7.97 0.14 0.84 Interference 4.68 2.14 4.91 1.54 0.12 0.45 Short delay 9.46 2.20 9.29 2.10 0.09 0.63 Long delay 9.25 2.13 9.30 2.13 0.03 0.75 Rotated long delay 8.27 2.35 8.63 2.26 0.20 0.70 Recognition total 19.50 2.71 20.09 2.39 0.24 0.52 Subtest score Hand BLT-A, n = 56 Hand BLT-B, n = 56 d r* M SD M SD Trial 1 4.95 1.99 5.49 2.01 0.33 0.66 Trial 2 6.93 2.21 7.05 1.99 0.07 0.65 Trial 3 8.70 2.20 8.61 1.93 0.06 0.69 Trial 4 9.61 1.98 9.77 1.91 0.10 0.69 Trial 5 10.68 1.55 10.55 1.64 0.12 0.78 Trials 1–5 40.86 8.08 41.48 7.97 0.14 0.84 Interference 4.68 2.14 4.91 1.54 0.12 0.45 Short delay 9.46 2.20 9.29 2.10 0.09 0.63 Long delay 9.25 2.13 9.30 2.13 0.03 0.75 Rotated long delay 8.27 2.35 8.63 2.26 0.20 0.70 Recognition total 19.50 2.71 20.09 2.39 0.24 0.52 Note: *p < .001 for all correlations; d = Cohen’s d effect size; r = Pearson’s r correlation coefficient. Generalizability Theory Results The variance components and the corresponding variance proportions from the generalizability theory analyses for the three comparisons are provided in Table 4. The results were notably consistent across the three kinds of comparisons. The largest variance component in each case was for Trial, with variance proportions ranging from 0.43 to 0.55. This is reasonable as each new learning trial should increase from the prior trial due to learning effects among healthy participants. The second largest variance component was for Persons, with variance proportions ranging from 0.25 to 0.36. Again, this was what one would expect because different individuals will have different mean performances. None of the remaining variance proportions were above 0.12, even for the interactions, including the residual terms. Across all three comparisons, the variance components for Version were essentially zero, and the variance components for interactions involving Version were very small. Table 4. Variance components and proportions from generalizability theory analyses Hand BLT-A versus Computer BLT-A Hand BLT-A versus Hand BLT-B Computer BLT-A versus Computer BLT-B Variance component Variance proportion Variance component Variance proportion Variance component Variance proportion Persons 2.442 0.281 2.646 0.357 2.093 0.247 Version 0.001 0.001 0.007 0.001 0.053 0.006 Trial 4.328 0.499 3.165 0.427 4.633 0.547 Persons ∗ Version 0.094 0.011 0.481 0.065 0.148 0.017 Persons ∗ Trial 0.876 0.101 0.270 0.036 0.531 0.063 Version ∗ Trial 0.016 0.002 0.000 0.000 0.011 0.001 Residual 0.920 0.106 0.845 0.114 1.006 0.119 Hand BLT-A versus Computer BLT-A Hand BLT-A versus Hand BLT-B Computer BLT-A versus Computer BLT-B Variance component Variance proportion Variance component Variance proportion Variance component Variance proportion Persons 2.442 0.281 2.646 0.357 2.093 0.247 Version 0.001 0.001 0.007 0.001 0.053 0.006 Trial 4.328 0.499 3.165 0.427 4.633 0.547 Persons ∗ Version 0.094 0.011 0.481 0.065 0.148 0.017 Persons ∗ Trial 0.876 0.101 0.270 0.036 0.531 0.063 Version ∗ Trial 0.016 0.002 0.000 0.000 0.011 0.001 Residual 0.920 0.106 0.845 0.114 1.006 0.119 The G-coefficients were high: 0.89 (SEM = 0.56) for the Hand BLT-A versus Computer BLT-A comparison; 0.88 (SEM = 0.62) for the Hand BLT-A versus Hand BLT-A; and 0.88 (SEM = 0.53) for the Computer BLT-A versus Computer BLT-B comparison. These G-coefficients are the reliability levels that occur when two versions of the BLT are administered across five trials. The results of the D-study analyses were also consistent across the three comparisons, which was to be expected given the consistency in the variance components. A plot of the forecasted G-coefficient values for eight values of Trial (1, 2, 3, 4, 5, 6, 7, 8) for the Hand BLT-A versus Computer BLT-A analyses is provided in Fig. 1. The biggest jump in G coefficients occurred when going from one to two trials. Reliability did not increase noticeably after five trials. Doubling the number of versions (from one to two) resulted in a negligible, almost indiscernible, increment in G coefficients. Only one version of the BLT would typically be administered in research or clinical work. Reliability (the G coefficient) was 0.78 when just one version and three trials were used. Reliability was 0.84 when one version and five trials was used (see Fig. 1). Discussion We used G-theory rather than classical test methods to examine the reliabilities of the hand and computer BLT while simultaneously taking into account multiple sources of variance. The greatest amount of variance on each test was due to the number of learning trials (i.e., the scores changed with each learning trial as expected), followed by individual variance due to the persons taking the test. There was virtually no variance between tests based on administration type. Consistent with the lack of variance between administration types, the G-coefficients were high (0.88–0.89) for the various test forms when comparing the reliability across all five learning trials. The current reliabilities using G-theory were similar or slightly higher than the classical test theory reliabilities that were reported in the original study examining two different hand versions (Brown et al., 2007). Furthermore, we found that the use of five learning trials provided significantly greater reliability (G = 0.84) than if the test only used three learning trials (0.78). In contrast, there was no improvement in reliability when additional learning trials were forecast. Thus, G-theory supported the decision to use five learning trials, and demonstrated high levels of consistency in scores across the hand and computer forms of the BLT. In addition to G-theory, several other statistics supported the comparability between test forms and the computer and hand versions of the BLT. First, the means and standard deviations across versions and subtests were essentially the same. None of the tests for differences between means were significant, and the d-value effect sizes for the mean differences were in the small/weak range (Cohen, 1988). Second, classical test theory methods (Pearson r correlation coefficients) suggested good reliabilities between the various test and administration form types. Finally, there were slightly stronger correlations on Form A when comparing the hand and computer version, than when comparing Form A and B, regardless of the administration type. Together, these findings suggest that the administration format did not result in differences on the BLT. This high degree of consistency between the two hand versions, and when comparing the hand to computer versions suggested that the normative data collected for the hand version can also be applied to the computer version, as the scores did not differ across form or administration type. This is important for several reasons. First, it suggested that an examinee’s test scores can be compared over repeated testing trials, using different forms or administration methods. It further indicated that large deviations in test scores were not likely to reflect normal variation due to psychometric properties. This may be particularly important when monitoring changes in scores among epilepsy surgical patients, one of the key groups for which this test was originally developed. Finally, these results indicated that one can generalize the results of studies using one administration type to other administration types of the BLT. While these were strong results within the current samples obtained, there were some considerations which may be addressed in future studies. Although a general problem within neuropsychology, the present samples were not ethnically diverse. The Computer-A and Computer-B sample were college students only, which accounts for them being somewhat younger than the other two samples. The other samples were older (average age ranges in upper-30s to mid-40s). Further data collection among more diverse samples are recommended so that we can more confidently interpret the BLT results among patients who demographic variables are different from the current study and the normative sample. It may be worth collecting reliability data on an older population as they may be less comfortable with computer based administrations (Bauer et al., 2012), although one may argue that computer use is more common among older age populations now than it was even 10 years ago, with older patients often using computers to increase social interactions. Of course, an examiner should always ensure the examinee is comfortable using computers, and use the hand version if there are any concerns in that area. While some may believe that a larger sample size would be ideal, the degree of significant, effect sizes, and similar degree of reliabilities among all three studies reported within this article suggest that larger sample sizes would be highly likely to replicate the current findings. In conclusion, these results supported a high degree of consistency between the computer and hand versions of the BLT, and among different forms of the computer and hand versions. Thus, the results suggested the ability to use either version within clinical or research populations and feel reasonably confident that the administration format should not affect the test results. The computer version has the advantage of removing some risks associated with deviation from the standardized administration, as well as automatically calculated scores and normative values. Consistent with this, the alternate form reliabilities using the computer were slightly stronger than the hand version reliabilities. The computer version is also easier to distribute than the hand version as it can be downloaded easily, and it does not require the materials needed for the hand version (e.g., making the wooden marker chips, administration booklets). Perhaps the most apparent disadvantage of the computer will occur for individuals whose fine motor difficulties interfere with controlling the mouse. However, this and similar disadvantages can easily be addressed by the examiner who can administer the hand version instead, use a touch screen formatted computer, or operate the mouse for the patient. Regardless, these results support that the computer version can now be confidently used instead of the hand version for individuals who are comfortable with such an administration format. Funding This project was supported in part by the Connecticut State University – AAUP Research Grant Award, and the EMD Serono Investigator Initiated Award, “A novel measure of cognitive inefficiency: sensitivity to subjective complaints and abnormal resting state functional connectivity in multiple sclerosis.” Conflict of Interest The first author holds a copyright to the Brown Location Test (BLT), and operates a website to distribute the Brown Location Test (BLT). The fees charged for the BLT covers materials, distribution, software engineer support, time to manage the site and material distribution, and related expenses. Any residual amount from the sales of the BLT are kept for future development of this test. Individual users are welcome to reproduce the score sheets as needed without charge. Acknowledgments In addition to funding sources, this project would not have been possible without space and other material supports provided by Yale University, Central Connecticut State University, and Eastern Connecticut State University. References Barnett, R., Maruff, P., & Vance, A. ( 2009). Neurocognitive function in Attention-Deficit/Hyperactivity Disorder with and without comorbid disruptive behaviour disorders. Australian and New Zealand Journal of Psychiatry , 43, 722– 730. Google Scholar CrossRef Search ADS PubMed Barr, W., Morrison, C., Zaroff, C., & Devinsky, O. ( 2004). Use of the Brief Visuospatial Memory Test—Revised (BVMT-R) in neuropsychological evaluation of epilepsy surgery candidates. Epilepsy & Behavior , 5, 175– 179. Google Scholar CrossRef Search ADS Barr, W. B. ( 1997). Examining the right temporal lobe’s role in nonverbal memory. Brain and Cognition , 35, 26– 41. Google Scholar CrossRef Search ADS PubMed Barr, W. B., Chelune, G. J., Hermann, B. P., Loring, D. W., Perrine, K., Strauss, E., et al. . ( 1997). The use of figural reproduction tests as measures of nonverbal memory in epilepsy surgery candidates. Journal of the International Neuropsychological Society , 3, 435– 443. Google Scholar PubMed Bauer, R. M., Iverson, G. L., Cernich, A. N., Binder, L. M., Ruff, R. M., & Naugle, R. I. ( 2012). Computerized neuropsychological assessment devices: Joint position paper of the American Academy of Clinical Neuropsychology and the National Academy of Neuropsychology. The Clinical Neuropsychologist , 26, 177– 196. Google Scholar CrossRef Search ADS PubMed Benedict, R. H. B., & Groninger, L. ( 1995). Preliminary standardization of a new visuospatial memory test with six alternate forms. Clinical Neuropsychologist , 9, 11– 16. Google Scholar CrossRef Search ADS Benedict, R. H. B., Schretlen, D., Groninger, L., Dobraski, M., & Shpritz, B. ( 1996). Revision of the Brief Visuospatial Memory Test: Studies of normal performance, reliability, and validity. Psychological Assessment , 8, 145– 153. Google Scholar CrossRef Search ADS Bloch, R., & Norman, G. ( 2012). Generalizability theory for the perplexed: A practical introduction and guide: AMEE Guide No. 68. Medical Teacher , 34, 960– 992. doi:10.3109/0142159X.2012.703791. Google Scholar CrossRef Search ADS PubMed Brennan, R. L. ( 2001). Generalizability theory . New York: Springer. Google Scholar CrossRef Search ADS Brown, F. C., Hirsch, L. J., & Spencer, D. D. ( 2015). Spatial memory for asymmetrical dot locations predicts lateralization among patients with presurgical mesial temporal lobe epilepsy. Epilepsy and Behavior , 52A, 19– 24. Google Scholar CrossRef Search ADS Brown, F. C., Roth, R. M., Saykin, A. J., & Gibson-Beverly, G. ( 2007). A new measure of visual location learning and memory: Development and psychometric properties for the Brown Location Test (BLT). The Clinical Neuropsychologist , 21, 811– 825. Google Scholar CrossRef Search ADS PubMed Brown, F. C., Tuttle, E., Westerveld, M., Ferraro, R., Chmielowiec, T., Vandemore, M., et al. . ( 2010). Visual memory in patients after anterior right temporal lobectomy and adult normative data for the Brown Location Test. Epilepsy and Behavior , 17, 215– 220. Google Scholar CrossRef Search ADS PubMed Cohen, J. ( 1988). Statistical power analysis for the behavioral sciences ( 2nd ed.). Hillsale, NJ: Lawrence Erlbaum Associates. Golby, A. J., Poldrack, R. A., Brewer, J. B., Spencer, D., Desmond, J. E., Aron, A. P., et al. . ( 2001). Material-specific laterization in the medial temporal lobe and prefrontal cortex during memory encoding. Brain , 124, 1841– 1854. Google Scholar CrossRef Search ADS PubMed Gropper, R. J., & Tannock, R. ( 2009). A pilot study of working memory and academic achievement in college students with ADHD. Journal of Attention Disorders , 12, 574– 581. Google Scholar CrossRef Search ADS PubMed Johnson, D. E., Epstein, J. N., Waid, L. R., Latham, P. K., Voronin, K. E., & Anton, R. F. ( 2001). Neuropsychological performance deficits in adults with Attention Deficit/Hyperactivity Disorder. Archives of Clinical Neuropsychology , 16, 587– 604. Google Scholar CrossRef Search ADS PubMed Kibby, M., & Cohen, M. J. ( 2008). Memory functioning in children with reading disabilities and/or Attention Deficit/Hyperactivity Disorder: A clinical investigation of their working memory and long-term memory functioning. Child Neuropsychology , 14, 525– 546. Google Scholar CrossRef Search ADS PubMed Lee, T. M. C., Yip, J. T. H., & Jones-Gotman, M. ( 2002). Memory deficits after resection from left or right anterior temporal lobes in humans: A meta-analytic review. Epilepsia , 43, 283– 291. Google Scholar CrossRef Search ADS PubMed Moritz, S., Kloss, M., von Eckstaedt, F. V., & Jelinek, L. ( 2009). Comparable performance of patients with obsessive-compulsive disorder (OCD) and healthy controls for verbal and nonverbal memory accuracy and confidence: time to forget the forgetfulness hypothesis of OCD? Psychiatry Research , 166, 247– 253. Google Scholar CrossRef Search ADS PubMed Muller, B. W., Gimbel, K., Keller-Pliessnig, A., Sartory, G., Gastpar, M., & Davids, E. ( 2007). Neuropsychological assessment of adult patients with Attention-Deficit/Hyperactivity Disorder. European Archives of Psychiatry and Clinical Neuroscience , 257, 112– 119. Google Scholar CrossRef Search ADS PubMed Mushquash, C., & O’Connor, B. P. ( 2006). SPSS and SAS programs for generalizability theory analyses. Behavior Research Methods , 38, 542– 547. Google Scholar CrossRef Search ADS PubMed Paolo, A. M., Troester, A. I., & Ryan, J. J. ( 1998). Continuous Visual Memory Test performance in healthy persons 60 to 94 years of age. Archives of Clinical Neuropsychology , 13, 333– 337. Google Scholar CrossRef Search ADS PubMed Shang, C. Y., & Gau, S. S. ( 2011). Visual memory as a potential cognitive endophenotype of Attention Deficit Hyperactivity Disorder. Psychological Medicine: A Journal of Research in Psychiatry and the Allied Sciences , 41, 2603– 2614. Google Scholar CrossRef Search ADS © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org.
Archives of Clinical Neuropsychology – Oxford University Press
Published: Feb 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera