Repeated Administration Effects on Psychomotor Vigilance Test Performance

Repeated Administration Effects on Psychomotor Vigilance Test Performance Abstract Study Objectives The Psychomotor Vigilance Test (PVT) is reported to be free of practice effects that can otherwise confound the effects of sleep loss and circadian misalignment on performance. This differentiates the PVT from more complex cognitive tests. To the best of our knowledge, no study has systematically investigated practice effects on the PVT across multiple outcome domains, depending on administration interval, and in ecologically more valid settings. Methods We administered a validated 3-minute PVT (PVT-B) 16 times in 45 participants (23 male, mean ± SD age 32.6 ± 7.3 years, range 25–54 years) with administration intervals of ≥10 days, ≤5 days, or 4 times per day. We investigated linear and logarithmic trends across repeated administrations in 10 PVT-B outcome variables. Results The fastest 10% of response times (RT; plin = .0002), minimum RT (plog = .0010), and the slowest 10% of reciprocal RT (plog = .0124) increased while false starts (plog = 0.0050) decreased with repeated administration, collectively decreasing RT variability (plog = .0010) across administrations. However, the observed absolute changes were small (e.g., −0.03 false starts per administration, linear fit) and are probably irrelevant in practice. Test administration interval did not modify the effects of repeated administration on PVT-B performance (all p > .13 for interaction). Importantly, mean and median RT, response speed, and lapses, which are among the most frequently used PVT outcomes, did not change systematically with repeated administration. Conclusions PVT-B showed stable performance across repeated administrations. Combined with its high sensitivity, this corroborates the status of the PVT as the de facto gold standard measure of the neurobehavioral effects of sleep loss and circadian misalignment. alertness, psychomotor vigilance performance, fatigue, sleep deprivation, effects of sleep restriction on cognition and affect Statement of Significance Practice effects confound the effects of sleep loss on cognitive performance of more complex tests. The Psychomotor Vigilance Test (PVT) is reported to be free of practice effects, but a rigorous PVT practice effect assessment in ecologically more valid settings has been missing. Although some statistically significant changes in PVT-B performance were found with repeated administration in this study, they were small and did not affect the most frequently investigated PVT outcomes. To the best of our knowledge, this study is the first to demonstrate no relevant PVT-B practice effects in a controlled testing environment outside the laboratory, and no effect of test administration intervals. The findings stress the value of this brief assay of vigilant attention for sleep research and sleep medicine. INTRODUCTION The Psychomotor Vigilance Test (PVT) measures sustained or vigilant attention by recording response times (RT) to visual (or auditory) stimuli that occur at random inter-stimulus intervals (ISIs).1–7 Sleep deprivation induces reliable changes in PVT performance, causing an overall slowing of RT, a steady increase in the number of errors of omission (i.e., lapses of attention, historically defined as RTs ≥ 500 milliseconds), and a more modest increase in errors of commission (i.e., responses without a stimulus, or false starts).8,9 These effects typically increase with time-on-task.10 The PVT has become arguably the most widely used measure of behavioral alertness owing in large part to its high sensitivity to sleep deprivation and circadian misalignment, and the fact that it can reflect real-world risks because deficits in sustained attention and timely reactions adversely affect many applied tasks.1,5 Another reason for the popularity of the PVT is that, in contrast to other more complex cognitive tests, the PVT is reported to be free of aptitude and practice effects.1,11 The latter can confound the effects of sleep deprivation or circadian misalignment on cognitive performance. Practice or learning effects refer to the phenomenon of gradually improving performance (i.e., higher accuracy and faster speed) with repeated administration of a cognitive test.12 Also, the interval between test administrations can influence practice effects.13 Learning curves are often curvilinear with a more rapid improvement of performance during early test administrations and a decelerating improvement during later test administrations, with test performance eventually reaching a plateau or asymptote.12,14 In many studies, it is neither practical to train participants until they reach a performance plateau prior to entering a study, nor is it possible to investigate a control group that could be used to adjust for some of the effects of repeated testing, stressing the value of a cognitive test without practice effects. Manuscripts reporting on studies that used the PVT often either state the PVT has no practice effects without citing evidence or they cite the same few papers. Manuscripts frequently cited include van Dongen and Dinges,15 Dinges et al.,3 Kribbs and Dinges,16 van Dongen et al.,9 and Dinges and Kribbs6. There are several published laboratory studies that administered the PVT repeatedly during baseline and control conditions. However, as Paterson et al.17 point out, 8 hours sleep opportunity is often not long enough to guarantee that participants obtain their pre-study habitual sleep times in the unfamiliar laboratory environment. Two studies that provided 9 hours sleep opportunity over seven consecutive nights showed conflicting results. Belenky et al.18 found no significant changes in PVT response speed, lapses, or fastest 10% RT, whereas Paterson et al.17 found a significant decline in PVT performance reflected in response speed, average RT, and lapses starting as early as the first night after baseline. Both studies treated day as a factor, i.e., they did not investigate linear or non-linear trends over time. The laboratory environments differed in the degree of confinement, environmental conditions, study group size, research team, and demand characteristics, among other factors, which may explain the differences found between these two studies.17 In light of this laboratory effect and based on the fact that more ecologically valid studies outside the sleep laboratory are needed to inform sleep interventions and health policy, studies are required that investigate PVT practice effects in more naturalistic settings. To the best of our knowledge, none of the previously published studies rigorously assessed practice effects on the PVT by (a) inspecting multiple outcome domains of the PVT (including errors of commission and omission, response speed, changes in the fast and slow RT domains, and RT variability) and by (b) taking administration interval into account. We thus designed a study to systematically investigate practice effects on the PVT during 16 repeated administrations in a setting with increased ecologic validity (i.e., sleep at home) in three groups of N = 14–16 participants that differed in PVT administration interval (≥10 days, ≤5 days, and 4 times per day with 15-minute breaks between tests). METHODS Participants and Protocol A total of N = 45 astronaut surrogate participants (23 male, mean ± SD age 32.6 ± 7.3 years, range 25–54 years) were recruited for the study through on campus and online postings. Participants filled out a screening questionnaire prior to enrollment. They had to be proficient in English, between 25 and 55 years old, and have at least a Master’s degree (or equivalent). Participants with a history of a medical disorder that may significantly affect brain function (e.g., neurological disorders, severe head trauma) as well as participants with a psychiatric illness were excluded from study participation. After the study, participants were asked to fill out the Epworth Sleepiness Scale (ESS)19 and answer a question on how much sleep they usually got on weekdays or workdays during the past year. Valid reports on ESS and sleep duration were obtained from all but one participant who had moved outside the country. The study was reviewed by the Institutional Review Board of the University of Pennsylvania and considered exempt. Written informed consent was obtained from study participants prior to data collection. Participants were compensated for each session they attended. They received $235 (ultrashort group) and $390 (short and long groups) if they completed all test sessions. Participants performed a validated 3-minute version of the PVT (PVT-B)20 for a total of 16 times in an office in the Unit for Experimental Psychiatry at the University of Pennsylvania. The office door was kept closed and a “Testing in Session – Quiet Please – Thank you” warning sign was set up in the hallway during cognitive testing to keep the level of distractions low. Participants were not instructed to follow a specific sleep schedule. They were allowed to go about their normal lives. Participants were free to choose the time of day for testing (i.e., they did not perform the PVT at the same time of day every time they were tested). The start time of 97.1% of testing sessions fell between 8 am and 6 pm (earliest 7:10 am, latest 8:24 pm). PVT-B utilizes 2- to 5-second ISIs in contrast to the 2- to 10-second ISIs of the standard 10-minute PVT. Also, PVT-B utilizes a 355-millisecond lapse threshold in contrast to the 500-millisecond lapse threshold of the standard 10-minute PVT. These modifications increase sensitivity of PVT-B, which is comparable to the standard 10-minute PVT, despite the substantial decrease in test duration.20 PVT-B has been successfully administered to medical interns and residents,21,22 astronauts,23 and astronaut surrogate populations.24 We used block-randomization based on sex to assign participants to one of three groups that differed in PVT-B administration interval: participants performed the PVT-B with ≥10 days between test administrations (long group, N = 14), with ≤5 days between test administrations (short group, N = 16), or four times on a single day with 15-minute breaks between test battery administrations for a total of four visits (ultrashort group, N = 15). In the short and ultrashort groups, there had to be at least 1 day without testing between study site visits. In the ultrashort group, sessions were scheduled with 2- to 6-day intervals. Due to last minute scheduling conflicts, this interval had to be extended to 7 days in two participants in one instance each. The three groups were comparable in age and sex composition (Table 1). Before the first test administration, participants watched a standardized familiarization video and performed a brief practice version of the PVT. In the video, the principal investigator performed a short version of all 10 Cognition tests (see below) after reading out loud the standardized instructions for each test. During all test administrations, participants were supervised by a research coordinator who could answer questions and address technical problems at any time. Table 1 Participant Characteristics.   Long group  Short group  Ultrashort group  All  Administration intervala  ≥10 days  ≤5 days  4× per day    Number of participants  14  16  15  45  Mean participant age (SD)  32.4 (7.4) years  34.4 (6.9) years  30.9 (7.5) years  32.6 (7.3) years  Male  57.1%  50.0%  46.7%  51.1%    Long group  Short group  Ultrashort group  All  Administration intervala  ≥10 days  ≤5 days  4× per day    Number of participants  14  16  15  45  Mean participant age (SD)  32.4 (7.4) years  34.4 (6.9) years  30.9 (7.5) years  32.6 (7.3) years  Male  57.1%  50.0%  46.7%  51.1%  SD = Standard deviation. aConsecutive test days had to be separated by at least one test-free day. View Large Measurement and Outcomes The PVT-B was administered on a calibrated laptop computer (Dell Latitude E6430 with a 14" screen diagonal, 16:9 aspect ratio) using the Cognition software.23 The average response latency of the spacebar (30.8 milliseconds, standard deviation 2.8 milliseconds) was determined with a robotic calibrator before the start of the study and subtracted from each RT. Before each testing session, participants provided their sleep duration on the night preceding the test and rated their alertness on an 11-point Likert scale ranging from 0 (Tired) to 10 (Alert). For PVT-B administration, participants were instructed to monitor a rectangular box and press the spacebar as fast as possible as soon as a millisecond counter started incrementing, without committing premature responses (false starts). After test completion, participants were presented with a feedback score ranging between 0 (worst possible performance) and 1000 (best possible performance). The feedback score was based on RT and the number of false starts. PVT-B was preceded by the following nine cognitive tests that compose the Cognition test battery and cover a wide range of cognitive domains: Motor Praxis, Visual Object Learning, Fractal 2-Back, Abstract Matching, Line Orientation, Emotion Recognition, Matrix Reasoning, Digit Symbol Substitution, and Balloon Analog Risk. It took participants ca. 15 minutes to complete these nine tests prior to PVT-B administration. The overall goal of the study was to characterize practice and stimulus set difficulty effects for all 10 tests of the Cognition battery. Detailed results for the other nine tests will be reported elsewhere. The following standard outcomes were generated for each PVT-B test bout:25 (1) the number of lapses (errors of omission, RT ≥ 355 milliseconds); (2) response speed (reciprocal response time, 1/RT); (3) average RT; (4) median RT; (5) fastest 10% of RT; (6) slowest 10% reciprocal RT; (7) standard deviation of RT; (8) false starts (errors of commission, including coincident false starts defined as RT < 100 milliseconds); (9) minimum RT; and (10) maximum RT. Data Analyses Each PVT-B was inspected for outliers and participant non-adherence, but none of the tests needed to be excluded from data analysis. All data were analyzed with linear mixed effects models in SAS (Version 9.3, Cary, NC). In model #1, administration number was entered as a factor (effect coding) in a random subject intercept model testing whether at least one administration differed significantly from the overall mean across administrations. In models #2 and #3, administration number or the natural logarithmic transform of administration number were entered as a continuous variable in a random subject intercept model with random slopes (unstructured covariance). Degrees of freedom were corrected with Satterthwaite’s method.26 The Akaike Information Criterion (AIC) was used to determine whether linear or logarithmic administration number fit the data better.27 In model #4, a factor for administration interval group was introduced together with an interaction term between administration number and administration interval to investigate whether slopes for administration number differed significantly between administration interval groups. For each of the 45 participants, we calculated individual slope estimates based on the random effect terms of the mixed effect models. We determined the number of negative and positive slopes, and whether they were significantly different from zero. The latter tests were adjusted for multiple testing with the false discovery rate method at p < .05.28 To investigate potential time of day effects on practice, we first determined the average time of day of the start of the 16 test sessions for each of the 45 participants. We then sorted participants by this variable in ascending order and generated a variable that indicated whether a participant belonged to the first, second, or third tertile of ordered session start times. This variable and an interaction term with administration number (or the natural logarithmic transform of administration number, whichever had the lowest AIC value) were introduced in model #5. RESULTS Two participants assigned to the long administration interval group withdrew from the study after 4 and 6 PVT-B administrations, respectively, and were excluded from data analysis (reducing the total N from 47 to 45). Participants reported on average 7.1 hours sleep (SD 1.1 hours, range 5–9.5 hours) on weekdays or workdays during the past year. The average ESS score was 6.6 (SD 3.4, range 1–13). Mild-to-moderate excessive daytime sleepiness (ESS scores 11–15) was reported by 18.2% of participants, and no participant reported severe excessive daytime sleepiness. Alertness ratings provided before test administration averaged 6.1 (SE 0.2) on an 11-point Likert scale ranging from 0 (Tired) to 10 (Alert). There was no linear trend across administration days (p = .8882). Sleep duration on the night before the test was taken averaged 7.0 hours (SD 1.1 hours, range 3–10 hours). There was no linear trend across administration days (p = .4078). The average (range) administration interval between site visits was 12.0 (10–28) days for the long group, 3.1 (2–5) days for the short group, and 3.5 (2–7) days for the ultrashort group. Participants were found to be high performers and adherent to the test with 0.98 lapses, 1.29 false starts, and 187.4 milliseconds fastest 10% RT on average (Table 2). Table 2 Regression Outcomes. Variable  Overall mean across administrations (SD)a  Range across administrationsb  Logarithmic fit slope (p-value; AIC)  Linear fit slope (p-value; AIC)  ANOVA p-value  Negative slope N (N pFDR<0.05)  Positive slope N (N pFDR<0.05)  Interaction slope × ToD p-value  Fastest 10% RT [ms]  187.4 (15.7)  10.2 (−5.4; 5.2)  3.024 (<.0001; 5237.0)  0.496 (.0002; 5226.1)  <.0001  8 (0)  37 (6)  .6506  Standard Deviation RT [ms]  37.4 (15.3)  8.6 (−3.6; 5.0)  −2.714 (.0010; 5638.5)  −0.449 (.0006; 5646.8)  .0008  38 (0)  7 (0)  .0867  Minimum RT [ms]  176.7 (20.2)  16.8 (−10.6; 6.2)  3.966 (.0010; 6132.8)  0.611 (.0023; 6138.6)  .0002  7 (0)  38 (3)  .1630  False Starts [N]  1.29 (1.42)  0.8 (−0.3; 0.5)  −0.196 (.0050; 2165.3)  −0.028 (.0104; 2175.5)  .0010  39 (0)  6 (0)  .1512  Slowest 10% RRT [1/s]  3.36 (0.50)  0.20 (−0.12; 0.08)  0.068 (.0124; 383.2)  0.011 (.0126; 400.7)  .0014  13 (3)  32 (10)  .1906  Median RT [ms]  219.2 (21.2)  8.2 (−2.4; 5.8)  1.458 (.1261; 5513.9)  0.309 (.0536; 5508.0)  .0132  18 (1)  27 (6)  .8858  Maximum RT [ms]  365.3 (95.1)  39.9 (−18.3; 21.6)  −6.609 (.1443; 8426.0)  −1.219 (.1083; 8429.2)  .2947  37 (0)  8 (0)  .0891  Response speed [1/s]  4.52 (0.41)  0.13 (−0.09; 0.04)  −0.020 (.2351; -276.8)  −0.004 (.1759; −278.9)  .0545  25 (5)  20 (2)  .8524  Mean RT [ms]  228.3 (23.0)  7.4 (−2.6; 4.8)  0.535 (.5938; 5601.9)  0.132 (.4387; 5597.3)  .1820  25 (1)  20 (5)  .7666  Lapses [N]  0.98 (1.61)  0.4 (−0.2; 0.2)  −0.026 (.7604; 2333.3)  0.001 (.9578; 2329.2)  .8290  25 (0)  20 (2)  .3775  Variable  Overall mean across administrations (SD)a  Range across administrationsb  Logarithmic fit slope (p-value; AIC)  Linear fit slope (p-value; AIC)  ANOVA p-value  Negative slope N (N pFDR<0.05)  Positive slope N (N pFDR<0.05)  Interaction slope × ToD p-value  Fastest 10% RT [ms]  187.4 (15.7)  10.2 (−5.4; 5.2)  3.024 (<.0001; 5237.0)  0.496 (.0002; 5226.1)  <.0001  8 (0)  37 (6)  .6506  Standard Deviation RT [ms]  37.4 (15.3)  8.6 (−3.6; 5.0)  −2.714 (.0010; 5638.5)  −0.449 (.0006; 5646.8)  .0008  38 (0)  7 (0)  .0867  Minimum RT [ms]  176.7 (20.2)  16.8 (−10.6; 6.2)  3.966 (.0010; 6132.8)  0.611 (.0023; 6138.6)  .0002  7 (0)  38 (3)  .1630  False Starts [N]  1.29 (1.42)  0.8 (−0.3; 0.5)  −0.196 (.0050; 2165.3)  −0.028 (.0104; 2175.5)  .0010  39 (0)  6 (0)  .1512  Slowest 10% RRT [1/s]  3.36 (0.50)  0.20 (−0.12; 0.08)  0.068 (.0124; 383.2)  0.011 (.0126; 400.7)  .0014  13 (3)  32 (10)  .1906  Median RT [ms]  219.2 (21.2)  8.2 (−2.4; 5.8)  1.458 (.1261; 5513.9)  0.309 (.0536; 5508.0)  .0132  18 (1)  27 (6)  .8858  Maximum RT [ms]  365.3 (95.1)  39.9 (−18.3; 21.6)  −6.609 (.1443; 8426.0)  −1.219 (.1083; 8429.2)  .2947  37 (0)  8 (0)  .0891  Response speed [1/s]  4.52 (0.41)  0.13 (−0.09; 0.04)  −0.020 (.2351; -276.8)  −0.004 (.1759; −278.9)  .0545  25 (5)  20 (2)  .8524  Mean RT [ms]  228.3 (23.0)  7.4 (−2.6; 4.8)  0.535 (.5938; 5601.9)  0.132 (.4387; 5597.3)  .1820  25 (1)  20 (5)  .7666  Lapses [N]  0.98 (1.61)  0.4 (−0.2; 0.2)  −0.026 (.7604; 2333.3)  0.001 (.9578; 2329.2)  .8290  25 (0)  20 (2)  .3775  RRT = reciprocal response time; SE = standard error; AIC = Akaike Information Criterion; FDR = False Discovery Rate; ToD = Time of Day; ANOVA = administration number was entered as a factor in this model. aAcross administrations. bRelative to overall mean. View Large Statistically significant linear and logarithmic changes in PVT-B outcomes with repeated administration were observed for five out of the 10 variables (Table 2 and Figure 1). In four out of these five variables, the logarithmic fit outperformed the linear fit. The fastest 10% of RT, minimum RT, and response speed of the slowest 10% of RT increased with repeated administration. Both the number of false starts and the standard deviation of RT decreased with repeated administration. For these five variables, the majority of participants (between 35 and 39 out of 45) followed the same pattern indicated by the sign of individually determined slopes. No statistically significant changes with repeated administration were found for median and mean RT, maximum RT, response speed, and the number of lapses. This was also reflected in individually determined slopes (Table 2). Figure 1 View largeDownload slide Changes in PVT outcomes with repeated test administration. Each participant performed the PVT-B 16 times. A significant linear and logarithmic trend was found for the fastest 10% RT, standard deviation of RT, minimum RT, false starts, and the slowest 10% of reciprocal RT. Black dots represent estimated means relative to the overall mean across test administrations. Error bars reflect standard errors. The red line reflects a trend line fitted to the estimated means (a linear or logarithmic fit was chosen based on AIC). Figure 1 View largeDownload slide Changes in PVT outcomes with repeated test administration. Each participant performed the PVT-B 16 times. A significant linear and logarithmic trend was found for the fastest 10% RT, standard deviation of RT, minimum RT, false starts, and the slowest 10% of reciprocal RT. Black dots represent estimated means relative to the overall mean across test administrations. Error bars reflect standard errors. The red line reflects a trend line fitted to the estimated means (a linear or logarithmic fit was chosen based on AIC). Administration interval did not modify the effects of repeated administration on PVT-B performance (all p > .13 for interaction, Figure 2). Also, no significant interaction was found between tertiles of average time of day of session start times and administration number (all p > .08 for interaction). Significant practice effects were found for all of the nine other cognitive tests in the speed domain and for six out of the nine tests in the accuracy domain (detailed results will be reported elsewhere). Figure 2 View largeDownload slide Changes in PVT performance with repeated administration were not moderated by test administration interval. Regression models suggested no significant interaction effects (all p > .13) between test administration number and administration interval. The choice of linear or logarithmic fit was based on AIC. Long: ≥10 days; Short: ≤5 days; Ultrashort: 4 times per day with 15-minute breaks between tests. Figure 2 View largeDownload slide Changes in PVT performance with repeated administration were not moderated by test administration interval. Regression models suggested no significant interaction effects (all p > .13) between test administration number and administration interval. The choice of linear or logarithmic fit was based on AIC. Long: ≥10 days; Short: ≤5 days; Ultrashort: 4 times per day with 15-minute breaks between tests. DISCUSSION To the best of our knowledge, this is the first study to systematically investigate the effects of repeated administration on PVT-B performance in a controlled testing environment outside the sleep laboratory. In contrast to the current doctrine of the PVT not being affected by practice effects, we found significant linear and logarithmic trends with repeated administration in five out of the 10 investigated PVT outcomes. These changes can best be characterized by a decrease in variability of RTs, reflected in an increase of minimum RT, fewer false starts, longer RTs in the fast domain, and shorter RTs in the slow domain, which collectively caused the standard deviation of RTs to decrease. These systematic changes with repeated administration may reflect a change in response strategy rather than a practice effect, as participants sacrificed speed in the fast RT domain to avoid false starts and showed fewer long RTs at the same time. Although statistically significant, these changes are exceedingly small and hence cannot be meaningfully interpreted. For example, the fastest 10% RT increased by 0.5 milliseconds per administration, and the number of false starts decreased by 0.03 per administration (both linear fits). Importantly, mean and median RT, response speed, and the number of lapses, which are among the most frequently used outcomes of the PVT,11,29 did not show any systematic changes with repeated administration. Administration interval, which varied between 15 minutes and 28 days, did not significantly modify the effects of repeated administration on PVT-B performance. All cognitive tests (including the PVT) require participant motivation. Although we did not explicitly measure motivation in our study, the overall high levels of performance (e.g., 0.98 lapses and 1.29 false starts on average) and the fact that maximally six out of 45 participants demonstrated a statistically significantly declining trend in PVT performance over time (Table 2) suggest that our participant population was highly motivated. Strengths and Limitations Strengths of the study include the relatively large sample size, the increased ecologic validity relative to laboratory studies with participants sleeping in their own home, and the fact that we systematically investigated the effects of administration interval on PVT-B performance across multiple outcome domains. Potential limitations include the following: We did not fix time of day across PVT-B administrations within participants. This potentially increased variability across test administrations, but was unlikely to influence practice effects (corroborated by results of model #5). Furthermore, aside from sleep inertia effects during the first hour or so after waking up, PVT performance is typically very stable across the first 16 hours of the wake period.20 We used a validated 3-minute version of the PVT instead of the standard 10-minute version. It is therefore unclear whether our findings generalize to the 10-minute or longer versions of the PVT that are more affected by time-on-task effects.1,4,30 Furthermore, we investigated a high performing astronaut surrogate population. It is unclear whether our findings translate to other populations with lower degrees of educational attainment or lower levels of motivation. However, due to its simplicity, the PVT is generally believed to be free of intellectual aptitude effects. In contrast to the standard PVT, participants were presented with a feedback score at the end of each test. It is unclear whether this influenced our findings. The standard PVT does, however, provide RT feedback after each response. Thus, participants typically have an idea how well they performed after test completion. Finally, PVT-B administration was preceded by nine other cognitive tests that may have influenced performance due to time-on-task or other contaminating effects. CONCLUSIONS This study found statistically significant systematic changes with repeated administration in several PVT-B outcomes that led to a decrease in variability of RTs and were consistent with a change in response strategy rather than practice effects. Overall, the observed absolute changes were small and are probably irrelevant in practice. Test administration interval did not modify the effects of repeated administration on PVT-B performance. Importantly, mean and median RT, response speed, and lapses, which are among the most frequently used outcomes of the PVT, did not show any systematic changes with repeated administration. In conclusion, our findings support that the PVT, in contrast to many other more complex cognitive tests (including those administered in this study), shows stable performance across repeated administrations and is not influenced by practice effects. Combined with its high sensitivity, this finding corroborates the status of the PVT as a de facto gold standard measure of the neurobehavioral effects induced by sleep loss and/or circadian misalignment. FUNDING This work was funded by the National Space Biomedical Research Institute (NSBRI) through grant NBPF00012 (NASA NCC 9–58) and by the National Aeronautics and Space Administration (NASA) through grant NNX14AH98G. DISCLOSURE STATEMENT None declared. REFERENCES 1. Lim J, Dinges DF. Sleep deprivation and vigilant attention. Ann N Y Acad Sci . 2008; 1129: 305– 322. Google Scholar CrossRef Search ADS PubMed  2. Dinges DF, Powell JW. Microcomputer analysis of performance on a portable, simple visual RT task during sustained operations. Behav Res Methods Instrum Comput . 1985; 6( 17): 652–65 5. Google Scholar CrossRef Search ADS   3. Dinges DF, Pack F, Williams Ket al.   Cumulative sleepiness, mood disturbance, and psychomotor vigilance performance decrements during a week of sleep restricted to 4-5 hours per night. Sleep . 1997; 20( 4): 267– 277. Google Scholar PubMed  4. Doran SM, Van Dongen HP, Dinges DF. Sustained attention performance during sleep deprivation: evidence of state instability. Arch Ital Biol . 2001; 139( 3): 1– 15. 5. Dorrian J, Rogers NL, Dinges DF, Kushida CA. Psychomotor vigilance performance: Neurocognitive assay sensitive to sleep loss. In: Kushida CA, ed. Sleep Deprivation: Clinical Issues, Pharmacology and Sleep Loss Effects . New York, NY: Marcel Dekker, Inc., 2005: 39– 70. Google Scholar CrossRef Search ADS   6. Dinges DF, Kribbs NB. Performing while sleepy: Effects of experimentally-induced sleepiness. In: Monk TH, ed. Sleep, Sleepiness and Performance. Chichester , United Kingdom: John Wiley and Sons, Ltd.; 1991: 97– 128. 7. Warm JS, Parasuraman R, Matthews G. Vigilance requires hard mental work and is stressful. Hum Factors . 2008; 50( 3): 433– 441. Google Scholar CrossRef Search ADS PubMed  8. Dinges DF, Mallis M. Managing fatigue by drowsiness detection: can technological promises be realized? In: Hartley L, ed. Managing Fatigue in Transportation - Proceedings of the 3rd Fatigue in Transportation Conference , Fremantle: Western AustraliaPergamon; 1998: 209– 29. Google Scholar CrossRef Search ADS   9. Van Dongen HP, Maislin G, Mullington JM, Dinges DF. The cumulative cost of additional wakefulness: dose-response effects on neurobehavioral functions and sleep physiology from chronic sleep restriction and total sleep deprivation. Sleep . 2003; 26( 2): 117– 126. Google Scholar CrossRef Search ADS PubMed  10. Gunzelmann G, Moore LR, Gluck KA, Van Dongen HP, Dinges DF. Fatigue in sustained attention: Generalizing mechanisms for time awake to time on task. In: Ackerman PL, ed. Cognitive fatigue: Multidisciplinary perspectives on current research and future applications . Washington, D.C.: American Psychological Association; 2010: 83– 101. Google Scholar CrossRef Search ADS   11. Basner M, Dinges DF. Maximizing sensitivity of the psychomotor vigilance test (PVT) to sleep loss. Sleep . 2011; 34( 5): 581– 591. Google Scholar CrossRef Search ADS PubMed  12. Beglinger LJ, Gaydos B, Tangphao-Daniels Oet al.   Practice effects and the use of alternate forms in serial neuropsychological testing. Arch Clin Neuropsychol . 2005; 20( 4): 517– 529. Google Scholar CrossRef Search ADS PubMed  13. Falleti MG, Maruff P, Collie A, Darby DG. Practice effects associated with the repeated assessment of cognitive function using the CogState battery at 10-minute, one week and one month test-retest intervals. J Clin Exp Neuropsychol . 2006; 28( 7): 1095– 1112. Google Scholar CrossRef Search ADS PubMed  14. Schlegel RE, Shehab RL, Gilliland K, Eddy DR, Schiflett SG. Microgravity effects on cognitive performance measures: practice schedules to acquire and maintain performance stability: Brooks Air Force Base, TX, 1995 . Report No.: AL/CF-TR-1994-0040. 15. Van Dongen HP, Dinges DF. Sleep, circadian rhythms, and psychomotor vigilance. Clin Sports Med . 2005; 24( 2): 237– 49, vii. Google Scholar CrossRef Search ADS PubMed  16. Kribbs NB, Dinges DF. Vigilance decrement and sleepiness. In: Harsh J, Ogilvie RD, eds. Sleep onset mechanisms . Washington, D.C.: American Psychological Association; 1994: 113– 125. Google Scholar CrossRef Search ADS   17. Paterson JL, Dorrian J, Ferguson SA, Jay SM, Dawson D. What happens to mood, performance and sleep in a laboratory study with no sleep deprivation? Sleep Biol Rhythms . 2013; 11( 3): 200– 209. Google Scholar CrossRef Search ADS PubMed  18. Belenky G, Wesensten NJ, Thorne DRet al.   Patterns of performance degradation and restoration during sleep restriction and subsequent recovery: a sleep dose-response study. J Sleep Res . 2003; 12( 1): 1– 12. Google Scholar CrossRef Search ADS PubMed  19. Johns MW. A new method for measuring daytime sleepiness: the Epworth sleepiness scale. Sleep . 1991; 14( 6): 540– 545. Google Scholar CrossRef Search ADS PubMed  20. Basner M, Mollicone D, Dinges DF. Validity and Sensitivity of a Brief Psychomotor Vigilance Test (PVT-B) to Total and Partial Sleep Deprivation. Acta Astronaut . 2011; 69( 11-12): 949– 959. Google Scholar CrossRef Search ADS PubMed  21. Basner M, Dinges DF, Shea JAet al.   Sleep and alertness in medical interns and residents: an observational study on the role of extended shifts. Sleep  2017; 40( 4) : 1– 8. Google Scholar CrossRef Search ADS   22. Volpp KG, Shea JA, Small DSet al.   Effect of a protected sleep period on hours slept during extended overnight in-hospital duty hours among medical interns: a randomized trial. JAMA . 2012; 308( 21): 2208– 2217. Google Scholar CrossRef Search ADS PubMed  23. Basner M, Savitt A, Moore TMet al.   Development and validation of the Cognition test battery. Aerosp Med Hum Perf  2015; 86( 11): 942– 52. Google Scholar CrossRef Search ADS   24. Basner M, Dinges DF, Mollicone Det al.   Mars 520-d mission simulation reveals protracted crew hypokinesis and alterations of sleep duration and timing. Proc Natl Acad Sci USA . 2013; 110( 7): 2635– 2640. Google Scholar CrossRef Search ADS PubMed  25. Basner M, Dinges DF. Maximizing sensitivity of the psychomotor vigilance test (PVT) to sleep loss. Sleep . 2011; 34( 5): 581– 591. Google Scholar CrossRef Search ADS PubMed  26. Satterthwaite FE. An approximate distribution of estimates of variance components. Biometrics . 1946; 2( 6): 110– 114. Google Scholar CrossRef Search ADS PubMed  27. Bozdogan H. Model Selection and Akaike Information Criterion (AIC) - the general-theory and its analytical extensions. Psychometrika  1987; 52( 3): 345– 70. Google Scholar CrossRef Search ADS   28. Curran-Everett D. Multiple comparisons: philosophies and illustrations. Am J Physiol Regul Integr Comp Physiol . 2000; 279( 1): R1– R8. Google Scholar CrossRef Search ADS PubMed  29. Basner M, Mcguire S, Goel N, Rao H, Dinges DF. A new likelihood ratio metric for the psychomotor vigilance test and its sensitivity to sleep loss. J Sleep Res . 2015; 24( 6): 702– 713. Google Scholar CrossRef Search ADS PubMed  30. Lim J, Wu WC, Wang J, Detre JA, Dinges DF, Rao H. Imaging brain fatigue from sustained mental workload: an ASL perfusion study of the time-on-task effect. Neuroimage . 2010; 49( 4): 3426– 3435. Google Scholar CrossRef Search ADS PubMed  © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png SLEEP Oxford University Press

Loading next page...
 
/lp/ou_press/repeated-administration-effects-on-psychomotor-vigilance-test-qjnDsUhQmH
Publisher
Sleep Research Society
Copyright
© Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.
ISSN
0161-8105
eISSN
1550-9109
D.O.I.
10.1093/sleep/zsx187
Publisher site
See Article on Publisher Site

Abstract

Abstract Study Objectives The Psychomotor Vigilance Test (PVT) is reported to be free of practice effects that can otherwise confound the effects of sleep loss and circadian misalignment on performance. This differentiates the PVT from more complex cognitive tests. To the best of our knowledge, no study has systematically investigated practice effects on the PVT across multiple outcome domains, depending on administration interval, and in ecologically more valid settings. Methods We administered a validated 3-minute PVT (PVT-B) 16 times in 45 participants (23 male, mean ± SD age 32.6 ± 7.3 years, range 25–54 years) with administration intervals of ≥10 days, ≤5 days, or 4 times per day. We investigated linear and logarithmic trends across repeated administrations in 10 PVT-B outcome variables. Results The fastest 10% of response times (RT; plin = .0002), minimum RT (plog = .0010), and the slowest 10% of reciprocal RT (plog = .0124) increased while false starts (plog = 0.0050) decreased with repeated administration, collectively decreasing RT variability (plog = .0010) across administrations. However, the observed absolute changes were small (e.g., −0.03 false starts per administration, linear fit) and are probably irrelevant in practice. Test administration interval did not modify the effects of repeated administration on PVT-B performance (all p > .13 for interaction). Importantly, mean and median RT, response speed, and lapses, which are among the most frequently used PVT outcomes, did not change systematically with repeated administration. Conclusions PVT-B showed stable performance across repeated administrations. Combined with its high sensitivity, this corroborates the status of the PVT as the de facto gold standard measure of the neurobehavioral effects of sleep loss and circadian misalignment. alertness, psychomotor vigilance performance, fatigue, sleep deprivation, effects of sleep restriction on cognition and affect Statement of Significance Practice effects confound the effects of sleep loss on cognitive performance of more complex tests. The Psychomotor Vigilance Test (PVT) is reported to be free of practice effects, but a rigorous PVT practice effect assessment in ecologically more valid settings has been missing. Although some statistically significant changes in PVT-B performance were found with repeated administration in this study, they were small and did not affect the most frequently investigated PVT outcomes. To the best of our knowledge, this study is the first to demonstrate no relevant PVT-B practice effects in a controlled testing environment outside the laboratory, and no effect of test administration intervals. The findings stress the value of this brief assay of vigilant attention for sleep research and sleep medicine. INTRODUCTION The Psychomotor Vigilance Test (PVT) measures sustained or vigilant attention by recording response times (RT) to visual (or auditory) stimuli that occur at random inter-stimulus intervals (ISIs).1–7 Sleep deprivation induces reliable changes in PVT performance, causing an overall slowing of RT, a steady increase in the number of errors of omission (i.e., lapses of attention, historically defined as RTs ≥ 500 milliseconds), and a more modest increase in errors of commission (i.e., responses without a stimulus, or false starts).8,9 These effects typically increase with time-on-task.10 The PVT has become arguably the most widely used measure of behavioral alertness owing in large part to its high sensitivity to sleep deprivation and circadian misalignment, and the fact that it can reflect real-world risks because deficits in sustained attention and timely reactions adversely affect many applied tasks.1,5 Another reason for the popularity of the PVT is that, in contrast to other more complex cognitive tests, the PVT is reported to be free of aptitude and practice effects.1,11 The latter can confound the effects of sleep deprivation or circadian misalignment on cognitive performance. Practice or learning effects refer to the phenomenon of gradually improving performance (i.e., higher accuracy and faster speed) with repeated administration of a cognitive test.12 Also, the interval between test administrations can influence practice effects.13 Learning curves are often curvilinear with a more rapid improvement of performance during early test administrations and a decelerating improvement during later test administrations, with test performance eventually reaching a plateau or asymptote.12,14 In many studies, it is neither practical to train participants until they reach a performance plateau prior to entering a study, nor is it possible to investigate a control group that could be used to adjust for some of the effects of repeated testing, stressing the value of a cognitive test without practice effects. Manuscripts reporting on studies that used the PVT often either state the PVT has no practice effects without citing evidence or they cite the same few papers. Manuscripts frequently cited include van Dongen and Dinges,15 Dinges et al.,3 Kribbs and Dinges,16 van Dongen et al.,9 and Dinges and Kribbs6. There are several published laboratory studies that administered the PVT repeatedly during baseline and control conditions. However, as Paterson et al.17 point out, 8 hours sleep opportunity is often not long enough to guarantee that participants obtain their pre-study habitual sleep times in the unfamiliar laboratory environment. Two studies that provided 9 hours sleep opportunity over seven consecutive nights showed conflicting results. Belenky et al.18 found no significant changes in PVT response speed, lapses, or fastest 10% RT, whereas Paterson et al.17 found a significant decline in PVT performance reflected in response speed, average RT, and lapses starting as early as the first night after baseline. Both studies treated day as a factor, i.e., they did not investigate linear or non-linear trends over time. The laboratory environments differed in the degree of confinement, environmental conditions, study group size, research team, and demand characteristics, among other factors, which may explain the differences found between these two studies.17 In light of this laboratory effect and based on the fact that more ecologically valid studies outside the sleep laboratory are needed to inform sleep interventions and health policy, studies are required that investigate PVT practice effects in more naturalistic settings. To the best of our knowledge, none of the previously published studies rigorously assessed practice effects on the PVT by (a) inspecting multiple outcome domains of the PVT (including errors of commission and omission, response speed, changes in the fast and slow RT domains, and RT variability) and by (b) taking administration interval into account. We thus designed a study to systematically investigate practice effects on the PVT during 16 repeated administrations in a setting with increased ecologic validity (i.e., sleep at home) in three groups of N = 14–16 participants that differed in PVT administration interval (≥10 days, ≤5 days, and 4 times per day with 15-minute breaks between tests). METHODS Participants and Protocol A total of N = 45 astronaut surrogate participants (23 male, mean ± SD age 32.6 ± 7.3 years, range 25–54 years) were recruited for the study through on campus and online postings. Participants filled out a screening questionnaire prior to enrollment. They had to be proficient in English, between 25 and 55 years old, and have at least a Master’s degree (or equivalent). Participants with a history of a medical disorder that may significantly affect brain function (e.g., neurological disorders, severe head trauma) as well as participants with a psychiatric illness were excluded from study participation. After the study, participants were asked to fill out the Epworth Sleepiness Scale (ESS)19 and answer a question on how much sleep they usually got on weekdays or workdays during the past year. Valid reports on ESS and sleep duration were obtained from all but one participant who had moved outside the country. The study was reviewed by the Institutional Review Board of the University of Pennsylvania and considered exempt. Written informed consent was obtained from study participants prior to data collection. Participants were compensated for each session they attended. They received $235 (ultrashort group) and $390 (short and long groups) if they completed all test sessions. Participants performed a validated 3-minute version of the PVT (PVT-B)20 for a total of 16 times in an office in the Unit for Experimental Psychiatry at the University of Pennsylvania. The office door was kept closed and a “Testing in Session – Quiet Please – Thank you” warning sign was set up in the hallway during cognitive testing to keep the level of distractions low. Participants were not instructed to follow a specific sleep schedule. They were allowed to go about their normal lives. Participants were free to choose the time of day for testing (i.e., they did not perform the PVT at the same time of day every time they were tested). The start time of 97.1% of testing sessions fell between 8 am and 6 pm (earliest 7:10 am, latest 8:24 pm). PVT-B utilizes 2- to 5-second ISIs in contrast to the 2- to 10-second ISIs of the standard 10-minute PVT. Also, PVT-B utilizes a 355-millisecond lapse threshold in contrast to the 500-millisecond lapse threshold of the standard 10-minute PVT. These modifications increase sensitivity of PVT-B, which is comparable to the standard 10-minute PVT, despite the substantial decrease in test duration.20 PVT-B has been successfully administered to medical interns and residents,21,22 astronauts,23 and astronaut surrogate populations.24 We used block-randomization based on sex to assign participants to one of three groups that differed in PVT-B administration interval: participants performed the PVT-B with ≥10 days between test administrations (long group, N = 14), with ≤5 days between test administrations (short group, N = 16), or four times on a single day with 15-minute breaks between test battery administrations for a total of four visits (ultrashort group, N = 15). In the short and ultrashort groups, there had to be at least 1 day without testing between study site visits. In the ultrashort group, sessions were scheduled with 2- to 6-day intervals. Due to last minute scheduling conflicts, this interval had to be extended to 7 days in two participants in one instance each. The three groups were comparable in age and sex composition (Table 1). Before the first test administration, participants watched a standardized familiarization video and performed a brief practice version of the PVT. In the video, the principal investigator performed a short version of all 10 Cognition tests (see below) after reading out loud the standardized instructions for each test. During all test administrations, participants were supervised by a research coordinator who could answer questions and address technical problems at any time. Table 1 Participant Characteristics.   Long group  Short group  Ultrashort group  All  Administration intervala  ≥10 days  ≤5 days  4× per day    Number of participants  14  16  15  45  Mean participant age (SD)  32.4 (7.4) years  34.4 (6.9) years  30.9 (7.5) years  32.6 (7.3) years  Male  57.1%  50.0%  46.7%  51.1%    Long group  Short group  Ultrashort group  All  Administration intervala  ≥10 days  ≤5 days  4× per day    Number of participants  14  16  15  45  Mean participant age (SD)  32.4 (7.4) years  34.4 (6.9) years  30.9 (7.5) years  32.6 (7.3) years  Male  57.1%  50.0%  46.7%  51.1%  SD = Standard deviation. aConsecutive test days had to be separated by at least one test-free day. View Large Measurement and Outcomes The PVT-B was administered on a calibrated laptop computer (Dell Latitude E6430 with a 14" screen diagonal, 16:9 aspect ratio) using the Cognition software.23 The average response latency of the spacebar (30.8 milliseconds, standard deviation 2.8 milliseconds) was determined with a robotic calibrator before the start of the study and subtracted from each RT. Before each testing session, participants provided their sleep duration on the night preceding the test and rated their alertness on an 11-point Likert scale ranging from 0 (Tired) to 10 (Alert). For PVT-B administration, participants were instructed to monitor a rectangular box and press the spacebar as fast as possible as soon as a millisecond counter started incrementing, without committing premature responses (false starts). After test completion, participants were presented with a feedback score ranging between 0 (worst possible performance) and 1000 (best possible performance). The feedback score was based on RT and the number of false starts. PVT-B was preceded by the following nine cognitive tests that compose the Cognition test battery and cover a wide range of cognitive domains: Motor Praxis, Visual Object Learning, Fractal 2-Back, Abstract Matching, Line Orientation, Emotion Recognition, Matrix Reasoning, Digit Symbol Substitution, and Balloon Analog Risk. It took participants ca. 15 minutes to complete these nine tests prior to PVT-B administration. The overall goal of the study was to characterize practice and stimulus set difficulty effects for all 10 tests of the Cognition battery. Detailed results for the other nine tests will be reported elsewhere. The following standard outcomes were generated for each PVT-B test bout:25 (1) the number of lapses (errors of omission, RT ≥ 355 milliseconds); (2) response speed (reciprocal response time, 1/RT); (3) average RT; (4) median RT; (5) fastest 10% of RT; (6) slowest 10% reciprocal RT; (7) standard deviation of RT; (8) false starts (errors of commission, including coincident false starts defined as RT < 100 milliseconds); (9) minimum RT; and (10) maximum RT. Data Analyses Each PVT-B was inspected for outliers and participant non-adherence, but none of the tests needed to be excluded from data analysis. All data were analyzed with linear mixed effects models in SAS (Version 9.3, Cary, NC). In model #1, administration number was entered as a factor (effect coding) in a random subject intercept model testing whether at least one administration differed significantly from the overall mean across administrations. In models #2 and #3, administration number or the natural logarithmic transform of administration number were entered as a continuous variable in a random subject intercept model with random slopes (unstructured covariance). Degrees of freedom were corrected with Satterthwaite’s method.26 The Akaike Information Criterion (AIC) was used to determine whether linear or logarithmic administration number fit the data better.27 In model #4, a factor for administration interval group was introduced together with an interaction term between administration number and administration interval to investigate whether slopes for administration number differed significantly between administration interval groups. For each of the 45 participants, we calculated individual slope estimates based on the random effect terms of the mixed effect models. We determined the number of negative and positive slopes, and whether they were significantly different from zero. The latter tests were adjusted for multiple testing with the false discovery rate method at p < .05.28 To investigate potential time of day effects on practice, we first determined the average time of day of the start of the 16 test sessions for each of the 45 participants. We then sorted participants by this variable in ascending order and generated a variable that indicated whether a participant belonged to the first, second, or third tertile of ordered session start times. This variable and an interaction term with administration number (or the natural logarithmic transform of administration number, whichever had the lowest AIC value) were introduced in model #5. RESULTS Two participants assigned to the long administration interval group withdrew from the study after 4 and 6 PVT-B administrations, respectively, and were excluded from data analysis (reducing the total N from 47 to 45). Participants reported on average 7.1 hours sleep (SD 1.1 hours, range 5–9.5 hours) on weekdays or workdays during the past year. The average ESS score was 6.6 (SD 3.4, range 1–13). Mild-to-moderate excessive daytime sleepiness (ESS scores 11–15) was reported by 18.2% of participants, and no participant reported severe excessive daytime sleepiness. Alertness ratings provided before test administration averaged 6.1 (SE 0.2) on an 11-point Likert scale ranging from 0 (Tired) to 10 (Alert). There was no linear trend across administration days (p = .8882). Sleep duration on the night before the test was taken averaged 7.0 hours (SD 1.1 hours, range 3–10 hours). There was no linear trend across administration days (p = .4078). The average (range) administration interval between site visits was 12.0 (10–28) days for the long group, 3.1 (2–5) days for the short group, and 3.5 (2–7) days for the ultrashort group. Participants were found to be high performers and adherent to the test with 0.98 lapses, 1.29 false starts, and 187.4 milliseconds fastest 10% RT on average (Table 2). Table 2 Regression Outcomes. Variable  Overall mean across administrations (SD)a  Range across administrationsb  Logarithmic fit slope (p-value; AIC)  Linear fit slope (p-value; AIC)  ANOVA p-value  Negative slope N (N pFDR<0.05)  Positive slope N (N pFDR<0.05)  Interaction slope × ToD p-value  Fastest 10% RT [ms]  187.4 (15.7)  10.2 (−5.4; 5.2)  3.024 (<.0001; 5237.0)  0.496 (.0002; 5226.1)  <.0001  8 (0)  37 (6)  .6506  Standard Deviation RT [ms]  37.4 (15.3)  8.6 (−3.6; 5.0)  −2.714 (.0010; 5638.5)  −0.449 (.0006; 5646.8)  .0008  38 (0)  7 (0)  .0867  Minimum RT [ms]  176.7 (20.2)  16.8 (−10.6; 6.2)  3.966 (.0010; 6132.8)  0.611 (.0023; 6138.6)  .0002  7 (0)  38 (3)  .1630  False Starts [N]  1.29 (1.42)  0.8 (−0.3; 0.5)  −0.196 (.0050; 2165.3)  −0.028 (.0104; 2175.5)  .0010  39 (0)  6 (0)  .1512  Slowest 10% RRT [1/s]  3.36 (0.50)  0.20 (−0.12; 0.08)  0.068 (.0124; 383.2)  0.011 (.0126; 400.7)  .0014  13 (3)  32 (10)  .1906  Median RT [ms]  219.2 (21.2)  8.2 (−2.4; 5.8)  1.458 (.1261; 5513.9)  0.309 (.0536; 5508.0)  .0132  18 (1)  27 (6)  .8858  Maximum RT [ms]  365.3 (95.1)  39.9 (−18.3; 21.6)  −6.609 (.1443; 8426.0)  −1.219 (.1083; 8429.2)  .2947  37 (0)  8 (0)  .0891  Response speed [1/s]  4.52 (0.41)  0.13 (−0.09; 0.04)  −0.020 (.2351; -276.8)  −0.004 (.1759; −278.9)  .0545  25 (5)  20 (2)  .8524  Mean RT [ms]  228.3 (23.0)  7.4 (−2.6; 4.8)  0.535 (.5938; 5601.9)  0.132 (.4387; 5597.3)  .1820  25 (1)  20 (5)  .7666  Lapses [N]  0.98 (1.61)  0.4 (−0.2; 0.2)  −0.026 (.7604; 2333.3)  0.001 (.9578; 2329.2)  .8290  25 (0)  20 (2)  .3775  Variable  Overall mean across administrations (SD)a  Range across administrationsb  Logarithmic fit slope (p-value; AIC)  Linear fit slope (p-value; AIC)  ANOVA p-value  Negative slope N (N pFDR<0.05)  Positive slope N (N pFDR<0.05)  Interaction slope × ToD p-value  Fastest 10% RT [ms]  187.4 (15.7)  10.2 (−5.4; 5.2)  3.024 (<.0001; 5237.0)  0.496 (.0002; 5226.1)  <.0001  8 (0)  37 (6)  .6506  Standard Deviation RT [ms]  37.4 (15.3)  8.6 (−3.6; 5.0)  −2.714 (.0010; 5638.5)  −0.449 (.0006; 5646.8)  .0008  38 (0)  7 (0)  .0867  Minimum RT [ms]  176.7 (20.2)  16.8 (−10.6; 6.2)  3.966 (.0010; 6132.8)  0.611 (.0023; 6138.6)  .0002  7 (0)  38 (3)  .1630  False Starts [N]  1.29 (1.42)  0.8 (−0.3; 0.5)  −0.196 (.0050; 2165.3)  −0.028 (.0104; 2175.5)  .0010  39 (0)  6 (0)  .1512  Slowest 10% RRT [1/s]  3.36 (0.50)  0.20 (−0.12; 0.08)  0.068 (.0124; 383.2)  0.011 (.0126; 400.7)  .0014  13 (3)  32 (10)  .1906  Median RT [ms]  219.2 (21.2)  8.2 (−2.4; 5.8)  1.458 (.1261; 5513.9)  0.309 (.0536; 5508.0)  .0132  18 (1)  27 (6)  .8858  Maximum RT [ms]  365.3 (95.1)  39.9 (−18.3; 21.6)  −6.609 (.1443; 8426.0)  −1.219 (.1083; 8429.2)  .2947  37 (0)  8 (0)  .0891  Response speed [1/s]  4.52 (0.41)  0.13 (−0.09; 0.04)  −0.020 (.2351; -276.8)  −0.004 (.1759; −278.9)  .0545  25 (5)  20 (2)  .8524  Mean RT [ms]  228.3 (23.0)  7.4 (−2.6; 4.8)  0.535 (.5938; 5601.9)  0.132 (.4387; 5597.3)  .1820  25 (1)  20 (5)  .7666  Lapses [N]  0.98 (1.61)  0.4 (−0.2; 0.2)  −0.026 (.7604; 2333.3)  0.001 (.9578; 2329.2)  .8290  25 (0)  20 (2)  .3775  RRT = reciprocal response time; SE = standard error; AIC = Akaike Information Criterion; FDR = False Discovery Rate; ToD = Time of Day; ANOVA = administration number was entered as a factor in this model. aAcross administrations. bRelative to overall mean. View Large Statistically significant linear and logarithmic changes in PVT-B outcomes with repeated administration were observed for five out of the 10 variables (Table 2 and Figure 1). In four out of these five variables, the logarithmic fit outperformed the linear fit. The fastest 10% of RT, minimum RT, and response speed of the slowest 10% of RT increased with repeated administration. Both the number of false starts and the standard deviation of RT decreased with repeated administration. For these five variables, the majority of participants (between 35 and 39 out of 45) followed the same pattern indicated by the sign of individually determined slopes. No statistically significant changes with repeated administration were found for median and mean RT, maximum RT, response speed, and the number of lapses. This was also reflected in individually determined slopes (Table 2). Figure 1 View largeDownload slide Changes in PVT outcomes with repeated test administration. Each participant performed the PVT-B 16 times. A significant linear and logarithmic trend was found for the fastest 10% RT, standard deviation of RT, minimum RT, false starts, and the slowest 10% of reciprocal RT. Black dots represent estimated means relative to the overall mean across test administrations. Error bars reflect standard errors. The red line reflects a trend line fitted to the estimated means (a linear or logarithmic fit was chosen based on AIC). Figure 1 View largeDownload slide Changes in PVT outcomes with repeated test administration. Each participant performed the PVT-B 16 times. A significant linear and logarithmic trend was found for the fastest 10% RT, standard deviation of RT, minimum RT, false starts, and the slowest 10% of reciprocal RT. Black dots represent estimated means relative to the overall mean across test administrations. Error bars reflect standard errors. The red line reflects a trend line fitted to the estimated means (a linear or logarithmic fit was chosen based on AIC). Administration interval did not modify the effects of repeated administration on PVT-B performance (all p > .13 for interaction, Figure 2). Also, no significant interaction was found between tertiles of average time of day of session start times and administration number (all p > .08 for interaction). Significant practice effects were found for all of the nine other cognitive tests in the speed domain and for six out of the nine tests in the accuracy domain (detailed results will be reported elsewhere). Figure 2 View largeDownload slide Changes in PVT performance with repeated administration were not moderated by test administration interval. Regression models suggested no significant interaction effects (all p > .13) between test administration number and administration interval. The choice of linear or logarithmic fit was based on AIC. Long: ≥10 days; Short: ≤5 days; Ultrashort: 4 times per day with 15-minute breaks between tests. Figure 2 View largeDownload slide Changes in PVT performance with repeated administration were not moderated by test administration interval. Regression models suggested no significant interaction effects (all p > .13) between test administration number and administration interval. The choice of linear or logarithmic fit was based on AIC. Long: ≥10 days; Short: ≤5 days; Ultrashort: 4 times per day with 15-minute breaks between tests. DISCUSSION To the best of our knowledge, this is the first study to systematically investigate the effects of repeated administration on PVT-B performance in a controlled testing environment outside the sleep laboratory. In contrast to the current doctrine of the PVT not being affected by practice effects, we found significant linear and logarithmic trends with repeated administration in five out of the 10 investigated PVT outcomes. These changes can best be characterized by a decrease in variability of RTs, reflected in an increase of minimum RT, fewer false starts, longer RTs in the fast domain, and shorter RTs in the slow domain, which collectively caused the standard deviation of RTs to decrease. These systematic changes with repeated administration may reflect a change in response strategy rather than a practice effect, as participants sacrificed speed in the fast RT domain to avoid false starts and showed fewer long RTs at the same time. Although statistically significant, these changes are exceedingly small and hence cannot be meaningfully interpreted. For example, the fastest 10% RT increased by 0.5 milliseconds per administration, and the number of false starts decreased by 0.03 per administration (both linear fits). Importantly, mean and median RT, response speed, and the number of lapses, which are among the most frequently used outcomes of the PVT,11,29 did not show any systematic changes with repeated administration. Administration interval, which varied between 15 minutes and 28 days, did not significantly modify the effects of repeated administration on PVT-B performance. All cognitive tests (including the PVT) require participant motivation. Although we did not explicitly measure motivation in our study, the overall high levels of performance (e.g., 0.98 lapses and 1.29 false starts on average) and the fact that maximally six out of 45 participants demonstrated a statistically significantly declining trend in PVT performance over time (Table 2) suggest that our participant population was highly motivated. Strengths and Limitations Strengths of the study include the relatively large sample size, the increased ecologic validity relative to laboratory studies with participants sleeping in their own home, and the fact that we systematically investigated the effects of administration interval on PVT-B performance across multiple outcome domains. Potential limitations include the following: We did not fix time of day across PVT-B administrations within participants. This potentially increased variability across test administrations, but was unlikely to influence practice effects (corroborated by results of model #5). Furthermore, aside from sleep inertia effects during the first hour or so after waking up, PVT performance is typically very stable across the first 16 hours of the wake period.20 We used a validated 3-minute version of the PVT instead of the standard 10-minute version. It is therefore unclear whether our findings generalize to the 10-minute or longer versions of the PVT that are more affected by time-on-task effects.1,4,30 Furthermore, we investigated a high performing astronaut surrogate population. It is unclear whether our findings translate to other populations with lower degrees of educational attainment or lower levels of motivation. However, due to its simplicity, the PVT is generally believed to be free of intellectual aptitude effects. In contrast to the standard PVT, participants were presented with a feedback score at the end of each test. It is unclear whether this influenced our findings. The standard PVT does, however, provide RT feedback after each response. Thus, participants typically have an idea how well they performed after test completion. Finally, PVT-B administration was preceded by nine other cognitive tests that may have influenced performance due to time-on-task or other contaminating effects. CONCLUSIONS This study found statistically significant systematic changes with repeated administration in several PVT-B outcomes that led to a decrease in variability of RTs and were consistent with a change in response strategy rather than practice effects. Overall, the observed absolute changes were small and are probably irrelevant in practice. Test administration interval did not modify the effects of repeated administration on PVT-B performance. Importantly, mean and median RT, response speed, and lapses, which are among the most frequently used outcomes of the PVT, did not show any systematic changes with repeated administration. In conclusion, our findings support that the PVT, in contrast to many other more complex cognitive tests (including those administered in this study), shows stable performance across repeated administrations and is not influenced by practice effects. Combined with its high sensitivity, this finding corroborates the status of the PVT as a de facto gold standard measure of the neurobehavioral effects induced by sleep loss and/or circadian misalignment. FUNDING This work was funded by the National Space Biomedical Research Institute (NSBRI) through grant NBPF00012 (NASA NCC 9–58) and by the National Aeronautics and Space Administration (NASA) through grant NNX14AH98G. DISCLOSURE STATEMENT None declared. REFERENCES 1. Lim J, Dinges DF. Sleep deprivation and vigilant attention. Ann N Y Acad Sci . 2008; 1129: 305– 322. Google Scholar CrossRef Search ADS PubMed  2. Dinges DF, Powell JW. Microcomputer analysis of performance on a portable, simple visual RT task during sustained operations. Behav Res Methods Instrum Comput . 1985; 6( 17): 652–65 5. Google Scholar CrossRef Search ADS   3. Dinges DF, Pack F, Williams Ket al.   Cumulative sleepiness, mood disturbance, and psychomotor vigilance performance decrements during a week of sleep restricted to 4-5 hours per night. Sleep . 1997; 20( 4): 267– 277. Google Scholar PubMed  4. Doran SM, Van Dongen HP, Dinges DF. Sustained attention performance during sleep deprivation: evidence of state instability. Arch Ital Biol . 2001; 139( 3): 1– 15. 5. Dorrian J, Rogers NL, Dinges DF, Kushida CA. Psychomotor vigilance performance: Neurocognitive assay sensitive to sleep loss. In: Kushida CA, ed. Sleep Deprivation: Clinical Issues, Pharmacology and Sleep Loss Effects . New York, NY: Marcel Dekker, Inc., 2005: 39– 70. Google Scholar CrossRef Search ADS   6. Dinges DF, Kribbs NB. Performing while sleepy: Effects of experimentally-induced sleepiness. In: Monk TH, ed. Sleep, Sleepiness and Performance. Chichester , United Kingdom: John Wiley and Sons, Ltd.; 1991: 97– 128. 7. Warm JS, Parasuraman R, Matthews G. Vigilance requires hard mental work and is stressful. Hum Factors . 2008; 50( 3): 433– 441. Google Scholar CrossRef Search ADS PubMed  8. Dinges DF, Mallis M. Managing fatigue by drowsiness detection: can technological promises be realized? In: Hartley L, ed. Managing Fatigue in Transportation - Proceedings of the 3rd Fatigue in Transportation Conference , Fremantle: Western AustraliaPergamon; 1998: 209– 29. Google Scholar CrossRef Search ADS   9. Van Dongen HP, Maislin G, Mullington JM, Dinges DF. The cumulative cost of additional wakefulness: dose-response effects on neurobehavioral functions and sleep physiology from chronic sleep restriction and total sleep deprivation. Sleep . 2003; 26( 2): 117– 126. Google Scholar CrossRef Search ADS PubMed  10. Gunzelmann G, Moore LR, Gluck KA, Van Dongen HP, Dinges DF. Fatigue in sustained attention: Generalizing mechanisms for time awake to time on task. In: Ackerman PL, ed. Cognitive fatigue: Multidisciplinary perspectives on current research and future applications . Washington, D.C.: American Psychological Association; 2010: 83– 101. Google Scholar CrossRef Search ADS   11. Basner M, Dinges DF. Maximizing sensitivity of the psychomotor vigilance test (PVT) to sleep loss. Sleep . 2011; 34( 5): 581– 591. Google Scholar CrossRef Search ADS PubMed  12. Beglinger LJ, Gaydos B, Tangphao-Daniels Oet al.   Practice effects and the use of alternate forms in serial neuropsychological testing. Arch Clin Neuropsychol . 2005; 20( 4): 517– 529. Google Scholar CrossRef Search ADS PubMed  13. Falleti MG, Maruff P, Collie A, Darby DG. Practice effects associated with the repeated assessment of cognitive function using the CogState battery at 10-minute, one week and one month test-retest intervals. J Clin Exp Neuropsychol . 2006; 28( 7): 1095– 1112. Google Scholar CrossRef Search ADS PubMed  14. Schlegel RE, Shehab RL, Gilliland K, Eddy DR, Schiflett SG. Microgravity effects on cognitive performance measures: practice schedules to acquire and maintain performance stability: Brooks Air Force Base, TX, 1995 . Report No.: AL/CF-TR-1994-0040. 15. Van Dongen HP, Dinges DF. Sleep, circadian rhythms, and psychomotor vigilance. Clin Sports Med . 2005; 24( 2): 237– 49, vii. Google Scholar CrossRef Search ADS PubMed  16. Kribbs NB, Dinges DF. Vigilance decrement and sleepiness. In: Harsh J, Ogilvie RD, eds. Sleep onset mechanisms . Washington, D.C.: American Psychological Association; 1994: 113– 125. Google Scholar CrossRef Search ADS   17. Paterson JL, Dorrian J, Ferguson SA, Jay SM, Dawson D. What happens to mood, performance and sleep in a laboratory study with no sleep deprivation? Sleep Biol Rhythms . 2013; 11( 3): 200– 209. Google Scholar CrossRef Search ADS PubMed  18. Belenky G, Wesensten NJ, Thorne DRet al.   Patterns of performance degradation and restoration during sleep restriction and subsequent recovery: a sleep dose-response study. J Sleep Res . 2003; 12( 1): 1– 12. Google Scholar CrossRef Search ADS PubMed  19. Johns MW. A new method for measuring daytime sleepiness: the Epworth sleepiness scale. Sleep . 1991; 14( 6): 540– 545. Google Scholar CrossRef Search ADS PubMed  20. Basner M, Mollicone D, Dinges DF. Validity and Sensitivity of a Brief Psychomotor Vigilance Test (PVT-B) to Total and Partial Sleep Deprivation. Acta Astronaut . 2011; 69( 11-12): 949– 959. Google Scholar CrossRef Search ADS PubMed  21. Basner M, Dinges DF, Shea JAet al.   Sleep and alertness in medical interns and residents: an observational study on the role of extended shifts. Sleep  2017; 40( 4) : 1– 8. Google Scholar CrossRef Search ADS   22. Volpp KG, Shea JA, Small DSet al.   Effect of a protected sleep period on hours slept during extended overnight in-hospital duty hours among medical interns: a randomized trial. JAMA . 2012; 308( 21): 2208– 2217. Google Scholar CrossRef Search ADS PubMed  23. Basner M, Savitt A, Moore TMet al.   Development and validation of the Cognition test battery. Aerosp Med Hum Perf  2015; 86( 11): 942– 52. Google Scholar CrossRef Search ADS   24. Basner M, Dinges DF, Mollicone Det al.   Mars 520-d mission simulation reveals protracted crew hypokinesis and alterations of sleep duration and timing. Proc Natl Acad Sci USA . 2013; 110( 7): 2635– 2640. Google Scholar CrossRef Search ADS PubMed  25. Basner M, Dinges DF. Maximizing sensitivity of the psychomotor vigilance test (PVT) to sleep loss. Sleep . 2011; 34( 5): 581– 591. Google Scholar CrossRef Search ADS PubMed  26. Satterthwaite FE. An approximate distribution of estimates of variance components. Biometrics . 1946; 2( 6): 110– 114. Google Scholar CrossRef Search ADS PubMed  27. Bozdogan H. Model Selection and Akaike Information Criterion (AIC) - the general-theory and its analytical extensions. Psychometrika  1987; 52( 3): 345– 70. Google Scholar CrossRef Search ADS   28. Curran-Everett D. Multiple comparisons: philosophies and illustrations. Am J Physiol Regul Integr Comp Physiol . 2000; 279( 1): R1– R8. Google Scholar CrossRef Search ADS PubMed  29. Basner M, Mcguire S, Goel N, Rao H, Dinges DF. A new likelihood ratio metric for the psychomotor vigilance test and its sensitivity to sleep loss. J Sleep Res . 2015; 24( 6): 702– 713. Google Scholar CrossRef Search ADS PubMed  30. Lim J, Wu WC, Wang J, Detre JA, Dinges DF, Rao H. Imaging brain fatigue from sustained mental workload: an ASL perfusion study of the time-on-task effect. Neuroimage . 2010; 49( 4): 3426– 3435. Google Scholar CrossRef Search ADS PubMed  © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.

Journal

SLEEPOxford University Press

Published: Jan 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

Monthly Plan

  • Read unlimited articles
  • Personalized recommendations
  • No expiration
  • Print 20 pages per month
  • 20% off on PDF purchases
  • Organize your research
  • Get updates on your journals and topic searches

$49/month

Start Free Trial

14-day Free Trial

Best Deal — 39% off

Annual Plan

  • All the features of the Professional Plan, but for 39% off!
  • Billed annually
  • No expiration
  • For the normal price of 10 articles elsewhere, you get one full year of unlimited access to articles.

$588

$360/year

billed annually
Start Free Trial

14-day Free Trial