Time for a quick word? The striking benefits of training speed and accuracy of word retrieval in post-stroke aphasia

Time for a quick word? The striking benefits of training speed and accuracy of word retrieval in... Abstract One-third of stroke survivors experience deficits in word retrieval as a core characteristic of their aphasia, which is frustrating, socially limiting and disabling for their professional and everyday lives. The, as yet, undiscovered ‘holy grail’ of clinical practice is to establish a treatment that not only improves item naming, but also generalizes to patients’ connected speech. Speech production in healthy participants is a remarkable feat of cognitive processing being both rapid (at least 120 words per minute) and accurate (∼one error per 1000 words). Accordingly, we tested the hypothesis that word-finding treatment will only be successful and generalize to connected speech if word retrieval is both accurate and quick. This study compared a novel combined speed- and accuracy-focused intervention—‘repeated, increasingly-speeded production’—to standard accuracy-focused treatment. Both treatments were evaluated for naming, connected speech outcomes, and related to participants’ neuropsychological and lesion profiles. Twenty participants with post-stroke chronic aphasia of varying severity and subtype took part in 12 computer-based treatment sessions over 6 weeks. Four carefully matched word sets were randomly allocated either to the speed- and accuracy-focused treatment, standard accuracy-only treatment, or untreated (two control sets). In the standard treatment, sound-based naming cues facilitated naming accuracy. The speed- and accuracy-focused treatment encouraged naming to become gradually quicker, aiming towards the naming time of age-matched controls. The novel treatment was significantly more effective in improving and maintaining picture naming accuracy and speed (reduced latencies). Generalization of treated vocabulary to connected speech was significantly increased for all items relative to the baseline. The speed- and accuracy-focused treatment generated substantial and significantly greater deployment of targeted items in connected speech. These gains were maintained at 1-month post-intervention. There was a significant negative correlation for the speed- and accuracy-focused treatment between the patients’ phonological scores and the magnitude of the therapy effect, which may have reflected the fact that the substantial beneficial effect of the novel treatment generated a ceiling effect in the milder patients. Maintenance of the speed- and accuracy-treatment effect correlated positively with executive skills. The neural correlate analyses revealed that participants with the greatest damage to the posterior superior temporal gyrus extending into the white matter of the inferior longitudinal fasciculus, showed the greatest speed- and accuracy treatment benefit. The novel treatment was well tolerated by participants across the range of severity and aphasia subtype, indicating that this type of intervention has considerable clinical utility and broad applicability. aphasia, speed, naming, treatment, stroke Introduction Fluent speech requires rapid, errorless retrieval of vocabulary, which occurs at a rate of at least two words per second and less than one error per 1000 words (Levelt, 1989; Bird et al., 2000). Aphasia occurs in at least one-third of stroke survivors [The Stroke Association (UK), 2016]. Failures, errors or delays in word retrieval (anomia) are the most pervasive aphasic symptoms (Laine and Martin, 2006). Anomia treatment typically involves single-item picture naming. There is a strong clinical belief that there is a lack of generalization to connected speech for standard naming therapies (Nickels, 2002; Wisenburn and Mahoney, 2009), yet typically studies (i) have lacked a systematic method for assessing generalization; and (ii) have been underpowered. Given that connected speech is highly demanding in terms of speed and accuracy, we hypothesized that retrained vocabulary will only generalize if it can be retrieved within the demanding time window required by connected speech (Crerar, 2004; Conroy et al., 2009). This hypothesis aligns with the broader observations that (i) naming speed is an important variable for both assessment and treatment tasks (McCall et al., 1997); and (ii) in mild aphasia, expressive vocabulary may be largely recovered except for delayed naming latencies (Crerar, 2004). To tackle this critical clinical need, we developed a novel treatment to reduce speed and increase picture naming accuracy, simultaneously (repeated, increasingly-speeded production, RISP). This intervention was directly compared to (i) a standard treatment that targeted accuracy alone; and (ii) no treatment. We hypothesized that (i) RISP would generate greater improvements in both naming speed and accuracy; and (ii) speedier naming would increase production of treated words in connected speech (evaluated through a newly developed, systematic method). Finally, we related the patients’ variable therapy outcomes to both their background neuropsychological profiles and the distributions of the underlying lesions. Materials and methods Participants The participants were recruited from a post-stroke database within the Neuroscience and Aphasia Research Unit. The database consisted of 70 participants with chronic aphasia following cerebrovascular accident. All were recruited from aphasia support groups or speech therapy services in Greater Manchester and North-West England, UK. Participants covered the full range of aphasia severity and multiple subtypes. All were right-handed, native English speakers, who had sustained one left hemisphere stroke at least 1 year prior to recruitment, had normal or corrected-to-normal hearing and vision, had no co-existing neurological impairments and had no contradistinctions for MRI scanning. Nineteen participants had no contradistinctions to MRI scanning (i.e. no pacemakers, metal implants, claustrophobia, etc.); however, one patient had a metal implant. This meant that neuroimaging data were collected only from 19 patients. Neuroimaging data from a healthy age and education matched control group (eight females, 11 males) were used to determine abnormal voxels using the automated lesion identification procedure (Seghier et al., 2008). All participants gave written informed consent with approval from the local ethics committee. From the full database, 20 participants (11 males, nine females; mean age 65.2 years, standard deviation = 11.7) took part in the study. Prerequisites for participating were to have minimal repetition skills (>40% on an immediate word repetition test; Kay et al., 1996). Participants with global aphasia, severe perceptual problems or with very severe naming difficulties [below 8% or 5/60 on the Boston Naming Test (BNT); Goodglass et al., 2000], were excluded from the study. All other levels and types were included so that the newly developed therapy could be trialled across a full range of patients. Demographic details of the participants are given in Supplementary Table 1 together with baseline picture naming accuracy and speed, and production of the same vocabulary items in connected speech (with participants ordered according to their BNT naming accuracy). Background assessments Before taking part in this study, participants also completed extensive linguistic and cognitive assessment. The results are summarized in Supplementary Tables 2 and 3. The background assessment battery included the following specific tests. The BNT (Goodglass et al., 2000) was used to assess word-finding difficulties. Four repetition tasks were used (from Kay et al., 1996): (i) word repetition immediate; (ii) word repetition delayed; (iii) non-word repetition immediate; and (iv) non-word repetition delayed. Two other phonological tasks included word and non-word minimal pairs (Kay et al., 1996). Participants also completed six tests of comprehension and semantic memory: (i) spoken sentence comprehension from the Comprehensive Aphasia Test (Swinburn et al., 2004); (ii) Synonym Judgement Test (Jefferies et al., 2009), and from the Cambridge Semantic Battery (Bozeat et al., 2000; (iii) picture naming; (iv) spoken word-to-picture matching; (v) written word-to-picture matching; and (vi) the picture version of the Camel and Cactus Test of semantic association knowledge. To test short-term memory skills, the forward and backward memory span assessments were administered (Wechsler, 1945). Two executive tests were completed: (i) Brixton Spatial Rule Anticipation Test (Burgess and Shallice, 1997); and (ii) Raven’s Coloured Progressive Matrices (Raven, 1962). Speech production deficits were assessed by coding responses to the ‘Cookie theft’ picture in the Boston Diagnostic Aphasia Examination, which included tokens (TOK), mean length of utterance (MLU), type/token ratio (TTR) and words per minute (WPM). All scores were converted into percentages; if no maximum was available we used the maximum score across the participants. Following previous studies, we used principal component analysis (PCA; SPSS v.22) to express the underling dimensions of performance variation (Butler et al., 2014; Halai et al., 2017). A PCA with varimax rotation was calculated for these behavioural measures for our full n = 70 chronic aphasia patient dataset. We performed the PCA on the full available dataset in order to: (i) maximize coverage of the multidimensional space; and (ii) achieve robust weighted averages for the scores of the patients on the extracted PCA components. Four principal components with an eigenvalue > 1 were extracted; these corresponded to phonological, semantic, executive and speech quanta dimensions [see Halai et al. (2017) for the details of these principal components and their lesion correlates]. Patients’ component scores on the four extracted components were reconstructed using regression for the entire dataset (n = 70). To explore the relationship between therapy outcome and background language cognitive skills, the component scores for the subset of 20 patients included in the therapy (one did not have a factor score as we did not have full background neuropsychological data) were correlated with their therapy outcomes (1 week versus baseline and 1 month versus baseline). We note here that it is preferable to compute the PCA and resultant component scores on the full patient dataset as this ensures that (i) the PCA is as robust as possible; and (ii) places the scores for the therapy subgroup in relation to the full range of aphasia severity (as shown in Supplementary Fig. 1). Acquisition of neuroimaging data High resolution structural T1-weighted MRI scans were acquired on a 3.0 T Philips Achieva scanner using an 8-element SENSE head coil. A T1-weighted inversion recovery sequence with 3D acquisition was used, with the following parameters: repetition time = 9.0 ms, echo time = 3.93 ms, flip angle = 8°, 150 contiguous slices, slice thickness = 1 mm, acquired voxel size 1.0 × 1.0 × 1.0 mm3, matrix size 256 × 256, field of view = 256 mm, inversion time = 1150 ms, SENSE acceleration factor 2.5, total scan acquisition time = 575 s. Analysis of neuroimaging data Structural MRI scans were preprocessed with Statistical Parametric Mapping software (SPM8: http://www.fil.ion.ucl.ac.uk/spm/). The images were normalized into standard Montreal Neurological Institute (MNI) space using a modified unified segmentation-normalization procedure optimized for focal lesioned brains (Seghier et al., 2008). Data from all participants with stroke aphasia and healthy controls were entered into the segmentation-normalization that combines segmentation, bias correction and spatial normalization through the inversion of a single unified model (Ashburner and Friston, 2005). Each patient’s lesion was identified using an outlier detection algorithm based on fuzzy clustering. The default parameters were used, except we modified the U-threshold from 0.3 to 0.5 after comparing sample results to those from an expert neurologist. The images were individually, visually inspected with respect to the original scan, and were used to create the lesion overlay map in Fig. 1. We note that although referred to as an automated ‘lesion’ segmentation method, the technique detects areas of unexpected tissue class; thus, identifying missing grey and white matter but also augmented CSF space. We then smoothed the T1-weighted images (8 mm full-width at half-maximum Gaussian kernel) and created separate models where we correlated with magnitude of the RISP effect to the signal intensity for each voxel in the whole brain using a voxel-based correlational methodology (Tyler et al., 2005), a variant of voxel symptom lesion mapping (Bates et al., 2003). An additional covariate was added to each model to account for lesion volume. Overlays were thresholded at P < 0.005 voxel height and cluster corrected at familywise error of P < 0.05, while including additional covariates of age, years of education, months post-onset and lesion volume. All anatomical labels were based on the Harvard-Oxford atlas in MNI space. Figure 1 View largeDownload slide Lesion overlap map based on structural imaging from all participants with stroke aphasia. The colour bar represents frequency of voxel lesion between 1 and 15. Figure 1 View largeDownload slide Lesion overlap map based on structural imaging from all participants with stroke aphasia. The colour bar represents frequency of voxel lesion between 1 and 15. Therapy methods Stimuli One reason for the dearth of information with regard to generalization from naming therapy to connected speech, is the lack of a systematic assessment method (Maendl, 1998). To measure word retrieval in both picture naming and connected speech, four detailed multi-event pictures were selected (from the ‘Where’s Wally/Waldo?’ publications). These contained detailed depictions of hundreds of items and events (e.g. animals, objects and events at a busy zoo or fairground) from which a small minority of target items were selected. To assess and treat confrontational naming for these targets, we selected new pictures of the same exemplars (presenting each exemplar singly and without any background). The 120 target stimuli were all nouns, selected to meet the following criteria: (i) named spontaneously in control participants’ scene descriptions by more than 3/10 participants (participants and patients were asked to describe freely and were not directed to any areas or items within the scene); (ii) targets could be depicted singly in a new picture with 100% name agreement (thus picturable nouns such as ‘bench’ rather than ‘water’ or actions were selected); and (iii) items did not have alternative names (e.g. ‘dodgems’ and ‘bumper cars’). From these 120 nouns, four matched sets of 20 items were selected; two sets were allocated to the treatment conditions (described below). The remaining two sets served as untreated control items (thus controlling for any non-specific effects, including the small boost in performance that can result from repeated assessment) (Nickels, 2002). One treatment set and its paired untreated set related to two of the four composite pictures. The other treatment set and its paired control related to the remaining two composite pictures. This allowed us to separate the effects of each therapy by avoiding target vocabulary for the two treatments appearing in the same composite picture. The allocation of picture sets to the two treatments was counterbalanced across participants. The word sets were matched (Van Casteren and Davis, 2007) for (i) the likelihood of retrieval in the spontaneous picture descriptions by the control participants; (ii) frequency from the British National Corpus (BNC Consortium, 2007); and (iii) phoneme length. Baseline and post-therapy assessment Baseline performance for the four composite picture descriptions and 80-item confrontational naming test were assessed twice before therapy commenced (across four separate assessment sessions with composite description assessed before the confrontational naming). There was no significant change in performance across the two baseline assessments (confirming a stable baseline) and thus we compared the post-therapy results to the first assessment. Post-therapy performance was assessed at 1 week and again at 1 month to establish the longer-term benefits of the therapy (no maintenance or practice regimes were used post-therapy). An additional, fifth composite picture description was assessed before and after therapy. No vocabulary from this fifth picture was included in the therapy as treated or untreated items. The fifth picture acted as a control for the target composite pictures to control for non-specific improvements that might arise simply from repeated assessment. For the picture naming assessment, participants were presented with all 80 items in random order. Each picture was presented simultaneously with an auditory beep and remained on the screen for a maximum of 10 s (using E-Prime software; Schneider et al., 2002). Audacity software was used to measure naming latencies by calculating the time elapsed from the beep to the onset of the participant’s correct response. When no correct name was produced, the reaction time for that trial was treated as missing data. To elicit connected speech samples, participants were informed that they were going to see four ‘busy’ pictures, one at a time on a computer screen. They were asked to describe what they saw in each picture in as much detail as they could for about 5–10 min. Participants’ responses were digitally audio-recorded. The order of presentation was randomized across participants, thus counterbalancing any effect of relative difficulty. Treatments The treatments were delivered in two phases (Fig. 2), each containing two sessions per week for 3 weeks (six treatment sessions per phase). In the first phase, only standard therapy was administered for all items (n = 40) in order to boost naming accuracy before introducing any speed requirement. In the second phase, standard (accuracy-only) treatment was continued for one set, whilst the other was treated with RISP (see below). In both phases, stimuli in each set were randomized and the order of sets was counterbalanced across sessions. See Table 1 for treatment protocols for standard production and RISP. Treatment sessions lasted between 30 and 50 min dependant on participant’s need for comfort breaks. Table 1 Treatment protocols   Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.      Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.    Table 1 Treatment protocols   Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.      Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.    Figure 2 View largeDownload slide Overview of sequence of assessments and treatment phases. Figure 2 View largeDownload slide Overview of sequence of assessments and treatment phases. Standard production This was a standard, increasing cues, naming therapy, which aimed to improve participants’ picture naming accuracy only. Participants were asked to name each picture, presented on a computer screen, in 10 s without support, i.e. with no cues. After each naming attempt, feedback was provided both verbally by the experimenter and presented in writing on the screen. Initially, minimal cues were provided (the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’) but the cues were increased if naming was not achieved (e.g. ‘wind’ for ‘window’, and then the whole word ‘window’). Participants worked through all therapy items three times per session. There were no auditory cues presented in this standard therapy to indicate any type of time pressure. Repeated, increasingly-speeded production treatment This treatment was a hybrid intervention that combined cued naming with the deadline naming method used in experimental psycholinguistics (Vitkovitch and Humphreys, 1991; Hodgson and Lambon Ralph, 2008). Participants were instructed that the computer would present the picture for a limited time and their task was to try to name the picture before the beep at the end of the stimulus presentation. In each therapy session, the presentation duration/time-to-the-beep was reduced (see below). During each trial, the target picture was presented on the computer screen. At the end of the allotted time, the picture disappeared and a beep sound was produced by the computer. A blank screen was displayed for 1000 ms. Participants were then presented with the written target word on the screen and the correct spoken name of the picture was played by the computer. Following an incorrect response, participants were asked to repeat the correct name three times. Participants cycled through all therapy items three times per session. This matched the number of item exposures between RISP and standard production within each treatment session. The naming deadline was shortened systematically across the six RISP sessions. The initial picture exposure time was set to the mean of all patients’ baseline picture naming speed (3 s). This ensured that each participant’s first ‘speeded’ naming attempt would feel reasonably natural. The ultimate target deadline in the sixth RISP session was 1 s, which matched the mean naming speed of elderly neurotypical participants (mean naming time: 1002 ms). The target naming speed was reduced in a systematic way: session 1 = 3 s, session 2 = 2.5 s, session 3 = 2 s, session 4 = 1.6 s, session 5 = 1.3 s, and session 6 = 1 s. The same target naming speed was used for the three cycles within each session and only reduced on the start of the next session. It was not necessary for participants to actually ‘beat the beep’, rather the attempt to do so was expected and did reduce naming latencies over the course of the treatment. Scoring Participant’s performance was scored based on their first response for all picture naming. Self-corrections were considered correct if the correct name was produced immediately after the first response. Analysis of the main therapy data For the three sets of target data (picture naming accuracy, picture naming speed, and word retrieval accuracy in the composite pictures), we carried out the same set of hierarchically structured analyses. First, we conducted a global ANOVA with picture set (the treated and untreated items) and time (pre- versus immediately post-therapy versus 1 month follow-up) as main factors, which allows us to specify if there were changes in performance before and after intervention, and if these varied for treated and untreated sets. We then unpicked the nature of the significant interactions with planned ANOVA and t-tests: before and after intervention, each treated set was compared to its matched untreated set, and the two treated sets were compared to each other. Our a priori expectations were that performance on the therapy sets would be significantly improved after therapy and better than that observed for the untreated items. Analyses were run in SPSS v22.0. Results Picture naming accuracy after the first treatment phase In the first phase, the standard therapy (standard production) was administered for all items (n = 40). Naming accuracy at the end of this phase is reported in Table 2. Set A progressed to be treated with standard production in the second phase of treatment, and Set B with RISP. The mean accuracy for Set A was 78.0, and Set B 81.25 (a non-significant difference: two-tailed t = −0.43, P = 0.66). Thus the main study comparison ANOVAs carried out at the end of the second treatment phase were not biased by the (equivalent) performance on the sets after the initial treatment phase. Table 2 Participant performance on naming accuracy as percentage for treated items at the mid-treatment point Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Data correspond to the end of the first phase, before the second phase shown in Fig. 2. Set A = standard production items in second phase; Set B = RISP items in second phase. Table 2 Participant performance on naming accuracy as percentage for treated items at the mid-treatment point Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Data correspond to the end of the first phase, before the second phase shown in Fig. 2. Set A = standard production items in second phase; Set B = RISP items in second phase. Picture naming accuracy after the second treatment phase A global 3 × 4 ANOVA was conducted with the factor of time and treatment. These analyses were concerned with the cumulative effects of standard production alone (phases 1 + 2, i.e. Set A) versus standard production followed by RISP (phases 1 + 2, i.e. Set B). The three time points were: baseline (pre-first phase of treatment), 1 week post-second phase of treatment, one month post-second phase of treatment. The four treatment conditions were: standard production, RISP, untreated standard production, and untreated RISP. This 3 × 4 ANOVA indicated that there was a main effect of time [F(2,38) = 55.6, P < 0.0005], a main effect of treatments [F(3,57) = 35.7, P < 0.0005], and a significant interaction between time and treatments [F(6,114) = 18.0, P < 0.0005; Fig. 3A], indicating very different effects of therapy on the treated and untreated items. Figure 3 View largeDownload slide Main results. (A) Naming accuracy (pre- versus post-treatment). (B) Naming Speed (pre- versus post-treatment). (C) Use of target vocabulary in connected speech (pre- versus post-treatment). Figure 3 View largeDownload slide Main results. (A) Naming accuracy (pre- versus post-treatment). (B) Naming Speed (pre- versus post-treatment). (C) Use of target vocabulary in connected speech (pre- versus post-treatment). We explored the nature of this interaction through three follow-up ANOVAs. First, we compared each treatment condition to its matched control set across the three time points, (through two 2 × 3 ANOVAs where the first factor compared each treatment type to its own control: i.e. standard production versus untreated standard production; RISP versus untreated RISP). These ANOVAs showed that both therapies generated significantly improved accuracy scores relative to their control sets (significant interaction: P < 0.0005 for both therapies). For RISP, a significant interaction between ‘Time Point’ and ‘Treatment’ was found [F(2,38) = 34.643, P < 0.0005, partial η2 = 0.65]. For standard production, a similarly robust significant interaction between Time Point and Treatment was evident [F(2,38) = 14.935, P < 0.0005, partial η2 = 0.44]. Direct comparison of the two treatments, through another 2 × 3 ANOVA (standard production versus RISP; over the three time points), indicated that there was a trend towards a borderline interaction between time and treatment [F(2,38) = 2.3, P = 0.117]. Planned t-tests showed that both therapies significantly increased picture naming accuracy between the baseline and post-treatment assessments (P < 0.0005), and that the RISP therapy effect was significantly greater than standard production not only at the 1 week post-treatment assessment (P < 0.0005), but also at the follow-up (1 month) assessment (P = 0.001). Picture naming speed after the second treatment phase Exactly the same set of planned ANOVAs and t-tests were used to examine the naming speed for correctly named items (the overall results are shown in Fig. 3B). In the global 3 (time point) × 4 (picture sets) ANOVA, there was a main effect of Time Point [F(2,36) = 21.1, P < 0.0005], no main effect of Treatment factor [F(3,54) = 1.7, P = 0.174], but a significant interaction between Time Point and Treatment [F(6,108) = 5.7, P < 0.0005], indicating significantly different changes in naming speed for the treated versus untreated sets. The follow-up 2 × 3 ANOVAs confirmed that the effect of each therapy was significantly different from its control [Time Point × Set interactions were significant: RISP F(2,36) = 8.6, P = 0.001; standard production F(2,36) = 3.9, P = 0.03]. A 2 × 3 ANOVA comparing the two treated sets indicated that there was a significant interaction between Time Point and Treatment [F(2,36) = 3.2, P = 0.05]. Whilst both treatments significantly reduced picture naming latencies between the baseline and both post-treatment assessments (1 week and 1 month), the pairwise t-tests showed that there was a trend for the RISP treatment to reduce reaction times more than standard production from baseline to the immediate assessment at Week 1 (P = 0.101) and, most strikingly, RISP was significantly more effective in maintaining the treatment effect in terms of quicker naming responses at the 1 month follow-up assessment (P = 0.001). In comparing the two untreated conditions, only the main effect of the Time Point factor was significant [F(2,36) = 3.23, P = 0.05], reflecting a small reduction in naming latencies across repeated assessments (presumably reflecting repetition priming). The main effect of Set was not significant [F(1,18) < 1], nor was the interaction between Time Point and Set [F(2,36) = 1.3, P = 0.28]. Generalization to connected speech: word retrieval in composite picture descriptions Again, exactly the same set of analyses was conducted on the target word retrieval data in the composite picture descriptions. The global 3 × 4 ANOVA indicated that there was a significant effect of the Time Point factor [F(2,38) = 87.8, P < 0.0005], a main effect of Treatment factor [F(3,57) = 43.7, P < 0.0005] and a highly significant interaction between Time Point and Treatment [F(6,114) = 19.9, P < 0.0005] (Fig. 3C), indicating very different production of the target versus untreated vocabulary in the patients’ narratives before and after therapy. Directly comparing the two treatments (standard production versus RISP), a 2 × 3 ANOVA indicated that there was a highly significant interaction between Time Point and Treatment [F(2,38) = 19.6, P < 0.0005]. The t-tests showed that the RISP effect on connected speech production was significantly stronger than standard production both at the 1 week and 1 month post-treatment assessments (both P < 0.0005). Comparing each treatment to its control set, separately, we found significant Time Point × Set interactions for the RISP and standard production sets [F(2,38) = 19.6, P < 0.0005; F(2,38) = 5.2, P = 0.01, respectively]. Thus, although there is a general clinical belief that standard therapy does not induce generalization to connected speech, our newly-developed assessment was able to demonstrate that this is incorrect—there is, in fact, a small but significant generalization to connected speech for standard production both at 1 week and 1 month (though the effect was significantly smaller than for the RISP therapy, see above). Finally, the two untreated conditions were compared. The main effect of Time Point was significant [F(2,36) = 3.2, P = 0.05], indicating a small improvement in target vocabulary production simply through repeated assessment, but neither the main effect of Set [F(1,18) < 1] nor the interaction between Time Point and Set were significant [F(2,36) = 1.3, P = 0.28]. Content analysis of the connected speech samples As well as exploring the generalization of trained vocabulary to the connected speech samples, it is also important to investigate the connected speech samples more generally. It is possible, for example, that improved vocabulary promotes connected speech more generally or that the improvement on the trained items comes at the cost of reduced performance on the untrained vocabulary. We examined the connected speech samples in terms of the total number of nouns produced (tokens), the number of unique nouns produced (types), nouns per minute, the type/token ratio (number of unique words divided by the total words), average word frequency and average imageability for the treated and untreated pictures. The overall secondary effects on the patients’ connected speech samples were entirely positive. Specifically, for the treated pictures, the speech samples including all items showed that significantly more unique items were produced after therapy compared to baseline [mean at 1 week = 103.6, mean at baseline = 85.4; t(18) = −2.30, P = 0.03]. There was also a significant decrease in the average word frequency of the nouns used [mean at 1 week = 1.40, mean at baseline = 1.55; t(18) = 4.21, P < 0.001]. There were no significant changes found in nouns per minute, type/token ratio, and average imageability rating. Importantly, there were no significant effects found in analyses of the untreated fifth picture, indicating that the improved connected speech samples did not reflect a non-specific effect of repeated assessment. This first analysis included all items, including the target therapy items. Accordingly, we repeated the analysis to remove these items from consideration. In this second analysis, the increase in unique items from baseline to post-therapy was no longer significant [mean at 1 week = 84.8, mean at baseline = 77.8; t(18) = −0.95, P = 0.3]. The reduction in mean word frequency, however, was still significant [mean at 1 week = 1.48, mean at baseline = 1.58; t(18) = 2.86, P < 0.01]. Correlations with individual’s background neuropsychological profile Although there were significant and reliable therapy effects at the group level, the effect varied across individual patients. We performed correlations between the background neuropsychological profile [with respect to four principal neuropsychological components (see Table 3 for component loadings): phonological, semantic, executive, and speech quanta (fluency)] and the magnitude of the therapy effect [1 week versus baseline performance, and 1 month (maintenance) versus baseline performance] in order to reveal which aspects of the patients’ profile were related to the therapy outcome. The PCA identified four components including phonological skill (50.9% variance), semantic ability (11.28% variance), executive ability (8.18% variance) and speech quanta (6.42% variance). In general, the phonological component loaded with repetition, naming and digit span tests, whereas the semantic component loaded with picture matching, camel and cactus and synonym judgement tests. The executive component loaded with Ravens coloured progressive matrices, Brixton spatial rule anticipation test and minimal pairs all of which are demanding tests. Finally, the measures of the amount of speech output component loaded on the fourth factor speech quanta. Overall, no correlations were found between any of the components and the outcome on the standard therapy. For the RISP therapy, however, a significant negative correlation was found between the patient’ phonological component score and the magnitude of their therapy effect, at both 1 week (r = −0.55, P < 0.01) and 1 month (r = −0.61, P < 0.005). This demonstrates that patients with the poorest phonological abilities showed the largest RISP benefit. As can be seen across the case-series (Fig. 4), this negative correlation seems to reflect the fact that the RISP therapy was particularly beneficial leading to a ceiling effect for many of the milder patients (note that if Patient JS with poor phonological abilities but a large therapy effect is removed, then the correlation is still significant). Table 3 Factor loadings from the PCA on the entire dataset (n = 70)   Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768    Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768  Each component is subjectively labelled and shows the percentage variance explained. Loadings > 0.5 are marked in bold. Table 3 Factor loadings from the PCA on the entire dataset (n = 70)   Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768    Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768  Each component is subjectively labelled and shows the percentage variance explained. Loadings > 0.5 are marked in bold. Figure 4 View largeDownload slide Case series naming accuracy. (A) Pre- versus post-RISP treatment; (B) pre- versus post-standard production treatment. Figure 4 View largeDownload slide Case series naming accuracy. (A) Pre- versus post-RISP treatment; (B) pre- versus post-standard production treatment. It was also possible to determine how each component correlated with the maintenance of the therapy effect (i.e. 1 month versus 1 week performance). In this analysis, the maintenance of the RISP effect was found to correlate positively with performance on the executive tasks (r = 0.53, P < 0.01). Thus, the patients with better executive abilities exhibited the best therapy maintenance. No other correlations were significant. Neural correlates of RISP To determine the neural correlates of the RISP effect, we correlated each patient’s therapy effect (1 week versus baseline performance, and 1 month versus baseline performance) with their T1-weighted MRI using voxel-based correlational methodology (Tyler et al., 2005). This analysis revealed that patients with the greatest damage to the posterior superior temporal gyrus extending into the white matter of the inferior longitudinal fasciculus, showed the greatest RISP benefit both at 1 week and 1 month (height threshold P < 0.001, cluster corrected using FWE P < 0.05). This region is known to play an important role in phonological performance, as illustrated in Fig. 5 whereby the RISP effect overlaps closely with the area related to the lesion correlate for the patients’ phonological skill factor found previously by Halai et al. (2017) and thus aligns with the behavioural correlation between phonological ability and therapy effect noted above. It appears, therefore, that the RISP effect may relate particularly to the patients’ phonological abilities. Finally, no voxels were found to correlate significantly with the RISP maintenance effect (1 month versus 1 week performance). Figure 5 View largeDownload slide Neural correlates of RISP. RISP Week 1 versus baseline and phonological ability; RISP Month 1 versus baseline and phonological ability. Voxel-based correlational methodology correlations for the size of the therapy effect (green overlay) and degree of phonological impairment (phonological factor score; purple overlay). The overlap is marked in grey. The top row shows the results for the therapy outcome assessment after 1 week post-therapy; the bottom row shows the results for the 1-month post-therapy data. Overlays were thresholded at P < 0.005 voxel height and cluster corrected at familywise error of P < 0.05, while including additional covariates of age, years of education, months post-onset and lesion volume. Figure 5 View largeDownload slide Neural correlates of RISP. RISP Week 1 versus baseline and phonological ability; RISP Month 1 versus baseline and phonological ability. Voxel-based correlational methodology correlations for the size of the therapy effect (green overlay) and degree of phonological impairment (phonological factor score; purple overlay). The overlap is marked in grey. The top row shows the results for the therapy outcome assessment after 1 week post-therapy; the bottom row shows the results for the 1-month post-therapy data. Overlays were thresholded at P < 0.005 voxel height and cluster corrected at familywise error of P < 0.05, while including additional covariates of age, years of education, months post-onset and lesion volume. Discussion Anomia is an immensely frustrating and disabling feature of aphasia, which is a common disorder post-stroke (around one-third of cases) and in other neurological conditions. Accordingly, it is important to establish effective interventions for remediating word-finding skills and generalizing these improvements to patients’ connected speech. Given the observation that fluent speech requires both quick and accurate word retrieval, we investigated and confirmed the novel hypothesis that a behavioural treatment, focusing on both speed and accuracy rather than accuracy alone (as is the case in standard methods), would generate greater improvements in both confrontation naming and also generalization of this improved vocabulary to connected speech. A second key, novel feature of this study was that the interventions were not examined in isolation but we also investigated the neuropsychological and lesion correlates of treatment responsiveness. Although such analyses are a rarity in the literature to date (Abel et al., 2015), increasing our understanding of both the neuropsychological and lesion correlates of variable therapy success will be a critical step towards future neuroscience-led stratification of patients and choice of clinical pathways. To address these questions, we developed a novel naming treatment that focussed on both speed and accuracy (RISP), which we compared to a standard accuracy-only treatment (standard production). As expected, both treatments increased picture naming accuracy (assessed 1 week following the end of the intervention), which was largely retained at the 1 month follow-up assessment even without maintenance practice. RISP was, however, significantly more effective than standard production in promoting increased accuracy particularly at the important long-term follow-up assessment. The same pattern was found in naming speed—as intended, RISP was much more effective in speeding successful name retrieval and maintaining these improvements at follow-up assessment. Perhaps most importantly, we found that RISP generalized from naming individual target items into the patients’ connected speech—a ‘holy grail’ for speech and language therapy. With regard to neuropsychological and neural correlates of therapy effects, we found a significant negative correlation for the RISP therapy between the patients’ degree of phonological impairment and the magnitude of their therapy effect, both immediately after therapy and at follow-up assessment. This initially somewhat counter-intuitive finding probably reflects that RISP appears to be an especially beneficial treatment, such that milder patients show a resultant ceiling effect in their speech production assessment whereas the more severe patients can exhibit a much more dramatic improvement on the target items. This finding may also be consistent with the observation from Best and colleagues’ (2013) meta-analysis that better treatment responsiveness was evident in participants classified as having relatively less semantic difficulties and greater phonological output deficits (note, our use of PCA to extract the pattern of underlying language-cognitive deficits means that, over and above phonology per se, the potential additional influence of semantic, skills, speech fluency and cognitive-executive factors were already partialled out: see Butler et al., 2014; Halai et al., 2017). This behavioural correlate for the RISP therapy was also mirrored directly in the lesion correlate analysis: the RISP benefit was most evident in participants with the greatest damage to the posterior superior temporal gyrus extending into the white matter of the inferior longitudinal fasciculus. This region has been implicated in auditory-phonological processing not only through neuropsychological studies (Baldo et al., 2012; Robson et al., 2012, 2013) but also in functional MRI explorations of healthy function (Warren and Griffiths, 2003; Hickok and Poeppel, 2004; Rauschecker and Scott, 2009). Finally, with regard to the long-term maintenance of the RISP treatment, follow-up performance correlated positively with cognitive-executive skills. Specifically, strong performances on neuropsychological assessments like the Brixton Rule Anticipation Test (Vordenberg et al., 2014) predict good longer-term responsiveness to anomia treatment in general, and RISP in particular. This may reflect the enhanced demands that RISP placed on participants in terms of cognitive flexibility, planning, problem-solving and speed of processing—consistent with the suggestion that both patients’ degree of language impairment and remaining executive skill may be critical in recovery of function and therapy outcome (Lambon Ralph et al., 2010; Sharp et al., 2010; Brownsett et al., 2014; Geranmayeh et al., 2014). Two different possible hypotheses can be made about the mechanisms underlying the speeded treatment effect, which can be tested in future investigations. The first, language-specific hypothesis is related to the aim of the RISP treatment to target both accuracy and speed. For optimally easy and efficient word retrieval, the language system requires precise representations that allow the target meaning to be converted to phonological and motor-speech representations (Lupker et al., 1997). Computational models of speech production and reading have repeatedly shown that as these representations and mappings are refined through learning, performance of models becomes both more accurate and more efficient (Plaut et al., 1996; Ellis and Lambon Ralph, 2000). Accordingly, because the RISP treatment deliberately aims beyond accuracy to improve speed as well, the language representations and mappings may have been pressured not only to reform but also to be ‘sharpened up’ to become more precise. This also supports previous findings which indicated that naming speed is a significant, yet often overlooked factor, not only in assessment but also in treatment tasks (McCall et al., 1997). Indeed, this hypothesis might also explain why, aside from speed, RISP led to significantly better naming accuracy than the accuracy-only focused standard production (following the fact that both speed and accuracy reflect the precision of the underlying language representations). Another possible hypothesis accounting for the RISP effect is related to a domain-general, cognitive-executive mechanism (Lambon Ralph et al., 2010; Geranmayeh et al., 2014). Not only was the degree of treatment maintenance related to the patients’ cognitive-executive skills, but all participants (irrespective of severity) reported RISP to be especially engaging and motivating. Thus, RISP may be much better than standard production in engaging patients’ executive and attentional skills, in addition to the speech production system, resulting in improved learning and retention. From a neurobiological perspective, increased motivation and reward-seeking behaviour has been being strongly associated with dopamine release (Fiorillo, 2013; Morita et al., 2013; Sharp et al., 2016) and dopamine has been associated with improved learning and therapy effects (Berthier and Pulvermüller, 2011; Gill and Leff, 2012). This observation speaks to the wider potential of ‘gamification’, that is using the dynamic and engaging aspects of commercial gaming software to ramp-up the engagement required for rehabilitation tasks (Ferreira et al., 2014). Although based on a limited number of items in each condition, the current results suggest that there might be clinically-notable differences between the two therapy approaches, particularly at longer-term follow-up. These indications from the current experimental exploration will need to be confirmed in larger-scale studies, including formal clinical trials. Acknowledgements We would like to thank the participants with aphasia who took part in this study. Funding This research was supported by grants from the Rosetrees Foundation, the Medical Research Council (MRC) (MR/J004146/1) and European Research Council (ERC) (GAP: 670428 - BRAIN2MIND_NEUROCOMP). Supplementary material Supplementary material is available at Brain online. Abbreviations Abbreviations PCA principal component analysis RISP repeated, increasingly-speeded production References Abel S, Weiller C, Huber W, Willmes K, Specht K. Therapy-induced brain reorganization patterns in aphasia. Brain  2015; 138: 1097– 12. Google Scholar CrossRef Search ADS PubMed  Ashburner J, Friston KJ. Unified segmentation. Neuroimage  2005; 26: 839– 51. Google Scholar CrossRef Search ADS PubMed  Baldo JV, Katseff S, Dronkers NF. Brain regions underlying repetition and auditory-verbal short-term memory deficits in aphasia: evidence from voxel-based lesion symptom mapping. Aphasiology  2012; 26: 338– 54. Google Scholar CrossRef Search ADS PubMed  Bates E, Wilson SM, Saygin AP, Dick F, Sereno MI, Knight RT, Dronkers NF. Voxel-based lesion-symptom mapping. Nat Neurosci  2003; 6: 448– 50. Google Scholar CrossRef Search ADS PubMed  Berthier ML, Pulvermüller F. Neuroscience insights improve neurorehabilitation of poststroke aphasia. Nat Rev Neurol  2011; 7: 86– 97. Google Scholar CrossRef Search ADS PubMed  Best W, Greenwood A, Grassly J, Herbert R, Hickin J, Howard D. Aphasia rehabilitation: does generalisation from anomia therapy occur and is it predictable? A case series study. Cortex  2013; 49: 2345– 57. Google Scholar CrossRef Search ADS PubMed  Bird H, Lambon Ralph MA, Patterson K, Hodges JR. The rise and fall of frequency and imageability: noun and verb production in semantic dementia. Brain Lang  2000; 73: 17– 49. Google Scholar CrossRef Search ADS PubMed  BNC Consortium. The British National Corpus, version 3 (BNC XML Edition) . Oxford, England: Oxford University Computing Services; 2007. Bozeat S, Lambon Ralph MA, Patterson K, Garrard P, Hodges JR. Non-verbal semantic impairment in semantic dementia. Neuropsychologia  2000; 38: 1207– 15. Google Scholar CrossRef Search ADS PubMed  Brownsett SLE, Warren JE, Geranmayeh F, Woodhead Z, Leech R, Wise RJS. Cognitive control and its impact on recovery from aphasic stroke. Brain  2014; 137: 242– 54. Google Scholar CrossRef Search ADS PubMed  Burgess P, Shallice T. Hayling and Brixton tests . Oxford, England: Pearson; 1997. Butler RA, Lambon Ralph MA, Woollams AM. Capturing multidimensionality in stroke aphasia: mapping principal behavioural components to neural structures. Brain J Neurol  2014; 137: 3248– 66. Google Scholar CrossRef Search ADS   Conroy P, Sage K, Lambon Ralph MA. The effects of decreasing and increasing cue therapy on improving naming speed and accuracy for verbs and nouns in aphasia. Aphasiology  2009; 23: 707– 30. Google Scholar CrossRef Search ADS   Crerar MA. Aphasia rehabilitation and the strange neglect of speed. Neuropsychol Rehabil  2004; 14: 173– 206. Google Scholar CrossRef Search ADS   Ellis A, Lambon Ralph MA. Age of acquisition effects in adult lexical processing reflect loss of plasticity in maturing systems: insights from connectionist networks. J Exp Psychol Learn Mem Cogn  2000; 26: 1103– 23. Google Scholar CrossRef Search ADS PubMed  Ferreira C, Guimarães V, Santos A, Sousa I. Gamification of stroke rehabilitation exercises using a smartphone. In: Proceedings of the 8th International Conference on Pervasive Computing Technologies for Healthcare, PervasiveHealth ’14 . Brussels, Belgium: ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering); 2014. p. 282– 5. Fiorillo CD. Two dimensions of value: dopamine neurons represent reward but not aversiveness. Science  2013; 341: 546– 9. Google Scholar CrossRef Search ADS PubMed  Geranmayeh F, Brownsett SLE, Wise RJS. Task-induced brain activity in aphasic stroke patients: what is driving recovery? Brain  2014; 137: 2632– 48. Google Scholar CrossRef Search ADS PubMed  Gill SK, Leff AP. Dopaminergic therapy in aphasia. Aphasiology  2012; 28: 155– 70. Google Scholar CrossRef Search ADS PubMed  Goodglass H, Kaplan E, Barresi B. Boston diagnostic aphasia examination (BDAE-3) . 3rd edn. Philadelphia, USA: Pearsons; 2000. Halai AD, Woollams AM, Lambon Ralph MA. Using principal component analysis to capture individual differences within a unified neuropsychological model of chronic post-stroke aphasia: revealing the unique neural correlates of speech fluency, phonology and semantics. Cortex  2017; 86: 275– 89. Google Scholar CrossRef Search ADS PubMed  Hickok G, Poeppel D. Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language. Cognition  2004; 92: 67– 99. Google Scholar CrossRef Search ADS PubMed  Hodgson C, Lambon Ralph M. Mimicking aphasic semantic errors in normal speech production: evidence from a novel experimental paradigm. Brain Lang  2008; 104: 89– 101. Google Scholar CrossRef Search ADS PubMed  Jefferies E, Patterson K, Jones RW, Lambon Ralph MA. Comprehension of concrete and abstract words in semantic dementia. Neuropsychology  2009; 23: 492– 9. Google Scholar CrossRef Search ADS PubMed  Kay J, Lesser R, Coltheart M. Psycholinguistic assessments of language processing in aphasia (PALPA): an introduction. Aphasiology  1996; 10: 159– 80. Google Scholar CrossRef Search ADS   Laine M, Martin N. Anomia: theoretical and clinical aspects . Hove, England: Psychology Press; 2006. Lambon Ralph MA, Snell C, Fillingham JK, Conroy P, Sage K. Predicting the outcome of anomia therapy for people with aphasia post CVA: both language and cognitive status are key predictors. Neuropsychol Rehabil  2010; 20: 289– 305. Google Scholar CrossRef Search ADS PubMed  Levelt W. Speaking: from intention to articulation . Cambridge, MA: MIT Press; 1989. Lupker SJ, Brown P, Colombo L. Strategic control in a naming task: changing routes or changing deadlines? J Exp Psychol Learn Mem Cogn  1997; 23: 570– 90. Google Scholar CrossRef Search ADS   Maendl L. Word finding difficulties in aphasia . York, UK: University of York; 1998. McCall D, Cox DM, Shelton JR, Weinrich M. The influence of syntactic and semantic information on picture-naming performance in aphasic patients. Aphasiology  1997; 11: 581– 600. Google Scholar CrossRef Search ADS   Morita K, Morishima M, Sakai K, Kawaguchi Y. Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior. J Neurosci  2013; 33: 8866– 90. Google Scholar CrossRef Search ADS PubMed  Nickels L. Therapy for naming disorders: revisiting, revising, and reviewing. Aphasiology  2002; 16: 935– 79. Google Scholar CrossRef Search ADS   Plaut DC, McClelland JL, Seidenberg MS, Patterson K. Understanding normal and impaired word reading: computational principles in quasi-regular domains. Psychol Rev  1996; 103: 56– 115. Google Scholar CrossRef Search ADS PubMed  Rauschecker JP, Scott SK. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci  2009; 12: 718– 24. Google Scholar CrossRef Search ADS PubMed  Raven J. Raven’s coloured progressive matrices (CPM) . Oxford, England: Pearson; 1962. Robson H, Grube M, Lambon Ralph MA, Griffiths TD, Sage K. Fundamental deficits of auditory perception in Wernicke’s aphasia. Cortex  2013; 49: 1808– 22. Google Scholar CrossRef Search ADS PubMed  Robson H, Sage K, Ralph MA. Wernicke’s aphasia reflects a combination of acoustic-phonological and semantic control deficits: a case-series comparison of Wernicke’s aphasia, semantic dementia and semantic aphasia. Neuropsychologia  2012; 50: 266– 75. Google Scholar CrossRef Search ADS PubMed  Schneider W, Eschman A, Zuccolotto A. E-prime computer software and manual . Pittsburg, PA: Psychology Software Tools Inc.; 2002. Seghier ML, Ramlackhansingh A, Crinion J, Leff AP, Price CJ. Lesion identification using unified segmentation-normalisation models and fuzzy clustering. Neuroimage  2008; 41: 1253– 66. Google Scholar CrossRef Search ADS PubMed  Sharp DJ, Turkheimer FE, Bose SK, Scott SK, Wise RJS. Increased frontoparietal integration after stroke and cognitive recovery. Ann Neurol  2010; 68: 753– 6. Google Scholar CrossRef Search ADS PubMed  Sharp ME, Foerde K, Daw ND, Shohamy D. Dopamine selectively remediates “model-based” reward learning: a computational approach. Brain  2016; 139: 355– 64. Google Scholar CrossRef Search ADS PubMed  Swinburn K, Porter G, Howard D. Comprehensive aphasia test . Hove, England: Psychology Press; 2004. The Stroke Association (UK). State of the nation stroke statistics January 2016 . London, UK: The Stroke Association (UK); 2016. Tyler LK, Marslen-Wilson W, Stamatakis EA. Dissociating neuro-cognitive component processes: voxel-based correlational methodology. Neuropsychologia  2005; 43: 771– 78. Google Scholar CrossRef Search ADS PubMed  Van Casteren M, Davis MH. Match: a program to assist in matching the conditions of factorial experiments. Behav Res Methods  2007; 39: 973– 8. Google Scholar CrossRef Search ADS PubMed  Vitkovitch M, Humphreys GW. Perseverant responding in speeded naming of pictures: it’s in the links. J Exp Psychol Learn Mem Cogn  1991; 17: 664– 80. Google Scholar CrossRef Search ADS   Vordenberg JA, Barrett JJ, Doninger NA, Contardo CP, Ozoude KA. Application of the Brixton spatial anticipation test in stroke: ecological validity and performance characteristics. Clin Neuropsychol  2014; 28: 300– 16. Google Scholar CrossRef Search ADS PubMed  Warren JD, Griffiths TD. Distinct mechanisms for processing spatial sequences and pitch sequences in the human auditory brain. J Neurosci  2003; 23: 5799– 804. Google Scholar CrossRef Search ADS PubMed  Wechsler D. A standardized memory scale for clinical use. J Psychol  1945; 19: 87– 95. Google Scholar CrossRef Search ADS   Wisenburn B, Mahoney K. A meta-analysis of word-finding treatments for aphasia. Aphasiology  2009; 23: 1338– 52. Google Scholar CrossRef Search ADS   © The Author(s) (2018). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Brain Oxford University Press

Time for a quick word? The striking benefits of training speed and accuracy of word retrieval in post-stroke aphasia

Loading next page...
 
/lp/ou_press/time-for-a-quick-word-the-striking-benefits-of-training-speed-and-0pTF68mpmv
Publisher
Oxford University Press
Copyright
© The Author(s) (2018). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For permissions, please email: journals.permissions@oup.com
ISSN
0006-8950
eISSN
1460-2156
D.O.I.
10.1093/brain/awy087
Publisher site
See Article on Publisher Site

Abstract

Abstract One-third of stroke survivors experience deficits in word retrieval as a core characteristic of their aphasia, which is frustrating, socially limiting and disabling for their professional and everyday lives. The, as yet, undiscovered ‘holy grail’ of clinical practice is to establish a treatment that not only improves item naming, but also generalizes to patients’ connected speech. Speech production in healthy participants is a remarkable feat of cognitive processing being both rapid (at least 120 words per minute) and accurate (∼one error per 1000 words). Accordingly, we tested the hypothesis that word-finding treatment will only be successful and generalize to connected speech if word retrieval is both accurate and quick. This study compared a novel combined speed- and accuracy-focused intervention—‘repeated, increasingly-speeded production’—to standard accuracy-focused treatment. Both treatments were evaluated for naming, connected speech outcomes, and related to participants’ neuropsychological and lesion profiles. Twenty participants with post-stroke chronic aphasia of varying severity and subtype took part in 12 computer-based treatment sessions over 6 weeks. Four carefully matched word sets were randomly allocated either to the speed- and accuracy-focused treatment, standard accuracy-only treatment, or untreated (two control sets). In the standard treatment, sound-based naming cues facilitated naming accuracy. The speed- and accuracy-focused treatment encouraged naming to become gradually quicker, aiming towards the naming time of age-matched controls. The novel treatment was significantly more effective in improving and maintaining picture naming accuracy and speed (reduced latencies). Generalization of treated vocabulary to connected speech was significantly increased for all items relative to the baseline. The speed- and accuracy-focused treatment generated substantial and significantly greater deployment of targeted items in connected speech. These gains were maintained at 1-month post-intervention. There was a significant negative correlation for the speed- and accuracy-focused treatment between the patients’ phonological scores and the magnitude of the therapy effect, which may have reflected the fact that the substantial beneficial effect of the novel treatment generated a ceiling effect in the milder patients. Maintenance of the speed- and accuracy-treatment effect correlated positively with executive skills. The neural correlate analyses revealed that participants with the greatest damage to the posterior superior temporal gyrus extending into the white matter of the inferior longitudinal fasciculus, showed the greatest speed- and accuracy treatment benefit. The novel treatment was well tolerated by participants across the range of severity and aphasia subtype, indicating that this type of intervention has considerable clinical utility and broad applicability. aphasia, speed, naming, treatment, stroke Introduction Fluent speech requires rapid, errorless retrieval of vocabulary, which occurs at a rate of at least two words per second and less than one error per 1000 words (Levelt, 1989; Bird et al., 2000). Aphasia occurs in at least one-third of stroke survivors [The Stroke Association (UK), 2016]. Failures, errors or delays in word retrieval (anomia) are the most pervasive aphasic symptoms (Laine and Martin, 2006). Anomia treatment typically involves single-item picture naming. There is a strong clinical belief that there is a lack of generalization to connected speech for standard naming therapies (Nickels, 2002; Wisenburn and Mahoney, 2009), yet typically studies (i) have lacked a systematic method for assessing generalization; and (ii) have been underpowered. Given that connected speech is highly demanding in terms of speed and accuracy, we hypothesized that retrained vocabulary will only generalize if it can be retrieved within the demanding time window required by connected speech (Crerar, 2004; Conroy et al., 2009). This hypothesis aligns with the broader observations that (i) naming speed is an important variable for both assessment and treatment tasks (McCall et al., 1997); and (ii) in mild aphasia, expressive vocabulary may be largely recovered except for delayed naming latencies (Crerar, 2004). To tackle this critical clinical need, we developed a novel treatment to reduce speed and increase picture naming accuracy, simultaneously (repeated, increasingly-speeded production, RISP). This intervention was directly compared to (i) a standard treatment that targeted accuracy alone; and (ii) no treatment. We hypothesized that (i) RISP would generate greater improvements in both naming speed and accuracy; and (ii) speedier naming would increase production of treated words in connected speech (evaluated through a newly developed, systematic method). Finally, we related the patients’ variable therapy outcomes to both their background neuropsychological profiles and the distributions of the underlying lesions. Materials and methods Participants The participants were recruited from a post-stroke database within the Neuroscience and Aphasia Research Unit. The database consisted of 70 participants with chronic aphasia following cerebrovascular accident. All were recruited from aphasia support groups or speech therapy services in Greater Manchester and North-West England, UK. Participants covered the full range of aphasia severity and multiple subtypes. All were right-handed, native English speakers, who had sustained one left hemisphere stroke at least 1 year prior to recruitment, had normal or corrected-to-normal hearing and vision, had no co-existing neurological impairments and had no contradistinctions for MRI scanning. Nineteen participants had no contradistinctions to MRI scanning (i.e. no pacemakers, metal implants, claustrophobia, etc.); however, one patient had a metal implant. This meant that neuroimaging data were collected only from 19 patients. Neuroimaging data from a healthy age and education matched control group (eight females, 11 males) were used to determine abnormal voxels using the automated lesion identification procedure (Seghier et al., 2008). All participants gave written informed consent with approval from the local ethics committee. From the full database, 20 participants (11 males, nine females; mean age 65.2 years, standard deviation = 11.7) took part in the study. Prerequisites for participating were to have minimal repetition skills (>40% on an immediate word repetition test; Kay et al., 1996). Participants with global aphasia, severe perceptual problems or with very severe naming difficulties [below 8% or 5/60 on the Boston Naming Test (BNT); Goodglass et al., 2000], were excluded from the study. All other levels and types were included so that the newly developed therapy could be trialled across a full range of patients. Demographic details of the participants are given in Supplementary Table 1 together with baseline picture naming accuracy and speed, and production of the same vocabulary items in connected speech (with participants ordered according to their BNT naming accuracy). Background assessments Before taking part in this study, participants also completed extensive linguistic and cognitive assessment. The results are summarized in Supplementary Tables 2 and 3. The background assessment battery included the following specific tests. The BNT (Goodglass et al., 2000) was used to assess word-finding difficulties. Four repetition tasks were used (from Kay et al., 1996): (i) word repetition immediate; (ii) word repetition delayed; (iii) non-word repetition immediate; and (iv) non-word repetition delayed. Two other phonological tasks included word and non-word minimal pairs (Kay et al., 1996). Participants also completed six tests of comprehension and semantic memory: (i) spoken sentence comprehension from the Comprehensive Aphasia Test (Swinburn et al., 2004); (ii) Synonym Judgement Test (Jefferies et al., 2009), and from the Cambridge Semantic Battery (Bozeat et al., 2000; (iii) picture naming; (iv) spoken word-to-picture matching; (v) written word-to-picture matching; and (vi) the picture version of the Camel and Cactus Test of semantic association knowledge. To test short-term memory skills, the forward and backward memory span assessments were administered (Wechsler, 1945). Two executive tests were completed: (i) Brixton Spatial Rule Anticipation Test (Burgess and Shallice, 1997); and (ii) Raven’s Coloured Progressive Matrices (Raven, 1962). Speech production deficits were assessed by coding responses to the ‘Cookie theft’ picture in the Boston Diagnostic Aphasia Examination, which included tokens (TOK), mean length of utterance (MLU), type/token ratio (TTR) and words per minute (WPM). All scores were converted into percentages; if no maximum was available we used the maximum score across the participants. Following previous studies, we used principal component analysis (PCA; SPSS v.22) to express the underling dimensions of performance variation (Butler et al., 2014; Halai et al., 2017). A PCA with varimax rotation was calculated for these behavioural measures for our full n = 70 chronic aphasia patient dataset. We performed the PCA on the full available dataset in order to: (i) maximize coverage of the multidimensional space; and (ii) achieve robust weighted averages for the scores of the patients on the extracted PCA components. Four principal components with an eigenvalue > 1 were extracted; these corresponded to phonological, semantic, executive and speech quanta dimensions [see Halai et al. (2017) for the details of these principal components and their lesion correlates]. Patients’ component scores on the four extracted components were reconstructed using regression for the entire dataset (n = 70). To explore the relationship between therapy outcome and background language cognitive skills, the component scores for the subset of 20 patients included in the therapy (one did not have a factor score as we did not have full background neuropsychological data) were correlated with their therapy outcomes (1 week versus baseline and 1 month versus baseline). We note here that it is preferable to compute the PCA and resultant component scores on the full patient dataset as this ensures that (i) the PCA is as robust as possible; and (ii) places the scores for the therapy subgroup in relation to the full range of aphasia severity (as shown in Supplementary Fig. 1). Acquisition of neuroimaging data High resolution structural T1-weighted MRI scans were acquired on a 3.0 T Philips Achieva scanner using an 8-element SENSE head coil. A T1-weighted inversion recovery sequence with 3D acquisition was used, with the following parameters: repetition time = 9.0 ms, echo time = 3.93 ms, flip angle = 8°, 150 contiguous slices, slice thickness = 1 mm, acquired voxel size 1.0 × 1.0 × 1.0 mm3, matrix size 256 × 256, field of view = 256 mm, inversion time = 1150 ms, SENSE acceleration factor 2.5, total scan acquisition time = 575 s. Analysis of neuroimaging data Structural MRI scans were preprocessed with Statistical Parametric Mapping software (SPM8: http://www.fil.ion.ucl.ac.uk/spm/). The images were normalized into standard Montreal Neurological Institute (MNI) space using a modified unified segmentation-normalization procedure optimized for focal lesioned brains (Seghier et al., 2008). Data from all participants with stroke aphasia and healthy controls were entered into the segmentation-normalization that combines segmentation, bias correction and spatial normalization through the inversion of a single unified model (Ashburner and Friston, 2005). Each patient’s lesion was identified using an outlier detection algorithm based on fuzzy clustering. The default parameters were used, except we modified the U-threshold from 0.3 to 0.5 after comparing sample results to those from an expert neurologist. The images were individually, visually inspected with respect to the original scan, and were used to create the lesion overlay map in Fig. 1. We note that although referred to as an automated ‘lesion’ segmentation method, the technique detects areas of unexpected tissue class; thus, identifying missing grey and white matter but also augmented CSF space. We then smoothed the T1-weighted images (8 mm full-width at half-maximum Gaussian kernel) and created separate models where we correlated with magnitude of the RISP effect to the signal intensity for each voxel in the whole brain using a voxel-based correlational methodology (Tyler et al., 2005), a variant of voxel symptom lesion mapping (Bates et al., 2003). An additional covariate was added to each model to account for lesion volume. Overlays were thresholded at P < 0.005 voxel height and cluster corrected at familywise error of P < 0.05, while including additional covariates of age, years of education, months post-onset and lesion volume. All anatomical labels were based on the Harvard-Oxford atlas in MNI space. Figure 1 View largeDownload slide Lesion overlap map based on structural imaging from all participants with stroke aphasia. The colour bar represents frequency of voxel lesion between 1 and 15. Figure 1 View largeDownload slide Lesion overlap map based on structural imaging from all participants with stroke aphasia. The colour bar represents frequency of voxel lesion between 1 and 15. Therapy methods Stimuli One reason for the dearth of information with regard to generalization from naming therapy to connected speech, is the lack of a systematic assessment method (Maendl, 1998). To measure word retrieval in both picture naming and connected speech, four detailed multi-event pictures were selected (from the ‘Where’s Wally/Waldo?’ publications). These contained detailed depictions of hundreds of items and events (e.g. animals, objects and events at a busy zoo or fairground) from which a small minority of target items were selected. To assess and treat confrontational naming for these targets, we selected new pictures of the same exemplars (presenting each exemplar singly and without any background). The 120 target stimuli were all nouns, selected to meet the following criteria: (i) named spontaneously in control participants’ scene descriptions by more than 3/10 participants (participants and patients were asked to describe freely and were not directed to any areas or items within the scene); (ii) targets could be depicted singly in a new picture with 100% name agreement (thus picturable nouns such as ‘bench’ rather than ‘water’ or actions were selected); and (iii) items did not have alternative names (e.g. ‘dodgems’ and ‘bumper cars’). From these 120 nouns, four matched sets of 20 items were selected; two sets were allocated to the treatment conditions (described below). The remaining two sets served as untreated control items (thus controlling for any non-specific effects, including the small boost in performance that can result from repeated assessment) (Nickels, 2002). One treatment set and its paired untreated set related to two of the four composite pictures. The other treatment set and its paired control related to the remaining two composite pictures. This allowed us to separate the effects of each therapy by avoiding target vocabulary for the two treatments appearing in the same composite picture. The allocation of picture sets to the two treatments was counterbalanced across participants. The word sets were matched (Van Casteren and Davis, 2007) for (i) the likelihood of retrieval in the spontaneous picture descriptions by the control participants; (ii) frequency from the British National Corpus (BNC Consortium, 2007); and (iii) phoneme length. Baseline and post-therapy assessment Baseline performance for the four composite picture descriptions and 80-item confrontational naming test were assessed twice before therapy commenced (across four separate assessment sessions with composite description assessed before the confrontational naming). There was no significant change in performance across the two baseline assessments (confirming a stable baseline) and thus we compared the post-therapy results to the first assessment. Post-therapy performance was assessed at 1 week and again at 1 month to establish the longer-term benefits of the therapy (no maintenance or practice regimes were used post-therapy). An additional, fifth composite picture description was assessed before and after therapy. No vocabulary from this fifth picture was included in the therapy as treated or untreated items. The fifth picture acted as a control for the target composite pictures to control for non-specific improvements that might arise simply from repeated assessment. For the picture naming assessment, participants were presented with all 80 items in random order. Each picture was presented simultaneously with an auditory beep and remained on the screen for a maximum of 10 s (using E-Prime software; Schneider et al., 2002). Audacity software was used to measure naming latencies by calculating the time elapsed from the beep to the onset of the participant’s correct response. When no correct name was produced, the reaction time for that trial was treated as missing data. To elicit connected speech samples, participants were informed that they were going to see four ‘busy’ pictures, one at a time on a computer screen. They were asked to describe what they saw in each picture in as much detail as they could for about 5–10 min. Participants’ responses were digitally audio-recorded. The order of presentation was randomized across participants, thus counterbalancing any effect of relative difficulty. Treatments The treatments were delivered in two phases (Fig. 2), each containing two sessions per week for 3 weeks (six treatment sessions per phase). In the first phase, only standard therapy was administered for all items (n = 40) in order to boost naming accuracy before introducing any speed requirement. In the second phase, standard (accuracy-only) treatment was continued for one set, whilst the other was treated with RISP (see below). In both phases, stimuli in each set were randomized and the order of sets was counterbalanced across sessions. See Table 1 for treatment protocols for standard production and RISP. Treatment sessions lasted between 30 and 50 min dependant on participant’s need for comfort breaks. Table 1 Treatment protocols   Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.      Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.    Table 1 Treatment protocols   Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.      Standard production  RISP  RISP: beep intervals across treatment sessions  Step 1  Practice items, then participants to name each picture presented on a computer screen, within 10 s with no cues.  Participants use tempo-naming for practice items.  Session 1 = 3 s  Session 2 = 2.5 s  Step 2  Feedback was provided both verbally and target shown in writing on the screen.  Participants to name picture presented on computer screen, before the beep at the end of the stimulus presentation.  Session 3 = 2 s  Session 4 = 1.6 s  Step 3  If naming incorrect, minimal verbal cue provided - the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’.  Stimulus presentation duration/time-to-the-beep reduced at start of each treatment session for whole of treatment session.  Session 5 = 1.3 s  Session 6 = 1 s  Step 4  If naming still incorrect, part word verbal cue provided, e.g. ‘wind’ for ‘window’.  After beep, target word provided in spoken and written form as feedback.    Step 5  If naming still incorrect, whole word provided for repetition.  If naming incorrect, participant asked to repeat the correct name after the computer/experimenter three times.    Dosage  Treatment set worked through three times per session.  Treatment set worked through three times per session.    Figure 2 View largeDownload slide Overview of sequence of assessments and treatment phases. Figure 2 View largeDownload slide Overview of sequence of assessments and treatment phases. Standard production This was a standard, increasing cues, naming therapy, which aimed to improve participants’ picture naming accuracy only. Participants were asked to name each picture, presented on a computer screen, in 10 s without support, i.e. with no cues. After each naming attempt, feedback was provided both verbally by the experimenter and presented in writing on the screen. Initially, minimal cues were provided (the initial consonant and vowel of the target word, e.g. ‘wi’ for ‘window’) but the cues were increased if naming was not achieved (e.g. ‘wind’ for ‘window’, and then the whole word ‘window’). Participants worked through all therapy items three times per session. There were no auditory cues presented in this standard therapy to indicate any type of time pressure. Repeated, increasingly-speeded production treatment This treatment was a hybrid intervention that combined cued naming with the deadline naming method used in experimental psycholinguistics (Vitkovitch and Humphreys, 1991; Hodgson and Lambon Ralph, 2008). Participants were instructed that the computer would present the picture for a limited time and their task was to try to name the picture before the beep at the end of the stimulus presentation. In each therapy session, the presentation duration/time-to-the-beep was reduced (see below). During each trial, the target picture was presented on the computer screen. At the end of the allotted time, the picture disappeared and a beep sound was produced by the computer. A blank screen was displayed for 1000 ms. Participants were then presented with the written target word on the screen and the correct spoken name of the picture was played by the computer. Following an incorrect response, participants were asked to repeat the correct name three times. Participants cycled through all therapy items three times per session. This matched the number of item exposures between RISP and standard production within each treatment session. The naming deadline was shortened systematically across the six RISP sessions. The initial picture exposure time was set to the mean of all patients’ baseline picture naming speed (3 s). This ensured that each participant’s first ‘speeded’ naming attempt would feel reasonably natural. The ultimate target deadline in the sixth RISP session was 1 s, which matched the mean naming speed of elderly neurotypical participants (mean naming time: 1002 ms). The target naming speed was reduced in a systematic way: session 1 = 3 s, session 2 = 2.5 s, session 3 = 2 s, session 4 = 1.6 s, session 5 = 1.3 s, and session 6 = 1 s. The same target naming speed was used for the three cycles within each session and only reduced on the start of the next session. It was not necessary for participants to actually ‘beat the beep’, rather the attempt to do so was expected and did reduce naming latencies over the course of the treatment. Scoring Participant’s performance was scored based on their first response for all picture naming. Self-corrections were considered correct if the correct name was produced immediately after the first response. Analysis of the main therapy data For the three sets of target data (picture naming accuracy, picture naming speed, and word retrieval accuracy in the composite pictures), we carried out the same set of hierarchically structured analyses. First, we conducted a global ANOVA with picture set (the treated and untreated items) and time (pre- versus immediately post-therapy versus 1 month follow-up) as main factors, which allows us to specify if there were changes in performance before and after intervention, and if these varied for treated and untreated sets. We then unpicked the nature of the significant interactions with planned ANOVA and t-tests: before and after intervention, each treated set was compared to its matched untreated set, and the two treated sets were compared to each other. Our a priori expectations were that performance on the therapy sets would be significantly improved after therapy and better than that observed for the untreated items. Analyses were run in SPSS v22.0. Results Picture naming accuracy after the first treatment phase In the first phase, the standard therapy (standard production) was administered for all items (n = 40). Naming accuracy at the end of this phase is reported in Table 2. Set A progressed to be treated with standard production in the second phase of treatment, and Set B with RISP. The mean accuracy for Set A was 78.0, and Set B 81.25 (a non-significant difference: two-tailed t = −0.43, P = 0.66). Thus the main study comparison ANOVAs carried out at the end of the second treatment phase were not biased by the (equivalent) performance on the sets after the initial treatment phase. Table 2 Participant performance on naming accuracy as percentage for treated items at the mid-treatment point Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Data correspond to the end of the first phase, before the second phase shown in Fig. 2. Set A = standard production items in second phase; Set B = RISP items in second phase. Table 2 Participant performance on naming accuracy as percentage for treated items at the mid-treatment point Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Patient  KS  JBo  MD  AD  AB  JW  JSc  JS  KAd  DF  GP  PR  EB  JSo  RL  CH  JBr  DCS  DM  JM  Mean, %  Set A  20  25  65  65  85  60  65  100  80  90  85  85  100  95  100  80  100  80  85  95  78.00  Set B  25  30  70  40  85  75  60  100  90  90  100  85  100  95  100  100  100  95  85  100  81.25  A − B  −5  −5  −5  25  0  −5  5  0  −10  0  −15  0  0  0  0  −20  0  −15  0  −5  3.25  Data correspond to the end of the first phase, before the second phase shown in Fig. 2. Set A = standard production items in second phase; Set B = RISP items in second phase. Picture naming accuracy after the second treatment phase A global 3 × 4 ANOVA was conducted with the factor of time and treatment. These analyses were concerned with the cumulative effects of standard production alone (phases 1 + 2, i.e. Set A) versus standard production followed by RISP (phases 1 + 2, i.e. Set B). The three time points were: baseline (pre-first phase of treatment), 1 week post-second phase of treatment, one month post-second phase of treatment. The four treatment conditions were: standard production, RISP, untreated standard production, and untreated RISP. This 3 × 4 ANOVA indicated that there was a main effect of time [F(2,38) = 55.6, P < 0.0005], a main effect of treatments [F(3,57) = 35.7, P < 0.0005], and a significant interaction between time and treatments [F(6,114) = 18.0, P < 0.0005; Fig. 3A], indicating very different effects of therapy on the treated and untreated items. Figure 3 View largeDownload slide Main results. (A) Naming accuracy (pre- versus post-treatment). (B) Naming Speed (pre- versus post-treatment). (C) Use of target vocabulary in connected speech (pre- versus post-treatment). Figure 3 View largeDownload slide Main results. (A) Naming accuracy (pre- versus post-treatment). (B) Naming Speed (pre- versus post-treatment). (C) Use of target vocabulary in connected speech (pre- versus post-treatment). We explored the nature of this interaction through three follow-up ANOVAs. First, we compared each treatment condition to its matched control set across the three time points, (through two 2 × 3 ANOVAs where the first factor compared each treatment type to its own control: i.e. standard production versus untreated standard production; RISP versus untreated RISP). These ANOVAs showed that both therapies generated significantly improved accuracy scores relative to their control sets (significant interaction: P < 0.0005 for both therapies). For RISP, a significant interaction between ‘Time Point’ and ‘Treatment’ was found [F(2,38) = 34.643, P < 0.0005, partial η2 = 0.65]. For standard production, a similarly robust significant interaction between Time Point and Treatment was evident [F(2,38) = 14.935, P < 0.0005, partial η2 = 0.44]. Direct comparison of the two treatments, through another 2 × 3 ANOVA (standard production versus RISP; over the three time points), indicated that there was a trend towards a borderline interaction between time and treatment [F(2,38) = 2.3, P = 0.117]. Planned t-tests showed that both therapies significantly increased picture naming accuracy between the baseline and post-treatment assessments (P < 0.0005), and that the RISP therapy effect was significantly greater than standard production not only at the 1 week post-treatment assessment (P < 0.0005), but also at the follow-up (1 month) assessment (P = 0.001). Picture naming speed after the second treatment phase Exactly the same set of planned ANOVAs and t-tests were used to examine the naming speed for correctly named items (the overall results are shown in Fig. 3B). In the global 3 (time point) × 4 (picture sets) ANOVA, there was a main effect of Time Point [F(2,36) = 21.1, P < 0.0005], no main effect of Treatment factor [F(3,54) = 1.7, P = 0.174], but a significant interaction between Time Point and Treatment [F(6,108) = 5.7, P < 0.0005], indicating significantly different changes in naming speed for the treated versus untreated sets. The follow-up 2 × 3 ANOVAs confirmed that the effect of each therapy was significantly different from its control [Time Point × Set interactions were significant: RISP F(2,36) = 8.6, P = 0.001; standard production F(2,36) = 3.9, P = 0.03]. A 2 × 3 ANOVA comparing the two treated sets indicated that there was a significant interaction between Time Point and Treatment [F(2,36) = 3.2, P = 0.05]. Whilst both treatments significantly reduced picture naming latencies between the baseline and both post-treatment assessments (1 week and 1 month), the pairwise t-tests showed that there was a trend for the RISP treatment to reduce reaction times more than standard production from baseline to the immediate assessment at Week 1 (P = 0.101) and, most strikingly, RISP was significantly more effective in maintaining the treatment effect in terms of quicker naming responses at the 1 month follow-up assessment (P = 0.001). In comparing the two untreated conditions, only the main effect of the Time Point factor was significant [F(2,36) = 3.23, P = 0.05], reflecting a small reduction in naming latencies across repeated assessments (presumably reflecting repetition priming). The main effect of Set was not significant [F(1,18) < 1], nor was the interaction between Time Point and Set [F(2,36) = 1.3, P = 0.28]. Generalization to connected speech: word retrieval in composite picture descriptions Again, exactly the same set of analyses was conducted on the target word retrieval data in the composite picture descriptions. The global 3 × 4 ANOVA indicated that there was a significant effect of the Time Point factor [F(2,38) = 87.8, P < 0.0005], a main effect of Treatment factor [F(3,57) = 43.7, P < 0.0005] and a highly significant interaction between Time Point and Treatment [F(6,114) = 19.9, P < 0.0005] (Fig. 3C), indicating very different production of the target versus untreated vocabulary in the patients’ narratives before and after therapy. Directly comparing the two treatments (standard production versus RISP), a 2 × 3 ANOVA indicated that there was a highly significant interaction between Time Point and Treatment [F(2,38) = 19.6, P < 0.0005]. The t-tests showed that the RISP effect on connected speech production was significantly stronger than standard production both at the 1 week and 1 month post-treatment assessments (both P < 0.0005). Comparing each treatment to its control set, separately, we found significant Time Point × Set interactions for the RISP and standard production sets [F(2,38) = 19.6, P < 0.0005; F(2,38) = 5.2, P = 0.01, respectively]. Thus, although there is a general clinical belief that standard therapy does not induce generalization to connected speech, our newly-developed assessment was able to demonstrate that this is incorrect—there is, in fact, a small but significant generalization to connected speech for standard production both at 1 week and 1 month (though the effect was significantly smaller than for the RISP therapy, see above). Finally, the two untreated conditions were compared. The main effect of Time Point was significant [F(2,36) = 3.2, P = 0.05], indicating a small improvement in target vocabulary production simply through repeated assessment, but neither the main effect of Set [F(1,18) < 1] nor the interaction between Time Point and Set were significant [F(2,36) = 1.3, P = 0.28]. Content analysis of the connected speech samples As well as exploring the generalization of trained vocabulary to the connected speech samples, it is also important to investigate the connected speech samples more generally. It is possible, for example, that improved vocabulary promotes connected speech more generally or that the improvement on the trained items comes at the cost of reduced performance on the untrained vocabulary. We examined the connected speech samples in terms of the total number of nouns produced (tokens), the number of unique nouns produced (types), nouns per minute, the type/token ratio (number of unique words divided by the total words), average word frequency and average imageability for the treated and untreated pictures. The overall secondary effects on the patients’ connected speech samples were entirely positive. Specifically, for the treated pictures, the speech samples including all items showed that significantly more unique items were produced after therapy compared to baseline [mean at 1 week = 103.6, mean at baseline = 85.4; t(18) = −2.30, P = 0.03]. There was also a significant decrease in the average word frequency of the nouns used [mean at 1 week = 1.40, mean at baseline = 1.55; t(18) = 4.21, P < 0.001]. There were no significant changes found in nouns per minute, type/token ratio, and average imageability rating. Importantly, there were no significant effects found in analyses of the untreated fifth picture, indicating that the improved connected speech samples did not reflect a non-specific effect of repeated assessment. This first analysis included all items, including the target therapy items. Accordingly, we repeated the analysis to remove these items from consideration. In this second analysis, the increase in unique items from baseline to post-therapy was no longer significant [mean at 1 week = 84.8, mean at baseline = 77.8; t(18) = −0.95, P = 0.3]. The reduction in mean word frequency, however, was still significant [mean at 1 week = 1.48, mean at baseline = 1.58; t(18) = 2.86, P < 0.01]. Correlations with individual’s background neuropsychological profile Although there were significant and reliable therapy effects at the group level, the effect varied across individual patients. We performed correlations between the background neuropsychological profile [with respect to four principal neuropsychological components (see Table 3 for component loadings): phonological, semantic, executive, and speech quanta (fluency)] and the magnitude of the therapy effect [1 week versus baseline performance, and 1 month (maintenance) versus baseline performance] in order to reveal which aspects of the patients’ profile were related to the therapy outcome. The PCA identified four components including phonological skill (50.9% variance), semantic ability (11.28% variance), executive ability (8.18% variance) and speech quanta (6.42% variance). In general, the phonological component loaded with repetition, naming and digit span tests, whereas the semantic component loaded with picture matching, camel and cactus and synonym judgement tests. The executive component loaded with Ravens coloured progressive matrices, Brixton spatial rule anticipation test and minimal pairs all of which are demanding tests. Finally, the measures of the amount of speech output component loaded on the fourth factor speech quanta. Overall, no correlations were found between any of the components and the outcome on the standard therapy. For the RISP therapy, however, a significant negative correlation was found between the patient’ phonological component score and the magnitude of their therapy effect, at both 1 week (r = −0.55, P < 0.01) and 1 month (r = −0.61, P < 0.005). This demonstrates that patients with the poorest phonological abilities showed the largest RISP benefit. As can be seen across the case-series (Fig. 4), this negative correlation seems to reflect the fact that the RISP therapy was particularly beneficial leading to a ceiling effect for many of the milder patients (note that if Patient JS with poor phonological abilities but a large therapy effect is removed, then the correlation is still significant). Table 3 Factor loadings from the PCA on the entire dataset (n = 70)   Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768    Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768  Each component is subjectively labelled and shows the percentage variance explained. Loadings > 0.5 are marked in bold. Table 3 Factor loadings from the PCA on the entire dataset (n = 70)   Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768    Phonology (50.9%)  Semantics (11.28%)  Executive-demand (8.18%)  Speech quanta (6.42%)  Delayed Repetition - Word  0.888  0.221  0.183  0.193  Delayed Repetition – Non-word  0.883  0.028  0.236  0.148  Immediate Repetition – Non-word  0.881  0.061  0.231  0.143  Immediate Repetition - Word  0.858  0.211  0.125  0.170  Boston Naming Test  0.823  0.382  0.077  0.121  Cambridge Naming Test  0.813  0.432  0.154  0.117  Forward Digit Span  0.746  0.232  0.188  0.073  Backward Digit Span  0.595  0.207  0.234  0.359  Spoken sentence comprehension - CAT  0.521  0.455  0.441  0.163  Spoken Word-Picture Matching  0.236  0.801  0.267  0.145  Type/Token Ratio  0.362  0.719  −0.075  −0.091  Written Word-Picture Matching  0.182  0.712  0.504  0.155  Camel and Cactus (pictures)  0.092  0.688  0.484  0.288  96 Synonym Judgement  0.381  0.658  0.315  0.359  Minimal Pairs – Non-word  0.353  0.057  0.815  −0.013  Raven Coloured Progressive Matrices  0.048  0.274  0.735  0.155  Minimal Pairs - Word  0.419  0.167  0.706  0.132  Brixton Spatial Rule Anticipation Test  0.132  0.179  0.697  0.231  Tokens  0.011  0.034  0.207  0.885  Mean Length of Utterances  0.314  0.252  0.137  0.831  Words Per Minute  0.314  0.096  0.080  0.768  Each component is subjectively labelled and shows the percentage variance explained. Loadings > 0.5 are marked in bold. Figure 4 View largeDownload slide Case series naming accuracy. (A) Pre- versus post-RISP treatment; (B) pre- versus post-standard production treatment. Figure 4 View largeDownload slide Case series naming accuracy. (A) Pre- versus post-RISP treatment; (B) pre- versus post-standard production treatment. It was also possible to determine how each component correlated with the maintenance of the therapy effect (i.e. 1 month versus 1 week performance). In this analysis, the maintenance of the RISP effect was found to correlate positively with performance on the executive tasks (r = 0.53, P < 0.01). Thus, the patients with better executive abilities exhibited the best therapy maintenance. No other correlations were significant. Neural correlates of RISP To determine the neural correlates of the RISP effect, we correlated each patient’s therapy effect (1 week versus baseline performance, and 1 month versus baseline performance) with their T1-weighted MRI using voxel-based correlational methodology (Tyler et al., 2005). This analysis revealed that patients with the greatest damage to the posterior superior temporal gyrus extending into the white matter of the inferior longitudinal fasciculus, showed the greatest RISP benefit both at 1 week and 1 month (height threshold P < 0.001, cluster corrected using FWE P < 0.05). This region is known to play an important role in phonological performance, as illustrated in Fig. 5 whereby the RISP effect overlaps closely with the area related to the lesion correlate for the patients’ phonological skill factor found previously by Halai et al. (2017) and thus aligns with the behavioural correlation between phonological ability and therapy effect noted above. It appears, therefore, that the RISP effect may relate particularly to the patients’ phonological abilities. Finally, no voxels were found to correlate significantly with the RISP maintenance effect (1 month versus 1 week performance). Figure 5 View largeDownload slide Neural correlates of RISP. RISP Week 1 versus baseline and phonological ability; RISP Month 1 versus baseline and phonological ability. Voxel-based correlational methodology correlations for the size of the therapy effect (green overlay) and degree of phonological impairment (phonological factor score; purple overlay). The overlap is marked in grey. The top row shows the results for the therapy outcome assessment after 1 week post-therapy; the bottom row shows the results for the 1-month post-therapy data. Overlays were thresholded at P < 0.005 voxel height and cluster corrected at familywise error of P < 0.05, while including additional covariates of age, years of education, months post-onset and lesion volume. Figure 5 View largeDownload slide Neural correlates of RISP. RISP Week 1 versus baseline and phonological ability; RISP Month 1 versus baseline and phonological ability. Voxel-based correlational methodology correlations for the size of the therapy effect (green overlay) and degree of phonological impairment (phonological factor score; purple overlay). The overlap is marked in grey. The top row shows the results for the therapy outcome assessment after 1 week post-therapy; the bottom row shows the results for the 1-month post-therapy data. Overlays were thresholded at P < 0.005 voxel height and cluster corrected at familywise error of P < 0.05, while including additional covariates of age, years of education, months post-onset and lesion volume. Discussion Anomia is an immensely frustrating and disabling feature of aphasia, which is a common disorder post-stroke (around one-third of cases) and in other neurological conditions. Accordingly, it is important to establish effective interventions for remediating word-finding skills and generalizing these improvements to patients’ connected speech. Given the observation that fluent speech requires both quick and accurate word retrieval, we investigated and confirmed the novel hypothesis that a behavioural treatment, focusing on both speed and accuracy rather than accuracy alone (as is the case in standard methods), would generate greater improvements in both confrontation naming and also generalization of this improved vocabulary to connected speech. A second key, novel feature of this study was that the interventions were not examined in isolation but we also investigated the neuropsychological and lesion correlates of treatment responsiveness. Although such analyses are a rarity in the literature to date (Abel et al., 2015), increasing our understanding of both the neuropsychological and lesion correlates of variable therapy success will be a critical step towards future neuroscience-led stratification of patients and choice of clinical pathways. To address these questions, we developed a novel naming treatment that focussed on both speed and accuracy (RISP), which we compared to a standard accuracy-only treatment (standard production). As expected, both treatments increased picture naming accuracy (assessed 1 week following the end of the intervention), which was largely retained at the 1 month follow-up assessment even without maintenance practice. RISP was, however, significantly more effective than standard production in promoting increased accuracy particularly at the important long-term follow-up assessment. The same pattern was found in naming speed—as intended, RISP was much more effective in speeding successful name retrieval and maintaining these improvements at follow-up assessment. Perhaps most importantly, we found that RISP generalized from naming individual target items into the patients’ connected speech—a ‘holy grail’ for speech and language therapy. With regard to neuropsychological and neural correlates of therapy effects, we found a significant negative correlation for the RISP therapy between the patients’ degree of phonological impairment and the magnitude of their therapy effect, both immediately after therapy and at follow-up assessment. This initially somewhat counter-intuitive finding probably reflects that RISP appears to be an especially beneficial treatment, such that milder patients show a resultant ceiling effect in their speech production assessment whereas the more severe patients can exhibit a much more dramatic improvement on the target items. This finding may also be consistent with the observation from Best and colleagues’ (2013) meta-analysis that better treatment responsiveness was evident in participants classified as having relatively less semantic difficulties and greater phonological output deficits (note, our use of PCA to extract the pattern of underlying language-cognitive deficits means that, over and above phonology per se, the potential additional influence of semantic, skills, speech fluency and cognitive-executive factors were already partialled out: see Butler et al., 2014; Halai et al., 2017). This behavioural correlate for the RISP therapy was also mirrored directly in the lesion correlate analysis: the RISP benefit was most evident in participants with the greatest damage to the posterior superior temporal gyrus extending into the white matter of the inferior longitudinal fasciculus. This region has been implicated in auditory-phonological processing not only through neuropsychological studies (Baldo et al., 2012; Robson et al., 2012, 2013) but also in functional MRI explorations of healthy function (Warren and Griffiths, 2003; Hickok and Poeppel, 2004; Rauschecker and Scott, 2009). Finally, with regard to the long-term maintenance of the RISP treatment, follow-up performance correlated positively with cognitive-executive skills. Specifically, strong performances on neuropsychological assessments like the Brixton Rule Anticipation Test (Vordenberg et al., 2014) predict good longer-term responsiveness to anomia treatment in general, and RISP in particular. This may reflect the enhanced demands that RISP placed on participants in terms of cognitive flexibility, planning, problem-solving and speed of processing—consistent with the suggestion that both patients’ degree of language impairment and remaining executive skill may be critical in recovery of function and therapy outcome (Lambon Ralph et al., 2010; Sharp et al., 2010; Brownsett et al., 2014; Geranmayeh et al., 2014). Two different possible hypotheses can be made about the mechanisms underlying the speeded treatment effect, which can be tested in future investigations. The first, language-specific hypothesis is related to the aim of the RISP treatment to target both accuracy and speed. For optimally easy and efficient word retrieval, the language system requires precise representations that allow the target meaning to be converted to phonological and motor-speech representations (Lupker et al., 1997). Computational models of speech production and reading have repeatedly shown that as these representations and mappings are refined through learning, performance of models becomes both more accurate and more efficient (Plaut et al., 1996; Ellis and Lambon Ralph, 2000). Accordingly, because the RISP treatment deliberately aims beyond accuracy to improve speed as well, the language representations and mappings may have been pressured not only to reform but also to be ‘sharpened up’ to become more precise. This also supports previous findings which indicated that naming speed is a significant, yet often overlooked factor, not only in assessment but also in treatment tasks (McCall et al., 1997). Indeed, this hypothesis might also explain why, aside from speed, RISP led to significantly better naming accuracy than the accuracy-only focused standard production (following the fact that both speed and accuracy reflect the precision of the underlying language representations). Another possible hypothesis accounting for the RISP effect is related to a domain-general, cognitive-executive mechanism (Lambon Ralph et al., 2010; Geranmayeh et al., 2014). Not only was the degree of treatment maintenance related to the patients’ cognitive-executive skills, but all participants (irrespective of severity) reported RISP to be especially engaging and motivating. Thus, RISP may be much better than standard production in engaging patients’ executive and attentional skills, in addition to the speech production system, resulting in improved learning and retention. From a neurobiological perspective, increased motivation and reward-seeking behaviour has been being strongly associated with dopamine release (Fiorillo, 2013; Morita et al., 2013; Sharp et al., 2016) and dopamine has been associated with improved learning and therapy effects (Berthier and Pulvermüller, 2011; Gill and Leff, 2012). This observation speaks to the wider potential of ‘gamification’, that is using the dynamic and engaging aspects of commercial gaming software to ramp-up the engagement required for rehabilitation tasks (Ferreira et al., 2014). Although based on a limited number of items in each condition, the current results suggest that there might be clinically-notable differences between the two therapy approaches, particularly at longer-term follow-up. These indications from the current experimental exploration will need to be confirmed in larger-scale studies, including formal clinical trials. Acknowledgements We would like to thank the participants with aphasia who took part in this study. Funding This research was supported by grants from the Rosetrees Foundation, the Medical Research Council (MRC) (MR/J004146/1) and European Research Council (ERC) (GAP: 670428 - BRAIN2MIND_NEUROCOMP). Supplementary material Supplementary material is available at Brain online. Abbreviations Abbreviations PCA principal component analysis RISP repeated, increasingly-speeded production References Abel S, Weiller C, Huber W, Willmes K, Specht K. Therapy-induced brain reorganization patterns in aphasia. Brain  2015; 138: 1097– 12. Google Scholar CrossRef Search ADS PubMed  Ashburner J, Friston KJ. Unified segmentation. Neuroimage  2005; 26: 839– 51. Google Scholar CrossRef Search ADS PubMed  Baldo JV, Katseff S, Dronkers NF. Brain regions underlying repetition and auditory-verbal short-term memory deficits in aphasia: evidence from voxel-based lesion symptom mapping. Aphasiology  2012; 26: 338– 54. Google Scholar CrossRef Search ADS PubMed  Bates E, Wilson SM, Saygin AP, Dick F, Sereno MI, Knight RT, Dronkers NF. Voxel-based lesion-symptom mapping. Nat Neurosci  2003; 6: 448– 50. Google Scholar CrossRef Search ADS PubMed  Berthier ML, Pulvermüller F. Neuroscience insights improve neurorehabilitation of poststroke aphasia. Nat Rev Neurol  2011; 7: 86– 97. Google Scholar CrossRef Search ADS PubMed  Best W, Greenwood A, Grassly J, Herbert R, Hickin J, Howard D. Aphasia rehabilitation: does generalisation from anomia therapy occur and is it predictable? A case series study. Cortex  2013; 49: 2345– 57. Google Scholar CrossRef Search ADS PubMed  Bird H, Lambon Ralph MA, Patterson K, Hodges JR. The rise and fall of frequency and imageability: noun and verb production in semantic dementia. Brain Lang  2000; 73: 17– 49. Google Scholar CrossRef Search ADS PubMed  BNC Consortium. The British National Corpus, version 3 (BNC XML Edition) . Oxford, England: Oxford University Computing Services; 2007. Bozeat S, Lambon Ralph MA, Patterson K, Garrard P, Hodges JR. Non-verbal semantic impairment in semantic dementia. Neuropsychologia  2000; 38: 1207– 15. Google Scholar CrossRef Search ADS PubMed  Brownsett SLE, Warren JE, Geranmayeh F, Woodhead Z, Leech R, Wise RJS. Cognitive control and its impact on recovery from aphasic stroke. Brain  2014; 137: 242– 54. Google Scholar CrossRef Search ADS PubMed  Burgess P, Shallice T. Hayling and Brixton tests . Oxford, England: Pearson; 1997. Butler RA, Lambon Ralph MA, Woollams AM. Capturing multidimensionality in stroke aphasia: mapping principal behavioural components to neural structures. Brain J Neurol  2014; 137: 3248– 66. Google Scholar CrossRef Search ADS   Conroy P, Sage K, Lambon Ralph MA. The effects of decreasing and increasing cue therapy on improving naming speed and accuracy for verbs and nouns in aphasia. Aphasiology  2009; 23: 707– 30. Google Scholar CrossRef Search ADS   Crerar MA. Aphasia rehabilitation and the strange neglect of speed. Neuropsychol Rehabil  2004; 14: 173– 206. Google Scholar CrossRef Search ADS   Ellis A, Lambon Ralph MA. Age of acquisition effects in adult lexical processing reflect loss of plasticity in maturing systems: insights from connectionist networks. J Exp Psychol Learn Mem Cogn  2000; 26: 1103– 23. Google Scholar CrossRef Search ADS PubMed  Ferreira C, Guimarães V, Santos A, Sousa I. Gamification of stroke rehabilitation exercises using a smartphone. In: Proceedings of the 8th International Conference on Pervasive Computing Technologies for Healthcare, PervasiveHealth ’14 . Brussels, Belgium: ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering); 2014. p. 282– 5. Fiorillo CD. Two dimensions of value: dopamine neurons represent reward but not aversiveness. Science  2013; 341: 546– 9. Google Scholar CrossRef Search ADS PubMed  Geranmayeh F, Brownsett SLE, Wise RJS. Task-induced brain activity in aphasic stroke patients: what is driving recovery? Brain  2014; 137: 2632– 48. Google Scholar CrossRef Search ADS PubMed  Gill SK, Leff AP. Dopaminergic therapy in aphasia. Aphasiology  2012; 28: 155– 70. Google Scholar CrossRef Search ADS PubMed  Goodglass H, Kaplan E, Barresi B. Boston diagnostic aphasia examination (BDAE-3) . 3rd edn. Philadelphia, USA: Pearsons; 2000. Halai AD, Woollams AM, Lambon Ralph MA. Using principal component analysis to capture individual differences within a unified neuropsychological model of chronic post-stroke aphasia: revealing the unique neural correlates of speech fluency, phonology and semantics. Cortex  2017; 86: 275– 89. Google Scholar CrossRef Search ADS PubMed  Hickok G, Poeppel D. Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language. Cognition  2004; 92: 67– 99. Google Scholar CrossRef Search ADS PubMed  Hodgson C, Lambon Ralph M. Mimicking aphasic semantic errors in normal speech production: evidence from a novel experimental paradigm. Brain Lang  2008; 104: 89– 101. Google Scholar CrossRef Search ADS PubMed  Jefferies E, Patterson K, Jones RW, Lambon Ralph MA. Comprehension of concrete and abstract words in semantic dementia. Neuropsychology  2009; 23: 492– 9. Google Scholar CrossRef Search ADS PubMed  Kay J, Lesser R, Coltheart M. Psycholinguistic assessments of language processing in aphasia (PALPA): an introduction. Aphasiology  1996; 10: 159– 80. Google Scholar CrossRef Search ADS   Laine M, Martin N. Anomia: theoretical and clinical aspects . Hove, England: Psychology Press; 2006. Lambon Ralph MA, Snell C, Fillingham JK, Conroy P, Sage K. Predicting the outcome of anomia therapy for people with aphasia post CVA: both language and cognitive status are key predictors. Neuropsychol Rehabil  2010; 20: 289– 305. Google Scholar CrossRef Search ADS PubMed  Levelt W. Speaking: from intention to articulation . Cambridge, MA: MIT Press; 1989. Lupker SJ, Brown P, Colombo L. Strategic control in a naming task: changing routes or changing deadlines? J Exp Psychol Learn Mem Cogn  1997; 23: 570– 90. Google Scholar CrossRef Search ADS   Maendl L. Word finding difficulties in aphasia . York, UK: University of York; 1998. McCall D, Cox DM, Shelton JR, Weinrich M. The influence of syntactic and semantic information on picture-naming performance in aphasic patients. Aphasiology  1997; 11: 581– 600. Google Scholar CrossRef Search ADS   Morita K, Morishima M, Sakai K, Kawaguchi Y. Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior. J Neurosci  2013; 33: 8866– 90. Google Scholar CrossRef Search ADS PubMed  Nickels L. Therapy for naming disorders: revisiting, revising, and reviewing. Aphasiology  2002; 16: 935– 79. Google Scholar CrossRef Search ADS   Plaut DC, McClelland JL, Seidenberg MS, Patterson K. Understanding normal and impaired word reading: computational principles in quasi-regular domains. Psychol Rev  1996; 103: 56– 115. Google Scholar CrossRef Search ADS PubMed  Rauschecker JP, Scott SK. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci  2009; 12: 718– 24. Google Scholar CrossRef Search ADS PubMed  Raven J. Raven’s coloured progressive matrices (CPM) . Oxford, England: Pearson; 1962. Robson H, Grube M, Lambon Ralph MA, Griffiths TD, Sage K. Fundamental deficits of auditory perception in Wernicke’s aphasia. Cortex  2013; 49: 1808– 22. Google Scholar CrossRef Search ADS PubMed  Robson H, Sage K, Ralph MA. Wernicke’s aphasia reflects a combination of acoustic-phonological and semantic control deficits: a case-series comparison of Wernicke’s aphasia, semantic dementia and semantic aphasia. Neuropsychologia  2012; 50: 266– 75. Google Scholar CrossRef Search ADS PubMed  Schneider W, Eschman A, Zuccolotto A. E-prime computer software and manual . Pittsburg, PA: Psychology Software Tools Inc.; 2002. Seghier ML, Ramlackhansingh A, Crinion J, Leff AP, Price CJ. Lesion identification using unified segmentation-normalisation models and fuzzy clustering. Neuroimage  2008; 41: 1253– 66. Google Scholar CrossRef Search ADS PubMed  Sharp DJ, Turkheimer FE, Bose SK, Scott SK, Wise RJS. Increased frontoparietal integration after stroke and cognitive recovery. Ann Neurol  2010; 68: 753– 6. Google Scholar CrossRef Search ADS PubMed  Sharp ME, Foerde K, Daw ND, Shohamy D. Dopamine selectively remediates “model-based” reward learning: a computational approach. Brain  2016; 139: 355– 64. Google Scholar CrossRef Search ADS PubMed  Swinburn K, Porter G, Howard D. Comprehensive aphasia test . Hove, England: Psychology Press; 2004. The Stroke Association (UK). State of the nation stroke statistics January 2016 . London, UK: The Stroke Association (UK); 2016. Tyler LK, Marslen-Wilson W, Stamatakis EA. Dissociating neuro-cognitive component processes: voxel-based correlational methodology. Neuropsychologia  2005; 43: 771– 78. Google Scholar CrossRef Search ADS PubMed  Van Casteren M, Davis MH. Match: a program to assist in matching the conditions of factorial experiments. Behav Res Methods  2007; 39: 973– 8. Google Scholar CrossRef Search ADS PubMed  Vitkovitch M, Humphreys GW. Perseverant responding in speeded naming of pictures: it’s in the links. J Exp Psychol Learn Mem Cogn  1991; 17: 664– 80. Google Scholar CrossRef Search ADS   Vordenberg JA, Barrett JJ, Doninger NA, Contardo CP, Ozoude KA. Application of the Brixton spatial anticipation test in stroke: ecological validity and performance characteristics. Clin Neuropsychol  2014; 28: 300– 16. Google Scholar CrossRef Search ADS PubMed  Warren JD, Griffiths TD. Distinct mechanisms for processing spatial sequences and pitch sequences in the human auditory brain. J Neurosci  2003; 23: 5799– 804. Google Scholar CrossRef Search ADS PubMed  Wechsler D. A standardized memory scale for clinical use. J Psychol  1945; 19: 87– 95. Google Scholar CrossRef Search ADS   Wisenburn B, Mahoney K. A meta-analysis of word-finding treatments for aphasia. Aphasiology  2009; 23: 1338– 52. Google Scholar CrossRef Search ADS   © The Author(s) (2018). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

BrainOxford University Press

Published: Apr 17, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off