Amphetamine disrupts haemodynamic correlates of prediction errors in nucleus accumbens and orbitofrontal cortex

Amphetamine disrupts haemodynamic correlates of prediction errors in nucleus accumbens and... www.nature.com/npp ARTICLE OPEN Amphetamine disrupts haemodynamic correlates of prediction errors in nucleus accumbens and orbitofrontal cortex 1 1 2 2 3 2 2 Emilie Werlen , Soon-Lim Shin , Francois Gastambide , Jennifer Francois , Mark D. Tricklebank , Hugh M. Marston , John R. Huxter , 2 1,4 Gary Gilmour and Mark E. Walton In an uncertain world, the ability to predict and update the relationships between environmental cues and outcomes is a fundamental element of adaptive behaviour. This type of learning is typically thought to depend on prediction error, the difference between expected and experienced events and in the reward domain that has been closely linked to mesolimbic dopamine. There is also increasing behavioural and neuroimaging evidence that disruption to this process may be a cross-diagnostic feature of several neuropsychiatric and neurological disorders in which dopamine is dysregulated. However, the precise relationship between haemodynamic measures, dopamine and reward-guided learning remains unclear. To help address this issue, we used a translational technique, oxygen amperometry, to record haemodynamic signals in the nucleus accumbens (NAc) and orbitofrontal cortex (OFC), while freely moving rats performed a probabilistic Pavlovian learning task. Using a model-based analysis approach to account for individual variations in learning, we found that the oxygen signal in the NAc correlated with a reward prediction error, whereas in the OFC it correlated with an unsigned prediction error or salience signal. Furthermore, an acute dose of amphetamine, creating a hyperdopaminergic state, disrupted rats’ ability to discriminate between cues associated with either a high or a low probability of reward and concomitantly corrupted prediction error signalling. These results demonstrate parallel but distinct prediction error signals in NAc and OFC during learning, both of which are affected by psychostimulant administration. Furthermore, they establish the viability of tracking and manipulating haemodynamic signatures of reward-guided learning observed in human fMRI studies by using a proxy signal for BOLD in a freely behaving rodent. Neuropsychopharmacology (2019) 0:1–11; https://doi.org/10.1038/s41386-019-0564-8 dopaminergic dysfunction may play a central role [16–19]. For INTRODUCTION The world is an uncertain place, where behaviour of animals must example, patients with major depressive disorder or schizophrenia continuously change to promote optimal survival. Learning to can be insensitive to reward and display impairments in reward- predict the relationship between environmental cues and significant learning behaviours [20–24]. Neuroimaging studies suggest that events is a critical element of adaptive behaviour. It is hypothesised activation of parts of ventral striatum and frontal cortex, or that adaptive behaviour depends upon comparisons of neural changes in functional connectivity with these regions, may be an representations of cue-evoked expectations of events with events important neurophysiological correlate of reward-learning impair- that actually occurred. Mismatch between these two representations ments [24–29]. However, not all studies show the same patterns of is defined as a prediction error, and is likely a vital substrate by changes. e.g., ref. [30], and there is still uncertainty over whether which accuracy of ensuing predictions about cue-event relationships the blunting of neural responses reflects a primary aetiology in can be improved. Prediction errors related to receipt of reward have these disorders. While the potential links between dopaminergic been strongly associated with dopaminergic neurons and their dysregulation, disrupted neural signatures of reward-guided projections to frontostriatal circuits [1–4]. In rodents and humans, learning and neuropsychiatric symptoms are manifest, strong presentation of reward-predicting cues causes an increase in direct evidence is currently lacking. dopaminergic neuron activity and dopamine release in terminal To help bridge this gap, we used constant-potential ampero- regions, not only in proportion to the expected value of the metry to monitor haemodynamic responses simultaneously in upcoming reward but also to the deviation from that expectation the nucleus accumbens (NAc) and orbitofrontal cortex (OFC) in when the reward is actually delivered [5–12]. Furthermore, rats performing a reward-driven, probabilistic Pavlovian learning experimental disruption of dopaminergic transmission can impair task. Both NAc and OFC regions receive dopaminergic input, and formation of appropriate cue–reward associations [13–15]. have been previously implicated in representing the expected From a human perspective, elements of reward learning can be value of a cue to guide reward-learning behaviour [31, 32]. disrupted in a variety of neuropsychiatric conditions where Amperometric tissue oxygen [T ] signals likely originate from O2 1 2 Department of Experimental Psychology, University of Oxford, Tinsley Building, Mansfield Road, Oxford OX1 3SR, UK; Eli Lilly & Co Ltd, Erl Wood Manor, Windlesham GU20 6PH, 3 4 UK; Department of Neuroimaging Sciences, Institute of Psychiatry, Kings College London, London, UK and Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK Correspondence: Gary Gilmour (gilmour_gary@lilly.com) or Mark E. Walton (mark.walton@psy.ox.ac.uk) Deceased: Soon-Lim Shin Received: 29 March 2019 Revised: 2 October 2019 Accepted: 29 October 2019 © The Author(s) 2019 1234567890();,: Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. equivalent physiological mechanisms as fMRI BOLD signals additional trial-by-trial variance, we also included either trial- [33–35], allowing cross-species comparisons of behaviourally specific or recency-weighted pre-cue response rates. To compare driven haemodynamic signals in awake animals. The aim of this the models, we used the Bayesian information criterion (BIC), which study was to define amperometric signatures of cue-evoked penalises the likelihood of a model by the number of parameters expectation of reward and prediction error in these regions and and the natural logarithm of the number of data points. then investigate how they are modulated by administration of amphetamine. Amphetamine is known to modify physiological Data analysis dopamine signalling [36], and in humans, even a single dose of a Behaviour: We analysed the average number of head entries into stimulant like methamphetamine can cause an increase in mild the food magazine during presentation during either the CS or High psychotic symptoms [37, 38]. While amphetamine can promote CS cues during the 9 days of training and then during the pre- Low behavioural approach to rewarded cues [39], it can also impair drug day (day 9 of training) with the drug challenge day. conditional discrimination performance [40]and theinfluence of probabilistic cue–reward associations on subsequent decision- Amperometry: We performed two sets of complementary making [41]. We hypothesised that amphetamine would disrupt analyses: (i) model-free analyses, where we investigated the discriminative responses to cues during performance of a average signals in NAc and OFC during cue presentation or in the probabilistic Pavlovian task, with concomitant changes to the 30 s after outcome delivery over the course of learning and after haemodynamic correlates of reward expectation and prediction amphetamine administration, and (ii) model-based analyses error in the NAc and OFC. where we regressed the same signals against estimates from our computational model using the model with the lowest BIC score (Fig. 1d). METHODS See SI for detailed methods Animals. All experiments were conducted in accordance with the RESULTS United Kingdom Animals (Scientific Procedures) Act 1986. Adult Behavioural performance during probabilistic learning male Sprague Dawley rats (Charles River, UK) were used in the We trained rats on a two-cue probabilistic Pavlovian learning present studies (n = 36). Four animals did not contribute to the paradigm. One cue—CS —was associated with reward delivery High behavioural dataset owing to poor O calibration responses, and on 75% of trials and the other—CS —on 25% of trials (Fig. 1a). 2 Low the data from an additional two animals could not be included As can be observed, animals learned to discriminate between the owing to a computer error. During testing, they were maintained cues, increasingly making magazine responses during presenta- at >85% of their free-feeding weight relative to their normal tion of the CS but showing little change in behaviour upon High growth curve. Prior to the start of any training or testing, all presentation of CS as training progressed (main effect of CS: Low animals underwent surgical procedures under general anaesthesia F = 39.92, p < 0.001; CS × day interaction: F = 6.12, p = 1,27 2.63,70.92 to implant carbon paste electrodes targeted bilaterally at the NAc 0.001) (Fig. 1b, Fig. S2). Unexpectedly, however, there was also a and OFC. substantial and consistent influence of the counterbalancing assignment on responding (CS × cue identity interaction: F = 1,27 O amperometry data recording.O signals were recorded 48.17, p < 0.001). Specifically, follow-up pairwise comparisons 2 2 from the NAc and OFC by using constant-potential amperometry showed that the animals in Group 1, where CS was assigned High (–650 mV applied for the duration of the session) as described to be the clicker cue and CS the pure tone (“CL1-T2”) exhibited Low previously [35, 42]. strong discrimination between the cues throughout training (p < 0.001; Fig. 1c). By contrast, rats in Group 2 with the opposite CS— Probabilistic Pavlovian conditioning task. The task was a prob- auditory cue assignment (“T1-CL2”), did not show differential abilistic Pavlovian learning task performed in standard operant responding to the cues in spite of the different reward chambers. Each trial consisted of a 10-s presentation of one of two associations (p = 0.66). auditory cues (3-kHz pure tone at 77 dB or 100-Hz clicker at 76 dB) To better understand how the cue identity was influencing this followed immediately by either delivery or omission of reward pattern of responding, we formally analysed how well different (4 × 45 mg of sucrose food pellets). One of the auditory cues simple reinforcement learning models could describe individual (CS ) was followed by reward delivery on 75% of trials, the rats’ Pavlovian behaviour. The preferred model included a cue High other (CS ) was rewarded on 25% of trials. Each session salience parameter and a cue-specific unconditioned magazine Low consisted of a total of 56 cue presentations, with an average responding term, as well as recency-weighted pre-cue responding intertrial interval of 45 s (range 30–60 s). Standard training took parameter (Fig. 1d). In particular, the constant term attributable to place over nine sessions and session 10 consisted of the drug unconditioned cue-elicited magazine responding was higher on challenge (see Fig. S1). clicker than tone trials (Z = 3.98, p < 0.001, Wilcoxon signed-rank test; p < 0.015 for Group 1 or 2 analysed separately). Therefore, Pharmacological manipulations. D-amphetamine sulfate (Sigma, once the difference in cue attributes was accounted for, rats’ UK) was dissolved in 5% (w/v) glucose solution, and pH adjusted behaviour could be well explained by using this modified simple towards neutral with the dropwise addition of 1 M NaOH as reinforcement learning model (Fig. 1e). necessary. Amphetamine was dosed at 1 mg/kg (free weight) via the intraperitoneal route. Both NAc and OFC haemodynamic signals track Pavlovian responding Behavioural modelling. Head entries during the 10-s cue pre- We examined how T responses in NAc and OFC (Fig. 2) tracked O2 sentation were modelled by using variations of a Rescorla–Wagner animals’ learning of the appetitive associations and violations of model (Rescorla & Wagner 1972). We started with a model with a their expectations. After exclusions for misplaced electrodes and single free parameter, the learning rate α, and compared this poor quality of signals (see Supplementary Methods, Fig. S3), 40 against other models that also included free parameters specifying electrodes in 20 rats were included for analysis (NAc = 25 (a) cue-specific learning rates (i.e., a cue salience term, β); (b) electrodes from 15 rats, OFC = 15 electrodes from 11 rats). separate learning rates for rewarded α and nonrewarded trials We initially performed model-free analyses to investigate the pos α and either (c) cue-independent k or (d) cue-specific uncondi- T signals in response to presentation of the CS and CS neg O2 High Low tioned magazine responding, k and k .To capture cues as the rats learned the reward associations. Clicker Tone Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. All rats AB 75% Rew baseline CS High CS High 10 s CS 50% 6 25% Low baseline 10 s 50% 75% 2 CS Low 10 s 25% Rew 12 3 4 5 6 7 8 9 days Group CL1-T2 Group T1-CL2 baseline baseline CS = clicker CS = tone High High CS = tone CS = clicker 8 Low Low 4 4 2 2 0 12 3 4 5 6 7 8 9 12 3 4 5 6 7 8 9 days days actual CS High Example rat from group CL1-T2 actual CS 11 Low DE predicted CSHigh predicted CS Low 7 0 5 1 3 8 1 0 0 1 00 4 0 00 00 00 (5) (0) (5) (0) (2) (5) (0) (0) (0) (1) (0) (0) (2) (0) (0) (0) (0) (0) (0) (0) ab c d e f g h i j k l m n o p q r s t 12 Example rat from group T1-CL2 Reward learning rate xx x x x x x x xx Different learning rates for xxx x x xx x x x rewarded and no reward Cue salience xxxx xxxx x x General cue arousal x x xxxxxx Specific cue arousal x x xx xx xx Baseline entries xxxxxxx x Days Fig. 1 Task, behavioural performance and modelling. a Schematic of the Pavlovian task. b, c Average head entries (mean ± SEM) to the magazine during presentation of each cue or during the pre-cue baseline period across the nine sessions (b, all animals; c, Group C1–T2 only, where the CS was the clicker and CS was the tone; d, Group T1–C2 only, where the CS was the tone and CS the clicker). High Low High Low d Bayesian information criterion (BIC, a measure of the goodness of fit of the model) estimates for different learning models. The BIC penalises the likelihood of a model by the number of parameters and the natural logarithm of the number of data points. The model with the lowest BIC score was deemed to give a better fit of the data. Note, however, the patterns of results in the model-based analyses of amperometric signals remained unchanged if we used any of the three models that fitted best for a number of individual rats (models a, c and f) or even if we used a standard Rescorla-Wagner-type learning model (model s). An “x” in the table indicates the presence of the given component in the model. Numbers within each bar indicate the number of animals for which the given model had the lowest BIC. e Example of the model fits for two animals from the two counterbalancing groups T responses during presentation of the cues changed showed a significant positive correlation between the signals O2 markedly over training in a similar manner in both brain regions recorded in each area (r = 0.41, p < 0.01). Moreover, mirroring the (main effects of cue and session: both F > 9.49, p < 0.001) (Fig. 3a, behavioural data, the patterns of responses differed substantially b). In fact, analysis of the subset of animals with functional according to the cue identity (CL1–T2 or T1–CL2). While the T O2 electrodes recorded simultaneously in NAc and OFC (n = 6 rats) response following the CS developed similarly in both groups, High Neuropsychopharmacology (2019) 0:1 – 11 Model components Constant Scaling factors head entries BIC (x10 ) head entries head entries head entries head entries Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. AB +3.7 mm +1.6 mm +3.2 mm +1.2 mm +1.0 mm +2.7 mm +0.7 mm +0.5 mm +2.2 mm Fig. 2 Electrode location in OFC (a) and NAc (b). The NAc electrodes were clustered mainly in the ventromedial NAc, including ventromedial core and shell, while the OFC electrodes were in the ventral orbital sector there was a substantial difference in the CS response, with the CL1–T2 group. By contrast, on CS trials, there was no Low Low average signals when the clicker was the CS being significantly meaningful change in T responses to delivery or omission of Low O2 higher than when the tone was the CS (Cue × cue identity reward throughout training (all p > 0.37). As might be expected Low interaction: F = 18.67, p < 0.001; CL1–T2 vs. T1–CL2, CS : p = given its effect on behaviour and cue-elicited neural signals, cue 1,36 High 0.32, CS : p < 0.001). identity again influenced outcome signals, resulting in a four-way Low To establish the relationship between development of maga- interaction of all the factors, as well as two-way interactions zine responding and the T signals, we regressed the model- between cue × identity and training stage × identity (all F > 4.33, O2 derived estimates of cue value from the model that best fitted the p < 0.019). Importantly, however, when analysed separately, both behavioural data, V(t), against the trial-by-trial T responses and counterbalance groups showed the key cue × outcome × training- O2 found a significant positive relationship in both regions in both stage interaction (F > 4.84, p < 0.019). groups (Fig. 3c). This was not simply a correlate of invigorated In the OFC, outcome was also a strong influence on T O2 responding as cue value was a significantly better predictor than responses (F = 51.18, p < 0.001) and this was again shaped by 1,13 trial-by-trial magazine head entries (Fig. S4). Therefore, once the preceding cue (cue × outcome interaction: F = 5.37, p = 1,13 differences in cue identity are accounted for, it is possible to 0.037). However, unlike in NAc, there came to be an increasingly demonstrate that T signals in both NAc and OFC track the strong T response when reward was omitted, particularly after O2 O2 expected value associated with each cue. CS (Fig. 4c, d). This resulted in a cue × training-stage interaction High (F = 5.37, p = 0.037; CS vs. CS , p = 0.51 for the early 1,13 High Low Separate haemodynamic signatures of signed and unsigned training stage, p < 0.001 for mid- and late stages). While there prediction errors in NAc and OFC were qualitative differences in responses in the two counter- We next investigated how the probabilistic delivery or omission of balance groups, none of the interactions with this factor or the reward-shaped T responses in NAc and OFC, and how these main effect reached significance (all p > 0.069). O2 signals were shaped by cue-elicited reward expectations as While these model-free analyses illustrate that the T responses O2 learning progressed. As there were significant interactions changed dynamically and differently over the course of training in between brain region with cue and training stage and their the two brain regions, they do not clearly show whether either combination (all F > 3.76, p < 0.028), we here analysed responses response might encode a teaching signal useful for learning, such as in the two regions separately. arewardPE: δ(t)= V(t)+ r(t)− V(t−1).Therefore,we nextused We again first performed model-free analyses, by focusing on model-based analyses to examine whether there was a relationship how the average outcome-evoked changes in T responses were between T responses across all sessions and the fundamental O2 O2 influenced by the preceding cue and how these adapted over components of a reward PE: (i) a positive influence of outcome, r(t), training. As can be seen in Fig. 4, the primary determinant of the and (ii) a negative influence of model-derived cue value, −V(t− 1). signal change in the NAc was whether a reward was received While both NAc and OFC T responses showed a strong positive O2 (main effect of reward: F = 47.72, p < 0.001). However, the size influence on outcome, only the NAc signals fulfilled both criteria of a 1,36 of reward and no-reward signals, normalised to the time of reward PE by also exhibiting a significant negative influence of cue outcome, depended on which cue had preceded the outcome, value; in OFC, by contrast, the cue value effect was positive (Fig. 4e). and these patterns altered as learning progressed (significant We also examined whether reward PE-like T responses in NAc O2 cue × outcome and cue × outcome × training-stage interactions, were present throughout training. This showed that while correlates both F > 18.669, p < 0.001), suggesting a strong influence of of both NAc positive and negative reward PEs can be observed in expectation on T responses. Follow-up comparisons showed rats that are still learning the cue–reward associations once O2 that there was a reduction in reward-elicited T responses on appropriately established only positive reward PEs remain evident O2 CS trials as training progressed (CS rew, stage 1 vs. stage 3: (Fig. S5). High High p = 0.016; stage 2 vs. stage 3: p = 0.065). Unexpectedly, there was Although the OFC T responses do not correspond to a reward O2 also a diminution of omission-elicited reductions in T responses PE, the patterns of signals nonetheless still dynamically change O2 on these trials (CS no reward, stage 1 vs. stage 3: p = 0.014), over learning. As previous work has suggested that OFC neurons High which, from Fig. 4b, can be seen to be particularly prominent in may signal the salience of the outcome for learning, we examined Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. AB Day 1 Day 5 Day 9 1nA 10 CS = High CS High = CS = Low CS Low = 10 s 1nA -20 CS = High CS = High CS Low= CS = 20 Low -10 10 s Sessions cue onset cue onset cue onset NAc OFC Expected value Expected value Group C1-T2 Group C1-T2 1 1 Group T1-C2 Group T1-C2 0 0 10 s 10 s 0 10 s 10 s cue cue cue cue onset onset onset onset Fig. 3 Haemodynamic correlates during cue presentation. a T responses on 3 sample days time-locked to cue presentation in the two O2 counterbalance groups recorded from either NAc (upper panels) or OFC (lower panels). b Average area-under-the-curve responses (mean ± SEM) extracted from 5 to 10 s after cue onset for each cue across the nine sessions. c Average effect sizes in NAc (left panel) and OFC (right panel) from a general linear model relating T responses to trial-by-trial estimates of the expected value associated with each cue. Main plots O2 include all animals; insets show the analyses divided up into the two cue identity groups whether T responses instead correlated with how unexpected We first analysed baseline magazine responding in the pre-drug O2 each outcome was, corresponding to an unsigned PE. This analysis and the drug administration sessions. Although amphetamine showed that each animal’s trial-by-trial unsigned PE had a strong caused a numeric increase in baseline responding, this was positive influence on OFC signals (Fig. S6). Again, this was present variable between animals—7/15 rats given amphetamine showing in both counterbalance groups. a substantial increase in baseline magazine response rates, Therefore, while the changes in NAc T responses reflect how whereas the other 8/15 animals showed a decrease in response O2 much better or worse an outcome was than expected, OFC T rates —and the drug × session interaction did not reach O2 responses indicate how surprising either was. significance (F = 3.71, p = 0.065). By contrast, there was a 1,26 substantial and consistent change in cue-elicited responses (cue × Amphetamine disrupts cue-specific value encoding and prediction session × drug interaction: F = 16.31, p < 0.001). This was not 1,26 errors caused by differences between the drug groups on the pre-drug Having established haemodynamic PE correlates in NAc and OFC, session (no main effect or interaction with drug group: all F < 1.21, we next wanted to investigate how an acute dose of ampheta- p > 0.28). Instead, as can be observed in Fig. 5a, while both the mine (1 mg/kg), an indirect sympathomimetic known to potenti- vehicle and amphetamine groups responded more on average to ate dopamine release, influenced cue value and prediction error the CS than the CS on the pre-drug day (p < 0.003), this High Low T responses. discrimination was abolished after administration of the drug O2 Neuropsychopharmacology (2019) 0:1 – 11 OFC NAc Effect Size (a.u.) Group T1-CL2 Group CL1-T2 Group T1-CL2 Group CL1-T2 n = 11 / 6 n = 14 / 9 n = 7 / 5 n = 8 / 6 Effect Size (a.u.) AUC (nA x s) AUC (nA x s) AUC (nA x s) AUC (nA x s) Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. A Day 1 Day 5 Day 9 B 4nA CS = Rew High CS High= X CS = Low Rew CS = X Low CS Rew 0 High CS = X High CS Low Rew 4nA CS = X Low 30 s outcome outcome outcome Expected value Expected value Reward Reward Group CL1-T2 Group CL1-T2 Group T1-CL2 Group T1-CL2 4 4 2 2 0 0 0 0 30 s 30 s 30 s 30 s outcome outcome outcome outcome Fig. 4 Haemodynamic correlates following outcome presentation. a, c T responses on 3 sample days, time-locked to outcome presentation O2 (reward or no reward) after each cue in the two cue identity groups recorded from either NAc (a) or OFC (c). b, d Average area-under-the-curve responses (mean ± SEM) extracted from 30 s following the outcome after each cue across the nine sessions (NAc panel a, OFC panel c). e Average effect sizes in NAc (left panel) or OFC (right panel) from a general linear model relating T responses to trial-by-trial estimates of O2 the expected value associated with each cue and trial outcome (reward or no reward). Main plots include all animals; insets show the analyses divided into the two cue identity groups (CS vs. CS : p = 0.35), but not the vehicle (p < 0.001). Note Based on the differences between outcome-elicited signals in High Low that while there were again some differences between the NAc and OFC observed during training, we split the outcome- counterbalance groups (cue × session × drug × identity interac- elicited T data by region. In the NAc, there was a significant O2 tion: F = 6.30, p = 0.019), the effects of amphetamine admin- cue × session × outcome × drug interaction (F = 4.82, p = 0.04). 1,26 1,21 istration were comparable in both groups (amphetamine We focused follow-up analyses on each drug group separately group: significant cue × session interaction, F = 9.047, p = without cue identity as a between-subjects’ factor as the NAc 1,13 0.01; no significant cue × session × identity interaction, F = electrode exclusion criteria inadvertently biased the distribution of 1,13 2.71, p = 0.12). rats assigned to the drug and vehicle groups as a function of cue Administration of amphetamine also had a pronounced but identity (χ = 9.4, df = 3 and p = 0.024) (see Fig. S7 for breakdown specific effect on T responses. During the cue period, the effect by counterbalance group). O2 in both NAc and OFC mirrored the effect of the drug on behaviour, While vehicle injections caused no changes in NAc signals (no with amphetamine abolishing the distinction between the main effect or interaction with session: all F < 1.4, p > 0.33), average T response elicited by presentation of the CS or amphetamine had a marked influence on outcome-elicited T O2 High O2 CS (cue × session × drug: F = 6.22, p = 0.018; CS vs. responses, selectively blunting CS outcome responses (cue × Low 1,32 High Low CS , amphetamine group drug session, p = 0.25; all other p < session × outcome interaction: F = 22.07, p = 0.001; CS Low 1,12 Low 0.006) (Fig. 5b, c, S7A). While there were still notable effects of cue reward or no reward: pre-drug vs. drug session, p < 0.005; CS , High identity on signals, follow-up comparisons found that there were all p > 0.22). This meant that on amphetamine, there was now no no reliable differences in T responses between the different cue reliable distinction between reward-evoked T signals based on O2 O2 configurations in either group or session (all p > 0.27). the preceding cue (p = 0.08; all other CS vs. CS High Low Neuropsychopharmacology (2019) 0:1 – 11 OFC NAc NAc Group T1-CL2 Group CL1-T2 Group T1-CL2 Group CL1-T2 Effect Size (a.u.) OFC AUC (nA x s) AUC (nA x s) AUC (nA x s) AUC (nA x s) Effect Size (a.u.) Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. Group C1-T2 All rats Amph, CS High Amph, CS Low Amph, baseline Group T1-C2 Veh, CS High Veh, CS Low Veh, baseline PRE DRUG PRE DRUG VEHICLE AMPH VEHICLE AMPH VEHICLE AMPH VEHICLE AMPH BC PRE DRUG PRE DRUG 1uA -5 CS High CS Low 1uA -5 10 s cue onset cue onset cue onset cue onset PRE DRUG PRE DRUG DE VEHICLE AMPH VEHICLE AMPH PRE DRUG PRE DRUG 2uA 5uA 30 s outcome outcome outcome outcome PRE DRUG PRE DRUG CS Rew CS X CS Rew CS X High High Low Low Fig. 5 Effect of acute amphetamine administration on cue-elicited behaviour and haemodynamic signals. a Average head entries (mean ± SEM) to the magazine during presentation of each cue or during the pre-cue baseline period in the session before (“Pre”) and just after drug administration (“Drug”) in the group receiving vehicle or amphetamine (1 mg/kg). b T responses time-locked to cue presentation recorded O2 from either NAc (upper panels) or OFC (lower panels) in the pre-drug or drug administration sessions. Note that differences in the pre-drug patterns of signals in the vehicle and amphetamine group mainly reflect the unbalanced assignment of included animals from the two cue identity groups. c Average area-under-the-curve responses (mean ± SEM) extracted from 5 to 10 s after cue onset for each cue in the two sessions. d T responses time-locked to outcome presentation (reward or no reward) after each cue in the two cue identity groups recorded O2 from either NAc (upper panels) or OFC (lower panels) in the pre-drug or drug administration sessions. e Average area-under-the-curve responses (mean ± SEM) extracted from 30 s following the outcome after each cue in the two sessions comparisons, p < 0.015) (Fig. 5d, e). Consistent with this, we also we did not analyse the negative RPE as this was already largely found a significant reduction in the relationship between T absent in the pre-drug session in animals showing strong O2 responses and positive reward prediction errors selectively after discrimination between the CS and CS ). High Low amphetamine (comparison of peak effect size on and off drug: In OFC, there was also a significant change in T responses O2 session × drug interaction: F = 8.02, p = 0.01; pre-drug vs. drug when comparing outcome-elicited signals on the drug session 1,21 session, amphetamine group: p = 0.003; vehicle: p = 0.66) (note, with the pre-drug day (significant cue × session × drug and cue × Neuropsychopharmacology (2019) 0:1 – 11 OFC NAc OFC NAc veh (n = 8, r = 6) veh (n = 12, r = 8) veh (n = 8, r = 6) veh (n = 12, r = 8) amph (n = 7, r = 5) amph (n = 13, r = 8) amph (n = 7, r = 5) amph (n = 13, r = 8) Head entries Head entries Head entries AUC (nA x s) AUC (nA x s) AUC (nA x s) AUC (nA x s) Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. session × outcome × drug interactions: both F > 5.35, p < 0.042). In worse than expected fits well with the hypothesised roles of the the control group, vehicle injections caused a general increase in extensive dopaminergic projections to this region, and suggests a all the OFC T responses (main effect of testing session: F = fundamental role for NAc in sustaining approach responses to O2 1,6 7.51, p = 0.034). By contrast, in the amphetamine group, there was reward-associated cues [48, 49]. a striking reduction in OFC T signals, particularly elicited by It has been demonstrated that dopamine release in the core O2 CS cues (main effect of testing session: F = 5.12, p = 0.073; region of the NAc correlates with a RPE, and similar to that High 1,5 significant cue × session interaction: F = 29.527, p = 0.003). An observed here, dynamically changes over the course of learning 1,21 analysis of the unsigned prediction error signal also resulted in a [9, 10, 14, 50, 51] (see ref. [52] for a different interpretation). The session × drug interaction (F = 6.63, p = 0.026), though this was amperometry electrodes in the current study were largely in 1,11 driven both by a numeric decrease in the regression weight in the caudal parts of ventral NAc, spanning the core and ventral shell amphetamine group and an increase in the regression weight in regions. Given that the electrodes are estimated to be sensitive to the vehicle group. changes in signal over approximately a 400-μm sphere around the Taken together, therefore, amphetamine impaired the discrimi- electrode [53, 54], it is plausible that the signals we recorded here native influence of CS and CS cues on behaviour and also could have been influenced by RPE-like patterns of dopamine High Low corrupted the influence of these cue-based predictions on NAc release [55]. Several recent papers have shown that optogenetic and OFC T responses. stimulation of dopamine neurons can have widespread influence O2 on forebrain BOLD signals [56–58]. However, direct evidence for this link is currently lacking, and a recent study comparing DISCUSSION patterns of BOLD signals with dopamine release in humans found The results presented here show that haemodynamic signals in indications of uncoupling between the measures [59]. Therefore, it NAc and OFC dynamically track expectation of reward as rats form is also conceivable that the NAc haemodynamic signals we associations between cues with high or low probability of reward observed here instead reflect afferent input from regions such as outcome. Importantly, both regions also displayed distinct forms medial frontal cortex, where RPE-like signals have also been of haemodynamic prediction error signal. NAc signals were recorded [44, 60, 61]. shaped by reward expectation and the specific valence of the By contrast, OFC T signals did not respond to prediction error O2 reward outcome, while in contrast OFC signals did not discrimi- events in quite the same way as the NAc did and did not meet all nate the valence of reward outcome, but rather reflected how three formal criteria to be considered formal correlates of an RPE. surprising either reward outcome was. A single dose of While some studies have found RPE-like outcome signals in OFC amphetamine, sufficient to modulate dopamine activity, caused [62, 63], several—including those fMRI studies that have adopted a loss of discrimination between cues that was evident both the stringent criteria applied here—have not, e.g., refs. [44, 64, 65]. behaviourally and in the haemodynamic signatures of reward Like NAc, OFC T signals signalled reward expectations when the O2 expectation and prediction error in both regions. cues were presented. This is consistent with previous fMRI and These results extend a previous study of instrumental learning electrophysiological studies suggesting that central or lateral OFC where increases in NAc T were observed as rats learned to may represent stimulus–reward mappings during cue presenta- O2 associate a deterministic cue with receipt of reward upon pressing tion, e.g., refs. [66–68]. However, unlike NAc, the OFC signals a lever [35]. The present probabilistic learning study allowed a measured in the present study tended to increase following formal assessment of whether the measured T signals displayed reward omission as well as after reward delivery and this increase O2 features that would categorise them as encoding reward scaled with how surprising the reward omission was. Electro- prediction errors (RPEs). To be considered an RPE signal, three physiological studies suggest that similar proportions of OFC cells cardinal features should be measurable: (i) a positive influence of exhibit either positive or negative relationships with value, and expected reward value on cue-elicited signals (i.e., a greater individual neurons can encode both positive and negative response to a cue that is thought to predict a higher reward), (ii) a valenced information at outcome [69, 70]. positive influence of actual reward delivered (i.e., a greater These outcome-driven signals did however correlate with an response when a high-value reward is actually delivered unsigned prediction error: how surprising or salient any outcome compared with when it is omitted) and (iii) a negative influence is based on current expectations. There are an increasing number of expected reward value on outcome-elicited signals (i.e., a larger of studies implicating OFC in modulating salience for the purposes response to reward delivery the less that reward is expected and/ of learning [71–74]. However, given the precise pattern of OFC or a smaller response to reward omission the more that reward is signals observed in the current study, the OFC T responses O2 expected) [43, 44]. Using behavioural modelling, all three of those might reflect the acquired salience of an outcome, e.g., refs. features were evident in NAc T signal, consistent with a number [71, 75], which represents both how surprising and how rewarding O2 of human fMRI studies of reward-guided learning in healthy it is. Even though such responses do not signal whether an subjects [43–46]. Although human fMRI studies predominantly use outcome is better or worse than expected, they are still important secondary reinforcers such as money to incentivise performance, to guide the rate of learning or reallocate attention. Single-site similar RPE-like activations in NAc are also observed in studies by lesion studies demonstrate a role for OFC, as well as for NAc, in using primary fluid reinforcers in lightly food/water-restricted aspects of stimulus–reward learning [76, 77]. However, a specific participants, which more closely mimic the means by which rats interaction between these regions to support these behaviours are motivated to perform the present task, see ref. [47]. must depend upon another mediating region, as there is no direct While both positive and negative RPE-like T signals were projection between the two [78]. O2 evident across the whole learning period, it was clear that the One unexpected but fortuitous finding was that T signals O2 influence of each signal changed over time. Both positive and related to magazine responding were strongly influenced by cue negative RPEs were evident early in learning. However, as identity. By comparing different behavioural models, this effect discrimination between high and low reward probability cues was best explained by including two additional parameters to a was learned, negative RPEs had an increasingly negligible simple reinforcement learning model: (i) a cue salience parameter, influence on NAc haemodynamic signals. Such adaptation has which scaled the influence of the RPE on future value estimates as resonance with a previous finding in humans that NAc BOLD a function of which auditory cue had been presented, and (ii) a signals are not observed for every RPE event, but only those cue “arousal” parameter, which was a constant term applied from currently relevant to guide future behaviour [46]. The selective the start of training. It has long been established that cue salience involvement of NAc in signalling whether an event is better or can be an important determinant of learning rates, e.g., refs. Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. [79–81], and this is a standard term in the Rescorla–Wagner and there has been increased interest in using these types of finding other influential association learning models, capturing the effect as foundations for theoretical approaches to link the underlying that more salient or intense cues are learned about faster and are biological dysfunctions to observed symptoms in patients (e.g., more readily discriminated than weaker or less salient cues. refs. [21, 87–89]). However, an extra parameter was also needed to account for the While there is general consensus about the promise of such fact that in the majority of animals, responding to the clicker was approaches, the literature is complicated by the diversity of the greater than the tone at the start of testing, irrespective of disorders and the drug regimens that patients have taken, which whether it was assigned to be CS or CS . While we do not makes testing specific causal hypotheses about the relationship High Low know the precise reason for this effect and are not aware of this between altered brain function and psychiatric symptoms difficult. being reported in previous studies, one speculation is that the rats By contrast, in an animal model, it is possible to have precise partially generalised the clicker cue to the sound of the pellet control over and measurement of induced changes in neurobiol- dispenser. Regardless of its precise cause, importantly, by using ogy. Therefore, establishing the feasibility of observing these model-derived estimates of the value, we were nonetheless able signatures in a freely behaving rodent, measured by using a valid to observe identical neural correlates of reward prediction and proxy for fMRI, and demonstrating how clinically relevant prediction error in both cue-counterbalanced groups. pharmacological perturbations affect these responses, may be Moreover, across both groups, a single “moderate/high” dose of an important step to bridge the gaps in our understanding. This amphetamine (1 mg/kg) [82] at the end of learning, mimicking a foundation, if combined with other causal manipulations (such as dopamine hyperactivity state, was sufficient to impair discrimina- pharmacological or genetic animal models relevant to psychiatric tive behavioural responses to the high and low probability cues disorders) and more sophisticated behavioural tasks that allow us and also to disrupt haemodynamic signals in both NAc and OFC. to take into account different learning strategies, e.g., refs. [90, 91] This was not simply a blunt pharmacological influence on and value parameters [7, 92], could therefore provide new neurovascular coupling as the signal change was specificto opportunities for understanding how dysfunctional neurotrans- certain conditions, for instance selectively reducing the NAc mission is reflected as changes in haemodynamic signatures and reward signals on CS trials but leaving the reward and omission how both relate to behavioural performance. Low responses on CS trials unaltered. High This may at first appear at odds with previous studies that have shown that this dose of amphetamine can selectively augment FUNDING AND DISCLOSURE responding to reward-predictive cues over neutral cues and This work was funded by a Lilly Research Award Program grant to enhance phasic dopamine release and neuronal activity in the MEW and GG, and a Wellcome Trust Senior Research Fellowship to NAc [36, 39] and neuronal activity in OFC [83]. However, there are MEW (202831/Z/16/Z). JRH, HMM and GG declare being employ- a number of important differences between these previous ees of Eli Lilly & Co Ltd.; JF, FG and MDT were employees of Eli Lilly studies and the one reported here. & Co at the time of research. JF is now an employee of Vertex First, in our paradigm, the cues were probabilistically Pharmaceuticals (Europe) and FG is an employee of H. Lundbeck rewarded, meaning that both were associated with a certain A/S. EW, SLS and MEW have no competing interests to declare. level of reward expectation and elicit conditioned approach. A previous study has shown that the same dose of amphetamine as used in the current study can disrupt conditional discrimina- ACKNOWLEDGEMENTS MEW, MDT, HMM and GG conceived the project, MEW, GG, FG and JF designed the tion performance [40]. Second, as discussed above, increased experiment, JF and FG collected the data, JF performed the surgeries, EW, SS, JRH and dopamine release may not necessarily map directly onto MEW analysed the data and MEW prepared the paper with input from the other comparable haemodynamic changes. Indeed, although it would authors. We would like to thank David Bannerman for valuable advice throughout the be expected that thedoseofamphetamine used in thecurrent project, Mike Conway for assistance with surgeries and Thomas Akam, Miriam Klein- experiment would boost phasic dopamine release to reward- Flugge, Stephen McHugh and Marios Panayi for discussions about the modelling, predicting cues [36], fMRI studies have tended to observe a interpretation and analyses. The study is dedicated to the memory of one of the blunting of haemodynamic responses to such cues in NAc and authors, Soon-Lim Shin, a key contributor to the project who sadly passed away to frontal cortex following administration of a single dose of cancer in 2017. amphetamine or methamphetamine, comparable to what we observed in both NAc and OFC [37, 84]. In addition, one of these studies [37] reported the loss of RPE encoding in NAc, an effect ADDITIONAL INFORMATION that was also evident in the current study. Amphetamine is Supplementary information is available for this paper at (https://doi.org/10.1038/ known to increase levels of dopamine—and other monoamines s41386-019-0564-8). —in a stimulus-independent as well as a stimulus-driven manner. Therefore, the critical factor for appropriate responding Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims is likely to rest on the balance of these two elements in in published maps and institutional affiliations. frontal–striatal–limbic circuits, and the disruption of haemody- namic signalling of incentive predictions and prediction errors we recorded may reflect this. From the present data, it is unclear REFERENCES whether amphetamine is directly corrupting calculations of RPEs 1. Watabe-Uchida M, Eshel N, Uchida N. Neural circuitry of reward prediction error. or is instead primarily disrupting the inputs then used to Annu Rev Neurosci 2017;40:373–94. 2. Montague PR, Dayan P, Sejnowski TJ. A framework for mesencephalic dopamine calculate the RPE such as cue-elicited reward expectation. systems based on predictive Hebbian learning. J Neurosci Off J Soc Neurosci The changes in haemodynamic responses observed after 1996;16:1936–47. amphetamine administration here are also consistent with an 3. Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. increasingly large body of fMRI studies of reward-guided learning Science 1997;275:1593–9. in patients displaying symptoms that are believed to arise in part 4. Walton ME, Bouret S. What is the relationship between dopamine and effort? from dysregulated dopamine, such as psychosis and anhedonia Trends Neurosci 2018;42:79–91. (see refs. [16, 19, 25, 85] for reviews). This has raised the possibility 5. Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for that changes in behaviour and brain responses during reward reward and punishment in the ventral tegmental area. Nature 2012;482:85–8. anticipation and reinforcement learning might act as a cross- 6. Tobler PN, Fiorillo CD, Schultz W. Adaptive coding of reward value by dopamine neurons. Science 2005;307:1642–5. diagnostic preclinical translational biomarker [16, 86]. In parallel, Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. 7. Gan JO, Walton ME, Phillips PE. Dissociable cost and benefit encoding of future 33. Lowry JP, Griffin K, McHugh SB, Lowe AS, Tricklebank M, Sibson NR. Real-time rewards by mesolimbic dopamine. Nat Neurosci 2010;13:25–7. electrochemical monitoring of brain tissue oxygen: a surrogate for functional 8. Kishida KT, Saez I, Lohrenz T, Witcher MR, Laxton AW, Tatter SB, et al. Subsecond magnetic resonance imaging in rodents. NeuroImage 2010;52:549–55. dopamine fluctuations in human striatum encode superposed error signals about 34. Li J, Schwarz AJ, Gilmour G. Relating translational neuroimaging and ampero- actual and counterfactual reward. Proc Natl Acad Sci USA 2016;113:200–5. metric endpoints: utility for neuropsychiatric drug discovery. Curr Top Behav 9. Hart AS, Rutledge RB, Glimcher PW, Phillips PE. Phasic dopamine release in the rat Neurosci 2016;28:397–421. nucleus accumbens symmetrically encodes a reward prediction error term. J 35. Francois J, Huxter J, Conway MW, Lowry JP, Tricklebank MD, Gilmour G. Differ- Neurosci Off J Soc Neurosci 2014;34:698–704. ential contributions of infralimbic prefrontal cortex and nucleus accumbens 10. Saddoris MP, Cacciapaglia F, Wightman RM, Carelli RM. Differential dopamine during reward-based learning and extinction. J Neurosci: Off J Soc Neurosci release dynamics in the nucleus accumbens core and shell reveal complementary 2014;34:596–607. signals for error prediction and incentive motivation. J Neurosci Off J Soc Neu- 36. Daberkow DP, Brown HD, Bunner KD, Kraniotis SA, Doellman MA, Ragozzino ME, rosci 2015;35:11572–82. et al. Amphetamine paradoxically augments exocytotic dopamine release and 11. Ellwood IT, Patel T, Wadia V, Lee AT, Liptak AT, Bender KJ, et al. Tonic or phasic phasic dopamine signals. J Neurosci 2013;33:452–63. stimulation of dopaminergic projections to prefrontal cortex causes mice to 37. Bernacer J, Corlett PR, Ramachandra P, McFarlane B, Turner DC, Clark L, et al. maintain or deviate from previously learned behavioral strategies. J Neurosci Off Methamphetamine-induced disruption of frontostriatal reward learning signals: J Soc Neurosci 2017;37:8315–29. relation to psychotic symptoms. Am J Psychiatry 2013;170:1326–34. 12. Zaghloul KA, Blanco JA, Weidemann CT, McGill K, Jaggi JL, Baltuch GH, et al. 38. Curran C, Byrappa N, McBride A. Stimulant psychosis: systematic review. Br J Human substantia nigra neurons encode unexpected financial rewards. Science Psychiatry 2004;185:196–204. 2009;323:1496–9. 39. Wan X, Peoples LL. Amphetamine exposure enhances accumbal responses to 13. Saunders BT, Robinson TE. The role of dopamine in the accumbens core in the reward-predictive stimuli in a pavlovian conditioned approach task. J Neurosci expression of Pavlovian-conditioned responses. Eur J Neurosci 2012;36:2521–32. 2008;28:7501–12. 14. Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, et al. A selective role for 40. Dunn MJ, Futter D, Bonardi C, Killcross S. Attenuation of d-amphetamine-induced dopamine in stimulus-reward learning. Nature 2011;469:53–7. disruption of conditional discrimination performance by alpha-flupenthixol. 15. Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G, et al. Psychopharmacology 2005;177:296–306. Nucleus accumbens dopamine depletion impairs both acquisition and perfor- 41. St Onge JR, Chiu YC, Floresco SB. Differential effects of dopaminergic manip- mance of appetitive Pavlovian approach behaviour: implications for mesoac- ulations on risky choice. Psychopharmacology 2010;211:209–21. cumbens dopamine function. Behav Brain Res 2002;137:149–63. 42. Francois J, Conway MW, Lowry JP, Tricklebank MD, Gilmour G. Changes in 16. Deserno L, Schlagenhauf F, Heinz A. Striatal dopamine, reward, and decision reward-related signals in the rat nucleus accumbens measured by in vivo oxygen making in schizophrenia. Dialogues Clin Neurosci 2016;18:77–89. amperometry are consistent with fMRI BOLD responses in man. NeuroImage 17. Maia TV, Frank MJ. An integrative perspective on the role of dopamine in schi- 2012;60:2169–81. zophrenia. Biol Psychiatry 2017;81:52–66. 43. Behrens TE, Hunt LT, Woolrich MW, Rushworth MF. Associative learning of social 18. Garcia-Garcia I, Zeighami Y, Dagher A. Reward prediction errors in drug addiction value. Nature 2008;456:245–9. and Parkinson's disease: from neurophysiology to neuroimaging. Curr Neurol 44. Rutledge RB, Dean M, Caplin A, Glimcher PW. Testing the reward prediction error Neurosci Rep 2017;17:46. hypothesis with an axiomatic model. J Neurosci 2010;30:13525–36. 19. Zald DH, Treadway MT. Reward processing, neuroeconomics, and psycho- 45. Niv Y, Edlund JA, Dayan P, O'Doherty JP. Neural prediction errors reveal a risk- pathology. Annu Rev Clin Psychol 2017;13:471–95. sensitive reinforcement-learning process in the human brain. J Neurosci 20. Murray GK, Corlett PR, Clark L, Pessiglione M, Blackwell AD, Honey G, et al. 2012;32:551–62. Substantia nigra/ventral tegmental reward prediction error disruption in psy- 46. Klein-Flugge MC, Hunt LT, Bach DR, Dolan RJ, Behrens TE. Dissociable reward and chosis. Mol Psychiatry 2008;13:239. 67-76 timing signals in human midbrain and ventral striatum. Neuron 2011;72:654–64. 21. Huys QJ, Pizzagalli DA, Bogdan R, Dayan P. Mapping anhedonia onto 47. Chase HW, Kumar P, Eickhoff SB, Dombrovski AY. Reinforcement learning models reinforcement learning: a behavioural meta-analysis. Biol Mood Anxiety Disord. and their neural correlates: an activation likelihood estimation meta-analysis. 2013;3:12. Cognitive, affective &. Behav Neurosci 2015;15:435–59. 22. Pizzagalli DA, Iosifescu D, Hallett LA, Ratner KG, Fava M. Reduced hedonic 48. Parkinson JA, Olmstead MC, Burns LH, Robbins TW, Everitt BJ. Dissociation in capacity in major depressive disorder: evidence from a probabilistic reward task. J effects of lesions of the nucleus accumbens core and shell on appetitive pavlo- Psychiatr Res 2008;43:76–87. vian approach behavior and the potentiation of conditioned reinforcement and 23. Dowd EC, Frank MJ, Collins A, Gold JM, Barch DM. Probabilistic reinforcement locomotor activity by D-amphetamine. J Neurosci 1999;19:2401–11. learning in patients with schizophrenia: relationships to anhedonia and avolition. 49. du Hoffmann J, Nicola SM. Dopamine invigorates reward seeking by promoting Biol Psychiatry Cogn Neurosci Neuroimaging 2016;1:460–73. cue-evoked excitation in the nucleus accumbens. J Neurosci 2014;34:14349–64. 24. Kumar P, Goer F, Murray L, Dillon DG, Beltzer ML, Cohen AL, et al. Impaired 50. Hart AS, Clark JJ, Phillips PEM. Dynamic shaping of dopamine signals during reward prediction error encoding and striatal-midbrain connectivity in depres- probabilistic Pavlovian conditioning. Neurobiol Learn Mem 2015;117:84–92. sion. Neuropsychopharmacology. 2018;43:1581–88. 51. Day JJ, Roitman MF, Wightman RM, Carelli RM. Associative learning mediates 25. Radua J, Schmidt A, Borgwardt S, Heinz A, Schlagenhauf F, McGuire P, et al. dynamic shifts in dopamine signaling in the nucleus accumbens. Nat Neurosci Ventral striatal activation during reward processing in psychosis: a neurofunc- 2007;10:1020–8. tional meta-analysis. JAMA Psychiatry 2015;72:1243–51. 52. Hamid AA, Pettibone JR, Mabrouk OS, Hetrick VL, Schmidt R, Vander Weele CM, 26. Rausch F, Mier D, Eifler S, Esslinger C, Schilling C, Schirmbeck F, et al. Reduced et al. Mesolimbic dopamine signals the value of work. Nat Neurosci activation in ventral striatum and ventral tegmental area during probabilistic 2016;19:117–26. decision-making in schizophrenia. Schizophrenia Res 2014;156:143–9. 53. McHugh SB, Marques-Smith A, Li J, Rawlins JN, Lowry J, Conway M, et al. 27. Rothkirch M, Tonn J, Kohler S, Sterzer P. Neural mechanisms of reinforcement Hemodynamic responses in amygdala and hippocampus distinguish between learning in unmedicated patients with major depressive disorder. Brain aversive and neutral cues during Pavlovian fear conditioning in behaving rats. 2017;140:1147–57. Eur J Neurosci 2013;37:498–507. 28. Gradin VB, Kumar P, Waiter G, Ahearn T, Stickle C, Milders M, et al. Expected value 54. Li J, Bravo DS, Louise Upton A, Gilmour G, Tricklebank MD, Fillenz M, et al. Close and prediction error abnormalities in depression and schizophrenia. Brain temporal coupling of neuronal activity and tissue oxygen responses in rodent 2011;134:1751–64. whisker barrel cortex. Eur J Neurosci 2011;34:1983–96. 29. Morris RW, Vercammen A, Lenroot R, Moore L, Langton JM, Short B, et al. Dis- 55. Knutson B, Gibbs SE. Linking nucleus accumbens dopamine and blood oxyge- ambiguating ventral striatum fMRI-related BOLD signal during reward prediction nation. Psychopharmacology 2007;191:813–22. in schizophrenia. Mol Psychiatry 2012;17:235. 80-9 56. Lohani S, Poplawsky AJ, Kim SG, Moghaddam B. Unexpected global impact of 30. Rutledge RB, Moutoussis M, Smittenaar P, Zeidman P, Taylor T, Hrynkiewicz L, VTA dopamine neuron activation as measured by opto-fMRI. Mol Psychiatry et al. Association of neural and emotional impacts of reward prediction errors 2017;22:585–94. with major depression. JAMA Psychiatry 2017;74:790–97. 57. Decot HK, Namboodiri VM, Gao W, McHenry JA, Jennings JH, Lee SH, et al. 31. Oades RD, Halliday GM. Ventral tegmental (A10) system: neurobiology. 1. Anat- Coordination of brain-wide activity dynamics by dopaminergic neurons. Neu- omy and connectivity. Brain Res 1987;434:117–65. ropsychopharmacology. 2017;42:615–27. 32. Swanson LW. The projections of the ventral tegmental area and adjacent regions: 58. Ferenczi EA, Zalocusky KA, Liston C, Grosenick L, Warden MR, Amatya D, et al. a combined fluorescent retrograde tracer and immunofluorescence study in the Prefrontal cortical regulation of brainwide circuit dynamics and reward-related rat. Brain Res Bull 1982;9:321–53. behavior. Science 2016;351:aac9698. Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. 59. Lohrenz T, Kishida KT, Montague PR. BOLD and its connection to dopamine 78. Schilman EA, Uylings HB, Galis-de Graaf Y, Joel D, Groenewegen HJ. The orbital release in human striatum: a cross-cohort comparison. Philos Trans R Soc Lond B cortex in rats topographically projects to central parts of the caudate-putamen Biol Sci 2016;371:pii: 20150352. complex. Neurosci Lett 2008;432:40–5. 60. Kennerley SW, Behrens TE, Wallis JD. Double dissociation of value computations 79. Pavlov IP. Conditioned Reflexes. Oxford: Oxford University Press; 1927. in orbitofrontal and anterior cingulate neurons. Nat Neurosci 2011;14:1581–9. 80. Mackintosh NJ. Overshadowing and stimulus intensity. Anim Learn Behav. 61. Warren CM, Hyman JM, Seamans JK, Holroyd CB. Feedback-related negativity 1976;4:186–92. observed in rodent anterior cingulate cortex. J Physiol Paris 2015;109:87–94. 81. Jakubowska E, Zielinski K. Differentiation learning as a function of stimulus 62. Sul JH, Kim H, Huh N, Lee D, Jung MW. Distinct roles of rodent orbitofrontal and intensity and previous experience with the CS. Acta Neurobiol Exp medial prefrontal cortex in decision making. Neuron 2010;66:449–60. 1976;36:427–46. 63. O'Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ. Temporal difference 82. Grilly DM, Loveland A. What is a "low dose" of d-amphetamine for inducing models and reward-related learning in the human brain. Neuron 2003;38:329–37. behavioral effects in laboratory rats? Psychopharmacology 2001;153:155–69. 64. Rohe T, Weber B, Fliessbach K. Dissociation of BOLD responses to reward pre- 83. Homayoun H, Moghaddam B. Orbitofrontal cortex neurons as a common target diction errors and reward receipt by a model comparison. Eur J Neurosci for classic and glutamatergic antipsychotic drugs. Proc. Natl Acad. Sci. USA 2012;36:2376–82. 2008;105(46):18041–6. 65. Stalnaker TA, Liu TL, Takahashi YK, Schoenbaum G. Orbitofrontal neurons signal 84. Knutson B, Bjork JM, Fong GW, Hommer D, Mattay VS, Weinberger DR. Amphe- reward predictions, not reward prediction errors. Neurobiol Learn Mem tamine modulates human incentive processing. Neuron 2004;43:261–9. 2018;153:137–43. 85. Heinz A, Schlagenhauf F. Dopaminergic dysfunction in schizophrenia: salience 66. Klein-Flugge MC, Barron HC, Brodersen KH, Dolan RJ, Behrens TE. Segregated attribution revisited. Schizophr Bull 2010;36:472–85. encoding of reward-identity and stimulus-reward associations in human orbito- 86. Gilmour G, Gastambide F, Marston HM, Walton ME. Using intermediate cognitive frontal cortex. J Neurosci 2013;33:3202–11. endpoints to facilitate translational research in psychosis. Curr Opin Behav Sci 67. Kahnt T, Heinzle J, Park SQ, Haynes JD. Decoding the formation of reward pre- 2015;4:128–35. dictions across learning. J Neurosci 2011;31:14624–30. 87. Corlett PR, Fletcher PC. Computational psychiatry: a Rosetta Stone linking the 68. Schoenbaum G, Chiba AA, Gallagher M. Orbitofrontal cortex and basolateral brain to mental illness. Lancet Psychiatry 2014;1:399–402. amygdala encode expected outcomes during learning. Nat Neurosci 88. Maia TV, Huys QJM, Frank MJ. Theory-based computational psychiatry. Biol Psy- 1998;1:155–9. chiatry 2017;82:382–84. 69. Morrison SE, Salzman CD. The convergence of information about rewarding and 89. Valton V, Romaniuk L, Douglas Steele J, Lawrie S, Series P. Comprehensive review: aversive stimuli in single neurons. J Neurosci 2009;29:11471–83. computational modelling of schizophrenia. Neurosci Biobehav Rev 70. Kennerley SW, Wallis JD. Evaluating choices by single neurons in the frontal lobe: 2017;83:631–46. outcome value encoded across multiple decision variables. Eur J Neurosci 90. Akam T, Costa R, Dayan P. Simple plans or sophisticated habits? state, transition 2009;29:2061–73. and learning interactions in the two-step task. PLoS Comput Biol 2015;11: 71. Ogawa M, van der Meer MA, Esber GR, Cerri DH, Stalnaker TA, Schoenbaum G. e1004648. Risk-responsive orbitofrontal neurons track acquired salience. Neuron 91. Miller KJ, Botvinick MM, Brody CD. Dorsal hippocampus contributes to model- 2013;77:251–8. based planning. Nat Neurosci 2017;20:1269–76. 72. Harle KM, Zhang S, Ma N, Yu AJ, Paulus MP. Reduced neural recruitment for 92. Hollon NG, Arnold MM, Gan JO, Walton ME, Phillips PE. Dopamine-associated bayesian adjustment of inhibitory control in methamphetamine dependence. cached values are not sufficient as the basis for action selection. Proc Natl Acad Biol Psychiatry Cogn Neurosci Neuroimaging 2016;1:448–59. Sci USA 2014;111(51):18357–62. 73. McDannald MA, Esber GR, Wegener MA, Wied HM, Liu TL, Stalnaker TA, et al. Orbitofrontal neurons acquire responses to ‘valueless' Pavlovian cues during unblocking. Elife. 2014;3:e02653. Open Access This article is licensed under a Creative Commons 74. Schiller D, Weiner I. Lesions to the basolateral amygdala and the orbitofrontal Attribution 4.0 International License, which permits use, sharing, cortex but not to the medial prefrontal cortex produce an abnormally persistent adaptation, distribution and reproduction in any medium or format, as long as you give latent inhibition in rats. Neuroscience 2004;128:15–25. appropriate credit to the original author(s) and the source, provide a link to the Creative 75. Esber GR, Haselgrove M. Reconciling the influence of predictiveness and uncer- Commons license, and indicate if changes were made. The images or other third party tainty on stimulus salience: a model of attention in associative learning. Proc. Biol. material in this article are included in the article’s Creative Commons license, unless Sci. 2011;278(1718):2553–61. indicated otherwise in a credit line to the material. If material is not included in the 76. Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and article’s Creative Commons license and your intended use is not permitted by statutory infralimbic cortex to pavlovian autoshaping and discrimination reversal learning: regulation or exceeds the permitted use, you will need to obtain permission directly further evidence for the functional heterogeneity of the rodent frontal cortex. J from the copyright holder. To view a copy of this license, visit http://creativecommons. Neurosci 2003;23:8771–80. org/licenses/by/4.0/. 77. Everitt BJ, Parkinson JA, Olmstead MC, Arroyo M, Robledo P, Robbins TW. Asso- ciative processes in addiction and reward. The role of amygdala-ventral striatal © The Author(s) 2019 subsystems. Ann N Y Acad Sci 1999;877:412–38. Neuropsychopharmacology (2019) 0:1 – 11 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Neuropsychopharmacology Springer Journals

Amphetamine disrupts haemodynamic correlates of prediction errors in nucleus accumbens and orbitofrontal cortex

Loading next page...
 
/lp/springer-journals/amphetamine-disrupts-haemodynamic-correlates-of-prediction-errors-in-QJtCxdKOYU
Publisher
Springer Journals
Copyright
Copyright © 2019 by The Author(s)
Subject
Medicine & Public Health; Medicine/Public Health, general; Psychiatry; Neurosciences; Behavioral Sciences; Pharmacotherapy; Biological Psychology
ISSN
0893-133X
eISSN
1740-634X
DOI
10.1038/s41386-019-0564-8
Publisher site
See Article on Publisher Site

Abstract

www.nature.com/npp ARTICLE OPEN Amphetamine disrupts haemodynamic correlates of prediction errors in nucleus accumbens and orbitofrontal cortex 1 1 2 2 3 2 2 Emilie Werlen , Soon-Lim Shin , Francois Gastambide , Jennifer Francois , Mark D. Tricklebank , Hugh M. Marston , John R. Huxter , 2 1,4 Gary Gilmour and Mark E. Walton In an uncertain world, the ability to predict and update the relationships between environmental cues and outcomes is a fundamental element of adaptive behaviour. This type of learning is typically thought to depend on prediction error, the difference between expected and experienced events and in the reward domain that has been closely linked to mesolimbic dopamine. There is also increasing behavioural and neuroimaging evidence that disruption to this process may be a cross-diagnostic feature of several neuropsychiatric and neurological disorders in which dopamine is dysregulated. However, the precise relationship between haemodynamic measures, dopamine and reward-guided learning remains unclear. To help address this issue, we used a translational technique, oxygen amperometry, to record haemodynamic signals in the nucleus accumbens (NAc) and orbitofrontal cortex (OFC), while freely moving rats performed a probabilistic Pavlovian learning task. Using a model-based analysis approach to account for individual variations in learning, we found that the oxygen signal in the NAc correlated with a reward prediction error, whereas in the OFC it correlated with an unsigned prediction error or salience signal. Furthermore, an acute dose of amphetamine, creating a hyperdopaminergic state, disrupted rats’ ability to discriminate between cues associated with either a high or a low probability of reward and concomitantly corrupted prediction error signalling. These results demonstrate parallel but distinct prediction error signals in NAc and OFC during learning, both of which are affected by psychostimulant administration. Furthermore, they establish the viability of tracking and manipulating haemodynamic signatures of reward-guided learning observed in human fMRI studies by using a proxy signal for BOLD in a freely behaving rodent. Neuropsychopharmacology (2019) 0:1–11; https://doi.org/10.1038/s41386-019-0564-8 dopaminergic dysfunction may play a central role [16–19]. For INTRODUCTION The world is an uncertain place, where behaviour of animals must example, patients with major depressive disorder or schizophrenia continuously change to promote optimal survival. Learning to can be insensitive to reward and display impairments in reward- predict the relationship between environmental cues and significant learning behaviours [20–24]. Neuroimaging studies suggest that events is a critical element of adaptive behaviour. It is hypothesised activation of parts of ventral striatum and frontal cortex, or that adaptive behaviour depends upon comparisons of neural changes in functional connectivity with these regions, may be an representations of cue-evoked expectations of events with events important neurophysiological correlate of reward-learning impair- that actually occurred. Mismatch between these two representations ments [24–29]. However, not all studies show the same patterns of is defined as a prediction error, and is likely a vital substrate by changes. e.g., ref. [30], and there is still uncertainty over whether which accuracy of ensuing predictions about cue-event relationships the blunting of neural responses reflects a primary aetiology in can be improved. Prediction errors related to receipt of reward have these disorders. While the potential links between dopaminergic been strongly associated with dopaminergic neurons and their dysregulation, disrupted neural signatures of reward-guided projections to frontostriatal circuits [1–4]. In rodents and humans, learning and neuropsychiatric symptoms are manifest, strong presentation of reward-predicting cues causes an increase in direct evidence is currently lacking. dopaminergic neuron activity and dopamine release in terminal To help bridge this gap, we used constant-potential ampero- regions, not only in proportion to the expected value of the metry to monitor haemodynamic responses simultaneously in upcoming reward but also to the deviation from that expectation the nucleus accumbens (NAc) and orbitofrontal cortex (OFC) in when the reward is actually delivered [5–12]. Furthermore, rats performing a reward-driven, probabilistic Pavlovian learning experimental disruption of dopaminergic transmission can impair task. Both NAc and OFC regions receive dopaminergic input, and formation of appropriate cue–reward associations [13–15]. have been previously implicated in representing the expected From a human perspective, elements of reward learning can be value of a cue to guide reward-learning behaviour [31, 32]. disrupted in a variety of neuropsychiatric conditions where Amperometric tissue oxygen [T ] signals likely originate from O2 1 2 Department of Experimental Psychology, University of Oxford, Tinsley Building, Mansfield Road, Oxford OX1 3SR, UK; Eli Lilly & Co Ltd, Erl Wood Manor, Windlesham GU20 6PH, 3 4 UK; Department of Neuroimaging Sciences, Institute of Psychiatry, Kings College London, London, UK and Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK Correspondence: Gary Gilmour (gilmour_gary@lilly.com) or Mark E. Walton (mark.walton@psy.ox.ac.uk) Deceased: Soon-Lim Shin Received: 29 March 2019 Revised: 2 October 2019 Accepted: 29 October 2019 © The Author(s) 2019 1234567890();,: Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. equivalent physiological mechanisms as fMRI BOLD signals additional trial-by-trial variance, we also included either trial- [33–35], allowing cross-species comparisons of behaviourally specific or recency-weighted pre-cue response rates. To compare driven haemodynamic signals in awake animals. The aim of this the models, we used the Bayesian information criterion (BIC), which study was to define amperometric signatures of cue-evoked penalises the likelihood of a model by the number of parameters expectation of reward and prediction error in these regions and and the natural logarithm of the number of data points. then investigate how they are modulated by administration of amphetamine. Amphetamine is known to modify physiological Data analysis dopamine signalling [36], and in humans, even a single dose of a Behaviour: We analysed the average number of head entries into stimulant like methamphetamine can cause an increase in mild the food magazine during presentation during either the CS or High psychotic symptoms [37, 38]. While amphetamine can promote CS cues during the 9 days of training and then during the pre- Low behavioural approach to rewarded cues [39], it can also impair drug day (day 9 of training) with the drug challenge day. conditional discrimination performance [40]and theinfluence of probabilistic cue–reward associations on subsequent decision- Amperometry: We performed two sets of complementary making [41]. We hypothesised that amphetamine would disrupt analyses: (i) model-free analyses, where we investigated the discriminative responses to cues during performance of a average signals in NAc and OFC during cue presentation or in the probabilistic Pavlovian task, with concomitant changes to the 30 s after outcome delivery over the course of learning and after haemodynamic correlates of reward expectation and prediction amphetamine administration, and (ii) model-based analyses error in the NAc and OFC. where we regressed the same signals against estimates from our computational model using the model with the lowest BIC score (Fig. 1d). METHODS See SI for detailed methods Animals. All experiments were conducted in accordance with the RESULTS United Kingdom Animals (Scientific Procedures) Act 1986. Adult Behavioural performance during probabilistic learning male Sprague Dawley rats (Charles River, UK) were used in the We trained rats on a two-cue probabilistic Pavlovian learning present studies (n = 36). Four animals did not contribute to the paradigm. One cue—CS —was associated with reward delivery High behavioural dataset owing to poor O calibration responses, and on 75% of trials and the other—CS —on 25% of trials (Fig. 1a). 2 Low the data from an additional two animals could not be included As can be observed, animals learned to discriminate between the owing to a computer error. During testing, they were maintained cues, increasingly making magazine responses during presenta- at >85% of their free-feeding weight relative to their normal tion of the CS but showing little change in behaviour upon High growth curve. Prior to the start of any training or testing, all presentation of CS as training progressed (main effect of CS: Low animals underwent surgical procedures under general anaesthesia F = 39.92, p < 0.001; CS × day interaction: F = 6.12, p = 1,27 2.63,70.92 to implant carbon paste electrodes targeted bilaterally at the NAc 0.001) (Fig. 1b, Fig. S2). Unexpectedly, however, there was also a and OFC. substantial and consistent influence of the counterbalancing assignment on responding (CS × cue identity interaction: F = 1,27 O amperometry data recording.O signals were recorded 48.17, p < 0.001). Specifically, follow-up pairwise comparisons 2 2 from the NAc and OFC by using constant-potential amperometry showed that the animals in Group 1, where CS was assigned High (–650 mV applied for the duration of the session) as described to be the clicker cue and CS the pure tone (“CL1-T2”) exhibited Low previously [35, 42]. strong discrimination between the cues throughout training (p < 0.001; Fig. 1c). By contrast, rats in Group 2 with the opposite CS— Probabilistic Pavlovian conditioning task. The task was a prob- auditory cue assignment (“T1-CL2”), did not show differential abilistic Pavlovian learning task performed in standard operant responding to the cues in spite of the different reward chambers. Each trial consisted of a 10-s presentation of one of two associations (p = 0.66). auditory cues (3-kHz pure tone at 77 dB or 100-Hz clicker at 76 dB) To better understand how the cue identity was influencing this followed immediately by either delivery or omission of reward pattern of responding, we formally analysed how well different (4 × 45 mg of sucrose food pellets). One of the auditory cues simple reinforcement learning models could describe individual (CS ) was followed by reward delivery on 75% of trials, the rats’ Pavlovian behaviour. The preferred model included a cue High other (CS ) was rewarded on 25% of trials. Each session salience parameter and a cue-specific unconditioned magazine Low consisted of a total of 56 cue presentations, with an average responding term, as well as recency-weighted pre-cue responding intertrial interval of 45 s (range 30–60 s). Standard training took parameter (Fig. 1d). In particular, the constant term attributable to place over nine sessions and session 10 consisted of the drug unconditioned cue-elicited magazine responding was higher on challenge (see Fig. S1). clicker than tone trials (Z = 3.98, p < 0.001, Wilcoxon signed-rank test; p < 0.015 for Group 1 or 2 analysed separately). Therefore, Pharmacological manipulations. D-amphetamine sulfate (Sigma, once the difference in cue attributes was accounted for, rats’ UK) was dissolved in 5% (w/v) glucose solution, and pH adjusted behaviour could be well explained by using this modified simple towards neutral with the dropwise addition of 1 M NaOH as reinforcement learning model (Fig. 1e). necessary. Amphetamine was dosed at 1 mg/kg (free weight) via the intraperitoneal route. Both NAc and OFC haemodynamic signals track Pavlovian responding Behavioural modelling. Head entries during the 10-s cue pre- We examined how T responses in NAc and OFC (Fig. 2) tracked O2 sentation were modelled by using variations of a Rescorla–Wagner animals’ learning of the appetitive associations and violations of model (Rescorla & Wagner 1972). We started with a model with a their expectations. After exclusions for misplaced electrodes and single free parameter, the learning rate α, and compared this poor quality of signals (see Supplementary Methods, Fig. S3), 40 against other models that also included free parameters specifying electrodes in 20 rats were included for analysis (NAc = 25 (a) cue-specific learning rates (i.e., a cue salience term, β); (b) electrodes from 15 rats, OFC = 15 electrodes from 11 rats). separate learning rates for rewarded α and nonrewarded trials We initially performed model-free analyses to investigate the pos α and either (c) cue-independent k or (d) cue-specific uncondi- T signals in response to presentation of the CS and CS neg O2 High Low tioned magazine responding, k and k .To capture cues as the rats learned the reward associations. Clicker Tone Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. All rats AB 75% Rew baseline CS High CS High 10 s CS 50% 6 25% Low baseline 10 s 50% 75% 2 CS Low 10 s 25% Rew 12 3 4 5 6 7 8 9 days Group CL1-T2 Group T1-CL2 baseline baseline CS = clicker CS = tone High High CS = tone CS = clicker 8 Low Low 4 4 2 2 0 12 3 4 5 6 7 8 9 12 3 4 5 6 7 8 9 days days actual CS High Example rat from group CL1-T2 actual CS 11 Low DE predicted CSHigh predicted CS Low 7 0 5 1 3 8 1 0 0 1 00 4 0 00 00 00 (5) (0) (5) (0) (2) (5) (0) (0) (0) (1) (0) (0) (2) (0) (0) (0) (0) (0) (0) (0) ab c d e f g h i j k l m n o p q r s t 12 Example rat from group T1-CL2 Reward learning rate xx x x x x x x xx Different learning rates for xxx x x xx x x x rewarded and no reward Cue salience xxxx xxxx x x General cue arousal x x xxxxxx Specific cue arousal x x xx xx xx Baseline entries xxxxxxx x Days Fig. 1 Task, behavioural performance and modelling. a Schematic of the Pavlovian task. b, c Average head entries (mean ± SEM) to the magazine during presentation of each cue or during the pre-cue baseline period across the nine sessions (b, all animals; c, Group C1–T2 only, where the CS was the clicker and CS was the tone; d, Group T1–C2 only, where the CS was the tone and CS the clicker). High Low High Low d Bayesian information criterion (BIC, a measure of the goodness of fit of the model) estimates for different learning models. The BIC penalises the likelihood of a model by the number of parameters and the natural logarithm of the number of data points. The model with the lowest BIC score was deemed to give a better fit of the data. Note, however, the patterns of results in the model-based analyses of amperometric signals remained unchanged if we used any of the three models that fitted best for a number of individual rats (models a, c and f) or even if we used a standard Rescorla-Wagner-type learning model (model s). An “x” in the table indicates the presence of the given component in the model. Numbers within each bar indicate the number of animals for which the given model had the lowest BIC. e Example of the model fits for two animals from the two counterbalancing groups T responses during presentation of the cues changed showed a significant positive correlation between the signals O2 markedly over training in a similar manner in both brain regions recorded in each area (r = 0.41, p < 0.01). Moreover, mirroring the (main effects of cue and session: both F > 9.49, p < 0.001) (Fig. 3a, behavioural data, the patterns of responses differed substantially b). In fact, analysis of the subset of animals with functional according to the cue identity (CL1–T2 or T1–CL2). While the T O2 electrodes recorded simultaneously in NAc and OFC (n = 6 rats) response following the CS developed similarly in both groups, High Neuropsychopharmacology (2019) 0:1 – 11 Model components Constant Scaling factors head entries BIC (x10 ) head entries head entries head entries head entries Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. AB +3.7 mm +1.6 mm +3.2 mm +1.2 mm +1.0 mm +2.7 mm +0.7 mm +0.5 mm +2.2 mm Fig. 2 Electrode location in OFC (a) and NAc (b). The NAc electrodes were clustered mainly in the ventromedial NAc, including ventromedial core and shell, while the OFC electrodes were in the ventral orbital sector there was a substantial difference in the CS response, with the CL1–T2 group. By contrast, on CS trials, there was no Low Low average signals when the clicker was the CS being significantly meaningful change in T responses to delivery or omission of Low O2 higher than when the tone was the CS (Cue × cue identity reward throughout training (all p > 0.37). As might be expected Low interaction: F = 18.67, p < 0.001; CL1–T2 vs. T1–CL2, CS : p = given its effect on behaviour and cue-elicited neural signals, cue 1,36 High 0.32, CS : p < 0.001). identity again influenced outcome signals, resulting in a four-way Low To establish the relationship between development of maga- interaction of all the factors, as well as two-way interactions zine responding and the T signals, we regressed the model- between cue × identity and training stage × identity (all F > 4.33, O2 derived estimates of cue value from the model that best fitted the p < 0.019). Importantly, however, when analysed separately, both behavioural data, V(t), against the trial-by-trial T responses and counterbalance groups showed the key cue × outcome × training- O2 found a significant positive relationship in both regions in both stage interaction (F > 4.84, p < 0.019). groups (Fig. 3c). This was not simply a correlate of invigorated In the OFC, outcome was also a strong influence on T O2 responding as cue value was a significantly better predictor than responses (F = 51.18, p < 0.001) and this was again shaped by 1,13 trial-by-trial magazine head entries (Fig. S4). Therefore, once the preceding cue (cue × outcome interaction: F = 5.37, p = 1,13 differences in cue identity are accounted for, it is possible to 0.037). However, unlike in NAc, there came to be an increasingly demonstrate that T signals in both NAc and OFC track the strong T response when reward was omitted, particularly after O2 O2 expected value associated with each cue. CS (Fig. 4c, d). This resulted in a cue × training-stage interaction High (F = 5.37, p = 0.037; CS vs. CS , p = 0.51 for the early 1,13 High Low Separate haemodynamic signatures of signed and unsigned training stage, p < 0.001 for mid- and late stages). While there prediction errors in NAc and OFC were qualitative differences in responses in the two counter- We next investigated how the probabilistic delivery or omission of balance groups, none of the interactions with this factor or the reward-shaped T responses in NAc and OFC, and how these main effect reached significance (all p > 0.069). O2 signals were shaped by cue-elicited reward expectations as While these model-free analyses illustrate that the T responses O2 learning progressed. As there were significant interactions changed dynamically and differently over the course of training in between brain region with cue and training stage and their the two brain regions, they do not clearly show whether either combination (all F > 3.76, p < 0.028), we here analysed responses response might encode a teaching signal useful for learning, such as in the two regions separately. arewardPE: δ(t)= V(t)+ r(t)− V(t−1).Therefore,we nextused We again first performed model-free analyses, by focusing on model-based analyses to examine whether there was a relationship how the average outcome-evoked changes in T responses were between T responses across all sessions and the fundamental O2 O2 influenced by the preceding cue and how these adapted over components of a reward PE: (i) a positive influence of outcome, r(t), training. As can be seen in Fig. 4, the primary determinant of the and (ii) a negative influence of model-derived cue value, −V(t− 1). signal change in the NAc was whether a reward was received While both NAc and OFC T responses showed a strong positive O2 (main effect of reward: F = 47.72, p < 0.001). However, the size influence on outcome, only the NAc signals fulfilled both criteria of a 1,36 of reward and no-reward signals, normalised to the time of reward PE by also exhibiting a significant negative influence of cue outcome, depended on which cue had preceded the outcome, value; in OFC, by contrast, the cue value effect was positive (Fig. 4e). and these patterns altered as learning progressed (significant We also examined whether reward PE-like T responses in NAc O2 cue × outcome and cue × outcome × training-stage interactions, were present throughout training. This showed that while correlates both F > 18.669, p < 0.001), suggesting a strong influence of of both NAc positive and negative reward PEs can be observed in expectation on T responses. Follow-up comparisons showed rats that are still learning the cue–reward associations once O2 that there was a reduction in reward-elicited T responses on appropriately established only positive reward PEs remain evident O2 CS trials as training progressed (CS rew, stage 1 vs. stage 3: (Fig. S5). High High p = 0.016; stage 2 vs. stage 3: p = 0.065). Unexpectedly, there was Although the OFC T responses do not correspond to a reward O2 also a diminution of omission-elicited reductions in T responses PE, the patterns of signals nonetheless still dynamically change O2 on these trials (CS no reward, stage 1 vs. stage 3: p = 0.014), over learning. As previous work has suggested that OFC neurons High which, from Fig. 4b, can be seen to be particularly prominent in may signal the salience of the outcome for learning, we examined Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. AB Day 1 Day 5 Day 9 1nA 10 CS = High CS High = CS = Low CS Low = 10 s 1nA -20 CS = High CS = High CS Low= CS = 20 Low -10 10 s Sessions cue onset cue onset cue onset NAc OFC Expected value Expected value Group C1-T2 Group C1-T2 1 1 Group T1-C2 Group T1-C2 0 0 10 s 10 s 0 10 s 10 s cue cue cue cue onset onset onset onset Fig. 3 Haemodynamic correlates during cue presentation. a T responses on 3 sample days time-locked to cue presentation in the two O2 counterbalance groups recorded from either NAc (upper panels) or OFC (lower panels). b Average area-under-the-curve responses (mean ± SEM) extracted from 5 to 10 s after cue onset for each cue across the nine sessions. c Average effect sizes in NAc (left panel) and OFC (right panel) from a general linear model relating T responses to trial-by-trial estimates of the expected value associated with each cue. Main plots O2 include all animals; insets show the analyses divided up into the two cue identity groups whether T responses instead correlated with how unexpected We first analysed baseline magazine responding in the pre-drug O2 each outcome was, corresponding to an unsigned PE. This analysis and the drug administration sessions. Although amphetamine showed that each animal’s trial-by-trial unsigned PE had a strong caused a numeric increase in baseline responding, this was positive influence on OFC signals (Fig. S6). Again, this was present variable between animals—7/15 rats given amphetamine showing in both counterbalance groups. a substantial increase in baseline magazine response rates, Therefore, while the changes in NAc T responses reflect how whereas the other 8/15 animals showed a decrease in response O2 much better or worse an outcome was than expected, OFC T rates —and the drug × session interaction did not reach O2 responses indicate how surprising either was. significance (F = 3.71, p = 0.065). By contrast, there was a 1,26 substantial and consistent change in cue-elicited responses (cue × Amphetamine disrupts cue-specific value encoding and prediction session × drug interaction: F = 16.31, p < 0.001). This was not 1,26 errors caused by differences between the drug groups on the pre-drug Having established haemodynamic PE correlates in NAc and OFC, session (no main effect or interaction with drug group: all F < 1.21, we next wanted to investigate how an acute dose of ampheta- p > 0.28). Instead, as can be observed in Fig. 5a, while both the mine (1 mg/kg), an indirect sympathomimetic known to potenti- vehicle and amphetamine groups responded more on average to ate dopamine release, influenced cue value and prediction error the CS than the CS on the pre-drug day (p < 0.003), this High Low T responses. discrimination was abolished after administration of the drug O2 Neuropsychopharmacology (2019) 0:1 – 11 OFC NAc Effect Size (a.u.) Group T1-CL2 Group CL1-T2 Group T1-CL2 Group CL1-T2 n = 11 / 6 n = 14 / 9 n = 7 / 5 n = 8 / 6 Effect Size (a.u.) AUC (nA x s) AUC (nA x s) AUC (nA x s) AUC (nA x s) Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. A Day 1 Day 5 Day 9 B 4nA CS = Rew High CS High= X CS = Low Rew CS = X Low CS Rew 0 High CS = X High CS Low Rew 4nA CS = X Low 30 s outcome outcome outcome Expected value Expected value Reward Reward Group CL1-T2 Group CL1-T2 Group T1-CL2 Group T1-CL2 4 4 2 2 0 0 0 0 30 s 30 s 30 s 30 s outcome outcome outcome outcome Fig. 4 Haemodynamic correlates following outcome presentation. a, c T responses on 3 sample days, time-locked to outcome presentation O2 (reward or no reward) after each cue in the two cue identity groups recorded from either NAc (a) or OFC (c). b, d Average area-under-the-curve responses (mean ± SEM) extracted from 30 s following the outcome after each cue across the nine sessions (NAc panel a, OFC panel c). e Average effect sizes in NAc (left panel) or OFC (right panel) from a general linear model relating T responses to trial-by-trial estimates of O2 the expected value associated with each cue and trial outcome (reward or no reward). Main plots include all animals; insets show the analyses divided into the two cue identity groups (CS vs. CS : p = 0.35), but not the vehicle (p < 0.001). Note Based on the differences between outcome-elicited signals in High Low that while there were again some differences between the NAc and OFC observed during training, we split the outcome- counterbalance groups (cue × session × drug × identity interac- elicited T data by region. In the NAc, there was a significant O2 tion: F = 6.30, p = 0.019), the effects of amphetamine admin- cue × session × outcome × drug interaction (F = 4.82, p = 0.04). 1,26 1,21 istration were comparable in both groups (amphetamine We focused follow-up analyses on each drug group separately group: significant cue × session interaction, F = 9.047, p = without cue identity as a between-subjects’ factor as the NAc 1,13 0.01; no significant cue × session × identity interaction, F = electrode exclusion criteria inadvertently biased the distribution of 1,13 2.71, p = 0.12). rats assigned to the drug and vehicle groups as a function of cue Administration of amphetamine also had a pronounced but identity (χ = 9.4, df = 3 and p = 0.024) (see Fig. S7 for breakdown specific effect on T responses. During the cue period, the effect by counterbalance group). O2 in both NAc and OFC mirrored the effect of the drug on behaviour, While vehicle injections caused no changes in NAc signals (no with amphetamine abolishing the distinction between the main effect or interaction with session: all F < 1.4, p > 0.33), average T response elicited by presentation of the CS or amphetamine had a marked influence on outcome-elicited T O2 High O2 CS (cue × session × drug: F = 6.22, p = 0.018; CS vs. responses, selectively blunting CS outcome responses (cue × Low 1,32 High Low CS , amphetamine group drug session, p = 0.25; all other p < session × outcome interaction: F = 22.07, p = 0.001; CS Low 1,12 Low 0.006) (Fig. 5b, c, S7A). While there were still notable effects of cue reward or no reward: pre-drug vs. drug session, p < 0.005; CS , High identity on signals, follow-up comparisons found that there were all p > 0.22). This meant that on amphetamine, there was now no no reliable differences in T responses between the different cue reliable distinction between reward-evoked T signals based on O2 O2 configurations in either group or session (all p > 0.27). the preceding cue (p = 0.08; all other CS vs. CS High Low Neuropsychopharmacology (2019) 0:1 – 11 OFC NAc NAc Group T1-CL2 Group CL1-T2 Group T1-CL2 Group CL1-T2 Effect Size (a.u.) OFC AUC (nA x s) AUC (nA x s) AUC (nA x s) AUC (nA x s) Effect Size (a.u.) Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. Group C1-T2 All rats Amph, CS High Amph, CS Low Amph, baseline Group T1-C2 Veh, CS High Veh, CS Low Veh, baseline PRE DRUG PRE DRUG VEHICLE AMPH VEHICLE AMPH VEHICLE AMPH VEHICLE AMPH BC PRE DRUG PRE DRUG 1uA -5 CS High CS Low 1uA -5 10 s cue onset cue onset cue onset cue onset PRE DRUG PRE DRUG DE VEHICLE AMPH VEHICLE AMPH PRE DRUG PRE DRUG 2uA 5uA 30 s outcome outcome outcome outcome PRE DRUG PRE DRUG CS Rew CS X CS Rew CS X High High Low Low Fig. 5 Effect of acute amphetamine administration on cue-elicited behaviour and haemodynamic signals. a Average head entries (mean ± SEM) to the magazine during presentation of each cue or during the pre-cue baseline period in the session before (“Pre”) and just after drug administration (“Drug”) in the group receiving vehicle or amphetamine (1 mg/kg). b T responses time-locked to cue presentation recorded O2 from either NAc (upper panels) or OFC (lower panels) in the pre-drug or drug administration sessions. Note that differences in the pre-drug patterns of signals in the vehicle and amphetamine group mainly reflect the unbalanced assignment of included animals from the two cue identity groups. c Average area-under-the-curve responses (mean ± SEM) extracted from 5 to 10 s after cue onset for each cue in the two sessions. d T responses time-locked to outcome presentation (reward or no reward) after each cue in the two cue identity groups recorded O2 from either NAc (upper panels) or OFC (lower panels) in the pre-drug or drug administration sessions. e Average area-under-the-curve responses (mean ± SEM) extracted from 30 s following the outcome after each cue in the two sessions comparisons, p < 0.015) (Fig. 5d, e). Consistent with this, we also we did not analyse the negative RPE as this was already largely found a significant reduction in the relationship between T absent in the pre-drug session in animals showing strong O2 responses and positive reward prediction errors selectively after discrimination between the CS and CS ). High Low amphetamine (comparison of peak effect size on and off drug: In OFC, there was also a significant change in T responses O2 session × drug interaction: F = 8.02, p = 0.01; pre-drug vs. drug when comparing outcome-elicited signals on the drug session 1,21 session, amphetamine group: p = 0.003; vehicle: p = 0.66) (note, with the pre-drug day (significant cue × session × drug and cue × Neuropsychopharmacology (2019) 0:1 – 11 OFC NAc OFC NAc veh (n = 8, r = 6) veh (n = 12, r = 8) veh (n = 8, r = 6) veh (n = 12, r = 8) amph (n = 7, r = 5) amph (n = 13, r = 8) amph (n = 7, r = 5) amph (n = 13, r = 8) Head entries Head entries Head entries AUC (nA x s) AUC (nA x s) AUC (nA x s) AUC (nA x s) Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. session × outcome × drug interactions: both F > 5.35, p < 0.042). In worse than expected fits well with the hypothesised roles of the the control group, vehicle injections caused a general increase in extensive dopaminergic projections to this region, and suggests a all the OFC T responses (main effect of testing session: F = fundamental role for NAc in sustaining approach responses to O2 1,6 7.51, p = 0.034). By contrast, in the amphetamine group, there was reward-associated cues [48, 49]. a striking reduction in OFC T signals, particularly elicited by It has been demonstrated that dopamine release in the core O2 CS cues (main effect of testing session: F = 5.12, p = 0.073; region of the NAc correlates with a RPE, and similar to that High 1,5 significant cue × session interaction: F = 29.527, p = 0.003). An observed here, dynamically changes over the course of learning 1,21 analysis of the unsigned prediction error signal also resulted in a [9, 10, 14, 50, 51] (see ref. [52] for a different interpretation). The session × drug interaction (F = 6.63, p = 0.026), though this was amperometry electrodes in the current study were largely in 1,11 driven both by a numeric decrease in the regression weight in the caudal parts of ventral NAc, spanning the core and ventral shell amphetamine group and an increase in the regression weight in regions. Given that the electrodes are estimated to be sensitive to the vehicle group. changes in signal over approximately a 400-μm sphere around the Taken together, therefore, amphetamine impaired the discrimi- electrode [53, 54], it is plausible that the signals we recorded here native influence of CS and CS cues on behaviour and also could have been influenced by RPE-like patterns of dopamine High Low corrupted the influence of these cue-based predictions on NAc release [55]. Several recent papers have shown that optogenetic and OFC T responses. stimulation of dopamine neurons can have widespread influence O2 on forebrain BOLD signals [56–58]. However, direct evidence for this link is currently lacking, and a recent study comparing DISCUSSION patterns of BOLD signals with dopamine release in humans found The results presented here show that haemodynamic signals in indications of uncoupling between the measures [59]. Therefore, it NAc and OFC dynamically track expectation of reward as rats form is also conceivable that the NAc haemodynamic signals we associations between cues with high or low probability of reward observed here instead reflect afferent input from regions such as outcome. Importantly, both regions also displayed distinct forms medial frontal cortex, where RPE-like signals have also been of haemodynamic prediction error signal. NAc signals were recorded [44, 60, 61]. shaped by reward expectation and the specific valence of the By contrast, OFC T signals did not respond to prediction error O2 reward outcome, while in contrast OFC signals did not discrimi- events in quite the same way as the NAc did and did not meet all nate the valence of reward outcome, but rather reflected how three formal criteria to be considered formal correlates of an RPE. surprising either reward outcome was. A single dose of While some studies have found RPE-like outcome signals in OFC amphetamine, sufficient to modulate dopamine activity, caused [62, 63], several—including those fMRI studies that have adopted a loss of discrimination between cues that was evident both the stringent criteria applied here—have not, e.g., refs. [44, 64, 65]. behaviourally and in the haemodynamic signatures of reward Like NAc, OFC T signals signalled reward expectations when the O2 expectation and prediction error in both regions. cues were presented. This is consistent with previous fMRI and These results extend a previous study of instrumental learning electrophysiological studies suggesting that central or lateral OFC where increases in NAc T were observed as rats learned to may represent stimulus–reward mappings during cue presenta- O2 associate a deterministic cue with receipt of reward upon pressing tion, e.g., refs. [66–68]. However, unlike NAc, the OFC signals a lever [35]. The present probabilistic learning study allowed a measured in the present study tended to increase following formal assessment of whether the measured T signals displayed reward omission as well as after reward delivery and this increase O2 features that would categorise them as encoding reward scaled with how surprising the reward omission was. Electro- prediction errors (RPEs). To be considered an RPE signal, three physiological studies suggest that similar proportions of OFC cells cardinal features should be measurable: (i) a positive influence of exhibit either positive or negative relationships with value, and expected reward value on cue-elicited signals (i.e., a greater individual neurons can encode both positive and negative response to a cue that is thought to predict a higher reward), (ii) a valenced information at outcome [69, 70]. positive influence of actual reward delivered (i.e., a greater These outcome-driven signals did however correlate with an response when a high-value reward is actually delivered unsigned prediction error: how surprising or salient any outcome compared with when it is omitted) and (iii) a negative influence is based on current expectations. There are an increasing number of expected reward value on outcome-elicited signals (i.e., a larger of studies implicating OFC in modulating salience for the purposes response to reward delivery the less that reward is expected and/ of learning [71–74]. However, given the precise pattern of OFC or a smaller response to reward omission the more that reward is signals observed in the current study, the OFC T responses O2 expected) [43, 44]. Using behavioural modelling, all three of those might reflect the acquired salience of an outcome, e.g., refs. features were evident in NAc T signal, consistent with a number [71, 75], which represents both how surprising and how rewarding O2 of human fMRI studies of reward-guided learning in healthy it is. Even though such responses do not signal whether an subjects [43–46]. Although human fMRI studies predominantly use outcome is better or worse than expected, they are still important secondary reinforcers such as money to incentivise performance, to guide the rate of learning or reallocate attention. Single-site similar RPE-like activations in NAc are also observed in studies by lesion studies demonstrate a role for OFC, as well as for NAc, in using primary fluid reinforcers in lightly food/water-restricted aspects of stimulus–reward learning [76, 77]. However, a specific participants, which more closely mimic the means by which rats interaction between these regions to support these behaviours are motivated to perform the present task, see ref. [47]. must depend upon another mediating region, as there is no direct While both positive and negative RPE-like T signals were projection between the two [78]. O2 evident across the whole learning period, it was clear that the One unexpected but fortuitous finding was that T signals O2 influence of each signal changed over time. Both positive and related to magazine responding were strongly influenced by cue negative RPEs were evident early in learning. However, as identity. By comparing different behavioural models, this effect discrimination between high and low reward probability cues was best explained by including two additional parameters to a was learned, negative RPEs had an increasingly negligible simple reinforcement learning model: (i) a cue salience parameter, influence on NAc haemodynamic signals. Such adaptation has which scaled the influence of the RPE on future value estimates as resonance with a previous finding in humans that NAc BOLD a function of which auditory cue had been presented, and (ii) a signals are not observed for every RPE event, but only those cue “arousal” parameter, which was a constant term applied from currently relevant to guide future behaviour [46]. The selective the start of training. It has long been established that cue salience involvement of NAc in signalling whether an event is better or can be an important determinant of learning rates, e.g., refs. Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. [79–81], and this is a standard term in the Rescorla–Wagner and there has been increased interest in using these types of finding other influential association learning models, capturing the effect as foundations for theoretical approaches to link the underlying that more salient or intense cues are learned about faster and are biological dysfunctions to observed symptoms in patients (e.g., more readily discriminated than weaker or less salient cues. refs. [21, 87–89]). However, an extra parameter was also needed to account for the While there is general consensus about the promise of such fact that in the majority of animals, responding to the clicker was approaches, the literature is complicated by the diversity of the greater than the tone at the start of testing, irrespective of disorders and the drug regimens that patients have taken, which whether it was assigned to be CS or CS . While we do not makes testing specific causal hypotheses about the relationship High Low know the precise reason for this effect and are not aware of this between altered brain function and psychiatric symptoms difficult. being reported in previous studies, one speculation is that the rats By contrast, in an animal model, it is possible to have precise partially generalised the clicker cue to the sound of the pellet control over and measurement of induced changes in neurobiol- dispenser. Regardless of its precise cause, importantly, by using ogy. Therefore, establishing the feasibility of observing these model-derived estimates of the value, we were nonetheless able signatures in a freely behaving rodent, measured by using a valid to observe identical neural correlates of reward prediction and proxy for fMRI, and demonstrating how clinically relevant prediction error in both cue-counterbalanced groups. pharmacological perturbations affect these responses, may be Moreover, across both groups, a single “moderate/high” dose of an important step to bridge the gaps in our understanding. This amphetamine (1 mg/kg) [82] at the end of learning, mimicking a foundation, if combined with other causal manipulations (such as dopamine hyperactivity state, was sufficient to impair discrimina- pharmacological or genetic animal models relevant to psychiatric tive behavioural responses to the high and low probability cues disorders) and more sophisticated behavioural tasks that allow us and also to disrupt haemodynamic signals in both NAc and OFC. to take into account different learning strategies, e.g., refs. [90, 91] This was not simply a blunt pharmacological influence on and value parameters [7, 92], could therefore provide new neurovascular coupling as the signal change was specificto opportunities for understanding how dysfunctional neurotrans- certain conditions, for instance selectively reducing the NAc mission is reflected as changes in haemodynamic signatures and reward signals on CS trials but leaving the reward and omission how both relate to behavioural performance. Low responses on CS trials unaltered. High This may at first appear at odds with previous studies that have shown that this dose of amphetamine can selectively augment FUNDING AND DISCLOSURE responding to reward-predictive cues over neutral cues and This work was funded by a Lilly Research Award Program grant to enhance phasic dopamine release and neuronal activity in the MEW and GG, and a Wellcome Trust Senior Research Fellowship to NAc [36, 39] and neuronal activity in OFC [83]. However, there are MEW (202831/Z/16/Z). JRH, HMM and GG declare being employ- a number of important differences between these previous ees of Eli Lilly & Co Ltd.; JF, FG and MDT were employees of Eli Lilly studies and the one reported here. & Co at the time of research. JF is now an employee of Vertex First, in our paradigm, the cues were probabilistically Pharmaceuticals (Europe) and FG is an employee of H. Lundbeck rewarded, meaning that both were associated with a certain A/S. EW, SLS and MEW have no competing interests to declare. level of reward expectation and elicit conditioned approach. A previous study has shown that the same dose of amphetamine as used in the current study can disrupt conditional discrimina- ACKNOWLEDGEMENTS MEW, MDT, HMM and GG conceived the project, MEW, GG, FG and JF designed the tion performance [40]. Second, as discussed above, increased experiment, JF and FG collected the data, JF performed the surgeries, EW, SS, JRH and dopamine release may not necessarily map directly onto MEW analysed the data and MEW prepared the paper with input from the other comparable haemodynamic changes. Indeed, although it would authors. We would like to thank David Bannerman for valuable advice throughout the be expected that thedoseofamphetamine used in thecurrent project, Mike Conway for assistance with surgeries and Thomas Akam, Miriam Klein- experiment would boost phasic dopamine release to reward- Flugge, Stephen McHugh and Marios Panayi for discussions about the modelling, predicting cues [36], fMRI studies have tended to observe a interpretation and analyses. The study is dedicated to the memory of one of the blunting of haemodynamic responses to such cues in NAc and authors, Soon-Lim Shin, a key contributor to the project who sadly passed away to frontal cortex following administration of a single dose of cancer in 2017. amphetamine or methamphetamine, comparable to what we observed in both NAc and OFC [37, 84]. In addition, one of these studies [37] reported the loss of RPE encoding in NAc, an effect ADDITIONAL INFORMATION that was also evident in the current study. Amphetamine is Supplementary information is available for this paper at (https://doi.org/10.1038/ known to increase levels of dopamine—and other monoamines s41386-019-0564-8). —in a stimulus-independent as well as a stimulus-driven manner. Therefore, the critical factor for appropriate responding Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims is likely to rest on the balance of these two elements in in published maps and institutional affiliations. frontal–striatal–limbic circuits, and the disruption of haemody- namic signalling of incentive predictions and prediction errors we recorded may reflect this. From the present data, it is unclear REFERENCES whether amphetamine is directly corrupting calculations of RPEs 1. Watabe-Uchida M, Eshel N, Uchida N. Neural circuitry of reward prediction error. or is instead primarily disrupting the inputs then used to Annu Rev Neurosci 2017;40:373–94. 2. Montague PR, Dayan P, Sejnowski TJ. A framework for mesencephalic dopamine calculate the RPE such as cue-elicited reward expectation. systems based on predictive Hebbian learning. J Neurosci Off J Soc Neurosci The changes in haemodynamic responses observed after 1996;16:1936–47. amphetamine administration here are also consistent with an 3. Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. increasingly large body of fMRI studies of reward-guided learning Science 1997;275:1593–9. in patients displaying symptoms that are believed to arise in part 4. Walton ME, Bouret S. What is the relationship between dopamine and effort? from dysregulated dopamine, such as psychosis and anhedonia Trends Neurosci 2018;42:79–91. (see refs. [16, 19, 25, 85] for reviews). This has raised the possibility 5. Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for that changes in behaviour and brain responses during reward reward and punishment in the ventral tegmental area. Nature 2012;482:85–8. anticipation and reinforcement learning might act as a cross- 6. Tobler PN, Fiorillo CD, Schultz W. Adaptive coding of reward value by dopamine neurons. Science 2005;307:1642–5. diagnostic preclinical translational biomarker [16, 86]. In parallel, Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. 7. Gan JO, Walton ME, Phillips PE. Dissociable cost and benefit encoding of future 33. Lowry JP, Griffin K, McHugh SB, Lowe AS, Tricklebank M, Sibson NR. Real-time rewards by mesolimbic dopamine. Nat Neurosci 2010;13:25–7. electrochemical monitoring of brain tissue oxygen: a surrogate for functional 8. Kishida KT, Saez I, Lohrenz T, Witcher MR, Laxton AW, Tatter SB, et al. Subsecond magnetic resonance imaging in rodents. NeuroImage 2010;52:549–55. dopamine fluctuations in human striatum encode superposed error signals about 34. Li J, Schwarz AJ, Gilmour G. Relating translational neuroimaging and ampero- actual and counterfactual reward. Proc Natl Acad Sci USA 2016;113:200–5. metric endpoints: utility for neuropsychiatric drug discovery. Curr Top Behav 9. Hart AS, Rutledge RB, Glimcher PW, Phillips PE. Phasic dopamine release in the rat Neurosci 2016;28:397–421. nucleus accumbens symmetrically encodes a reward prediction error term. J 35. Francois J, Huxter J, Conway MW, Lowry JP, Tricklebank MD, Gilmour G. Differ- Neurosci Off J Soc Neurosci 2014;34:698–704. ential contributions of infralimbic prefrontal cortex and nucleus accumbens 10. Saddoris MP, Cacciapaglia F, Wightman RM, Carelli RM. Differential dopamine during reward-based learning and extinction. J Neurosci: Off J Soc Neurosci release dynamics in the nucleus accumbens core and shell reveal complementary 2014;34:596–607. signals for error prediction and incentive motivation. J Neurosci Off J Soc Neu- 36. Daberkow DP, Brown HD, Bunner KD, Kraniotis SA, Doellman MA, Ragozzino ME, rosci 2015;35:11572–82. et al. Amphetamine paradoxically augments exocytotic dopamine release and 11. Ellwood IT, Patel T, Wadia V, Lee AT, Liptak AT, Bender KJ, et al. Tonic or phasic phasic dopamine signals. J Neurosci 2013;33:452–63. stimulation of dopaminergic projections to prefrontal cortex causes mice to 37. Bernacer J, Corlett PR, Ramachandra P, McFarlane B, Turner DC, Clark L, et al. maintain or deviate from previously learned behavioral strategies. J Neurosci Off Methamphetamine-induced disruption of frontostriatal reward learning signals: J Soc Neurosci 2017;37:8315–29. relation to psychotic symptoms. Am J Psychiatry 2013;170:1326–34. 12. Zaghloul KA, Blanco JA, Weidemann CT, McGill K, Jaggi JL, Baltuch GH, et al. 38. Curran C, Byrappa N, McBride A. Stimulant psychosis: systematic review. Br J Human substantia nigra neurons encode unexpected financial rewards. Science Psychiatry 2004;185:196–204. 2009;323:1496–9. 39. Wan X, Peoples LL. Amphetamine exposure enhances accumbal responses to 13. Saunders BT, Robinson TE. The role of dopamine in the accumbens core in the reward-predictive stimuli in a pavlovian conditioned approach task. J Neurosci expression of Pavlovian-conditioned responses. Eur J Neurosci 2012;36:2521–32. 2008;28:7501–12. 14. Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, et al. A selective role for 40. Dunn MJ, Futter D, Bonardi C, Killcross S. Attenuation of d-amphetamine-induced dopamine in stimulus-reward learning. Nature 2011;469:53–7. disruption of conditional discrimination performance by alpha-flupenthixol. 15. Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G, et al. Psychopharmacology 2005;177:296–306. Nucleus accumbens dopamine depletion impairs both acquisition and perfor- 41. St Onge JR, Chiu YC, Floresco SB. Differential effects of dopaminergic manip- mance of appetitive Pavlovian approach behaviour: implications for mesoac- ulations on risky choice. Psychopharmacology 2010;211:209–21. cumbens dopamine function. Behav Brain Res 2002;137:149–63. 42. Francois J, Conway MW, Lowry JP, Tricklebank MD, Gilmour G. Changes in 16. Deserno L, Schlagenhauf F, Heinz A. Striatal dopamine, reward, and decision reward-related signals in the rat nucleus accumbens measured by in vivo oxygen making in schizophrenia. Dialogues Clin Neurosci 2016;18:77–89. amperometry are consistent with fMRI BOLD responses in man. NeuroImage 17. Maia TV, Frank MJ. An integrative perspective on the role of dopamine in schi- 2012;60:2169–81. zophrenia. Biol Psychiatry 2017;81:52–66. 43. Behrens TE, Hunt LT, Woolrich MW, Rushworth MF. Associative learning of social 18. Garcia-Garcia I, Zeighami Y, Dagher A. Reward prediction errors in drug addiction value. Nature 2008;456:245–9. and Parkinson's disease: from neurophysiology to neuroimaging. Curr Neurol 44. Rutledge RB, Dean M, Caplin A, Glimcher PW. Testing the reward prediction error Neurosci Rep 2017;17:46. hypothesis with an axiomatic model. J Neurosci 2010;30:13525–36. 19. Zald DH, Treadway MT. Reward processing, neuroeconomics, and psycho- 45. Niv Y, Edlund JA, Dayan P, O'Doherty JP. Neural prediction errors reveal a risk- pathology. Annu Rev Clin Psychol 2017;13:471–95. sensitive reinforcement-learning process in the human brain. J Neurosci 20. Murray GK, Corlett PR, Clark L, Pessiglione M, Blackwell AD, Honey G, et al. 2012;32:551–62. Substantia nigra/ventral tegmental reward prediction error disruption in psy- 46. Klein-Flugge MC, Hunt LT, Bach DR, Dolan RJ, Behrens TE. Dissociable reward and chosis. Mol Psychiatry 2008;13:239. 67-76 timing signals in human midbrain and ventral striatum. Neuron 2011;72:654–64. 21. Huys QJ, Pizzagalli DA, Bogdan R, Dayan P. Mapping anhedonia onto 47. Chase HW, Kumar P, Eickhoff SB, Dombrovski AY. Reinforcement learning models reinforcement learning: a behavioural meta-analysis. Biol Mood Anxiety Disord. and their neural correlates: an activation likelihood estimation meta-analysis. 2013;3:12. Cognitive, affective &. Behav Neurosci 2015;15:435–59. 22. Pizzagalli DA, Iosifescu D, Hallett LA, Ratner KG, Fava M. Reduced hedonic 48. Parkinson JA, Olmstead MC, Burns LH, Robbins TW, Everitt BJ. Dissociation in capacity in major depressive disorder: evidence from a probabilistic reward task. J effects of lesions of the nucleus accumbens core and shell on appetitive pavlo- Psychiatr Res 2008;43:76–87. vian approach behavior and the potentiation of conditioned reinforcement and 23. Dowd EC, Frank MJ, Collins A, Gold JM, Barch DM. Probabilistic reinforcement locomotor activity by D-amphetamine. J Neurosci 1999;19:2401–11. learning in patients with schizophrenia: relationships to anhedonia and avolition. 49. du Hoffmann J, Nicola SM. Dopamine invigorates reward seeking by promoting Biol Psychiatry Cogn Neurosci Neuroimaging 2016;1:460–73. cue-evoked excitation in the nucleus accumbens. J Neurosci 2014;34:14349–64. 24. Kumar P, Goer F, Murray L, Dillon DG, Beltzer ML, Cohen AL, et al. Impaired 50. Hart AS, Clark JJ, Phillips PEM. Dynamic shaping of dopamine signals during reward prediction error encoding and striatal-midbrain connectivity in depres- probabilistic Pavlovian conditioning. Neurobiol Learn Mem 2015;117:84–92. sion. Neuropsychopharmacology. 2018;43:1581–88. 51. Day JJ, Roitman MF, Wightman RM, Carelli RM. Associative learning mediates 25. Radua J, Schmidt A, Borgwardt S, Heinz A, Schlagenhauf F, McGuire P, et al. dynamic shifts in dopamine signaling in the nucleus accumbens. Nat Neurosci Ventral striatal activation during reward processing in psychosis: a neurofunc- 2007;10:1020–8. tional meta-analysis. JAMA Psychiatry 2015;72:1243–51. 52. Hamid AA, Pettibone JR, Mabrouk OS, Hetrick VL, Schmidt R, Vander Weele CM, 26. Rausch F, Mier D, Eifler S, Esslinger C, Schilling C, Schirmbeck F, et al. Reduced et al. Mesolimbic dopamine signals the value of work. Nat Neurosci activation in ventral striatum and ventral tegmental area during probabilistic 2016;19:117–26. decision-making in schizophrenia. Schizophrenia Res 2014;156:143–9. 53. McHugh SB, Marques-Smith A, Li J, Rawlins JN, Lowry J, Conway M, et al. 27. Rothkirch M, Tonn J, Kohler S, Sterzer P. Neural mechanisms of reinforcement Hemodynamic responses in amygdala and hippocampus distinguish between learning in unmedicated patients with major depressive disorder. Brain aversive and neutral cues during Pavlovian fear conditioning in behaving rats. 2017;140:1147–57. Eur J Neurosci 2013;37:498–507. 28. Gradin VB, Kumar P, Waiter G, Ahearn T, Stickle C, Milders M, et al. Expected value 54. Li J, Bravo DS, Louise Upton A, Gilmour G, Tricklebank MD, Fillenz M, et al. Close and prediction error abnormalities in depression and schizophrenia. Brain temporal coupling of neuronal activity and tissue oxygen responses in rodent 2011;134:1751–64. whisker barrel cortex. Eur J Neurosci 2011;34:1983–96. 29. Morris RW, Vercammen A, Lenroot R, Moore L, Langton JM, Short B, et al. Dis- 55. Knutson B, Gibbs SE. Linking nucleus accumbens dopamine and blood oxyge- ambiguating ventral striatum fMRI-related BOLD signal during reward prediction nation. Psychopharmacology 2007;191:813–22. in schizophrenia. Mol Psychiatry 2012;17:235. 80-9 56. Lohani S, Poplawsky AJ, Kim SG, Moghaddam B. Unexpected global impact of 30. Rutledge RB, Moutoussis M, Smittenaar P, Zeidman P, Taylor T, Hrynkiewicz L, VTA dopamine neuron activation as measured by opto-fMRI. Mol Psychiatry et al. Association of neural and emotional impacts of reward prediction errors 2017;22:585–94. with major depression. JAMA Psychiatry 2017;74:790–97. 57. Decot HK, Namboodiri VM, Gao W, McHenry JA, Jennings JH, Lee SH, et al. 31. Oades RD, Halliday GM. Ventral tegmental (A10) system: neurobiology. 1. Anat- Coordination of brain-wide activity dynamics by dopaminergic neurons. Neu- omy and connectivity. Brain Res 1987;434:117–65. ropsychopharmacology. 2017;42:615–27. 32. Swanson LW. The projections of the ventral tegmental area and adjacent regions: 58. Ferenczi EA, Zalocusky KA, Liston C, Grosenick L, Warden MR, Amatya D, et al. a combined fluorescent retrograde tracer and immunofluorescence study in the Prefrontal cortical regulation of brainwide circuit dynamics and reward-related rat. Brain Res Bull 1982;9:321–53. behavior. Science 2016;351:aac9698. Neuropsychopharmacology (2019) 0:1 – 11 Amphetamine disrupts haemodynamic correlates of prediction errors in. . . E Werlen et al. 59. Lohrenz T, Kishida KT, Montague PR. BOLD and its connection to dopamine 78. Schilman EA, Uylings HB, Galis-de Graaf Y, Joel D, Groenewegen HJ. The orbital release in human striatum: a cross-cohort comparison. Philos Trans R Soc Lond B cortex in rats topographically projects to central parts of the caudate-putamen Biol Sci 2016;371:pii: 20150352. complex. Neurosci Lett 2008;432:40–5. 60. Kennerley SW, Behrens TE, Wallis JD. Double dissociation of value computations 79. Pavlov IP. Conditioned Reflexes. Oxford: Oxford University Press; 1927. in orbitofrontal and anterior cingulate neurons. Nat Neurosci 2011;14:1581–9. 80. Mackintosh NJ. Overshadowing and stimulus intensity. Anim Learn Behav. 61. Warren CM, Hyman JM, Seamans JK, Holroyd CB. Feedback-related negativity 1976;4:186–92. observed in rodent anterior cingulate cortex. J Physiol Paris 2015;109:87–94. 81. Jakubowska E, Zielinski K. Differentiation learning as a function of stimulus 62. Sul JH, Kim H, Huh N, Lee D, Jung MW. Distinct roles of rodent orbitofrontal and intensity and previous experience with the CS. Acta Neurobiol Exp medial prefrontal cortex in decision making. Neuron 2010;66:449–60. 1976;36:427–46. 63. O'Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ. Temporal difference 82. Grilly DM, Loveland A. What is a "low dose" of d-amphetamine for inducing models and reward-related learning in the human brain. Neuron 2003;38:329–37. behavioral effects in laboratory rats? Psychopharmacology 2001;153:155–69. 64. Rohe T, Weber B, Fliessbach K. Dissociation of BOLD responses to reward pre- 83. Homayoun H, Moghaddam B. Orbitofrontal cortex neurons as a common target diction errors and reward receipt by a model comparison. Eur J Neurosci for classic and glutamatergic antipsychotic drugs. Proc. Natl Acad. Sci. USA 2012;36:2376–82. 2008;105(46):18041–6. 65. Stalnaker TA, Liu TL, Takahashi YK, Schoenbaum G. Orbitofrontal neurons signal 84. Knutson B, Bjork JM, Fong GW, Hommer D, Mattay VS, Weinberger DR. Amphe- reward predictions, not reward prediction errors. Neurobiol Learn Mem tamine modulates human incentive processing. Neuron 2004;43:261–9. 2018;153:137–43. 85. Heinz A, Schlagenhauf F. Dopaminergic dysfunction in schizophrenia: salience 66. Klein-Flugge MC, Barron HC, Brodersen KH, Dolan RJ, Behrens TE. Segregated attribution revisited. Schizophr Bull 2010;36:472–85. encoding of reward-identity and stimulus-reward associations in human orbito- 86. Gilmour G, Gastambide F, Marston HM, Walton ME. Using intermediate cognitive frontal cortex. J Neurosci 2013;33:3202–11. endpoints to facilitate translational research in psychosis. Curr Opin Behav Sci 67. Kahnt T, Heinzle J, Park SQ, Haynes JD. Decoding the formation of reward pre- 2015;4:128–35. dictions across learning. J Neurosci 2011;31:14624–30. 87. Corlett PR, Fletcher PC. Computational psychiatry: a Rosetta Stone linking the 68. Schoenbaum G, Chiba AA, Gallagher M. Orbitofrontal cortex and basolateral brain to mental illness. Lancet Psychiatry 2014;1:399–402. amygdala encode expected outcomes during learning. Nat Neurosci 88. Maia TV, Huys QJM, Frank MJ. Theory-based computational psychiatry. Biol Psy- 1998;1:155–9. chiatry 2017;82:382–84. 69. Morrison SE, Salzman CD. The convergence of information about rewarding and 89. Valton V, Romaniuk L, Douglas Steele J, Lawrie S, Series P. Comprehensive review: aversive stimuli in single neurons. J Neurosci 2009;29:11471–83. computational modelling of schizophrenia. Neurosci Biobehav Rev 70. Kennerley SW, Wallis JD. Evaluating choices by single neurons in the frontal lobe: 2017;83:631–46. outcome value encoded across multiple decision variables. Eur J Neurosci 90. Akam T, Costa R, Dayan P. Simple plans or sophisticated habits? state, transition 2009;29:2061–73. and learning interactions in the two-step task. PLoS Comput Biol 2015;11: 71. Ogawa M, van der Meer MA, Esber GR, Cerri DH, Stalnaker TA, Schoenbaum G. e1004648. Risk-responsive orbitofrontal neurons track acquired salience. Neuron 91. Miller KJ, Botvinick MM, Brody CD. Dorsal hippocampus contributes to model- 2013;77:251–8. based planning. Nat Neurosci 2017;20:1269–76. 72. Harle KM, Zhang S, Ma N, Yu AJ, Paulus MP. Reduced neural recruitment for 92. Hollon NG, Arnold MM, Gan JO, Walton ME, Phillips PE. Dopamine-associated bayesian adjustment of inhibitory control in methamphetamine dependence. cached values are not sufficient as the basis for action selection. Proc Natl Acad Biol Psychiatry Cogn Neurosci Neuroimaging 2016;1:448–59. Sci USA 2014;111(51):18357–62. 73. McDannald MA, Esber GR, Wegener MA, Wied HM, Liu TL, Stalnaker TA, et al. Orbitofrontal neurons acquire responses to ‘valueless' Pavlovian cues during unblocking. Elife. 2014;3:e02653. Open Access This article is licensed under a Creative Commons 74. Schiller D, Weiner I. Lesions to the basolateral amygdala and the orbitofrontal Attribution 4.0 International License, which permits use, sharing, cortex but not to the medial prefrontal cortex produce an abnormally persistent adaptation, distribution and reproduction in any medium or format, as long as you give latent inhibition in rats. Neuroscience 2004;128:15–25. appropriate credit to the original author(s) and the source, provide a link to the Creative 75. Esber GR, Haselgrove M. Reconciling the influence of predictiveness and uncer- Commons license, and indicate if changes were made. The images or other third party tainty on stimulus salience: a model of attention in associative learning. Proc. Biol. material in this article are included in the article’s Creative Commons license, unless Sci. 2011;278(1718):2553–61. indicated otherwise in a credit line to the material. If material is not included in the 76. Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and article’s Creative Commons license and your intended use is not permitted by statutory infralimbic cortex to pavlovian autoshaping and discrimination reversal learning: regulation or exceeds the permitted use, you will need to obtain permission directly further evidence for the functional heterogeneity of the rodent frontal cortex. J from the copyright holder. To view a copy of this license, visit http://creativecommons. Neurosci 2003;23:8771–80. org/licenses/by/4.0/. 77. Everitt BJ, Parkinson JA, Olmstead MC, Arroyo M, Robledo P, Robbins TW. Asso- ciative processes in addiction and reward. The role of amygdala-ventral striatal © The Author(s) 2019 subsystems. Ann N Y Acad Sci 1999;877:412–38. Neuropsychopharmacology (2019) 0:1 – 11

Journal

NeuropsychopharmacologySpringer Journals

Published: Nov 8, 2019

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create folders to
organize your research

Export folders, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off