Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Assessing the Impacts of Correlated Variability with Dissociated Timescales

Assessing the Impacts of Correlated Variability with Dissociated Timescales Despite the profound influence on coding capacity of sensory neurons, the measurements of noise corre- lations have been inconsistent. This is, possibly, because nonstationarity, i.e., drifting baselines, engendered the spurious long-term correlations even if no actual short-term correlation existed. Although attempts to separate them have been made previously, they were ad hoc for specific cases or computationally too demanding. Here we proposed an information-geometric method to unbiasedly estimate pure short-term noise correlations irrespective of the background brain activities without demanding computational re- sources. First, the benchmark simulations demonstrated that the proposed estimator is more accurate and computationally efficient than the conventional correlograms and the residual correlations with Kalman filters or moving averages of length three or more, while the best moving average of length two coincided with the propose method regarding correlation estimates. Next, we analyzed the cat V1 neural responses to demonstrate that the statistical test accompanying the proposed method combined with the existing nonstationarity test enabled us to dissociate short-term and long-term noise correlations. When we excluded the spurious noise correlations of purely long-term nature, only a small fraction of neuron pairs showed significant short-term correlations, possibly reconciling the previous inconsistent observations on existence of significant noise correlations. The decoding accuracy was slightly improved by the short-term correlations. Although the long-term correlations deteriorated the generalizability, the generalizability was recovered by the decoder with trend removal, suggesting that brains could overcome nonstationarity. Thus, the proposed method enables us to elucidate the impacts of short-term and long-term noise correlations in a dissociated manner. Key words: decoding analysis; information geometry; noise correlations; population codes; primary visual cor- tex; spontaneous activity Significance Statement The proposed measure for spike-count noise correlations, based on the local temporal detrending, enables us to decompose the correlated responses into long-timescale and short-timescale compo- nents. The proposed method is essential to elucidate the population codes in the era of large-scale electrophysiology as it works for large number of simultaneously recorded neurons while existing methods do not. With the additional help of the machine learning that classifies stimuli from neural activities, we demonstrate proper ways to assess the impacts on decoding of the presence of short-term or long-term noise correlations, separately. The well-designed decoding analysis with dissociated correlated activities will help to gain insight into the brain’s decoding strategies under changing environments. January/February 2019, 6(1) e0395-18.2019 1–16 Methods/New Tools 2 of 16 2006; Sharpee, 2017). That is, the decoding success rates Introduction can be used as a measure of accuracy of neural repre- The impacts and mechanisms of correlations in noises, sentations. One can take different features of neural ac- i.e., trial-to-trial variations in neural responses to the same tivities as clues for decoding to see which feature carries stimulus, have been issues in neuroscience (Cohen and information. Therefore, it is ideal, within this framework, if Kohn, 2011; Doiron et al., 2016). The information theoretic the dissociation of short-term and long-term correlations studies showed that correlation in response noises can be gives us a novel way to assess their respective impacts on a major determinant for coding capacities of sensory information representations. Abbott and Dayan, 1999; Som- information by neurons ( In this article, we propose an information-geometric polinsky et al., 2001; Miura et al., 2012; Latham and method to unbiasedly estimate pure short-term noise cor- Roudi, 2013; Moreno-Bote et al., 2014). In some cases, relations irrespective of the background brain activities. even in a simple homogeneous network with tiny noise One effective way to use the information geometry, that correlations, having more neurons does not help at all Amari generally finds orthogonal statistical parameters ( Zohary et al., 1994; but see also Abbott and Dayan, 1999; and Nagaoka, 2001; Miura, 2011; Amari, 2016), is to Sompolinsky et al., 2001; Miura, 2012; Moreno-Bote estimate only finite parameters of interest irrespectively of et al., 2014). Therefore, it is extremely important to esti- the other infinite numbers of parameters (Miura et al., mate noise correlations accurately in the era of large- 2006a,b, 2007; Miura and Uchida, 2008). Here, we used scale electrophysiology (Steinmetz et al., 2018). this infinite-dimensional scheme (Amari and Kawanabe, Although significant noise correlations have been ob- 1997; Miura, 2013) to dissociate the parameter for short- served in almost all recorded cortical areas, it has been term correlation from the infinitely many parameters for pointed out that nonstationarity such as drifts in signals (all possible) long-term baseline drifts nonparametrically. can engender artificial correlations even if no actual cor- This allows us to estimate pure short-term correlations Bair et al., 2001; Ecker et al., 2010; Renart relation exists ( whatever the baseline drift is without demanding consid- et al., 2010). Therefore, it is desired to dissociate the erable numbers of simultaneously recorded neurons and observed noise correlations into short-term and long-term high computational costs. Then, the accompanying sta- components, where the latter is possibly caused by the tistical test as well as the existing nonstationarity test background trends or fluctuations of the baseline activity enabled us to dissociate short-term and long-term corre- Fiser et al., 2004; Ikegaya et al., 2004; Sasaki et al., 2007; lations. First, as benchmark simulations, we demon- Luczak et al., 2015; Okun et al., 2015). Although attempts strated that the proposed estimator is more accurate and to separate them and estimate purely short-term noise computationally efficient than the conventional correlo- correlations under changing environments have been grams and the residual correlations with Kalman filters or made previously, they were ad hoc and applicable only to moving averages of length three or more, while the best Bair et al., 2001; Mitchell et al., 2009; specific cases ( moving average of length two coincided with the propose Ecker et al., 2010; Renart et al., 2010). Even the latest method regarding correlation estimates. Next, when we Bayesian method requires considerable numbers of si- excluded the spurious noise correlations of purely long- multaneously recorded neurons as well as exponential term nature, only a small fraction of V1 neuron pairs computational costs to estimate instantaneous activities showed significant short-term noise correlations, possibly Ecker et al., 2014; Rosenbaum et al., 2017). Thus, the reconciling the previous inconsistent observations on the estimation method, which requires only the recording of a existence of significant noise correlations. Finally, with the pair of neurons and works for arbitrary baseline drifts additional help of the machine learning that classifies Amari and Cardoso, 1997), is desired. nonparametricaly ( stimuli from neural activities, we assessed the impacts on In addition to measuring the noise correlations, assess- decoding of the presence of short-term or long-term noise ing their impacts is also very important. The degree to correlations, separately. The presence of pure short-term which sensory information is represented reliably by neu- correlations slightly improved the decoding accuracy, ral responses has been characterized by applying a de- while the pure long-term correlations deteriorated the coding approach in a stochastic stimulus–response generalization ability. However, the decrease in decoding Dayan and Abbott, 2001; Averbeck et al., framework ( accuracy by the long-term correlations was recoverable by using the decoder with offset, suggesting that the brain Received October 14, 2018; accepted February 5, 2019; First published could overcome nonstationarity by detrending. Thus, our February 11, 2019. method enables us to elucidate the impacts of short-term The authors declare no competing financial interests. Author contributions: T.T. and K.M. performed research; T.T. and K.M. and long-term correlations in a dissociated manner, ad- analyzed data; Y.M., H.I., and K.M. designed research; K.M. wrote the paper. vancing a modern, component-wise information theoretic This work was supported by Japan Society for the Promotion of Science, Schneidman et al., 2003; Latham and Nirenberg, analysis ( Grants-in-Aid for Scientific Research Grants 18K11485, 18K13251, 16K01966, ; Averbeck et al., 2006; Sharpee et al., 2006). and 15H05878. Correspondence should be addressed to Keiji Miura at miura@kwansei.ac.jp. Materials and Methods https://doi.org/10.1523/ENEURO.0395-18.2019 Copyright © 2019 Takahashi et al. All the simulations and data analyses in this article were This is an open-access article distributed under the terms of the Creative done by using R. Throughout the analyses in the article, Commons Attribution 4.0 International license, which permits unrestricted use, the firing rate for each trial was used as an activity feature. distribution and reproduction in any medium provided that the original work is properly attributed. The firing rate was computed as the spike count divided January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 3 of 16 by the trial duration with a visual stimulus, which varied by esized to obey the following statistical model (Eq. 4) and trials from 1.0 to 1.7 s. Thus, when we say correlation the proposed estimator estimates the Gaussian covari- coefficients or (trial-shifted) correlograms, we solely con- ance therein. Although the derivation and concise bench- Miura, sider spike count noise correlations. mark simulations were already shown elsewhere ( 2013), the application to the real experimental data has Proposed estimator for short-term noise correlation not been done yet. As a measure of short-term noise correlations, we pro- In this article, we solely consider the spike count within posed and used the following estimator, a trial, where the spiking activity of a neuron is integrated over a couple of seconds and, thus, well approximated by 12 a Gaussian distribution. This leads us to consider a bivari- ˆ  (1) ate normal distribution for activities of two neurons, q(x, y; ˆ ˆ 11 22 ,  , ), where  and  denote the means for two x y x y neurons’ activities and  denotes the covariance matrix. where the covariances  are estimated as ij The activities x and y denote the spike counts of two N/2 neurons for a trial. These analyses address the situation in 2 2 ˆ which the covariance matrix  is constant whereas the (x  x¯)  (x  x¯) (2) 11 2t1 (2t) 2t (2t) can change over time. Especially, when the signals signals t1 N/2 are distributed randomly, but two consecutive signals are the 2 2 y  y¯  y  y¯ 22 2t1 2t 2t 2t t1 same from continuity condition, the distribution of activities at N/2 x  x¯y  y¯  x  x¯ time 2t1 and 2t (t  1, 2,...) can be described as a mixed 12 2t1 2t 2t1 2t 2t 2t t1 model, y  y¯ . 2t 2t There in, x and y denote the neural responses in spike t t p( x , y , x , y ;, k( ,  ))  k( ,  ) 2t1 2t1 2t 2t x y x y counts within a few seconds in the t-th trial, while the local mean activities were defined by q(x , y ; ,  , )q(x , y ; ,  , )d d (4) 2t1 2t1 x y 2t 2t x y x y x  x 2t1 2t where k( , ) denotes an unknown distribution of the x¯  and (3) x y (2t) signals. The only assumption made here is that the con- y y 2t1 2t y¯  . secutive signals have equal value, at least approximately 2t (see the practical discussion below, at the end of Opti- mality of proposed estimator from statistical viewpoint). The proposed measure in Equation 1 is comparable to the That assumption is minimal and realistic as it is satisfied, conventional correlation coefficient. When we plotted in e.g., when the signal drift is continuous, and preferably, the form of correlograms, we first shifted one of two time sufficiently low. From another viewpoint, this definition of series by  trials and then computed the proposed mea- noises as the activities which is not locally flat over time is sure for them. quite convenient for estimation. Furthermore, Equation 4 is a semiparametric model Code accessibility (Bickel et al., 1993; van der Vaart, 1998) because it has The R code for computing the proposed correlation both a vector  and a function k( , ) as parameters. It is coefficient and its p value, as defined below in Statistical x y generally not easy to estimate parameters in semipara- tests for short-term noise correlations, is freely available metric models because a function space is fundamentally online at https://github.com/toshi-0415/eNeuro. The code infinite dimensional (Neyman and Scott, 1948). However, is ready to run just by replacing the example data for it is known that, for some cases, only parameters of Figure 4 with users’ own data. interest can be estimated efficiently through differential As there can be a minor style difference in coding the geometric methods on the manifolds of a family of prob- proposed measure, we unified the rule and adopted the ability distributions (Amari and Kawanabe, 1997; Amari one with the minimum errors throughout the article and and Nagaoka, 2001; Miura et al., 2006a,b, 2007; Miura the downloadable code. That is, there are two possible and Uchida, 2008). For this model, it is possible to esti- ways for pairing two neighboring trials, (1) starting at mate the three constant parameters  { , ( the first trial as {3,4}, {5,6}, and (2) starting at the 11 12 ), } whatever the signal drift k( , ) is. second trial as {2,3}, {4,5}, {6,7}, .... In the adopted 21 22 x y After a lengthy calculation in Miura (2013), the estimator {1,2} style, we took the average of the two estimated was obtained as in Equation 2. As the proposed estimator covariances, because we found it had smaller variances looks so simple, one might think that one can easily (estimation errors). This style difference only negligibly construct an arbitrary local smoother similar to the pro- modifies the results and the overall conclusions never posed estimator. However, because any arbitrarily in- change. vented estimators have larger estimation errors (biases Assumption and derivation of proposed estimator and variances) in general, it is actually very difficult to The proposed estimator in Equation 1 was derived for discover an optimal estimator from scratch. As far as we estimating parameters in a semiparametric statistical know, other than information geometry, there is no sys- model. That is, the activities of two neurons were hypoth- tematic way to analytically derive an optimal estimator January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 4 of 16 that works under arbitrary trends nonparametrically. For- d(( (X)  )p(X)) tunately, it is very easy to just prove the optimality of the fg p(X)dX  dX derived estimator, once it was derived. Therefore, we take advantage of this fact for the educational purpose in what ˆ d( (X)  ) p(X)dX  1, (8) follows. That is, we do not repeat the derivation but rather only check the answer and demonstrate the performance of the proposed estimator concisely in the following sec- where X  (X , X ). This shows that any estimator of 2t-1 2t tion. at least have the minimum variance in the right-hand side: Optimality of proposed estimator from statistical viewpoint Var[ ]     . (9) 11 22 12 21 Here, we summarize and prove the optimality property from the statistical viewpoint. Specifically, we show that Actually, the proposed estimator  attains this mini- mum variance as shown in Equation 6. Thus, for any the proposed estimator has no bias (i.e., correct on aver- age) and minimum variances (i.e., smallest errors) among estimator , the estimators which work unbiasedly for arbitrary base- Var[ ] Var[ ]. (10) line drifts. The unbiased nature of the proposed estimators is clear Similar relations hold for  and  . from the fact that the estimators in Equation 2 are nor- 11 22 We have demonstrated that the proposed estimator is malized by dividing not by 2( M) but by 1( M1). optimal as far as the assumption on the statistical model Normalization of this type is widely known to guarantee hold. Practically, due to the violation of the assumption the unbiased estimation for the covariances of Gaussian that the consecutive two signals (means) are exactly the distributions. In fact, with X  (x ,y ) and   ( , ), the t t t x y same, the biases can arise. However, it can be shown expectation of  can be calculated as an integral over from a simple calculus that the biases are generally small. the probability distribution in Equation 4 as In fact, if the consecutive signals are E[x ]    and (11) 2t1 ˆ ˆ E[ ]:  (  (X , X )q(X , )q(X , ) 12 12 2t1 2t 2t1 2t E x 2t dX dX )k()d   k()d  . (5) differing of order of , then, the biases are of the second 2t1 2t 12 12 order of : This means that the estimator is unbiased or the estimator (x  x ) 2t1 2t 2 2 E[ ]  E[ ]    2 . (12) works (at least) “on average.” The variance of the estimate 11 can be similarly computed as Thus, even if one assumes that the biases accumulate over the time points whose size is of order 1 / , the total ˆ ˆ Var[ ]:  ( ( (X , X ) ) 12 12 2t1 2t bias is still negligible, being of order 1 /  . This suggests that even if the signal drifts slowly O  as in q(X , )q(X , )dX dX )k()d 2t1 2t 2t1 2t 11 22 Equation 11, keeping the difference between the first and . (6) 12 21 the last activities finite O(1) after a long time sequence O1 / , the total bias is negligibly small O( ). In fact, Surprisingly,  has the minimum variance ( estimation Figure 5D demonstrates that the proposed statistical test detects no spurious short-term correlations even if signals error) among all the estimators. To prove this, assume that drift in the real V1 data. X , X  is an arbitrary estimator of  , that is, 2t1 2t 12 Simulation of activities of two neurons with drifting E( (X , X )): 2t1 2t baselines The simulations of bivariate Gaussian noises added to (X , X )p(X , X ;, k( ,  ))dX dX  . 2t1 2t 2t1 2t x y 2t1 2t 12 the baselines generated by the ARIMA models for activi- ties of two neurons in Figure 1A were performed with mvrnorm() and arima.sim() functions in R. Note that we assumed that the expectation is equal to the statistical parameter of interest because any estimator Conventional cross-correlograms should work at least “on average.” By using the Cauchy– The conventional cross-correlograms were computed Schwartz inequality in functional space ( f g f·g) with with cor() function in R for the manually trialshifted data. f  X  and g  dlog pX d , we get 12 / 12 As the function returns NA (i.e., not available) when either of two neurons show no spike across all 40 trials, we Var[f]Var[g]  Var[ ] (    ) (7) 11 22 12 21 excluded those pairs from the analyses in the article. Note January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 5 of 16 Figure 1. Comparison of conventional cross-correlogram and proposed method. A, Artificial activities of two neurons, simulated as sums of baselines and trial-to-trial noises. The thick gray smooth curves denote time-dependent baselines  generated by the ARIMA(0,2,1) model, on which the bivariate Gaussian noises were added to generate the neural activities. The added noises have significant spatial or interneuronal correlations but no temporal correlation because intertrial intervals are assumed to be fairly long ( 3 s). B, The estimated cross-correlations for the simulated activities in A by the proposed method (red) and the conventional correlogram (black). Only the proposed method works and shows a proper peak at the origin. C, Schematic illustrations of how the proposed method works for the cases with pure long-term or short-term correlations. The cross-correlation computed within each local window, where the baselines are instantaneously constant, are averaged across sliding windows to capture only short-term correlations whatever the baseline drift is. January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 6 of 16 that the proposed method also returns NA for those pairs. obtained distribution (typically top 2.5% for both sides, We computed time-shifted noise correlations or cross- corresponding to positive and negative correlations), it is correlation functions separately for different stimuli, be- detected as significant or violating the null hypothesis. cause recent works indicated the stimulus dependency of When we computed the control p value distribution Kohn and Smith, 2005; Maruyama and Figure 5D, we actually noise correlations ( for “one-time-shifted” data in Ito, 2013; Ruff and Cohen, 2016). shifted two-trials. This is because our proposed esti- mator treats the time series by pairs of time points as in Kalman filter method Equations 2, 4. This is also why we shifted 2, 4, 6, . . ., The smoothing by the Kalman filter to obtain the base- trials in Figures 1, 2. line trend of the simulated neural activities was computed The R codes for computing the proposed short-term with dlmFilter() function in dlm package for R (Petris correlation and the accompanying statistical test was et al., 2009). The noise correlations in residuals was ob- handwritten. tained by the maximum likelihood method for data fitting 16NN  1 2 , As the level of significance,0.01 / / with dlmMLE() function in the same package. The statis- where N denotes the number of neurons in the session tical model for the baseline trend  we assumed to and 16 is for 16 stimuli, was entirely used in the article decode with Kalman filter was Fig. 6). That is, we employed Bonferroni’s (specifically in multiple comparison technique, because we wanted to F   Gaussian noise t1 t keep the number of neuron pairs moderately. Note that if i i X  G    Gaussian noise (13) t t i we remove a neuron, we lose many pairs in the same session. (i) where X denotes i-th neuron’s activity and F and G are to be estimated by data fitting. Statistical tests for nonstationarity The computational time was measured by proc.time() We selected neurons with and without nonstationarity function in R on iMac with 3.3 GHz Intel Core i5 and by using the serial correlation test for randomness of 32-GB memory. fluctuations (CASE64 in Kanji, 2006). To remove the effect of stimulus presentation from the time series of neural Statistical tests for short-term noise correlations activities, we averaged local 16 trials within a single block We detected neuron pairs that have significant short- where 16 different stimuli are presented pseudo- term noise correlations by using the statistical test ac- randomly. In this way, the length of the time series was companying our estimator. As is usual with statistical reduced from original 640 to 40 trials, to which we applied tests, we computed p values under the null hypothesis of the test. The R code for the test was handwritten. The no correlation. validity of the test was confirmed by the observation that One possible way, which we did not adopt, was to the test returns uniformly distributed p values for the assume the asymptotic normality for the distribution of Gaussian white noises or the completely random time the proposed estimator, whose mean and variance can be series in which a random number is generated according computed from Equations 5, 6 (or from simulations). How- to the normal distribution at every time. Note that the ever, for the current case, each neuron has only finite 40 resulting p value varies by (random) time series and, here, trials per stimulus, and thus, the normality assumption we confirmed that the distribution got flat with many holds only approximately. Therefore, for example, the realizations. control p value distribution for the one-time-shifted data As the level of significance, 0.01 was entirely used in the are not as flat as in Figure 5D, although it is approximately article (specifically in Figs. 5, 6). We did not employ the flat. Although this method saves the computational time, it multiple comparison techniques, as we wanted to cate- seems to lack the accuracy in p values. gorize suspicious neurons into the nonstationary neuron To pursue the full accuracy, we resorted to the compu- pool, conservatively. tational method with the white Gaussian Monte Carlo simulations for reference activities of neuron pairs. Here, Classification analysis and principal component the test was based on the idea that even if there is no analysis short-term correlation, its estimate from finite 40 trials For the classifications of 16 visual stimuli based on the takes a non-zero value (error), which varies according to firing rates of neurons, we solely used lda() function in R some statistical distribution. First, we obtained the shape in this article, although the result did not change signifi- of the distribution as accurate as possible by repeating cantly when we used the support vector machine. The the Monte Carlo simulations a million times. Next, the p classification was done session by session to use the value for a given estimate is defined as its percentile in simultaneity of the recorded data. For the statistical sig- this numerically obtained distribution. That is, the p value nificance, the means of classification success rates for all is defined so that the p value distribution is completely flat sessions were compared between different conditions by for white Gaussian noises. To be precise, the p value the paired t test. Only the sessions with more than five varies by realization of the activities of two neurons, but, neurons remaining after the selections by short-term or with many realizations, one obtains the uniform distribu- long-term correlations were included in the classification tion for the p values. Note that the uniform p value distri- analyses for reliability. bution is a hallmark of a good statistical test. Finally, if an For the principal component analysis, we used estimate is too high or too low within the numerically- prcomp() command in R. As a preprocessing, we first January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 7 of 16 Figure 2. Comparison of conventional Kalman filter method and proposed method. A, The simulated activities of two neurons (red and blue) for 100 trials with the common sinusoidal baseline trend. The thick gray line denotes the model trend used for the data January/February 2019, 6(1) e0395-18.2019 eNeuro.org neural activity Methods/New Tools 8 of 16 continued generation. The activities of two neurons at each time are generated as the sum of the baseline trend and the bivariate Gaussian noises with unit variances and 0.3 correlation coefficient. When we simulated more than two neurons simultaneously, the additional neurons shared the trend but did not have noise correlation (data not shown). Thus, among N simulated neurons, only neurons 1 and 2 have non-zero correlation coefficient, which is to be estimated. B, The residual activities after the removal of the estimated trend by the Kalman filter from the activities in A. C, The noise correlations in the residuals averaged across 100 realizations of the simulated data. The horizontal dotted gray line for the true correlation coefficient (0.3) indicates that the conventional Kalman filter method does not work when the number of simultaneously simulated neurons are small. The error bars representing the SD demonstrate the large trial-to-trial variability in the results. D, Noise correlations estimated by the proposed method from the same data. The horizontal dotted gray line for the true correlation coefficient (0.3) indicates that the proposed method always works. The error bars representing the SD demonstrate the small variability in the results. E, The computational time for the conventional Kalman filter method. F, The computational time for the proposed method. averaged the neural responses to each stimulus, in order Multi-unit activities recorded by each electrode were not to include the trial-to-trial variability in the visualization sorted to recover the activities of individual single units Gray et al., 1995). by principal components. That is, we essentially visual- using custom spike sorting software ( ized the tuning curves. In addition, here we did not stan- Results dardize the activity of each neuron or tuning curve, For the purpose of measuring the spike-count noise because we did not want to enlarge small noises within correlations in different timescales and assessing their bad neurons who do not respond to any stimuli at all. That respective impacts on neural representations, we used is, not to listen to purely noisy neurons too much, we did the novel information geometric estimators of pure short- not enlarge the tuning curves even if their amplitudes are term correlations, which can be dissociated from long- small. In Figure 6B, the same neuron pool as in Figure 6C, term correlations in a nonparametric manner, that is, right, i.e., the neurons with pure long-term correlations, whatever the baseline drifts are. Before we applied this was used (189 neurons from 23 sessions). proposed method to the neural responses in V1, we checked whether and how it worked for the simulated V1 neuronal spikes time series as a benchmark. The experimental details for the cat V1 anesthetized recordings we reanalyzed have been previously described Proposed estimator works irrespective of baseline (Maruyama and Ito, 2013, 2017). Briefly, 566 neurons drifts First, we randomly generated the artificial time series were recorded in 48 sessions with 640 trials (40 repeats of which mimics the activities of two neurons, whose base- 16 visual stimuli) from five adult male cats. Two types of lines drift across many trials. Note that nonstationarity, electrode arrays were adopted for the recordings: a four- often observed experimentally in an unreproducible man- tetrode array and an array of eight single microelectrodes, ner, was indispensable for the simulation, as we wanted both of which were fabricated in the laboratory. to see whether the proposed method can overcome it. The eyes were focused on the tangential screen at a In the numerical simulation in Figure 1A, the activities of distance of 57 cm using the tapetal reflection technique two neurons were created by adding the bivariate Gauss- and an appropriate set of gas-permeable contact lenses. ian noises to the smoothly drifting trends, which, in turn, The pupils were dilated using phyenylephrine hydrochlo- were independently generated for the two neurons by ride (Neosynesin eye solution). All animal procedures ARIMA(0,2,1) model whose moving average coefficient were performed in accordance with the Kyoto Sangyo was 0.6 (Harvey, 1993). Here, a significant short-term University animal care committee’s regulations. noise correlation (  0.3) was induced only between Once stable recordings were obtained, the receptive simultaneous noises for two neurons, mimicking typical field properties (location) of the multi-unit activity re- neuroscience experiments where significant trial intervals corded by each electrode were mapped, using a mouse- of seconds order wash out intertrial temporal correlations controlled moving light bar presented on a 21-inch color in spike counts. An example realization of the simulation monitor (1024  768 resolution, vertical refresh rate of 80 in Figure 1A, that mimics one recording session, shows Hz) at a distance of 57 cm from the eyes. Because the hallmark drifting baselines, which is definitely unrepro- receptive fields of the units recorded by the high-density ducible and hard to estimate with limited sample number electrode arrays had significant overlap, the units were or from this “single snapshot” data. Note that here we stimulated by moving the light bars on a dark background exclusively consider trials as a unit for time axes, instead crossing over the region covering all of the receptive of fine scale windows such as 1-ms bins. fields. The stimuli consist of the light bars of 16 orienta- Figure 1B shows the cross-correlation functions for the tions equally spaced (i.e., with an angular separation of realization of simulated activities for two neurons in Figure 22.5°) that move along the direction of the normal. We ran 1A computed by both the conventional correlogram and 40 trial blocks in which each of the 16 stimuli were pre- the proposed method (Eqs. 1, 2). Here, the correlation sented in a pseudo-random order with an intertrial interval coefficient  was estimated for each time-shifted data, of 3 s. The bars traveled an angular distance of 3–5° over where the activities of one neuron was time-shifted while a period of 1.0–1.7 s (speed 3°/s). those for the other neuron was kept. Because of the wrong January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 9 of 16 assumption of the constant baselines, the conventional cor- Figure 2A shows the activities of two neurons, simulated relation coefficients caused a broad cross-correlation func- as the time series of length 100 with the common sinusoidal tion attributable to the temporal correlations in the baselines. baseline trend. The activities of two neurons at each time are That is, the correlation coefficient is positive because when generated as the sum of the baseline trend and the bivariate the activity of neuron 1 is higher (lower) than its average at a Gaussian noises with unit variances and 0.3 correlation co- late (early) trial, that of neuron 2 is also higher (lower). Adding efficient. When we simulated more than two neurons simul- a time shift does not affect this situation as there is a global taneously, the additional neurons shared the trend but did trend in Figure 1A. Note that broad cross-correlation func- not have noise correlations. Thus, among N simulated neu- tions have been observed for the experimental data (Bair rons, only neurons 1 and 2 have a non-zero correlation et al., 2001). On the other hands, the proposed method gave coefficient, which is to be estimated. a satisfiable result, correctly causing 0 for the time shifted Figure 2B shows the residual activities after the removal data and the short-term correlation  (0.3) for the simulta- of the estimated trend by the Kalman filter from the ac- neous data as demonstrated by a clear peak in Figure 1B. tivities in Figure 2A. The dark horizontal line indicates the Note that the estimated correlation coefficient ˆ 0.3 is estimated trend, which has been already removed from not only useful for statistical tests but also interpretable as a the activities. simultaneous covariation of Gaussian noises because our Figure 2C shows the noise correlations in the residuals method is statistical model-based. averaged across 100 realizations of the simulated data. The reason for the flexible estimation by the proposed The horizontal dotted gray line for the true correlation method is that it estimates the covariance for two neurons coefficient (0.3) indicates that the conventional Kalman within each local window, where the background activity filter method does not work when the number of simulta- is assumed to be almost constant, and, then, averages neously simulated neurons are small. Naturally, recording the local estimates across sliding windows as in Figure from more neurons helps to estimate the current baseline 1C. Note that our method is based on the assumption that trend, which is essentially the average activities of neu- the short-term correlation (or the covariance parameter of rons in this easiest situation. If one does not know base- Gaussian noises) is constant over time. Consequently, the line trends accurately, the estimation of noise correlations proposed method enables estimation of the short-term fails as well. In more realistic situations, in which neurons correlations existing in the simultaneous activities inde- do not necessarily share baseline trends, more neurons pendently of the drifting baselines. Figure 1C shows how would be required to estimate the noise correlation by the this method works for the cases with pure long-term (Fig. Kalman filter-like methods. 1C, top) or short-term (Fig. 1C, bottom) correlations. In the Figure 2D shows the noise correlations estimated by case of pure long-term correlations in Figure 1C, top, the the proposed method from the same data. The horizontal estimate of the correlation in the short window is zero (on dotted gray line for the true correlation coefficient (0.3) average), as there is no real short-term correlation and the indicates that the proposed method always works. Note baseline drift is negligible in this short timescale. Note that the proposed methods only requires the activities of the an implicit assumption in the proposed method is that two relevant neurons as evident in Equations 1, 2. within a short window, the baseline drift is absent or Furthermore, the Figure 2E shows that the Kalman filter negligible, although the violation of this assumption, if can be fairly expensive in computational time with as small enough, actually does not matter (Materials and small as 15 neurons. Given the number of simultaneously Methods). In the case of pure short-term correlations in recorded neurons is increasing rapidly, the computational Figure 1C, bottom, the estimate of the correlation in the costs can easily constitute a limiting factor. Thus, the short window is non-zero (on average), as there is a real proposed method is advantageous not only in the estima- short-term correlation although the baseline drift is ab- tion accuracy, but also in the computational cost as dem- sent. In this way, the proposed “local” estimates, that can onstrated in Figure 2F. be unaffected by the slow, long-term trends, work fairly The results obtained here are fairly general. Although the sinusoidal trend with seven cycles was entirely used in well even if the baseline activities drift arbitrarily over time. this article, qualitatively the same results were obtained Proposed estimator requires less neurons and for a wide range of numbers of cycles (4–10; data not computational powers than conventional Kalman shown). Imagine that the sinusoidal waves with different filters periods can exhaust the different possible timescales. In The key idea for the proposed estimator of noise cor- fact, it has been numerically demonstrated that the pro- relations resides in the local detrending. However, there posed method worked also for linear as well as stepwise are other types of detrending methods such as Kalman trends in the previous work (Miura, 2013), although all filters. The latest studies also computed the correlations these numerical simulations just confirmed the mathemat- in residuals after the neural activities were smoothed and ical statement that the proposed method is robust against detrended by the Kalman filter-like methods (Ecker et al., arbitrary drifts. Although the proposed method might look 2014; Rosenbaum et al., 2017). Therefore, we performed too easy at first glance, any other ad hoc estimators of another benchmark simulation to compare the conven- covariances cannot achieve the unbiasedness (i.e., cor- tional Kalman filter method with the proposed method. rectness) under arbitrary drifts. Moreover, although the Specifically, we checked whether the two methods work latest best Bayesian methods can be regarded as variants in the presence of sinusoidal baseline drifts in simulations. of Kalman filter methods and some of them might improve January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 10 of 16 Figure 3. Comparison of conventional moving average method and proposed method. As in Figure 2, the simulated neural activities had the sinusoidal trend with five waves (A) or four waves in 100 trials (B). For the moving average method, the neural activities were first smoothed by the moving average with various window sizes and then the correlation coefficients were computed for the residuals. The mean  SD of the estimated noise correlations across 100 realizations of the simulated data plotted. The horizontal dotted gray lines for the true correlation coefficient (0.3) indicate that the biases are prominent for longer window sizes and for rapidly changing trends. the estimation accuracy slightly, we believe that the prob- Some previous works used longer window lengths for lem in computational costs is unavoidable in any case. detrending (previous and future 20 trials for Cohen and Newsome, 2008; and Gaussian kernels with   5 trials Proposed estimator has less errors than for Mitchell et al., 2009). Although it is not clear whether conventional moving averages the actual drift is as drastic as in Figure 3, our message in As some of the previous works (Cohen and New- this article is that, in fact, one can safely shorten the some, 2008; Mitchell et al., 2009) simply used the window length to the minimum size, i.e., two. moving average for detrending, we next compared the conventional moving average method with the pro- Examples of noise correlations in V1 neuron pairs posed method (Fig. 3). Here, we applied the proposed method for estimating In the comparison, as in Figure 2, the simulated pure short-term noise correlations to the pairs of the neural activities had the sinusoidal trend with five neural activities in the primary visual cortex. Figure 4 waves (Fig. 3A) or four waves in 100 trials (Fig. 3B). For shows the interneuronal noise correlations of two exam- the moving average method, the neural activities were ple pairs of neurons estimated by the proposed method first smoothed by the moving average with various as well as the conventional cross-correlogram. We com- window sizes, and then the correlation coefficients puted time-shifted noise correlations or cross-correlation were computed for the residuals. The horizontal dotted functions. Note that we solely computed noise correla- gray lines for the true correlation coefficient (0.3) tions for a fixed stimulus in this article, because recent indicate that the biases are prominent for longer win- works indicated the stimulus dependency of noise corre- dow sizes and for rapidly changing trends. lations (Kohn and Smith, 2005; Maruyama and Ito, 2013; Although the moving average method is uniquely de- Ruff and Cohen, 2016). For the putatively nonstationary fined for odd window sizes, some variants can be con- neuron pairs in Figure 4A, the time series for the activities sidered when the window size is two (and even lengths in of both neurons showed significant drifts. The conven- general). When the window size is two, however, one can tional correlogram showed the spurious correlations carefully define the moving average method so that it across wide shifts of trials, while the proposed method coincides with the proposed method regarding the corre- indicated no short-term correlation successfully. Note lation coefficients. To be precise, the moving average that similar broad cross-correlation functions have been method actually fails and underestimates both the vari- observed previously (Bair et al., 2001). For the putatively ances ( ,  ) and the covariance ( ) by half, although stationary neuron pairs in Figure 4B, the time series for the 11 22 12 the correlation coefficient as their ratio is intact as both neurons did not show significant drifts but the simul- /   . For example, when the true variances for the taneous activities tended to synchronize. Both the con- 12  11 22 activities of two neurons are both 1 and the true covari- ventional correlogram and the proposed method correctly ance is 0.2, the moving average method on average es- detected the short-term noise correlation at the origin. timates them as 0.5, 0.5, and 0.1 while the correlation Thus, the proposed method succeeded to clarify the fine coefficient estimated as their ratio coincides with that of structure of noises in real V1 data by detecting purely the proposed estimator, which is always near 0.2. short-term correlations. January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 11 of 16 Figure 4. Examples of noise correlations for two V1 neuron pairs. A, For the nonstationary case where the time series for the both neurons show significant drifts (left), the broad cross-correlation was estimated by the conventional cross-correlogram (p  0.00012 at the origin) but no short-term correlation by the proposed method (p  0.92 at the origin). B, For the stationary case where the time series for the both neurons do not show significant drifts but the simultaneous activities tend to synchronize, the narrow cross- 5 5 correlation at the origin was estimated by both the conventional correlogram (p  10 ) and the proposed method (p  10 ). Both long-term and short-term correlations are indicates that the significant short-term correlations for widely observed in V1 some pairs are not obtained by chance. Furthermore, all Next, we investigate the noise correlations for the entire types of pairs, irrespective of stationary and nonstationary population of pairs of simultaneously recorded neurons. neurons, show significant short-term correlations. As a Figure 5A,B plots the short-term noise correlations esti- control to check the validity of our statistical test, Figure mated by the proposed method against the conventional 5D shows the p value histogram for the same test ob- correlation coefficient for all the pairs within the stationary tained for the one-time-shifted V1 data that cannot have or nonstationary neurons. The stationary or nonstationary short-term correlations. The resulting uniform distribution neurons were selected by the statistical serial correlation demonstrates that, desirably, the statistical test detects test for nonstationarity. In Figure 5A, for the stationary no spurious short-term correlation even if the signals drift neuron pool, the correlations are highly reproducible, lo- in the V1 data. Remember, in contrast, the conventional cated along the diagonal line. Meanwhile, in Figure 5B, for correlogram in Figure 1B resulted in the non-zero corre- the nonstationary neuron pool, they are not reproducible, lations even for time-shifted data. scattered apart from the diagonal line, with smaller absolute In total, significant fractions of noise correlations seem values for the proposed method. The result suggests that to be explainable by the long-term components while the proposed method successfully removes long-term com- there are some pairs with significant short-term correla- ponents of noise correlations essentially by detrending. Note tions as well. We next pursue whether each component is that some of the smallest noise correlations reported in the either helpful or harmful for the sensory information rep- previous works were obtained for the detrended time series resentation in the brain. (Bair et al., 2001; Ecker et al., 2010; Renart et al., 2010), consistent with our observation. Thus, the nonstationarity or Impacts of short-term and long-term noise a baseline drift may engender spurious correlations even if correlations are dissociable no actual short-term correlation exists. Finally, we assessed the impacts on decoding of the Figure 5C shows the p value histogram for the statisti- presence of short-term or long-term correlations, sepa- cal significance of the proposed short-term noise corre- rately. Our estimator enables us to elucidate the impacts lations for V1 data. The non-uniformity of the distribution of short-term and long-term correlations in a dissociated January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 12 of 16 C D Figure 5. Population summary of noise correlations for all recorded V1 neurons. A, The proposed short-term noise correlations plotted against the conventional correlation coefficients for all the simultaneously recorded pairs of stationary V1 neurons (n  12,931 pairs). The stationary or nonstationary neurons were selected by the statistical serial correlation test for nonstationarity. The numbers along the axes denote the mean  SEM. B, Same plot for all the simultaneously recorded pairs of nonstationary V1 neurons (n  18891 pairs). Note that the correlations are highly reproducible located along the diagonal for the stationary neuron pool but not reproducible for the nonstationary neuron pool, suggesting that the proposed method successfully removes long-term noise correlations by detrending. C, The distribution of the p values for the statistical significance of the proposed short-term noise correlations for V1 data. s-s denotes the pair of two stationary neurons. s-n denotes the pair of stationary and nonstationary neurons. n-n denotes the pair of two nonstationary neurons. The non-uniformity of the distribution indicates that the significant short-term correlations for some pairs are not obtained by chance. D, The control distribution of the p values for the same test obtained for the one-time-shifted V1 data that cannot have short-term correlations. The uniform distribution demonstrates that, desirably, the statistical test detects no spurious short-term correlation even if the signals drift in the V1 data. Note that, in contrast, the conventional correlogram in Figure 1B resulted in the non-zero correlations even for time-shifted data. manner, as we will see. Here, we performed the linear Next, as the origin of long-term correlations, we visual- discriminant analysis of stimuli based on the neural re- ized the baseline drifts in Figure 6B. For the neurons with sponses and used the classification success rates as a significant baseline drifts (to be precise, the same neuron measure of the accuracy of neural coding. That is, the pool as used in Fig. 6C, right), we performed the principal higher the classification success rate is, the more accu- component analysis for the average responses to 16 vi- rate the neural coding should be. sual stimuli (i.e., tuning curves). The activities of the 189 To elucidate the impact of short-term correlations, we neurons with baseline drifts were concatenated and trans- compared the classification success rates in the absence formed (“rotated”) to the same numbers of 189 principal and presence of pure short-term correlations in Figure 6A. components, from which we chose the first two as the For that purpose, we first selected the neurons who (most informative) axes for visualization. Figure 6B plots have no long-term correlation. That is, we selected the the average responses (tuning curves) for 1st–20th trials neurons whose baselines did not drift significantly by (turquoise blue) and 21st–40th trials (green) separately using the serial correlation statistical test for random- and demonstrates the baseline drifts over trials shifted the ness of fluctuations (Materials and Methods). For those entire activities of neurons. However, it is still unclear, selected neurons, that cannot have long-term correla- from the simple visualization, whether this drift is, taking tions, we compared the classification success rates be- form of long-term correlations, significant in decoding. fore and after trial shuffling, which was supposed to To elucidate the impact of long-term correlations on remove short-term correlations. We computed the classi- decoding, we compared the classification success rates fication success rate session by session, as we wanted to in the absence and presence of pure long-term correla- include only simultaneously recorded pairs. We found that tions in Figure 6C. Specifically, we compared the cross- the impact of pure short-term correlations was small but validated classification success rates for four types of significantly positive in Figure 6A. learning: (1) when trained by former trials and tested by January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 13 of 16 A B C Figure 6. Impacts of short-term and long-term components of correlated activities of V1 neurons. A, The classification success rates in the absence and presence of pure short-term correlations (mean  SEM). In the presence of pure short-term correlations, the decoding accuracy was slightly improved (p  0.0044, paired t test, 15 sessions with 134 neurons). Note that the chance level is 1 / 16  6.25%, as 16 stimuli were decoded. B, The baseline drifts, which cause long-term correlations, were visualized by the principal component analysis for the average responses to 16 visual stimuli (tuning curves) of the neurons with significant baseline drifts (to be precise, the same neuron pool as in C, right). The average responses for 1st–20th trials (turquoise blue) and 21th–40th trials (green) demonstrate that the entire activities of neurons shift over trials. C, Decoding accuracy in the absence and presence of pure long-term correlations. The cross-validated classification success rates for four types of learning were compared: (1) when trained by former trials and tested by former trials, (2) when trained by former trials and tested by latter trials, (3) when trained by former trials and tested by latter trials after the respective global means were subtracted for detrending (i.e., centering and equating the means of former and latter trials in B), (4) when trained by even-numbered trials and tested by odd-numbered trials. Note that the conventional sampling of odd-numbered 20 trials (1st, 3rd, 5th, , 39th) included both former and latter trials as a part and, thus, can be inhomogeneous under baseline drifts. No significant difference was observed among four types of learning in the absence of pure long-term correlations, that is, when both short-term and long-term correlations were absent (left, not significant for all pairs, paired t test, 11 sessions with 77 neurons). The significant decrease at the green bar in the presence of pure long-term correlations demonstrates that the long-term correlations do harm for generalization (right, p  0.05, paired t test, 23 sessions with 189 neurons). The recovery of the classification success by the detrending or the conventional inhomogeneous sampling (trained by even-numbered and tested by odd-numbered trials) suggests that the brain can decode stimulus information under changing environments by using a sophisticated decoder (p  0.001, paired t test). former trials, (2) when trained by former trials and tested 0.05, paired t test, 23 sessions with 189 neurons). The by latter trials, (3) when trained by former trials and tested recovery of the classification success by the detrending or by latter trials after the respective global means were the conventional inhomogeneous sampling (trained by subtracted for detrending (i.e., centering and equating the even-numbered and tested by odd-numbered trials) sug- means of former and latter trials in Fig. 6B), (4) when gests that the decrease in decoding accuracy is due to trained by even-numbered trials and tested by odd- the baseline drift (p  0.001, paired t test). Note that numbered trials. Note that the conventional sampling of the last two types of leaning may mimic brains’ possible odd-numbered 20 trials (1st, 3rd, 5th, ..., 39th) included decoding strategies under changing environments, sug- both former and latter trials as a part and, thus, can be gesting that the brain could overcome nonstationarity by inhomogeneous under baseline drifts. For that purpose, detrending. we first selected the neurons who have no short-term Here, we solely compared the classification success correlation. That is, we selected the neurons whose short- rates obtained for the same neuron pool with different term correlation is not significant by using the statistical types of learning. This is because we believe that it is test accompanying our estimator (Materials and Meth- dangerous to compare different pools even if the numbers ods). Note that although the test applies to a pair, we of neurons are equated, as the sensitivity to stimuli varies eventually selected the neurons who have no short-term by neurons, leading to considerable sampling biases. For correlation in any pair in the session. For those selected example, if we compare the two green bars in Figure 6C, neurons, that cannot have short-term correlations, we the classification success rate per neuron trained by for- compared the classification success rates for the neurons mer and tested by latter is higher in the presence of with and without long-term correlations (i.e., statistically long-term correlations (data not shown), suggesting that significant baseline drifts) as in Figure 6C. As a control, no the overall high classification success in the presence of significant difference was observed among four types of the long-term correlations can be explained by the sam- learning in the absence of pure long-term correlations, pling biases, i.e., simply because the neuron pool with that is, when both short-term and long-term correlations long-term correlations have more smart neurons. were absent (left, not significant for all pairs, paired t test, Taken together, the proposed method enables us to 11 sessions with 77 neurons). On the other hand, the elucidate the impacts of short-term and long-term noise significant decrease at the green bar in the presence of pure long-term correlations demonstrates that the long- correlations in a dissociated manner. The well-designed term correlations do harm for generalization (right, p  decoding analysis with dissociated correlated activities January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 14 of 16 may help to gain insight into the brains’ decoding strate- big spontaneous fluctuations? Can the downstream neu- gies under changing environments. rons separate stimuli in the high dimensional space or, alternatively, cancel out the baseline drifts suitably? These questions remained and leave future work, possi- Discussion bly, with well-designed decoding analyses. In this article, we proposed an information-geometric In this article, we solely treated spike count correlations method to unbiasedly estimate pure short-term noise cor- as a measure of synchrony. We did not use spike-timing relations irrespective of arbitrarily drifting baselines. The cross-correlograms with milli-second bins (Toyama et al., simulation demonstrated the robustness of the proposed 1981a,b; Ito et al., 2010) as we took advantage of our estimator against the slow, long-term drift. The accompa- proposed method, which is limited to spike count corre- nying statistical test as well as the existing nonstationarity lations. The limitation comes from the assumption of no test enabled us to dissociate short-term and long-term temporal auto-correlation in the time series. The assump- correlations. When we exclude the spurious noise corre- tion is necessary to dissociate short-term and long-term lations of purely long-term nature, only a small fraction of cross-correlations successfully. If we consider millisec- V1 neuron pairs showed significant short-term correla- ond bins, temporal auto-correlations exist, which violates tions, possibly reconciling the previous inconsistent ob- the assumption. Here, we rather focused on spike count servations on existence of significant noise correlations. correlations to use our proposed method in depth to the Finally, with the additional help of the machine learning extent to elucidate the componentwise functions of short- that classifies stimuli from neural activities, we assessed term and long-term correlations. Thus, we did not say the impacts on decoding of the presence of short-term or anything on temporal coding in this article, although pre- long-term correlations, separately. The presence of pure vious papers suggested the relationship that the spike short-term correlations slightly improved the decoding count correlations increase with coupling strengths (Cos- accuracy, while the pure long-term correlations deterio- sell et al., 2015; Bharmauria et al., 2016). rated the generalization ability. However, the decrease in There are considerable merits for our proposed estima- decoding accuracy by the long-term correlations was tor of short-term correlations. It guarantees the smallest recoverable by using the decoder with offset, suggesting estimation error among all the estimators which “works” that the brain could overcome nonstationarity by detrend- for arbitrary baseline drifts. It utilizes differential geometry ing. Thus, our method enables us to elucidate the func- essentially and otherwise it is generally impossible to tions of short-term and long-term correlations in a cope with infinitely many cases programmatically even dissociated manner and the well-designed decoding anal- with the fastest computers. As a practical advantage, it ysis with dissociated correlated activities may help to gain enables us to perform a statistical test from a single trial or insight into the brain’s decoding strategies under chang- a snapshot of time series with baseline drifts, which is ing environments. usually unreproducible. The estimating equation given in Our observation that only a small fraction of neuron an analytically closed form as well as the accompanying pairs has short-term noise correlations after detrending statistical test, are quite simple and implementable within may, at first glance, inconsistent with previous works, a few lines of programming codes, easier than shuffling- which reported significant noise correlations. However, based methods which have longer lines and computa- the previous works which detrended the time series be- tional time. The underlying statistical model allows us not fore calculating noise correlations reported small short- only to test statistical significance but also to interpret the term noise correlations (Bair et al., 2001; Ecker et al., correlation coefficients quantitatively, which is unrealiz- 2010; Renart et al., 2010). In this sense, our result is able for other ad hoc or shuffling-based methods. consistent with the previous results. The previous model- Meanwhile, to fully exploit the temporal order and con- ing studies implied that even if short-term noise correla- tinuity of trials without assuming specific statistical mod- tions are small, it can have a big impact in a large network els for trends nonparametrically, we had to consider a (Zohary et al., 1994; Sompolinsky et al., 2001; Miura, simplified additive Gaussian noise model. However, some 2012). As far as our V1 dataset, the impact of short-term previous works used more realistic models such as mixed noise correlations was small but significantly positive. Poisson distributions and, for example, estimated the The classification analysis in this article demonstrated contributions of additive as well as multiplicative noises to that the presence of baseline drifts decreased the gener- explore the underlying biological processes (Goris et al., alizability of the classifier. However, the further analysis 2014; Arandia-Romero et al., 2016). Thus, it is desired to showed that the classification success rate can be recov- pursue temporal structures with more realistic statistical ered by detrending data or including more inhomoge- models in the future work. neous training data. Note that the generalizability should It is important to check whether the spike count data to depend on the training data set: the more different con- be analyzed satisfy the model assumptions of the pro- ditions are learned, the higher the classification success posed method. For example, the normality assumption is rate becomes. In other words, if the future (test) condi- tions are completely different from the past (learned) con- satisfied by high firing neurons in general due to the ditions, the baseline drifts do harm. Thus, we essentially central limit theorem. However, strictly speaking, when showed two possible decoding strategies that can over- we checked whether the spike count data used in the come nonstationarity. It is interesting to know how the article obey the normal distribution by using the Shapiro– brain decodes visual stimuli from small responses under Wilk test, only 70.0% of the neuron-stimulus pairs that are January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 15 of 16 with population activity affects encoded information. Neuron 89: stationary (i.e., without drifts) and modest-firing (i.e., more 1305–1316. CrossRef Medline than 5 Hz) satisfied the normality assumption. One pos- Averbeck BB, Latham PE, Pouget A (2006) Neural correlations, sible solution might be to apply the proposed method only population coding and computation. Nat Rev Neurosci 7:358–366. to the high firing neurons as low firing neurons tend to CrossRef Medline violate the normality assumption. If the data do not satisfy Bair W, Zohary E, Newsome WT (2001) Correlated firing in macaque the assumption of the normal noises, the proposed noise visual area MT: time scales and relationship to behavior. J Neuro- correlation is no more an optimal parameter estimate of sci 21:1676–1697. Medline Bharmauria V, Bachatene L, Cattan S, Chanauria N, Rouat J, Mo- the statistical model. It is also important to check whether lotchnikoff S (2016) High noise correlation between the functionally the assumption of the constant covariances is satisfied, at connected neurons in emergent v1 microcircuits. Exp Brain Res least for some time range. For example, strongly nonsta- 234:523–532. CrossRef Medline tionary neurons, whose firing rates grow twofold over an Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA (1993) Efficient and hour, might violate the assumption. We leave the detailed adaptive estimation for semiparametric models. Baltimore, MD: examination of the model assumptions with statistical Johns Hopkins University Press. Cohen MR, Newsome WT (2008) Context-dependent changes in model selection procedures for the future works. How- functional circuitry in visual area MT. Neuron 60:162–173. Cross- ever, if the violation is weak, the proposed measure could Ref Medline still be used as a rough measure. For example, even if the Cohen MR, Kohn A (2011) Measuring and interpreting neuronal data were actually non-Gaussian spike counts with mul- correlations. Nat Neurosci 14:811–819. CrossRef Medline tiplicative drifts (Goris et al., 2014), the sign of the pro- Cossell L, Iacaruso MF, Muir DR, Houlton R, Sader EN, Ko H, Hofer posed measure, excitatory or inhibitory, could still be SB, Mrsic-Flogel TD (2015) Functional organization of excitatory meaningful. synaptic strength in primary visual cortex. Nature 518:399–403. CrossRef Medline Another assumption for the proposed estimator was Dayan P, Abbott LF (2001) Theoretical neuroscience. Cambridge: that the baseline activities for the consecutive two trials MIT Press. are (almost) the same. This assumption in our analysis Doiron B, Litwin-Kumar A, Rosenbaum R, Ocker GK, Josic´ K (2016) was the clue to separate short timescales and long time- The mechanics of state-dependent neural correlations. Nat Neu- scales. Strictly speaking, however, as we computed noise rosci 19:383–393. CrossRef Medline correlations separately for different stimuli, the intervals Ecker AS, Berens P, Keliris GA, Bethge M, Logothetis NK, Tolias AS between the trials for the same stimulus are variable. Note (2010) Decorrelated neuronal firing in cortical microcircuits. Sci- ence 327:584–587. CrossRef Medline that stimuli were presented in a pseudo-random order. In Ecker AS, Berens P, Cotton RJ, Subramaniyan M, Denfield GH, fact, for the worst case, the effective trial interval can be Cadwell CR, Smirnakis SM, Bethge M, Tolias AS (2014) State as large as 90 s (3 s  15 stimulus  2). Although it is dependence of noise correlations in macaque primary visual cor- generally hard to characterize the effects of drifts on these tex. Neuron 82:235–248. CrossRef Medline medium timescales, no difference was observed between Fiser J, Chiu C, Weliky M (2004) Small modulation of ongoing cortical randomized and repeated orders of stimulus presenta- dynamics by sensory input during natural vision. Nature 431:573– tions (Kohn and Smith, 2005). Thus, we assumed that the 578. CrossRef Medline Goris RL, Movshon JA, Simoncelli EP (2014) Partitioning neuronal drifts on these medium timescales were ignorable. Prac- variability. Nat Neurosci 17:858–865. CrossRef Medline tically, if the assumption of the constant baseline is doubt- Gray CM, Maldonado PE, Wilson M, McNaughton B (1995) Tetrodes ful for a trial pair due to the long interval between them, markedly improve the reliability and yield of multiple single-unit one could remove the pair from the calculation of the isolation from multi-unit recordings in cat striate cortex. J Neurosci proposed estimator. That is, one could exclude unreliable CrossRef Medline 63:43–54. trial pairs from the summation in Equation 2. This type of Harvey AC (1993) Time series models. Cambridge: MIT Press. Ikegaya Y, Aaron G, Cossart R, Aronov D, Lampl I, Ferster D, Yuste exception handling could also work for avoiding the R (2004) Synfire chains and cortical songs: temporal modules of change point where the baselines jump suddenly. Devel- cortical activity. Science 304:559–564. CrossRef Medline oping a more flexible algorithm for the proposed method Ito H, Maldonado PE, Gray CM (2010) Dynamics of stimulus-evoked can be a future work. spike timing correlations in the cat lateral geniculate nucleus. J Neurophysiol 104:3276–3292. CrossRef Medline References Kanji GK (2006) 100 statistical tests. London: Sage. Abbott LF, Dayan P (1999) The effect of correlated variability on the Kohn A, Smith MA (2005) Stimulus dependence of neuronal corre- accuracy of a population code. Neural Comput 11:91–101. Med- lation in primary visual cortex of the macaque. J Neurosci 25: 3661–3673. CrossRef Medline line Latham PE, Nirenberg S (2005) Synergy, redundancy, and indepen- Amari S (2016) Information geometry and its applications. Tokyo: dence in population codes, revisited. J Neurosci 25:5195–5206. Springer Japan. Amari S, Cardoso J (1997) Blind source separation-semiparametric CrossRef Medline statistical approach. IEEE Trans Signal Process 45:2692–2700. Latham PE, Roudi Y (2013) Role of correlations in population coding. In: Principles of neural coding (Quiroga RQ, Panzeri S, eds), pp CrossRef Amari S, Kawanabe M (1997) Information geometry of estimating 121–138. Boca Raton, FL: CRC Press. functions in semi-parametric statistical models. Bernoulli 3:29–54. Luczak A, McNaughton BL, Harris KD (2015) Packet-based commu- CrossRef nication in the cortex. Nat Rev Neurosci 16:745–755. CrossRef Amari S, Nagaoka H (2001) Methods of information geometry. Prov- Medline Maruyama Y, Ito H (2013) Diversity, heterogeneity and orientation- idence, RI: American Mathematical Society. Arandia-Romero I, Tanabe S, Drugowitsch J, Kohn A, Moreno-Bote dependent variation of spike count correlation in the cat visual R (2016) Multiplicative and additive modulation of neuronal tuning cortex. Eur J Neurosci 38:3611–3627. CrossRef Medline January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 16 of 16 Maruyama Y, Ito H (2017) Design of multielectrode arrays for uniform Petris G, Petrone S, Campagnoli P (2009) Dynamic linear models sampling of different orientations of tuned unit populations in the with R. New York, NY: Springer. cat visual cortex. Neurosci Res 122:51–63. CrossRef Medline Renart A, de la Rocha J, Bartho P, Hollender L, Parga N, Reyes A, Mitchell JF, Sundberg KA, Reynolds JH (2009) Spatial attention Harris KD (2010) The asynchronous state in cortical circuits. Sci- decorrelates intrinsic activity fluctuations in macaque area V4. ence 327:587–590. CrossRef Medline Neuron 63:879–888. CrossRef Medline Rosenbaum R, Smith MA, Kohn A, Rubin JE, Doiron B (2017) The Miura K (2011) An introduction to maximum likelihood estimation and spatial structure of correlated neuronal variability. Nat Neurosci information geometry. Interdiscip Informat Sci 17:155–174. Cross- 20:107–114. CrossRef Medline Ref Ruff DA, Cohen MR (2016) Stimulus dependence of correlated vari- Miura K (2012) Effects of noise correlations on population coding. ability across cortical areas. J Neurosci 36:7546–7556. CrossRef Proceedings of Soft Computing and Intelligent Systems- Medline International Symposium on Advanced Intelligent Systems, pp Sasaki T, Matsuki N, Ikegaya Y (2007) Metastability of active ca3 1072–1075. networks. J Neurosci 27:517–528. CrossRef Miura K (2013) A semiparametric covariance estimator immune to Schneidman E, Bialek W, Berry MJ (2003) Synergy, redundancy, and arbitrary signal drift. Interdiscip Informat Sci 19:35–41. CrossRef independence in population codes. J Neurosci 23:11539–11553. Miura K, Uchida N (2008) A rate-independent measure of irregularity Medline for event series and its application to neural spiking activity. 47th Sharpee TO (2017) Optimizing neural information capacity through IEEE Conference on Decision and Control, pp 2006–2011. discretization. Neuron 94:954–960. CrossRef Medline Miura K, Okada M, Amari S (2006a) Estimating spiking irregularities Sharpee TO, Sugihara H, Kurgansky AV, Rebrik SP, Stryker MP, under changing environments. Neural Comput 18:2359–2386. Miller KD (2006) Adaptive filtering enhances information transmis- CrossRef Medline sion in visual cortex. Nature 439:936–942. CrossRef Medline Miura K, Okada M, Amari S (2006b) Unbiased estimator of shape Sompolinsky H, Yoon H, Kang K, Shamir M (2001) Population coding parameter for spiking irregularities under changing environments. in neuronal systems with correlated noise. Phys Rev E Stat Nonlin Adv Neural Inf Process Syst 18:891–898. [CrossRef] Soft Matter Phys 64:051904. CrossRef Medline Miura K, Tsubo Y, Okada M, Fukai T (2007) Balanced excitatory and Steinmetz NA, Koch C, Harris KD, Carandini M (2018) Challenges inhibitory inputs to cortical neurons decouple firing irregularity and opportunities for large-scale electrophysiology with neu- from rate modulations. J Neurosci 27:13802–13812. CrossRef CrossRef Med- ropixels probes. Curr Opin Neurobiol 50:92–100. Medline line Miura K, Mainen ZF, Uchida N (2012) Odor representations in olfac- Toyama K, Kimura M, Tanaka K (1981a) Cross-correlation analysis of tory cortex: distributed rate coding and decorrelated population interneuronal connectivity in cat visual cortex. J Neurophysiol activity. Neuron 74:1087–1098. CrossRef Medline CrossRef Medline 46:191–201. Moreno-Bote R, Beck J, Kanitscheider I, Pitkow X, Latham P, Pouget Toyama K, Kimura M, Tanaka K (1981b) Organization of cat visual A (2014) Information-limiting correlations. Nat Neurosci 17:1410– cortex as investigated by cross-correlation technique. J Neuro- 1427. CrossRef Medline physiol 46:202–214. CrossRef Medline Neyman J, Scott EL (1948) Consistent estimates based on partially van der Vaart AW (1998) Asymptotic statistics. Cambridge: Cam- consistent observations. Econometrica 16:1–32. CrossRef bridge University Press. Okun M, Steinmetz NA, Cossell L, Iacaruso MF, Ko H, Barthó P, Moore T, Hofer SB, Mrsic-Flogel TD, Carandini M, Harris KD (2015) Zohary E, Shadlen MN, Newsome WT (1994) Correlated neuronal Diverse coupling of neurons to populations in sensory cortex. discharge rate and its implications for psychophysical perfor- Nature 521:511–515. CrossRef Medline CrossRef Medline mance. Nature 370:140–143. January/February 2019, 6(1) e0395-18.2019 eNeuro.org http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png eNeuro Unpaywall

Assessing the Impacts of Correlated Variability with Dissociated Timescales

Assessing the Impacts of Correlated Variability with Dissociated Timescales

Abstract

Despite the profound influence on coding capacity of sensory neurons, the measurements of noise corre- lations have been inconsistent. This is, possibly, because nonstationarity, i.e., drifting baselines, engendered the spurious long-term correlations even if no actual short-term correlation existed. Although attempts to separate them have been made previously, they were ad hoc for specific cases or computationally too demanding. Here we proposed an information-geometric method to unbiasedly estimate pure short-term noise correlations irrespective of the background brain activities without demanding computational re- sources. First, the benchmark simulations demonstrated that the proposed estimator is more accurate and computationally efficient than the conventional correlograms and the residual correlations with Kalman filters or moving averages of length three or more, while the best moving average of length two coincided with the propose method regarding correlation estimates. Next, we analyzed the cat V1 neural responses to demonstrate that the statistical test accompanying the proposed method combined with the existing nonstationarity test enabled us to dissociate short-term and long-term noise correlations. When we excluded the spurious noise correlations of purely long-term nature, only a small fraction of neuron pairs showed significant short-term correlations, possibly reconciling the previous inconsistent observations on existence of significant noise

Loading next page...
 
/lp/unpaywall/assessing-the-impacts-of-correlated-variability-with-dissociated-AfkPEeAlGk

References (56)

Publisher
Unpaywall
ISSN
2373-2822
DOI
10.1523/eneuro.0395-18.2019
Publisher site
See Article on Publisher Site

Abstract

Despite the profound influence on coding capacity of sensory neurons, the measurements of noise corre- lations have been inconsistent. This is, possibly, because nonstationarity, i.e., drifting baselines, engendered the spurious long-term correlations even if no actual short-term correlation existed. Although attempts to separate them have been made previously, they were ad hoc for specific cases or computationally too demanding. Here we proposed an information-geometric method to unbiasedly estimate pure short-term noise correlations irrespective of the background brain activities without demanding computational re- sources. First, the benchmark simulations demonstrated that the proposed estimator is more accurate and computationally efficient than the conventional correlograms and the residual correlations with Kalman filters or moving averages of length three or more, while the best moving average of length two coincided with the propose method regarding correlation estimates. Next, we analyzed the cat V1 neural responses to demonstrate that the statistical test accompanying the proposed method combined with the existing nonstationarity test enabled us to dissociate short-term and long-term noise correlations. When we excluded the spurious noise correlations of purely long-term nature, only a small fraction of neuron pairs showed significant short-term correlations, possibly reconciling the previous inconsistent observations on existence of significant noise correlations. The decoding accuracy was slightly improved by the short-term correlations. Although the long-term correlations deteriorated the generalizability, the generalizability was recovered by the decoder with trend removal, suggesting that brains could overcome nonstationarity. Thus, the proposed method enables us to elucidate the impacts of short-term and long-term noise correlations in a dissociated manner. Key words: decoding analysis; information geometry; noise correlations; population codes; primary visual cor- tex; spontaneous activity Significance Statement The proposed measure for spike-count noise correlations, based on the local temporal detrending, enables us to decompose the correlated responses into long-timescale and short-timescale compo- nents. The proposed method is essential to elucidate the population codes in the era of large-scale electrophysiology as it works for large number of simultaneously recorded neurons while existing methods do not. With the additional help of the machine learning that classifies stimuli from neural activities, we demonstrate proper ways to assess the impacts on decoding of the presence of short-term or long-term noise correlations, separately. The well-designed decoding analysis with dissociated correlated activities will help to gain insight into the brain’s decoding strategies under changing environments. January/February 2019, 6(1) e0395-18.2019 1–16 Methods/New Tools 2 of 16 2006; Sharpee, 2017). That is, the decoding success rates Introduction can be used as a measure of accuracy of neural repre- The impacts and mechanisms of correlations in noises, sentations. One can take different features of neural ac- i.e., trial-to-trial variations in neural responses to the same tivities as clues for decoding to see which feature carries stimulus, have been issues in neuroscience (Cohen and information. Therefore, it is ideal, within this framework, if Kohn, 2011; Doiron et al., 2016). The information theoretic the dissociation of short-term and long-term correlations studies showed that correlation in response noises can be gives us a novel way to assess their respective impacts on a major determinant for coding capacities of sensory information representations. Abbott and Dayan, 1999; Som- information by neurons ( In this article, we propose an information-geometric polinsky et al., 2001; Miura et al., 2012; Latham and method to unbiasedly estimate pure short-term noise cor- Roudi, 2013; Moreno-Bote et al., 2014). In some cases, relations irrespective of the background brain activities. even in a simple homogeneous network with tiny noise One effective way to use the information geometry, that correlations, having more neurons does not help at all Amari generally finds orthogonal statistical parameters ( Zohary et al., 1994; but see also Abbott and Dayan, 1999; and Nagaoka, 2001; Miura, 2011; Amari, 2016), is to Sompolinsky et al., 2001; Miura, 2012; Moreno-Bote estimate only finite parameters of interest irrespectively of et al., 2014). Therefore, it is extremely important to esti- the other infinite numbers of parameters (Miura et al., mate noise correlations accurately in the era of large- 2006a,b, 2007; Miura and Uchida, 2008). Here, we used scale electrophysiology (Steinmetz et al., 2018). this infinite-dimensional scheme (Amari and Kawanabe, Although significant noise correlations have been ob- 1997; Miura, 2013) to dissociate the parameter for short- served in almost all recorded cortical areas, it has been term correlation from the infinitely many parameters for pointed out that nonstationarity such as drifts in signals (all possible) long-term baseline drifts nonparametrically. can engender artificial correlations even if no actual cor- This allows us to estimate pure short-term correlations Bair et al., 2001; Ecker et al., 2010; Renart relation exists ( whatever the baseline drift is without demanding consid- et al., 2010). Therefore, it is desired to dissociate the erable numbers of simultaneously recorded neurons and observed noise correlations into short-term and long-term high computational costs. Then, the accompanying sta- components, where the latter is possibly caused by the tistical test as well as the existing nonstationarity test background trends or fluctuations of the baseline activity enabled us to dissociate short-term and long-term corre- Fiser et al., 2004; Ikegaya et al., 2004; Sasaki et al., 2007; lations. First, as benchmark simulations, we demon- Luczak et al., 2015; Okun et al., 2015). Although attempts strated that the proposed estimator is more accurate and to separate them and estimate purely short-term noise computationally efficient than the conventional correlo- correlations under changing environments have been grams and the residual correlations with Kalman filters or made previously, they were ad hoc and applicable only to moving averages of length three or more, while the best Bair et al., 2001; Mitchell et al., 2009; specific cases ( moving average of length two coincided with the propose Ecker et al., 2010; Renart et al., 2010). Even the latest method regarding correlation estimates. Next, when we Bayesian method requires considerable numbers of si- excluded the spurious noise correlations of purely long- multaneously recorded neurons as well as exponential term nature, only a small fraction of V1 neuron pairs computational costs to estimate instantaneous activities showed significant short-term noise correlations, possibly Ecker et al., 2014; Rosenbaum et al., 2017). Thus, the reconciling the previous inconsistent observations on the estimation method, which requires only the recording of a existence of significant noise correlations. Finally, with the pair of neurons and works for arbitrary baseline drifts additional help of the machine learning that classifies Amari and Cardoso, 1997), is desired. nonparametricaly ( stimuli from neural activities, we assessed the impacts on In addition to measuring the noise correlations, assess- decoding of the presence of short-term or long-term noise ing their impacts is also very important. The degree to correlations, separately. The presence of pure short-term which sensory information is represented reliably by neu- correlations slightly improved the decoding accuracy, ral responses has been characterized by applying a de- while the pure long-term correlations deteriorated the coding approach in a stochastic stimulus–response generalization ability. However, the decrease in decoding Dayan and Abbott, 2001; Averbeck et al., framework ( accuracy by the long-term correlations was recoverable by using the decoder with offset, suggesting that the brain Received October 14, 2018; accepted February 5, 2019; First published could overcome nonstationarity by detrending. Thus, our February 11, 2019. method enables us to elucidate the impacts of short-term The authors declare no competing financial interests. Author contributions: T.T. and K.M. performed research; T.T. and K.M. and long-term correlations in a dissociated manner, ad- analyzed data; Y.M., H.I., and K.M. designed research; K.M. wrote the paper. vancing a modern, component-wise information theoretic This work was supported by Japan Society for the Promotion of Science, Schneidman et al., 2003; Latham and Nirenberg, analysis ( Grants-in-Aid for Scientific Research Grants 18K11485, 18K13251, 16K01966, ; Averbeck et al., 2006; Sharpee et al., 2006). and 15H05878. Correspondence should be addressed to Keiji Miura at miura@kwansei.ac.jp. Materials and Methods https://doi.org/10.1523/ENEURO.0395-18.2019 Copyright © 2019 Takahashi et al. All the simulations and data analyses in this article were This is an open-access article distributed under the terms of the Creative done by using R. Throughout the analyses in the article, Commons Attribution 4.0 International license, which permits unrestricted use, the firing rate for each trial was used as an activity feature. distribution and reproduction in any medium provided that the original work is properly attributed. The firing rate was computed as the spike count divided January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 3 of 16 by the trial duration with a visual stimulus, which varied by esized to obey the following statistical model (Eq. 4) and trials from 1.0 to 1.7 s. Thus, when we say correlation the proposed estimator estimates the Gaussian covari- coefficients or (trial-shifted) correlograms, we solely con- ance therein. Although the derivation and concise bench- Miura, sider spike count noise correlations. mark simulations were already shown elsewhere ( 2013), the application to the real experimental data has Proposed estimator for short-term noise correlation not been done yet. As a measure of short-term noise correlations, we pro- In this article, we solely consider the spike count within posed and used the following estimator, a trial, where the spiking activity of a neuron is integrated over a couple of seconds and, thus, well approximated by 12 a Gaussian distribution. This leads us to consider a bivari- ˆ  (1) ate normal distribution for activities of two neurons, q(x, y; ˆ ˆ 11 22 ,  , ), where  and  denote the means for two x y x y neurons’ activities and  denotes the covariance matrix. where the covariances  are estimated as ij The activities x and y denote the spike counts of two N/2 neurons for a trial. These analyses address the situation in 2 2 ˆ which the covariance matrix  is constant whereas the (x  x¯)  (x  x¯) (2) 11 2t1 (2t) 2t (2t) can change over time. Especially, when the signals signals t1 N/2 are distributed randomly, but two consecutive signals are the 2 2 y  y¯  y  y¯ 22 2t1 2t 2t 2t t1 same from continuity condition, the distribution of activities at N/2 x  x¯y  y¯  x  x¯ time 2t1 and 2t (t  1, 2,...) can be described as a mixed 12 2t1 2t 2t1 2t 2t 2t t1 model, y  y¯ . 2t 2t There in, x and y denote the neural responses in spike t t p( x , y , x , y ;, k( ,  ))  k( ,  ) 2t1 2t1 2t 2t x y x y counts within a few seconds in the t-th trial, while the local mean activities were defined by q(x , y ; ,  , )q(x , y ; ,  , )d d (4) 2t1 2t1 x y 2t 2t x y x y x  x 2t1 2t where k( , ) denotes an unknown distribution of the x¯  and (3) x y (2t) signals. The only assumption made here is that the con- y y 2t1 2t y¯  . secutive signals have equal value, at least approximately 2t (see the practical discussion below, at the end of Opti- mality of proposed estimator from statistical viewpoint). The proposed measure in Equation 1 is comparable to the That assumption is minimal and realistic as it is satisfied, conventional correlation coefficient. When we plotted in e.g., when the signal drift is continuous, and preferably, the form of correlograms, we first shifted one of two time sufficiently low. From another viewpoint, this definition of series by  trials and then computed the proposed mea- noises as the activities which is not locally flat over time is sure for them. quite convenient for estimation. Furthermore, Equation 4 is a semiparametric model Code accessibility (Bickel et al., 1993; van der Vaart, 1998) because it has The R code for computing the proposed correlation both a vector  and a function k( , ) as parameters. It is coefficient and its p value, as defined below in Statistical x y generally not easy to estimate parameters in semipara- tests for short-term noise correlations, is freely available metric models because a function space is fundamentally online at https://github.com/toshi-0415/eNeuro. The code infinite dimensional (Neyman and Scott, 1948). However, is ready to run just by replacing the example data for it is known that, for some cases, only parameters of Figure 4 with users’ own data. interest can be estimated efficiently through differential As there can be a minor style difference in coding the geometric methods on the manifolds of a family of prob- proposed measure, we unified the rule and adopted the ability distributions (Amari and Kawanabe, 1997; Amari one with the minimum errors throughout the article and and Nagaoka, 2001; Miura et al., 2006a,b, 2007; Miura the downloadable code. That is, there are two possible and Uchida, 2008). For this model, it is possible to esti- ways for pairing two neighboring trials, (1) starting at mate the three constant parameters  { , ( the first trial as {3,4}, {5,6}, and (2) starting at the 11 12 ), } whatever the signal drift k( , ) is. second trial as {2,3}, {4,5}, {6,7}, .... In the adopted 21 22 x y After a lengthy calculation in Miura (2013), the estimator {1,2} style, we took the average of the two estimated was obtained as in Equation 2. As the proposed estimator covariances, because we found it had smaller variances looks so simple, one might think that one can easily (estimation errors). This style difference only negligibly construct an arbitrary local smoother similar to the pro- modifies the results and the overall conclusions never posed estimator. However, because any arbitrarily in- change. vented estimators have larger estimation errors (biases Assumption and derivation of proposed estimator and variances) in general, it is actually very difficult to The proposed estimator in Equation 1 was derived for discover an optimal estimator from scratch. As far as we estimating parameters in a semiparametric statistical know, other than information geometry, there is no sys- model. That is, the activities of two neurons were hypoth- tematic way to analytically derive an optimal estimator January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 4 of 16 that works under arbitrary trends nonparametrically. For- d(( (X)  )p(X)) tunately, it is very easy to just prove the optimality of the fg p(X)dX  dX derived estimator, once it was derived. Therefore, we take advantage of this fact for the educational purpose in what ˆ d( (X)  ) p(X)dX  1, (8) follows. That is, we do not repeat the derivation but rather only check the answer and demonstrate the performance of the proposed estimator concisely in the following sec- where X  (X , X ). This shows that any estimator of 2t-1 2t tion. at least have the minimum variance in the right-hand side: Optimality of proposed estimator from statistical viewpoint Var[ ]     . (9) 11 22 12 21 Here, we summarize and prove the optimality property from the statistical viewpoint. Specifically, we show that Actually, the proposed estimator  attains this mini- mum variance as shown in Equation 6. Thus, for any the proposed estimator has no bias (i.e., correct on aver- age) and minimum variances (i.e., smallest errors) among estimator , the estimators which work unbiasedly for arbitrary base- Var[ ] Var[ ]. (10) line drifts. The unbiased nature of the proposed estimators is clear Similar relations hold for  and  . from the fact that the estimators in Equation 2 are nor- 11 22 We have demonstrated that the proposed estimator is malized by dividing not by 2( M) but by 1( M1). optimal as far as the assumption on the statistical model Normalization of this type is widely known to guarantee hold. Practically, due to the violation of the assumption the unbiased estimation for the covariances of Gaussian that the consecutive two signals (means) are exactly the distributions. In fact, with X  (x ,y ) and   ( , ), the t t t x y same, the biases can arise. However, it can be shown expectation of  can be calculated as an integral over from a simple calculus that the biases are generally small. the probability distribution in Equation 4 as In fact, if the consecutive signals are E[x ]    and (11) 2t1 ˆ ˆ E[ ]:  (  (X , X )q(X , )q(X , ) 12 12 2t1 2t 2t1 2t E x 2t dX dX )k()d   k()d  . (5) differing of order of , then, the biases are of the second 2t1 2t 12 12 order of : This means that the estimator is unbiased or the estimator (x  x ) 2t1 2t 2 2 E[ ]  E[ ]    2 . (12) works (at least) “on average.” The variance of the estimate 11 can be similarly computed as Thus, even if one assumes that the biases accumulate over the time points whose size is of order 1 / , the total ˆ ˆ Var[ ]:  ( ( (X , X ) ) 12 12 2t1 2t bias is still negligible, being of order 1 /  . This suggests that even if the signal drifts slowly O  as in q(X , )q(X , )dX dX )k()d 2t1 2t 2t1 2t 11 22 Equation 11, keeping the difference between the first and . (6) 12 21 the last activities finite O(1) after a long time sequence O1 / , the total bias is negligibly small O( ). In fact, Surprisingly,  has the minimum variance ( estimation Figure 5D demonstrates that the proposed statistical test detects no spurious short-term correlations even if signals error) among all the estimators. To prove this, assume that drift in the real V1 data. X , X  is an arbitrary estimator of  , that is, 2t1 2t 12 Simulation of activities of two neurons with drifting E( (X , X )): 2t1 2t baselines The simulations of bivariate Gaussian noises added to (X , X )p(X , X ;, k( ,  ))dX dX  . 2t1 2t 2t1 2t x y 2t1 2t 12 the baselines generated by the ARIMA models for activi- ties of two neurons in Figure 1A were performed with mvrnorm() and arima.sim() functions in R. Note that we assumed that the expectation is equal to the statistical parameter of interest because any estimator Conventional cross-correlograms should work at least “on average.” By using the Cauchy– The conventional cross-correlograms were computed Schwartz inequality in functional space ( f g f·g) with with cor() function in R for the manually trialshifted data. f  X  and g  dlog pX d , we get 12 / 12 As the function returns NA (i.e., not available) when either of two neurons show no spike across all 40 trials, we Var[f]Var[g]  Var[ ] (    ) (7) 11 22 12 21 excluded those pairs from the analyses in the article. Note January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 5 of 16 Figure 1. Comparison of conventional cross-correlogram and proposed method. A, Artificial activities of two neurons, simulated as sums of baselines and trial-to-trial noises. The thick gray smooth curves denote time-dependent baselines  generated by the ARIMA(0,2,1) model, on which the bivariate Gaussian noises were added to generate the neural activities. The added noises have significant spatial or interneuronal correlations but no temporal correlation because intertrial intervals are assumed to be fairly long ( 3 s). B, The estimated cross-correlations for the simulated activities in A by the proposed method (red) and the conventional correlogram (black). Only the proposed method works and shows a proper peak at the origin. C, Schematic illustrations of how the proposed method works for the cases with pure long-term or short-term correlations. The cross-correlation computed within each local window, where the baselines are instantaneously constant, are averaged across sliding windows to capture only short-term correlations whatever the baseline drift is. January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 6 of 16 that the proposed method also returns NA for those pairs. obtained distribution (typically top 2.5% for both sides, We computed time-shifted noise correlations or cross- corresponding to positive and negative correlations), it is correlation functions separately for different stimuli, be- detected as significant or violating the null hypothesis. cause recent works indicated the stimulus dependency of When we computed the control p value distribution Kohn and Smith, 2005; Maruyama and Figure 5D, we actually noise correlations ( for “one-time-shifted” data in Ito, 2013; Ruff and Cohen, 2016). shifted two-trials. This is because our proposed esti- mator treats the time series by pairs of time points as in Kalman filter method Equations 2, 4. This is also why we shifted 2, 4, 6, . . ., The smoothing by the Kalman filter to obtain the base- trials in Figures 1, 2. line trend of the simulated neural activities was computed The R codes for computing the proposed short-term with dlmFilter() function in dlm package for R (Petris correlation and the accompanying statistical test was et al., 2009). The noise correlations in residuals was ob- handwritten. tained by the maximum likelihood method for data fitting 16NN  1 2 , As the level of significance,0.01 / / with dlmMLE() function in the same package. The statis- where N denotes the number of neurons in the session tical model for the baseline trend  we assumed to and 16 is for 16 stimuli, was entirely used in the article decode with Kalman filter was Fig. 6). That is, we employed Bonferroni’s (specifically in multiple comparison technique, because we wanted to F   Gaussian noise t1 t keep the number of neuron pairs moderately. Note that if i i X  G    Gaussian noise (13) t t i we remove a neuron, we lose many pairs in the same session. (i) where X denotes i-th neuron’s activity and F and G are to be estimated by data fitting. Statistical tests for nonstationarity The computational time was measured by proc.time() We selected neurons with and without nonstationarity function in R on iMac with 3.3 GHz Intel Core i5 and by using the serial correlation test for randomness of 32-GB memory. fluctuations (CASE64 in Kanji, 2006). To remove the effect of stimulus presentation from the time series of neural Statistical tests for short-term noise correlations activities, we averaged local 16 trials within a single block We detected neuron pairs that have significant short- where 16 different stimuli are presented pseudo- term noise correlations by using the statistical test ac- randomly. In this way, the length of the time series was companying our estimator. As is usual with statistical reduced from original 640 to 40 trials, to which we applied tests, we computed p values under the null hypothesis of the test. The R code for the test was handwritten. The no correlation. validity of the test was confirmed by the observation that One possible way, which we did not adopt, was to the test returns uniformly distributed p values for the assume the asymptotic normality for the distribution of Gaussian white noises or the completely random time the proposed estimator, whose mean and variance can be series in which a random number is generated according computed from Equations 5, 6 (or from simulations). How- to the normal distribution at every time. Note that the ever, for the current case, each neuron has only finite 40 resulting p value varies by (random) time series and, here, trials per stimulus, and thus, the normality assumption we confirmed that the distribution got flat with many holds only approximately. Therefore, for example, the realizations. control p value distribution for the one-time-shifted data As the level of significance, 0.01 was entirely used in the are not as flat as in Figure 5D, although it is approximately article (specifically in Figs. 5, 6). We did not employ the flat. Although this method saves the computational time, it multiple comparison techniques, as we wanted to cate- seems to lack the accuracy in p values. gorize suspicious neurons into the nonstationary neuron To pursue the full accuracy, we resorted to the compu- pool, conservatively. tational method with the white Gaussian Monte Carlo simulations for reference activities of neuron pairs. Here, Classification analysis and principal component the test was based on the idea that even if there is no analysis short-term correlation, its estimate from finite 40 trials For the classifications of 16 visual stimuli based on the takes a non-zero value (error), which varies according to firing rates of neurons, we solely used lda() function in R some statistical distribution. First, we obtained the shape in this article, although the result did not change signifi- of the distribution as accurate as possible by repeating cantly when we used the support vector machine. The the Monte Carlo simulations a million times. Next, the p classification was done session by session to use the value for a given estimate is defined as its percentile in simultaneity of the recorded data. For the statistical sig- this numerically obtained distribution. That is, the p value nificance, the means of classification success rates for all is defined so that the p value distribution is completely flat sessions were compared between different conditions by for white Gaussian noises. To be precise, the p value the paired t test. Only the sessions with more than five varies by realization of the activities of two neurons, but, neurons remaining after the selections by short-term or with many realizations, one obtains the uniform distribu- long-term correlations were included in the classification tion for the p values. Note that the uniform p value distri- analyses for reliability. bution is a hallmark of a good statistical test. Finally, if an For the principal component analysis, we used estimate is too high or too low within the numerically- prcomp() command in R. As a preprocessing, we first January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 7 of 16 Figure 2. Comparison of conventional Kalman filter method and proposed method. A, The simulated activities of two neurons (red and blue) for 100 trials with the common sinusoidal baseline trend. The thick gray line denotes the model trend used for the data January/February 2019, 6(1) e0395-18.2019 eNeuro.org neural activity Methods/New Tools 8 of 16 continued generation. The activities of two neurons at each time are generated as the sum of the baseline trend and the bivariate Gaussian noises with unit variances and 0.3 correlation coefficient. When we simulated more than two neurons simultaneously, the additional neurons shared the trend but did not have noise correlation (data not shown). Thus, among N simulated neurons, only neurons 1 and 2 have non-zero correlation coefficient, which is to be estimated. B, The residual activities after the removal of the estimated trend by the Kalman filter from the activities in A. C, The noise correlations in the residuals averaged across 100 realizations of the simulated data. The horizontal dotted gray line for the true correlation coefficient (0.3) indicates that the conventional Kalman filter method does not work when the number of simultaneously simulated neurons are small. The error bars representing the SD demonstrate the large trial-to-trial variability in the results. D, Noise correlations estimated by the proposed method from the same data. The horizontal dotted gray line for the true correlation coefficient (0.3) indicates that the proposed method always works. The error bars representing the SD demonstrate the small variability in the results. E, The computational time for the conventional Kalman filter method. F, The computational time for the proposed method. averaged the neural responses to each stimulus, in order Multi-unit activities recorded by each electrode were not to include the trial-to-trial variability in the visualization sorted to recover the activities of individual single units Gray et al., 1995). by principal components. That is, we essentially visual- using custom spike sorting software ( ized the tuning curves. In addition, here we did not stan- Results dardize the activity of each neuron or tuning curve, For the purpose of measuring the spike-count noise because we did not want to enlarge small noises within correlations in different timescales and assessing their bad neurons who do not respond to any stimuli at all. That respective impacts on neural representations, we used is, not to listen to purely noisy neurons too much, we did the novel information geometric estimators of pure short- not enlarge the tuning curves even if their amplitudes are term correlations, which can be dissociated from long- small. In Figure 6B, the same neuron pool as in Figure 6C, term correlations in a nonparametric manner, that is, right, i.e., the neurons with pure long-term correlations, whatever the baseline drifts are. Before we applied this was used (189 neurons from 23 sessions). proposed method to the neural responses in V1, we checked whether and how it worked for the simulated V1 neuronal spikes time series as a benchmark. The experimental details for the cat V1 anesthetized recordings we reanalyzed have been previously described Proposed estimator works irrespective of baseline (Maruyama and Ito, 2013, 2017). Briefly, 566 neurons drifts First, we randomly generated the artificial time series were recorded in 48 sessions with 640 trials (40 repeats of which mimics the activities of two neurons, whose base- 16 visual stimuli) from five adult male cats. Two types of lines drift across many trials. Note that nonstationarity, electrode arrays were adopted for the recordings: a four- often observed experimentally in an unreproducible man- tetrode array and an array of eight single microelectrodes, ner, was indispensable for the simulation, as we wanted both of which were fabricated in the laboratory. to see whether the proposed method can overcome it. The eyes were focused on the tangential screen at a In the numerical simulation in Figure 1A, the activities of distance of 57 cm using the tapetal reflection technique two neurons were created by adding the bivariate Gauss- and an appropriate set of gas-permeable contact lenses. ian noises to the smoothly drifting trends, which, in turn, The pupils were dilated using phyenylephrine hydrochlo- were independently generated for the two neurons by ride (Neosynesin eye solution). All animal procedures ARIMA(0,2,1) model whose moving average coefficient were performed in accordance with the Kyoto Sangyo was 0.6 (Harvey, 1993). Here, a significant short-term University animal care committee’s regulations. noise correlation (  0.3) was induced only between Once stable recordings were obtained, the receptive simultaneous noises for two neurons, mimicking typical field properties (location) of the multi-unit activity re- neuroscience experiments where significant trial intervals corded by each electrode were mapped, using a mouse- of seconds order wash out intertrial temporal correlations controlled moving light bar presented on a 21-inch color in spike counts. An example realization of the simulation monitor (1024  768 resolution, vertical refresh rate of 80 in Figure 1A, that mimics one recording session, shows Hz) at a distance of 57 cm from the eyes. Because the hallmark drifting baselines, which is definitely unrepro- receptive fields of the units recorded by the high-density ducible and hard to estimate with limited sample number electrode arrays had significant overlap, the units were or from this “single snapshot” data. Note that here we stimulated by moving the light bars on a dark background exclusively consider trials as a unit for time axes, instead crossing over the region covering all of the receptive of fine scale windows such as 1-ms bins. fields. The stimuli consist of the light bars of 16 orienta- Figure 1B shows the cross-correlation functions for the tions equally spaced (i.e., with an angular separation of realization of simulated activities for two neurons in Figure 22.5°) that move along the direction of the normal. We ran 1A computed by both the conventional correlogram and 40 trial blocks in which each of the 16 stimuli were pre- the proposed method (Eqs. 1, 2). Here, the correlation sented in a pseudo-random order with an intertrial interval coefficient  was estimated for each time-shifted data, of 3 s. The bars traveled an angular distance of 3–5° over where the activities of one neuron was time-shifted while a period of 1.0–1.7 s (speed 3°/s). those for the other neuron was kept. Because of the wrong January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 9 of 16 assumption of the constant baselines, the conventional cor- Figure 2A shows the activities of two neurons, simulated relation coefficients caused a broad cross-correlation func- as the time series of length 100 with the common sinusoidal tion attributable to the temporal correlations in the baselines. baseline trend. The activities of two neurons at each time are That is, the correlation coefficient is positive because when generated as the sum of the baseline trend and the bivariate the activity of neuron 1 is higher (lower) than its average at a Gaussian noises with unit variances and 0.3 correlation co- late (early) trial, that of neuron 2 is also higher (lower). Adding efficient. When we simulated more than two neurons simul- a time shift does not affect this situation as there is a global taneously, the additional neurons shared the trend but did trend in Figure 1A. Note that broad cross-correlation func- not have noise correlations. Thus, among N simulated neu- tions have been observed for the experimental data (Bair rons, only neurons 1 and 2 have a non-zero correlation et al., 2001). On the other hands, the proposed method gave coefficient, which is to be estimated. a satisfiable result, correctly causing 0 for the time shifted Figure 2B shows the residual activities after the removal data and the short-term correlation  (0.3) for the simulta- of the estimated trend by the Kalman filter from the ac- neous data as demonstrated by a clear peak in Figure 1B. tivities in Figure 2A. The dark horizontal line indicates the Note that the estimated correlation coefficient ˆ 0.3 is estimated trend, which has been already removed from not only useful for statistical tests but also interpretable as a the activities. simultaneous covariation of Gaussian noises because our Figure 2C shows the noise correlations in the residuals method is statistical model-based. averaged across 100 realizations of the simulated data. The reason for the flexible estimation by the proposed The horizontal dotted gray line for the true correlation method is that it estimates the covariance for two neurons coefficient (0.3) indicates that the conventional Kalman within each local window, where the background activity filter method does not work when the number of simulta- is assumed to be almost constant, and, then, averages neously simulated neurons are small. Naturally, recording the local estimates across sliding windows as in Figure from more neurons helps to estimate the current baseline 1C. Note that our method is based on the assumption that trend, which is essentially the average activities of neu- the short-term correlation (or the covariance parameter of rons in this easiest situation. If one does not know base- Gaussian noises) is constant over time. Consequently, the line trends accurately, the estimation of noise correlations proposed method enables estimation of the short-term fails as well. In more realistic situations, in which neurons correlations existing in the simultaneous activities inde- do not necessarily share baseline trends, more neurons pendently of the drifting baselines. Figure 1C shows how would be required to estimate the noise correlation by the this method works for the cases with pure long-term (Fig. Kalman filter-like methods. 1C, top) or short-term (Fig. 1C, bottom) correlations. In the Figure 2D shows the noise correlations estimated by case of pure long-term correlations in Figure 1C, top, the the proposed method from the same data. The horizontal estimate of the correlation in the short window is zero (on dotted gray line for the true correlation coefficient (0.3) average), as there is no real short-term correlation and the indicates that the proposed method always works. Note baseline drift is negligible in this short timescale. Note that the proposed methods only requires the activities of the an implicit assumption in the proposed method is that two relevant neurons as evident in Equations 1, 2. within a short window, the baseline drift is absent or Furthermore, the Figure 2E shows that the Kalman filter negligible, although the violation of this assumption, if can be fairly expensive in computational time with as small enough, actually does not matter (Materials and small as 15 neurons. Given the number of simultaneously Methods). In the case of pure short-term correlations in recorded neurons is increasing rapidly, the computational Figure 1C, bottom, the estimate of the correlation in the costs can easily constitute a limiting factor. Thus, the short window is non-zero (on average), as there is a real proposed method is advantageous not only in the estima- short-term correlation although the baseline drift is ab- tion accuracy, but also in the computational cost as dem- sent. In this way, the proposed “local” estimates, that can onstrated in Figure 2F. be unaffected by the slow, long-term trends, work fairly The results obtained here are fairly general. Although the sinusoidal trend with seven cycles was entirely used in well even if the baseline activities drift arbitrarily over time. this article, qualitatively the same results were obtained Proposed estimator requires less neurons and for a wide range of numbers of cycles (4–10; data not computational powers than conventional Kalman shown). Imagine that the sinusoidal waves with different filters periods can exhaust the different possible timescales. In The key idea for the proposed estimator of noise cor- fact, it has been numerically demonstrated that the pro- relations resides in the local detrending. However, there posed method worked also for linear as well as stepwise are other types of detrending methods such as Kalman trends in the previous work (Miura, 2013), although all filters. The latest studies also computed the correlations these numerical simulations just confirmed the mathemat- in residuals after the neural activities were smoothed and ical statement that the proposed method is robust against detrended by the Kalman filter-like methods (Ecker et al., arbitrary drifts. Although the proposed method might look 2014; Rosenbaum et al., 2017). Therefore, we performed too easy at first glance, any other ad hoc estimators of another benchmark simulation to compare the conven- covariances cannot achieve the unbiasedness (i.e., cor- tional Kalman filter method with the proposed method. rectness) under arbitrary drifts. Moreover, although the Specifically, we checked whether the two methods work latest best Bayesian methods can be regarded as variants in the presence of sinusoidal baseline drifts in simulations. of Kalman filter methods and some of them might improve January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 10 of 16 Figure 3. Comparison of conventional moving average method and proposed method. As in Figure 2, the simulated neural activities had the sinusoidal trend with five waves (A) or four waves in 100 trials (B). For the moving average method, the neural activities were first smoothed by the moving average with various window sizes and then the correlation coefficients were computed for the residuals. The mean  SD of the estimated noise correlations across 100 realizations of the simulated data plotted. The horizontal dotted gray lines for the true correlation coefficient (0.3) indicate that the biases are prominent for longer window sizes and for rapidly changing trends. the estimation accuracy slightly, we believe that the prob- Some previous works used longer window lengths for lem in computational costs is unavoidable in any case. detrending (previous and future 20 trials for Cohen and Newsome, 2008; and Gaussian kernels with   5 trials Proposed estimator has less errors than for Mitchell et al., 2009). Although it is not clear whether conventional moving averages the actual drift is as drastic as in Figure 3, our message in As some of the previous works (Cohen and New- this article is that, in fact, one can safely shorten the some, 2008; Mitchell et al., 2009) simply used the window length to the minimum size, i.e., two. moving average for detrending, we next compared the conventional moving average method with the pro- Examples of noise correlations in V1 neuron pairs posed method (Fig. 3). Here, we applied the proposed method for estimating In the comparison, as in Figure 2, the simulated pure short-term noise correlations to the pairs of the neural activities had the sinusoidal trend with five neural activities in the primary visual cortex. Figure 4 waves (Fig. 3A) or four waves in 100 trials (Fig. 3B). For shows the interneuronal noise correlations of two exam- the moving average method, the neural activities were ple pairs of neurons estimated by the proposed method first smoothed by the moving average with various as well as the conventional cross-correlogram. We com- window sizes, and then the correlation coefficients puted time-shifted noise correlations or cross-correlation were computed for the residuals. The horizontal dotted functions. Note that we solely computed noise correla- gray lines for the true correlation coefficient (0.3) tions for a fixed stimulus in this article, because recent indicate that the biases are prominent for longer win- works indicated the stimulus dependency of noise corre- dow sizes and for rapidly changing trends. lations (Kohn and Smith, 2005; Maruyama and Ito, 2013; Although the moving average method is uniquely de- Ruff and Cohen, 2016). For the putatively nonstationary fined for odd window sizes, some variants can be con- neuron pairs in Figure 4A, the time series for the activities sidered when the window size is two (and even lengths in of both neurons showed significant drifts. The conven- general). When the window size is two, however, one can tional correlogram showed the spurious correlations carefully define the moving average method so that it across wide shifts of trials, while the proposed method coincides with the proposed method regarding the corre- indicated no short-term correlation successfully. Note lation coefficients. To be precise, the moving average that similar broad cross-correlation functions have been method actually fails and underestimates both the vari- observed previously (Bair et al., 2001). For the putatively ances ( ,  ) and the covariance ( ) by half, although stationary neuron pairs in Figure 4B, the time series for the 11 22 12 the correlation coefficient as their ratio is intact as both neurons did not show significant drifts but the simul- /   . For example, when the true variances for the taneous activities tended to synchronize. Both the con- 12  11 22 activities of two neurons are both 1 and the true covari- ventional correlogram and the proposed method correctly ance is 0.2, the moving average method on average es- detected the short-term noise correlation at the origin. timates them as 0.5, 0.5, and 0.1 while the correlation Thus, the proposed method succeeded to clarify the fine coefficient estimated as their ratio coincides with that of structure of noises in real V1 data by detecting purely the proposed estimator, which is always near 0.2. short-term correlations. January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 11 of 16 Figure 4. Examples of noise correlations for two V1 neuron pairs. A, For the nonstationary case where the time series for the both neurons show significant drifts (left), the broad cross-correlation was estimated by the conventional cross-correlogram (p  0.00012 at the origin) but no short-term correlation by the proposed method (p  0.92 at the origin). B, For the stationary case where the time series for the both neurons do not show significant drifts but the simultaneous activities tend to synchronize, the narrow cross- 5 5 correlation at the origin was estimated by both the conventional correlogram (p  10 ) and the proposed method (p  10 ). Both long-term and short-term correlations are indicates that the significant short-term correlations for widely observed in V1 some pairs are not obtained by chance. Furthermore, all Next, we investigate the noise correlations for the entire types of pairs, irrespective of stationary and nonstationary population of pairs of simultaneously recorded neurons. neurons, show significant short-term correlations. As a Figure 5A,B plots the short-term noise correlations esti- control to check the validity of our statistical test, Figure mated by the proposed method against the conventional 5D shows the p value histogram for the same test ob- correlation coefficient for all the pairs within the stationary tained for the one-time-shifted V1 data that cannot have or nonstationary neurons. The stationary or nonstationary short-term correlations. The resulting uniform distribution neurons were selected by the statistical serial correlation demonstrates that, desirably, the statistical test detects test for nonstationarity. In Figure 5A, for the stationary no spurious short-term correlation even if the signals drift neuron pool, the correlations are highly reproducible, lo- in the V1 data. Remember, in contrast, the conventional cated along the diagonal line. Meanwhile, in Figure 5B, for correlogram in Figure 1B resulted in the non-zero corre- the nonstationary neuron pool, they are not reproducible, lations even for time-shifted data. scattered apart from the diagonal line, with smaller absolute In total, significant fractions of noise correlations seem values for the proposed method. The result suggests that to be explainable by the long-term components while the proposed method successfully removes long-term com- there are some pairs with significant short-term correla- ponents of noise correlations essentially by detrending. Note tions as well. We next pursue whether each component is that some of the smallest noise correlations reported in the either helpful or harmful for the sensory information rep- previous works were obtained for the detrended time series resentation in the brain. (Bair et al., 2001; Ecker et al., 2010; Renart et al., 2010), consistent with our observation. Thus, the nonstationarity or Impacts of short-term and long-term noise a baseline drift may engender spurious correlations even if correlations are dissociable no actual short-term correlation exists. Finally, we assessed the impacts on decoding of the Figure 5C shows the p value histogram for the statisti- presence of short-term or long-term correlations, sepa- cal significance of the proposed short-term noise corre- rately. Our estimator enables us to elucidate the impacts lations for V1 data. The non-uniformity of the distribution of short-term and long-term correlations in a dissociated January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 12 of 16 C D Figure 5. Population summary of noise correlations for all recorded V1 neurons. A, The proposed short-term noise correlations plotted against the conventional correlation coefficients for all the simultaneously recorded pairs of stationary V1 neurons (n  12,931 pairs). The stationary or nonstationary neurons were selected by the statistical serial correlation test for nonstationarity. The numbers along the axes denote the mean  SEM. B, Same plot for all the simultaneously recorded pairs of nonstationary V1 neurons (n  18891 pairs). Note that the correlations are highly reproducible located along the diagonal for the stationary neuron pool but not reproducible for the nonstationary neuron pool, suggesting that the proposed method successfully removes long-term noise correlations by detrending. C, The distribution of the p values for the statistical significance of the proposed short-term noise correlations for V1 data. s-s denotes the pair of two stationary neurons. s-n denotes the pair of stationary and nonstationary neurons. n-n denotes the pair of two nonstationary neurons. The non-uniformity of the distribution indicates that the significant short-term correlations for some pairs are not obtained by chance. D, The control distribution of the p values for the same test obtained for the one-time-shifted V1 data that cannot have short-term correlations. The uniform distribution demonstrates that, desirably, the statistical test detects no spurious short-term correlation even if the signals drift in the V1 data. Note that, in contrast, the conventional correlogram in Figure 1B resulted in the non-zero correlations even for time-shifted data. manner, as we will see. Here, we performed the linear Next, as the origin of long-term correlations, we visual- discriminant analysis of stimuli based on the neural re- ized the baseline drifts in Figure 6B. For the neurons with sponses and used the classification success rates as a significant baseline drifts (to be precise, the same neuron measure of the accuracy of neural coding. That is, the pool as used in Fig. 6C, right), we performed the principal higher the classification success rate is, the more accu- component analysis for the average responses to 16 vi- rate the neural coding should be. sual stimuli (i.e., tuning curves). The activities of the 189 To elucidate the impact of short-term correlations, we neurons with baseline drifts were concatenated and trans- compared the classification success rates in the absence formed (“rotated”) to the same numbers of 189 principal and presence of pure short-term correlations in Figure 6A. components, from which we chose the first two as the For that purpose, we first selected the neurons who (most informative) axes for visualization. Figure 6B plots have no long-term correlation. That is, we selected the the average responses (tuning curves) for 1st–20th trials neurons whose baselines did not drift significantly by (turquoise blue) and 21st–40th trials (green) separately using the serial correlation statistical test for random- and demonstrates the baseline drifts over trials shifted the ness of fluctuations (Materials and Methods). For those entire activities of neurons. However, it is still unclear, selected neurons, that cannot have long-term correla- from the simple visualization, whether this drift is, taking tions, we compared the classification success rates be- form of long-term correlations, significant in decoding. fore and after trial shuffling, which was supposed to To elucidate the impact of long-term correlations on remove short-term correlations. We computed the classi- decoding, we compared the classification success rates fication success rate session by session, as we wanted to in the absence and presence of pure long-term correla- include only simultaneously recorded pairs. We found that tions in Figure 6C. Specifically, we compared the cross- the impact of pure short-term correlations was small but validated classification success rates for four types of significantly positive in Figure 6A. learning: (1) when trained by former trials and tested by January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 13 of 16 A B C Figure 6. Impacts of short-term and long-term components of correlated activities of V1 neurons. A, The classification success rates in the absence and presence of pure short-term correlations (mean  SEM). In the presence of pure short-term correlations, the decoding accuracy was slightly improved (p  0.0044, paired t test, 15 sessions with 134 neurons). Note that the chance level is 1 / 16  6.25%, as 16 stimuli were decoded. B, The baseline drifts, which cause long-term correlations, were visualized by the principal component analysis for the average responses to 16 visual stimuli (tuning curves) of the neurons with significant baseline drifts (to be precise, the same neuron pool as in C, right). The average responses for 1st–20th trials (turquoise blue) and 21th–40th trials (green) demonstrate that the entire activities of neurons shift over trials. C, Decoding accuracy in the absence and presence of pure long-term correlations. The cross-validated classification success rates for four types of learning were compared: (1) when trained by former trials and tested by former trials, (2) when trained by former trials and tested by latter trials, (3) when trained by former trials and tested by latter trials after the respective global means were subtracted for detrending (i.e., centering and equating the means of former and latter trials in B), (4) when trained by even-numbered trials and tested by odd-numbered trials. Note that the conventional sampling of odd-numbered 20 trials (1st, 3rd, 5th, , 39th) included both former and latter trials as a part and, thus, can be inhomogeneous under baseline drifts. No significant difference was observed among four types of learning in the absence of pure long-term correlations, that is, when both short-term and long-term correlations were absent (left, not significant for all pairs, paired t test, 11 sessions with 77 neurons). The significant decrease at the green bar in the presence of pure long-term correlations demonstrates that the long-term correlations do harm for generalization (right, p  0.05, paired t test, 23 sessions with 189 neurons). The recovery of the classification success by the detrending or the conventional inhomogeneous sampling (trained by even-numbered and tested by odd-numbered trials) suggests that the brain can decode stimulus information under changing environments by using a sophisticated decoder (p  0.001, paired t test). former trials, (2) when trained by former trials and tested 0.05, paired t test, 23 sessions with 189 neurons). The by latter trials, (3) when trained by former trials and tested recovery of the classification success by the detrending or by latter trials after the respective global means were the conventional inhomogeneous sampling (trained by subtracted for detrending (i.e., centering and equating the even-numbered and tested by odd-numbered trials) sug- means of former and latter trials in Fig. 6B), (4) when gests that the decrease in decoding accuracy is due to trained by even-numbered trials and tested by odd- the baseline drift (p  0.001, paired t test). Note that numbered trials. Note that the conventional sampling of the last two types of leaning may mimic brains’ possible odd-numbered 20 trials (1st, 3rd, 5th, ..., 39th) included decoding strategies under changing environments, sug- both former and latter trials as a part and, thus, can be gesting that the brain could overcome nonstationarity by inhomogeneous under baseline drifts. For that purpose, detrending. we first selected the neurons who have no short-term Here, we solely compared the classification success correlation. That is, we selected the neurons whose short- rates obtained for the same neuron pool with different term correlation is not significant by using the statistical types of learning. This is because we believe that it is test accompanying our estimator (Materials and Meth- dangerous to compare different pools even if the numbers ods). Note that although the test applies to a pair, we of neurons are equated, as the sensitivity to stimuli varies eventually selected the neurons who have no short-term by neurons, leading to considerable sampling biases. For correlation in any pair in the session. For those selected example, if we compare the two green bars in Figure 6C, neurons, that cannot have short-term correlations, we the classification success rate per neuron trained by for- compared the classification success rates for the neurons mer and tested by latter is higher in the presence of with and without long-term correlations (i.e., statistically long-term correlations (data not shown), suggesting that significant baseline drifts) as in Figure 6C. As a control, no the overall high classification success in the presence of significant difference was observed among four types of the long-term correlations can be explained by the sam- learning in the absence of pure long-term correlations, pling biases, i.e., simply because the neuron pool with that is, when both short-term and long-term correlations long-term correlations have more smart neurons. were absent (left, not significant for all pairs, paired t test, Taken together, the proposed method enables us to 11 sessions with 77 neurons). On the other hand, the elucidate the impacts of short-term and long-term noise significant decrease at the green bar in the presence of pure long-term correlations demonstrates that the long- correlations in a dissociated manner. The well-designed term correlations do harm for generalization (right, p  decoding analysis with dissociated correlated activities January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 14 of 16 may help to gain insight into the brains’ decoding strate- big spontaneous fluctuations? Can the downstream neu- gies under changing environments. rons separate stimuli in the high dimensional space or, alternatively, cancel out the baseline drifts suitably? These questions remained and leave future work, possi- Discussion bly, with well-designed decoding analyses. In this article, we proposed an information-geometric In this article, we solely treated spike count correlations method to unbiasedly estimate pure short-term noise cor- as a measure of synchrony. We did not use spike-timing relations irrespective of arbitrarily drifting baselines. The cross-correlograms with milli-second bins (Toyama et al., simulation demonstrated the robustness of the proposed 1981a,b; Ito et al., 2010) as we took advantage of our estimator against the slow, long-term drift. The accompa- proposed method, which is limited to spike count corre- nying statistical test as well as the existing nonstationarity lations. The limitation comes from the assumption of no test enabled us to dissociate short-term and long-term temporal auto-correlation in the time series. The assump- correlations. When we exclude the spurious noise corre- tion is necessary to dissociate short-term and long-term lations of purely long-term nature, only a small fraction of cross-correlations successfully. If we consider millisec- V1 neuron pairs showed significant short-term correla- ond bins, temporal auto-correlations exist, which violates tions, possibly reconciling the previous inconsistent ob- the assumption. Here, we rather focused on spike count servations on existence of significant noise correlations. correlations to use our proposed method in depth to the Finally, with the additional help of the machine learning extent to elucidate the componentwise functions of short- that classifies stimuli from neural activities, we assessed term and long-term correlations. Thus, we did not say the impacts on decoding of the presence of short-term or anything on temporal coding in this article, although pre- long-term correlations, separately. The presence of pure vious papers suggested the relationship that the spike short-term correlations slightly improved the decoding count correlations increase with coupling strengths (Cos- accuracy, while the pure long-term correlations deterio- sell et al., 2015; Bharmauria et al., 2016). rated the generalization ability. However, the decrease in There are considerable merits for our proposed estima- decoding accuracy by the long-term correlations was tor of short-term correlations. It guarantees the smallest recoverable by using the decoder with offset, suggesting estimation error among all the estimators which “works” that the brain could overcome nonstationarity by detrend- for arbitrary baseline drifts. It utilizes differential geometry ing. Thus, our method enables us to elucidate the func- essentially and otherwise it is generally impossible to tions of short-term and long-term correlations in a cope with infinitely many cases programmatically even dissociated manner and the well-designed decoding anal- with the fastest computers. As a practical advantage, it ysis with dissociated correlated activities may help to gain enables us to perform a statistical test from a single trial or insight into the brain’s decoding strategies under chang- a snapshot of time series with baseline drifts, which is ing environments. usually unreproducible. The estimating equation given in Our observation that only a small fraction of neuron an analytically closed form as well as the accompanying pairs has short-term noise correlations after detrending statistical test, are quite simple and implementable within may, at first glance, inconsistent with previous works, a few lines of programming codes, easier than shuffling- which reported significant noise correlations. However, based methods which have longer lines and computa- the previous works which detrended the time series be- tional time. The underlying statistical model allows us not fore calculating noise correlations reported small short- only to test statistical significance but also to interpret the term noise correlations (Bair et al., 2001; Ecker et al., correlation coefficients quantitatively, which is unrealiz- 2010; Renart et al., 2010). In this sense, our result is able for other ad hoc or shuffling-based methods. consistent with the previous results. The previous model- Meanwhile, to fully exploit the temporal order and con- ing studies implied that even if short-term noise correla- tinuity of trials without assuming specific statistical mod- tions are small, it can have a big impact in a large network els for trends nonparametrically, we had to consider a (Zohary et al., 1994; Sompolinsky et al., 2001; Miura, simplified additive Gaussian noise model. However, some 2012). As far as our V1 dataset, the impact of short-term previous works used more realistic models such as mixed noise correlations was small but significantly positive. Poisson distributions and, for example, estimated the The classification analysis in this article demonstrated contributions of additive as well as multiplicative noises to that the presence of baseline drifts decreased the gener- explore the underlying biological processes (Goris et al., alizability of the classifier. However, the further analysis 2014; Arandia-Romero et al., 2016). Thus, it is desired to showed that the classification success rate can be recov- pursue temporal structures with more realistic statistical ered by detrending data or including more inhomoge- models in the future work. neous training data. Note that the generalizability should It is important to check whether the spike count data to depend on the training data set: the more different con- be analyzed satisfy the model assumptions of the pro- ditions are learned, the higher the classification success posed method. For example, the normality assumption is rate becomes. In other words, if the future (test) condi- tions are completely different from the past (learned) con- satisfied by high firing neurons in general due to the ditions, the baseline drifts do harm. Thus, we essentially central limit theorem. However, strictly speaking, when showed two possible decoding strategies that can over- we checked whether the spike count data used in the come nonstationarity. It is interesting to know how the article obey the normal distribution by using the Shapiro– brain decodes visual stimuli from small responses under Wilk test, only 70.0% of the neuron-stimulus pairs that are January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 15 of 16 with population activity affects encoded information. Neuron 89: stationary (i.e., without drifts) and modest-firing (i.e., more 1305–1316. CrossRef Medline than 5 Hz) satisfied the normality assumption. One pos- Averbeck BB, Latham PE, Pouget A (2006) Neural correlations, sible solution might be to apply the proposed method only population coding and computation. Nat Rev Neurosci 7:358–366. to the high firing neurons as low firing neurons tend to CrossRef Medline violate the normality assumption. If the data do not satisfy Bair W, Zohary E, Newsome WT (2001) Correlated firing in macaque the assumption of the normal noises, the proposed noise visual area MT: time scales and relationship to behavior. J Neuro- correlation is no more an optimal parameter estimate of sci 21:1676–1697. Medline Bharmauria V, Bachatene L, Cattan S, Chanauria N, Rouat J, Mo- the statistical model. It is also important to check whether lotchnikoff S (2016) High noise correlation between the functionally the assumption of the constant covariances is satisfied, at connected neurons in emergent v1 microcircuits. Exp Brain Res least for some time range. For example, strongly nonsta- 234:523–532. CrossRef Medline tionary neurons, whose firing rates grow twofold over an Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA (1993) Efficient and hour, might violate the assumption. We leave the detailed adaptive estimation for semiparametric models. Baltimore, MD: examination of the model assumptions with statistical Johns Hopkins University Press. Cohen MR, Newsome WT (2008) Context-dependent changes in model selection procedures for the future works. How- functional circuitry in visual area MT. Neuron 60:162–173. Cross- ever, if the violation is weak, the proposed measure could Ref Medline still be used as a rough measure. For example, even if the Cohen MR, Kohn A (2011) Measuring and interpreting neuronal data were actually non-Gaussian spike counts with mul- correlations. Nat Neurosci 14:811–819. CrossRef Medline tiplicative drifts (Goris et al., 2014), the sign of the pro- Cossell L, Iacaruso MF, Muir DR, Houlton R, Sader EN, Ko H, Hofer posed measure, excitatory or inhibitory, could still be SB, Mrsic-Flogel TD (2015) Functional organization of excitatory meaningful. synaptic strength in primary visual cortex. Nature 518:399–403. CrossRef Medline Another assumption for the proposed estimator was Dayan P, Abbott LF (2001) Theoretical neuroscience. Cambridge: that the baseline activities for the consecutive two trials MIT Press. are (almost) the same. This assumption in our analysis Doiron B, Litwin-Kumar A, Rosenbaum R, Ocker GK, Josic´ K (2016) was the clue to separate short timescales and long time- The mechanics of state-dependent neural correlations. Nat Neu- scales. Strictly speaking, however, as we computed noise rosci 19:383–393. CrossRef Medline correlations separately for different stimuli, the intervals Ecker AS, Berens P, Keliris GA, Bethge M, Logothetis NK, Tolias AS between the trials for the same stimulus are variable. Note (2010) Decorrelated neuronal firing in cortical microcircuits. Sci- ence 327:584–587. CrossRef Medline that stimuli were presented in a pseudo-random order. In Ecker AS, Berens P, Cotton RJ, Subramaniyan M, Denfield GH, fact, for the worst case, the effective trial interval can be Cadwell CR, Smirnakis SM, Bethge M, Tolias AS (2014) State as large as 90 s (3 s  15 stimulus  2). Although it is dependence of noise correlations in macaque primary visual cor- generally hard to characterize the effects of drifts on these tex. Neuron 82:235–248. CrossRef Medline medium timescales, no difference was observed between Fiser J, Chiu C, Weliky M (2004) Small modulation of ongoing cortical randomized and repeated orders of stimulus presenta- dynamics by sensory input during natural vision. Nature 431:573– tions (Kohn and Smith, 2005). Thus, we assumed that the 578. CrossRef Medline Goris RL, Movshon JA, Simoncelli EP (2014) Partitioning neuronal drifts on these medium timescales were ignorable. Prac- variability. Nat Neurosci 17:858–865. CrossRef Medline tically, if the assumption of the constant baseline is doubt- Gray CM, Maldonado PE, Wilson M, McNaughton B (1995) Tetrodes ful for a trial pair due to the long interval between them, markedly improve the reliability and yield of multiple single-unit one could remove the pair from the calculation of the isolation from multi-unit recordings in cat striate cortex. J Neurosci proposed estimator. That is, one could exclude unreliable CrossRef Medline 63:43–54. trial pairs from the summation in Equation 2. This type of Harvey AC (1993) Time series models. Cambridge: MIT Press. Ikegaya Y, Aaron G, Cossart R, Aronov D, Lampl I, Ferster D, Yuste exception handling could also work for avoiding the R (2004) Synfire chains and cortical songs: temporal modules of change point where the baselines jump suddenly. Devel- cortical activity. Science 304:559–564. CrossRef Medline oping a more flexible algorithm for the proposed method Ito H, Maldonado PE, Gray CM (2010) Dynamics of stimulus-evoked can be a future work. spike timing correlations in the cat lateral geniculate nucleus. J Neurophysiol 104:3276–3292. CrossRef Medline References Kanji GK (2006) 100 statistical tests. London: Sage. Abbott LF, Dayan P (1999) The effect of correlated variability on the Kohn A, Smith MA (2005) Stimulus dependence of neuronal corre- accuracy of a population code. Neural Comput 11:91–101. Med- lation in primary visual cortex of the macaque. J Neurosci 25: 3661–3673. CrossRef Medline line Latham PE, Nirenberg S (2005) Synergy, redundancy, and indepen- Amari S (2016) Information geometry and its applications. Tokyo: dence in population codes, revisited. J Neurosci 25:5195–5206. Springer Japan. Amari S, Cardoso J (1997) Blind source separation-semiparametric CrossRef Medline statistical approach. IEEE Trans Signal Process 45:2692–2700. Latham PE, Roudi Y (2013) Role of correlations in population coding. In: Principles of neural coding (Quiroga RQ, Panzeri S, eds), pp CrossRef Amari S, Kawanabe M (1997) Information geometry of estimating 121–138. Boca Raton, FL: CRC Press. functions in semi-parametric statistical models. Bernoulli 3:29–54. Luczak A, McNaughton BL, Harris KD (2015) Packet-based commu- CrossRef nication in the cortex. Nat Rev Neurosci 16:745–755. CrossRef Amari S, Nagaoka H (2001) Methods of information geometry. Prov- Medline Maruyama Y, Ito H (2013) Diversity, heterogeneity and orientation- idence, RI: American Mathematical Society. Arandia-Romero I, Tanabe S, Drugowitsch J, Kohn A, Moreno-Bote dependent variation of spike count correlation in the cat visual R (2016) Multiplicative and additive modulation of neuronal tuning cortex. Eur J Neurosci 38:3611–3627. CrossRef Medline January/February 2019, 6(1) e0395-18.2019 eNeuro.org Methods/New Tools 16 of 16 Maruyama Y, Ito H (2017) Design of multielectrode arrays for uniform Petris G, Petrone S, Campagnoli P (2009) Dynamic linear models sampling of different orientations of tuned unit populations in the with R. New York, NY: Springer. cat visual cortex. Neurosci Res 122:51–63. CrossRef Medline Renart A, de la Rocha J, Bartho P, Hollender L, Parga N, Reyes A, Mitchell JF, Sundberg KA, Reynolds JH (2009) Spatial attention Harris KD (2010) The asynchronous state in cortical circuits. Sci- decorrelates intrinsic activity fluctuations in macaque area V4. ence 327:587–590. CrossRef Medline Neuron 63:879–888. CrossRef Medline Rosenbaum R, Smith MA, Kohn A, Rubin JE, Doiron B (2017) The Miura K (2011) An introduction to maximum likelihood estimation and spatial structure of correlated neuronal variability. Nat Neurosci information geometry. Interdiscip Informat Sci 17:155–174. Cross- 20:107–114. CrossRef Medline Ref Ruff DA, Cohen MR (2016) Stimulus dependence of correlated vari- Miura K (2012) Effects of noise correlations on population coding. ability across cortical areas. J Neurosci 36:7546–7556. CrossRef Proceedings of Soft Computing and Intelligent Systems- Medline International Symposium on Advanced Intelligent Systems, pp Sasaki T, Matsuki N, Ikegaya Y (2007) Metastability of active ca3 1072–1075. networks. J Neurosci 27:517–528. CrossRef Miura K (2013) A semiparametric covariance estimator immune to Schneidman E, Bialek W, Berry MJ (2003) Synergy, redundancy, and arbitrary signal drift. Interdiscip Informat Sci 19:35–41. CrossRef independence in population codes. J Neurosci 23:11539–11553. Miura K, Uchida N (2008) A rate-independent measure of irregularity Medline for event series and its application to neural spiking activity. 47th Sharpee TO (2017) Optimizing neural information capacity through IEEE Conference on Decision and Control, pp 2006–2011. discretization. Neuron 94:954–960. CrossRef Medline Miura K, Okada M, Amari S (2006a) Estimating spiking irregularities Sharpee TO, Sugihara H, Kurgansky AV, Rebrik SP, Stryker MP, under changing environments. Neural Comput 18:2359–2386. Miller KD (2006) Adaptive filtering enhances information transmis- CrossRef Medline sion in visual cortex. Nature 439:936–942. CrossRef Medline Miura K, Okada M, Amari S (2006b) Unbiased estimator of shape Sompolinsky H, Yoon H, Kang K, Shamir M (2001) Population coding parameter for spiking irregularities under changing environments. in neuronal systems with correlated noise. Phys Rev E Stat Nonlin Adv Neural Inf Process Syst 18:891–898. [CrossRef] Soft Matter Phys 64:051904. CrossRef Medline Miura K, Tsubo Y, Okada M, Fukai T (2007) Balanced excitatory and Steinmetz NA, Koch C, Harris KD, Carandini M (2018) Challenges inhibitory inputs to cortical neurons decouple firing irregularity and opportunities for large-scale electrophysiology with neu- from rate modulations. J Neurosci 27:13802–13812. CrossRef CrossRef Med- ropixels probes. Curr Opin Neurobiol 50:92–100. Medline line Miura K, Mainen ZF, Uchida N (2012) Odor representations in olfac- Toyama K, Kimura M, Tanaka K (1981a) Cross-correlation analysis of tory cortex: distributed rate coding and decorrelated population interneuronal connectivity in cat visual cortex. J Neurophysiol activity. Neuron 74:1087–1098. CrossRef Medline CrossRef Medline 46:191–201. Moreno-Bote R, Beck J, Kanitscheider I, Pitkow X, Latham P, Pouget Toyama K, Kimura M, Tanaka K (1981b) Organization of cat visual A (2014) Information-limiting correlations. Nat Neurosci 17:1410– cortex as investigated by cross-correlation technique. J Neuro- 1427. CrossRef Medline physiol 46:202–214. CrossRef Medline Neyman J, Scott EL (1948) Consistent estimates based on partially van der Vaart AW (1998) Asymptotic statistics. Cambridge: Cam- consistent observations. Econometrica 16:1–32. CrossRef bridge University Press. Okun M, Steinmetz NA, Cossell L, Iacaruso MF, Ko H, Barthó P, Moore T, Hofer SB, Mrsic-Flogel TD, Carandini M, Harris KD (2015) Zohary E, Shadlen MN, Newsome WT (1994) Correlated neuronal Diverse coupling of neurons to populations in sensory cortex. discharge rate and its implications for psychophysical perfor- Nature 521:511–515. CrossRef Medline CrossRef Medline mance. Nature 370:140–143. January/February 2019, 6(1) e0395-18.2019 eNeuro.org

Journal

eNeuroUnpaywall

Published: Jan 1, 2019

There are no references for this article.