A Sordid Affair: Spike Sorting and Data Reproducibility

A Sordid Affair: Spike Sorting and Data Reproducibility Two important and rapidly developing scientific movements—data reproducibility and machine learning—are central to a recent Neuron paper by Chung et al.1 The paper covers a rather innocuous topic in electrophysiology, spike sorting, where extracellularly recorded action potentials are ascribed to individual neurons. When extracellular microelectrodes (tens of microns in diameter) are placed within the brain, they record the extracellular electric field generated by multiple nearby spiking neurons. This is the basis of the microelectrode recording technique used daily by many functional neurosurgeons, and is core to the development of various brain computer interfaces.2-4 There is also a vast literature utilizing extracellular recordings to infer basic properties of neural information processing.5 Based on the local geometry of the neuron and its proximity to the electrode, the shape of the recorded waveform varies, so each nearby neuron produces an action potential waveform with a slightly different shape. Detecting these slight differences allows researchers to say which recorded action potential belonged to which neuron, in turn allowing statements about how individual cells are affected by disease, or how often individual cells respond to different stimuli. At least, that is how it should work. Classic spike sorting methods (of which there are many) rely on manual intervention at several stages. Error rates approach 20%, and interobserver variability is high.6,7 This raises reproducibility concerns for a large number of studies. Automated, publicly available methods would help eliminate opportunities to introduce bias, and ensure that the provenance of data can be readily ascertained. In their work, Chung et al1 present an open-source software package designed to sort and identify spikes from distinct neurons without user intervention using a mix of standard and novel techniques from machine learning. Their package, MountainSort, streamlines various signal processing stages before employing fully automated clustering and curation. This algorithm can be applied to a broad spectrum of data, regardless of difficulties in dataset properties such as non-Gaussian cluster distributions, overlapping spikes, bursting neurons, and electrode type.1 Many of the preprocessing steps are similar to the existing standards, but what makes MountainSort unique is its clustering algorithm, ISO-SPLIT. The basis for this algorithm is a series of statistical tests for unimodality to sort, cluster, and consolidate redundant clusters on individual electrode neighborhoods. Most importantly, ISO-SPLIT has very few parameters that are adjustable, thus removing manual intervention when clustering spikes. Chung et al1 validated their algorithm on multiple datasets (all available online for corroboration), including tetrode recordings from rodent hippocampus and a training set with paired juxtacellular recordings providing a “ground truth.”8 According to their report, MountainSort detects more single units than manually sorted data, suggesting it may be more accurate than existing standards. To evaluate if these additional units are true events, Chung et al1 compared the timing of spiking events within the hippocampus to the animal's behavior. Hippocampal neurons that encode for “place fields” fire when an animal has entered a specific location in the environment.9 If the units detected by MountainSort are legitimate spiking events, then this activity should occur and cluster with other units that are spatially nearby, which is what was found. Further validation against paired juxtacellular recordings also showed favorable false positive and negative rates, again suggesting that the sorted spikes are legitimate events that have been properly clustered. The reproducibility “crisis” in neuroscience will not have an easy solution. But efforts like that of Chung et al,1 using machine learning techniques over manual parameter tweaking, help. These issues are by no means limited to spike sorting, but also apply to evoked potentials, interictal spikes, seizure propagation pathways, and many other electrophysiological phenomena that rely on manual curation rather than automated detection. Further efforts to remove the opportunity for human bias, ensure provenance of data processing steps, and make software and data publically available all help to make results more readily replicable and trustworthy. REFERENCES 1. Chung JE, Magland JF, Barnett AH et al.   A fully automated approach to spike sorting. Neuron . 2017; 95( 6): 1381- 1394.e6. Google Scholar CrossRef Search ADS PubMed  2. Ajiboye AB, Willett FR, Young DR et al.   Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration. Lancet . 2017; 389( 10081): 1821- 1830. Google Scholar CrossRef Search ADS PubMed  3. Flesher SN, Collinger JL, Foldes ST et al.   Intracortical microstimulation of human somatosensory cortex. Sci Transl Med . 2016; 8( 361): 361ra141- 361ra141. Google Scholar CrossRef Search ADS PubMed  4. Gross RE, Krack P, Rodriguez-Oroz MC, Rezai AR, Benabid A-L. Electrophysiological mapping for the implantation of deep brain stimulators for Parkinson's disease and tremor. Mov Disord . 2006; 21 Suppl 14( S14): S259- S283. Google Scholar CrossRef Search ADS PubMed  5. Cash SS, Hochberg LR. The emergence of single neurons in clinical neurology. Neuron . 2015; 86( 1): 79- 91. Google Scholar CrossRef Search ADS PubMed  6. Wood F, Black MJ, Vargas-Irwin C, Fellows M, Donoghue JP. On the variability of manual spike sorting. IEEE Trans Biomed Eng . 2004; 51( 6): 912- 918. Google Scholar CrossRef Search ADS PubMed  7. Harris KD, Henze DA, Csicsvari J, Hirase H, Buzsáki G. Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. J Neurophysiol . 2000; 84( 1): 401- 414. Google Scholar CrossRef Search ADS PubMed  8. Neto JP, Lopes G, Frazão J et al.   Validating silicon polytrodes with paired juxtacellular recordings: method and dataset. J Neurophysiol . 2016; 116( 2): 892- 903. Google Scholar CrossRef Search ADS PubMed  9. O’Keefe J, Dostrovsky J. The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. Brain Res . 1971; 34( 1): 171- 175. Google Scholar CrossRef Search ADS PubMed  Copyright © 2017 by the Congress of Neurological Surgeons http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Neurosurgery Oxford University Press

A Sordid Affair: Spike Sorting and Data Reproducibility

Loading next page...
 
/lp/ou_press/a-sordid-affair-spike-sorting-and-data-reproducibility-3uTK7ZQbCc
Publisher
Congress of Neurological Surgeons
Copyright
Copyright © 2017 by the Congress of Neurological Surgeons
ISSN
0148-396X
eISSN
1524-4040
D.O.I.
10.1093/neuros/nyx590
Publisher site
See Article on Publisher Site

Abstract

Two important and rapidly developing scientific movements—data reproducibility and machine learning—are central to a recent Neuron paper by Chung et al.1 The paper covers a rather innocuous topic in electrophysiology, spike sorting, where extracellularly recorded action potentials are ascribed to individual neurons. When extracellular microelectrodes (tens of microns in diameter) are placed within the brain, they record the extracellular electric field generated by multiple nearby spiking neurons. This is the basis of the microelectrode recording technique used daily by many functional neurosurgeons, and is core to the development of various brain computer interfaces.2-4 There is also a vast literature utilizing extracellular recordings to infer basic properties of neural information processing.5 Based on the local geometry of the neuron and its proximity to the electrode, the shape of the recorded waveform varies, so each nearby neuron produces an action potential waveform with a slightly different shape. Detecting these slight differences allows researchers to say which recorded action potential belonged to which neuron, in turn allowing statements about how individual cells are affected by disease, or how often individual cells respond to different stimuli. At least, that is how it should work. Classic spike sorting methods (of which there are many) rely on manual intervention at several stages. Error rates approach 20%, and interobserver variability is high.6,7 This raises reproducibility concerns for a large number of studies. Automated, publicly available methods would help eliminate opportunities to introduce bias, and ensure that the provenance of data can be readily ascertained. In their work, Chung et al1 present an open-source software package designed to sort and identify spikes from distinct neurons without user intervention using a mix of standard and novel techniques from machine learning. Their package, MountainSort, streamlines various signal processing stages before employing fully automated clustering and curation. This algorithm can be applied to a broad spectrum of data, regardless of difficulties in dataset properties such as non-Gaussian cluster distributions, overlapping spikes, bursting neurons, and electrode type.1 Many of the preprocessing steps are similar to the existing standards, but what makes MountainSort unique is its clustering algorithm, ISO-SPLIT. The basis for this algorithm is a series of statistical tests for unimodality to sort, cluster, and consolidate redundant clusters on individual electrode neighborhoods. Most importantly, ISO-SPLIT has very few parameters that are adjustable, thus removing manual intervention when clustering spikes. Chung et al1 validated their algorithm on multiple datasets (all available online for corroboration), including tetrode recordings from rodent hippocampus and a training set with paired juxtacellular recordings providing a “ground truth.”8 According to their report, MountainSort detects more single units than manually sorted data, suggesting it may be more accurate than existing standards. To evaluate if these additional units are true events, Chung et al1 compared the timing of spiking events within the hippocampus to the animal's behavior. Hippocampal neurons that encode for “place fields” fire when an animal has entered a specific location in the environment.9 If the units detected by MountainSort are legitimate spiking events, then this activity should occur and cluster with other units that are spatially nearby, which is what was found. Further validation against paired juxtacellular recordings also showed favorable false positive and negative rates, again suggesting that the sorted spikes are legitimate events that have been properly clustered. The reproducibility “crisis” in neuroscience will not have an easy solution. But efforts like that of Chung et al,1 using machine learning techniques over manual parameter tweaking, help. These issues are by no means limited to spike sorting, but also apply to evoked potentials, interictal spikes, seizure propagation pathways, and many other electrophysiological phenomena that rely on manual curation rather than automated detection. Further efforts to remove the opportunity for human bias, ensure provenance of data processing steps, and make software and data publically available all help to make results more readily replicable and trustworthy. REFERENCES 1. Chung JE, Magland JF, Barnett AH et al.   A fully automated approach to spike sorting. Neuron . 2017; 95( 6): 1381- 1394.e6. Google Scholar CrossRef Search ADS PubMed  2. Ajiboye AB, Willett FR, Young DR et al.   Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration. Lancet . 2017; 389( 10081): 1821- 1830. Google Scholar CrossRef Search ADS PubMed  3. Flesher SN, Collinger JL, Foldes ST et al.   Intracortical microstimulation of human somatosensory cortex. Sci Transl Med . 2016; 8( 361): 361ra141- 361ra141. Google Scholar CrossRef Search ADS PubMed  4. Gross RE, Krack P, Rodriguez-Oroz MC, Rezai AR, Benabid A-L. Electrophysiological mapping for the implantation of deep brain stimulators for Parkinson's disease and tremor. Mov Disord . 2006; 21 Suppl 14( S14): S259- S283. Google Scholar CrossRef Search ADS PubMed  5. Cash SS, Hochberg LR. The emergence of single neurons in clinical neurology. Neuron . 2015; 86( 1): 79- 91. Google Scholar CrossRef Search ADS PubMed  6. Wood F, Black MJ, Vargas-Irwin C, Fellows M, Donoghue JP. On the variability of manual spike sorting. IEEE Trans Biomed Eng . 2004; 51( 6): 912- 918. Google Scholar CrossRef Search ADS PubMed  7. Harris KD, Henze DA, Csicsvari J, Hirase H, Buzsáki G. Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. J Neurophysiol . 2000; 84( 1): 401- 414. Google Scholar CrossRef Search ADS PubMed  8. Neto JP, Lopes G, Frazão J et al.   Validating silicon polytrodes with paired juxtacellular recordings: method and dataset. J Neurophysiol . 2016; 116( 2): 892- 903. Google Scholar CrossRef Search ADS PubMed  9. O’Keefe J, Dostrovsky J. The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. Brain Res . 1971; 34( 1): 171- 175. Google Scholar CrossRef Search ADS PubMed  Copyright © 2017 by the Congress of Neurological Surgeons

Journal

NeurosurgeryOxford University Press

Published: Mar 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off