Spike Sorting for Beginners: How to Isolate Single Neurons from Multi-Electrode Data

What Is Spike Sorting?

When a multi-electrode array records neural activity, each electrode channel captures an amalgamation of electrical signals from multiple nearby neurons. Spike sorting is the computational process of identifying individual action potentials (spikes) within this noisy signal and assigning each spike to the neuron that most likely generated it. Done well, it transforms raw voltage traces into a structured dataset of identified single-unit activity — an essential step for understanding how individual neurons encode information.

The Three Core Stages of Spike Sorting

Stage 1: Spike Detection

The first task is identifying moments in the raw recording where a spike likely occurred. The most common approach is threshold crossing: a spike is detected whenever the filtered voltage signal crosses a predefined amplitude threshold, usually set to a multiple of the estimated noise standard deviation (e.g., 4–5× RMS noise).

The signal is first bandpass filtered (typically 300–3000 Hz) to isolate spike frequencies from LFP and high-frequency noise.
Threshold can be set manually or estimated automatically from the data.
Both positive and negative threshold crossings may be used depending on electrode geometry and neuron orientation.

Stage 2: Feature Extraction

Once spikes are detected, a short waveform snippet (usually 1–3 ms) is extracted around each crossing. The spike sorter then computes features that represent the shape of each waveform, reducing the high-dimensional waveform into a compact, comparable representation. Common feature extraction methods include:

Principal Component Analysis (PCA): Projects waveforms onto their principal axes of variation. Simple, fast, and widely used.
Wavelet coefficients: Captures multi-scale shape features, often outperforming PCA for complex waveform populations.
Waveform timestamps on multi-electrodes: For tetrodes or dense probes, the amplitude profile across neighboring channels provides powerful discrimination.

Stage 3: Clustering

With feature vectors computed, the algorithm groups similar waveforms together — each cluster ideally corresponding to a single neuron. Clustering methods range from classical to modern:

K-means clustering: Fast and intuitive but requires specifying the number of clusters in advance.
Gaussian Mixture Models (GMM): Provides probabilistic cluster assignments and soft boundaries.
Template matching (e.g., Kilosort, MountainSort): Fits detected spikes to a library of neuron templates, updated iteratively. Handles overlapping spikes from nearby neurons much better than distance-based clustering.

Manual Curation: The Human Step

Automated spike sorting is never perfect. Most pipelines are followed by a manual curation step, typically performed in a GUI such as Phy (used with Kilosort output). Curators inspect clusters, assess quality metrics, merge over-split units, and split contaminated ones. Key quality metrics include:

Isolation distance / L-ratio: Quantifies cluster separation in feature space.
Refractory period violations: True neurons cannot fire twice within ~1–2 ms; clusters with many such violations likely contain multiple cells.
Waveform stability: Drift in waveform shape over time indicates electrode movement or gradual tissue response.

Popular Spike Sorting Software

Software	Algorithm	Best For	License
Kilosort 2/3/4	Template matching + drift correction	High-density probes (Neuropixels)	Free (MATLAB/Python)
MountainSort 5	Density-based clustering	Tetrodes, moderate channel counts	Free (Python)
SpyKING CIRCUS	Template matching	Large MEAs, retinal recordings	Free (Python)
Ironclust	Density peak clustering	Flexible probe geometries	Free (MATLAB)

Getting Started: Practical Tips

If you are new to spike sorting, begin with a well-characterized dataset and a community-standard tool like Kilosort or MountainSort via the SpikeInterface Python framework, which provides a unified API for most major sorters. Always validate your sorted units with quality metrics before drawing scientific conclusions. Spike sorting is as much an art as a science — iterative refinement and familiarity with your specific probe geometry will improve results over time.