-
Notifications
You must be signed in to change notification settings - Fork 9
Roadmap
This document describes a possible roadmap for future developments of the KlustaSuite.
Most of these developments are expected to occur in the second half of 2014.
The goal of spike sorting is to extract single-unit spiking activity from raw extracellular recordings. Those recordings are now increasingly acquired through high-density multielectrode arrays. The number of recording channels is currently growing at a fast rate. Whereas probes with tens of channels are routinely used today, probes containing several hundreds or even thousands of channels are expected in the coming months and years. This deluge of data requires novel algorithmic and software developments.
In our group, we have developed a software suite tackling the spike sorting problem with a particular set of algorithms. We also designed a new file format based on HDF5.
Our software suite is designed around a rigid workflow: filtering, spike detection, waveform extraction, automatic clustering, and manual clustering. This workflow has been widely used for years in many research teams.
However, other workflows exist and might prove more powerful. For example, retinal data is increasingly being sorted with algorithms based on template matching. These methods allow extracted waveforms to be semi-automatically matched to the original data. They often yield greater sorting quality, notably by mitigating the problem of spike overlap. Whether this approach can be beneficial on cortical data remains to be tested.
Further approaches and workflows might also be considered. Thus, spike sorting remains today an open research question. A generic, fully automatic, real-time method for spike sorting is out of reach in the medium term, but we might get closer in the coming years. Yet, progress in this respect is hindered by the variety of workflows, algorithms, experimental protocols, file formats, implementations, programming languages used by various research teams around the world.
Our own software suite is hardly extendable, making experimentations difficult.
For these reasons, we propose to develop the foundations of a modern open framework for large-scale electrophysiology. We will focus on the Python language, because we believe it is a very strong candidate for such a framework. Instead of providing a monolithic graphical software, we will create reusable components in Python (high-performance visualization views, graphical widgets, algorithms). These components will be organized around the IPython notebook. This innovative platform provides a modern, dynamic, Web-based framework for performing computational experiments in a reproducible way.
In the following, we will motivate and describe all those elements.
This language has the following strengths:
- Python is an open-source language, which is a strong benefit compared with commercial, closed-source alternatives.
- Python provides a high-performance environment for numerical computing.
- There are many solid and actively maintained libraries for scientific computing.
- Scientific Python has a very dynamic and active community, particularly in neuroscience. It is supported by many researchers, research institutions, and industries around the world.
- The language itself is expressive, multi-paradigm and easy to learn.
- Python can very easily integrate with non-Python code and libraries (C/C++, FORTRAN, etc.).
- MATLAB users can move to Python relatively easily, since those two languages share many programming concepts and syntax paradigms.
Python also has a few weaknesses. We describe how we plan to solve these problems.
-
Two incompatible versions of Python coexist at this time: Python 2 and Python 3. Python 2 is minimally maintained, whereas all recent developments have only concerned Python 3. Today, most scientific computing libraries are compatible with both Python 2 and Python 3. However, many people are still using Python 2. Some libraries remain incompatible with Python 3. Some computing environments have not upgraded to Python 3 yet. For all these reasons, it is not recommended to support only one branch of Python. Fortunately, several solid solutions exist to write a single codebase compatible with Python 2 and Python 3 (notably
six.py
, a popular method that we choose to rely upon). Robust software engineering methods make it possible to ensure perfect compatibility of the code (testing suite, code coverage, continuous integration). -
Although Python is multi-platform, distributing a Python software is known to be particularly painful. Many incompatible packaging systems have been developed for Python, and none of them is perfect. However, there are currently many efforts in the direction of an "ultimate" packaging system for Python, and things are really improving. One of the most promising efforts is conducted by a company, Continuum Analytics. They developed an open-source system, conda, for building and distributing multi-platform Python packages. This system might be combined with methods for building standalone installers and executables. Finally, let's mention the fact that Python programs might be able to run in the browser in the near future. This would considerably simplify the distribution of Python-based applications.