Resources and references

1. Resources and references#

I’ve put together some resources and references using Python (but keep in mind, R is another popular route into data science).

1.1. Some Python packages#

First, I list some indispensable Python libraries used in data science. In addition to core Python, you should also start getting familiar with a few other tools:

Python’s classics:
- NumPy – numerical computing and array manipulation.
- SciPy – scientific computing and statistics.
- Matplotlib – basic plotting library.
Data Manipulation:
- Pandas – data structures and analysis tools.
Statistical Analysis:
- statsmodels – estimation of statistical models, statistical tests, and data exploration.
Machine Learning:
- Scikit-learn – widely used machine learning library.
Natural Language Processing (NLP):
- NLTK – platform for working with human language data.
- SpaCy – main library for NLP tasks.
- Gensim – topic modeling library.
Data Visualization:
- Seaborn – statistical data visualization.
- Plotly – interactive graphing library.
Web Scraping:
- BeautifulSoup – extracting data from HTML files.

1.2. Resources and references for general data sciences#

Books#

An Introduction to Statistical Learning with Applications in Python, Springer 2023, by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, and Jonathan Taylor. See Book Homepage and Resources with the PDF and the associated Youtube videos with Trevor Hastie & Jonathan Taylor (and it starts with Trevor complimenting Jonathan on his new haircut, why not…). Trevor Hastie is one of the big dudes in statistics (see his book “The Elements of Statistical Learning: Data Mining, Inference, and Prediction”), and Jonathan Taylor is a younger statistician with a nice new haircut.
Python for Data Analysis (3rd ed.), O’Reilly 2022, by Wes McKinney a creator of Panda. The open edition is avalaible, with the codes, see his GitHub for other resources.
Python Data Science Handbook (2nd ed.), O’Reilly 2022, by Jake VanderPlas – full text, and the associated Jupyter Notebook (very nice!), see his GitHub for other resources.
Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python (2nd ed.), O’Reilly 2020, by Peter Bruce, Andrew Bruce, Peter Gedeck. See the GitHub with the Python codes and notebooks.
Python for Probability, Statistics, and Machine Learning (3rd ed.), Springer 2022, by José Unpingco. See his GitHub for other resources.
Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow (3rd ed.), O’Reilly 2022, by Aurélien Géron, and the associated notebooks, see his GitHub for other resources. Machine learning and deep Learning.
Deep Learning Illustrated, Addison-Wesley 2019, by Jon Krohn, with the associated notebooks, the concept if quite interesting. See also this github.

I do not provide references on the basic mathematical foundations of data science, which usually include linear algebra, calculus (with a focus on optimization), probability theory, statistics (both elementary and inferential), discrete mathematics (graphs, combinatorics, logic), and sometimes numerical methods. I also do not include general references on statistics, machine learning, or Python programming itself, as well as topics related to databases such as SQL, relational database design, and NoSQL systems. There are numerous high-quality resources available for all these areas.

Jupyter (note)books#

Among the previous references:

Jake VanderPlas’ Python Data Science Handbook
Aurélien Géron’s notebooks, a series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Wes McKinney’s “Python for Data Analysis” open edition and notebooks.

1.3. Resources and references for data sciences in neurosciences#

References#

Python in Neuroscience, E. Muller, J. A. Bednar, M. Diesmann, M.-O. Gewaltig, M. Hines, and A. P. Davison
Frontiers in Neuroinformatics, 9, 2015.
Case Studies in Neural Data Analysis, 2016 - The book presents MATLAB tools, but there is an associated GitHub repository for Python. The book primarily covers extracranial data, except Chapter 8: Basic Visualizations and Descriptive Statistics of SpikeTrainData.
Neural Data Science (2020–23), Aaron J. Newman from the NeuroCognitive Imaging Lab (Dalhousie University, Halifax).
Starts from scratch, especially in Python. Includes a section on Single Unit Data. See the GitHub repository for the Jupyter Book and the YouTube channel Neural Data Science with Python.
Neural Data Science: A Primer with MATLAB and Python, Erik Lee Nylen and Pascal Wallisch, 2017.
See the table of contents.

Spike Train and Electrophysiology Data Analysis#

Math books#

The contributions of Robert E. Kass are noteworthy. Rob Kass is a renowned statistician, and he has also contributed to the modeling and statistical analysis of Neural Spike Train Data, and to machine learning. One can refer to his book Analysis of Neural Data, which is actually an excellent introductory book on probability and statistics through the lens of neural data. His page Contributions to Analysis of Neural Spike Train Data also provides an overview of his contributions to the field.
Analysis of Parallel Spike Trains edited by S. Grün and S. Rotter (Springer, 2010).
Stochastic Models for Spike Trains of Single Neurons by G. Sampath and S. K. Srinivasan

Python packages#

syncopy - Systems Neuroscience Computing in Python: a Python package for large-scale analysis of electrophysiological data, with the following article.

MNE - Open-source Python package for exploring, visualizing, and analyzing human neurophysiological data: MEG, EEG, sEEG, ECoG, NIRS, and more).

pynapple – Python Neural Analysis Package. Pynapple is a lightweight Python library for neurophysiological data analysis. See the article: Pynapple, a toolbox for data analysis in neuroscience, 2023.

osl-ephys - This package contains models for analysing electrophysiology data. It builds on top of the widely used MNE-Python package and contains analysis tools for M/EEG sensor and source space analysis. From the Oxford Centre for Human Brain Activity Analysis Group, with this GitHub repository and this 2025 paper: osl-ephys: a Python toolbox for the analysis of electrophysiology data.

Elephant - Electrophysiology Analysis Toolkit is an emerging open-source, community centered library for the analysis of electrophysiological data in the Python programming language. Elephant focuses on generic analysis functions for spike train data and time series recordings from electrodes GitHub repository

NeuralEnsemble – a community-based initiative to promote and coordinate open-source software development in neuroscience. Inactive since 2022.

Jupyter (note)book(s)#

Spike sorting the ‘Do It Yourself’ way a Jupyter book by Christophe Pouzat with the gitlab repository. See also the Probabilistic Spiking Neuronal Nets: Companion associated with tge book Probabilistic Spiking Neuronal Nets co-authored with Antonio Galves and Eva Löcherbach.

Blog(s) and blog posts#

Spikes and Bursts — an interesting blog by David Cabrera-Garcia, where he explores various concepts.
He also runs a YouTube channel and shares projects on GitHub. An interesting post:
- Patch-clamp data analysis in Python: animate time series data.
Patch clamp electrophysiology analysis with Python (2023) by Vincenzo Mastrolia

Misc.#

ElecFeX - A MATLAB-based Electrophysiological Feature eXtraction toolbox for single-cell intracellular recordings. See the article: ElecFeX is a user-friendly toolbox for efficient feature extraction from single-cell electrophysiological recordings

Other tools#

Before analyzing data, we first need to read electrophysiology recordings and handle the different standards used.

The pyABF library was created by Scott Harden. We will return to that package in a future section.

1.4. Sometimes we don’t even know what we’re talking about#

Data science, statistics, math, machine learning—sure, they’re all great when applied to modeling and analyzing spikes and bursts. But let’s not forget: we also need to paddle upstream to the very source of those signals. Where do the spikes and bursts records come from? The experimental lab. And what do they actually represent? The wild and real dynamics of real neurons.

Guide to Research Techniques in Neuroscience by Matt Carter, Rachel Essner, Nitsan Goldstein, and Manasi Iyer (2022, 3rd Edition)
Electrophysiological Recording Techniques edited by Robert P. Vertes and Timothy Allen (Springer, 2022).
Introduction to Electrophysiological Methods and Instrumentation by Franklin Bretschneider and Jan R. de Weille (Academic Press, Second edition, 2019).
Basic Electrophysiological Methods edited by Matt Carter and Ellen Covey (Oxford University Press, 2015)
The Laboratory Computer: A Practical Guide for Physiologists and Neuroscientists by John Dempster (Academic Press, 2001).