# Resources and references

I’ve put together some resources and references using Python (but keep in mind, R is another popular route into data science).

## Some Python packages 

First, I list some indispensable Python libraries used in data science.
In addition to core Python, you should also start getting familiar with a few other tools:


- Python's classics:
    - [NumPy](https://numpy.org/) – numerical computing and array manipulation.
    - [SciPy](https://scipy.org/) – scientific computing and statistics.
    - [Matplotlib](https://matplotlib.org/) – basic plotting library.
- Data Manipulation:
    - [Pandas](https://pandas.pydata.org) – data structures and analysis tools.
- Statistical Analysis:
    - [statsmodels](https://www.statsmodels.org/stable/index.html) – estimation of statistical models, statistical tests, and data exploration.
- Machine Learning:
    - [Scikit-learn](https://scikit-learn.org/stable/) – widely used machine learning library.
- Natural Language Processing (NLP):
    - [NLTK](https://www.nltk.org) – platform for working with human language data.
    - [SpaCy](https://spacy.io) – main library for NLP tasks.
    - [Gensim](https://radimrehurek.com/gensim/intro.html) – topic modeling library.
- Data Visualization:
    - [Seaborn](https://seaborn.pydata.org) – statistical data visualization.
    - [Plotly](https://plotly.com/python/) – interactive graphing library.
- Web Scraping:
    - [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/) – extracting data from HTML files.


## Resources and references for general data sciences

### Books

- **[An Introduction to Statistical Learning with Applications in Python](https://link.springer.com/book/10.1007/978-3-031-38747-0)**, Springer 2023, by 
Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, and Jonathan Taylor. See [Book Homepage and Resources](https://www.statlearning.com) with the [PDF](https://hastie.su.domains/ISLP/ISLP_website.pdf.download.html) and the associated [Youtube videos](https://youtube.com/playlist?list=PLoROMvodv4rNHU1-iPeDRH-J0cL-CrIda&feature=shared) with Trevor Hastie & Jonathan Taylor (and it starts with Trevor complimenting Jonathan on his new haircut, why not...). [Trevor Hastie](https://hastie.su.domains) is one of the big dudes in statistics (see his [book](https://hastie.su.domains/ElemStatLearn/)  "The Elements of Statistical Learning: Data Mining, Inference, and Prediction"), and [Jonathan Taylor](https://jtaylor.su.domains) is a younger statistician with a nice new haircut.

- **[Python for Data Analysis](https://www.oreilly.com/library/view/python-for-data/9781491957653/)** (3rd ed.), O’Reilly 2022, by [Wes McKinney](https://wesmckinney.com) a creator of Panda. The [open edition](https://wesmckinney.com/book/) is avalaible, with the [codes](https://github.com/wesm/pydata-book/tree/3rd-edition), see his   [GitHub](https://github.com/wesm) for other resources.

- **[Python Data Science Handbook](https://www.oreilly.com/library/view/python-data-science/9781491912126/)** (2nd ed.), O’Reilly 2022, by [Jake VanderPlas](http://vanderplas.com) -- [full text](https://jakevdp.github.io/PythonDataScienceHandbook/), and the associated [Jupyter Notebook](https://github.com/jakevdp/PythonDataScienceHandbook) (very nice!), see his [GitHub](https://github.com/jakevdp) for other resources.

- **[Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python](https://www.oreilly.com/library/view/practical-statistics-for/9781492072935/)** (2nd ed.), O'Reilly 2020, by Peter Bruce, Andrew Bruce, Peter Gedeck. See the [GitHub](https://github.com/gedeck/practical-statistics-for-data-scientists/tree/master) with the Python [codes and notebooks](https://github.com/gedeck/practical-statistics-for-data-scientists/tree/master/python).

- **[Python for Probability, Statistics, and Machine Learning](https://link.springer.com/book/10.1007/978-3-031-04648-3)** (3rd ed.), Springer 2022, by José Unpingco. See his [GitHub](https://github.com/unpingco) for other resources.

- **[Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/)** (3rd ed.), O’Reilly 2022, by [Aurélien Géron](https://github.com/ageron),  and the associated [notebooks](https://github.com/ageron/handson-ml2), see his   [GitHub](https://github.com/ageron) for other resources.
Machine learning and deep Learning.

- **[Deep Learning Illustrated](https://www.deeplearningillustrated.com)**,  Addison-Wesley 2019, by [Jon Krohn](https://www.jonkrohn.com), with the associated [notebooks](https://github.com/the-deep-learners/deep-learning-illustrated), the concept if quite interesting. See also this [github](https://github.com/jonkrohn/DLTFpT).

I do not provide references on the basic mathematical foundations of data science, which usually include linear algebra, calculus (with a focus on optimization), probability theory, statistics (both elementary and inferential), discrete mathematics (graphs, combinatorics, logic), and sometimes numerical methods. I also do not include general references on statistics, machine learning, or Python programming itself, as well as topics related to databases such as SQL, relational database design, and NoSQL systems. There are numerous high-quality resources available for all these areas.

### Jupyter (note)books

Among the previous references:

- Jake VanderPlas' [Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook)
- Aurélien Géron's [notebooks](https://github.com/ageron/handson-ml2), *a series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.*
- Wes McKinney's "Python for Data Analysis" [open edition](https://wesmckinney.com/book/) and [notebooks](https://github.com/wesm/pydata-book/tree/3rd-edition).


## Resources and references for data sciences in neurosciences

### References

- **[Python in Neuroscience](https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/fninf.2015.00011/full)**, E. Muller, J. A. Bednar, M. Diesmann, M.-O. Gewaltig, M. Hines, and A. P. Davison  
Frontiers in Neuroinformatics, 9, 2015.

- **[Case Studies in Neural Data Analysis](https://mitpress.ublish.com/book/case-studies-neural-data-analysis)**, 2016 - The book presents MATLAB tools, but there is an associated [GitHub repository](https://github.com/Mark-Kramer/Case-Studies-Kramer-Eden) for Python. The book primarily covers extracranial data, except Chapter 8: *Basic Visualizations and Descriptive Statistics of SpikeTrainData*.

- **[Neural Data Science](https://neuraldatascience.io/)** (2020–23), [Aaron J. Newman](https://www.dal.ca/faculty/science/psychology_neuroscience/faculty-staff/our-faculty/aaron-newman.html) from the [NeuroCognitive Imaging Lab](https://www.ncilab.ca) (Dalhousie University, Halifax).  
Starts from scratch, especially in Python. Includes a section on [Single Unit Data](https://neuraldatascience.io/6-single_unit/introduction.html). See the [GitHub repository](https://github.com/neural-data-science/NESC_3505_textbook) for the Jupyter Book and the YouTube channel [Neural Data Science with Python](https://www.youtube.com/playlist?list=PLtfEWMIgWS22MMZjPIzBRE2cHhMcvEKwp).

- **Neural Data Science: A Primer with MATLAB and Python**, Erik Lee Nylen and Pascal Wallisch, 2017.  
See the [table of contents](https://www.sciencedirect.com/book/9780128040430/neural-data-science).



### Spike Train and Electrophysiology Data Analysis

#### Math books

- The contributions of **[Robert E. Kass](https://www.stat.cmu.edu/~kass/)** are noteworthy. Rob Kass is a renowned statistician, and he has also contributed to the modeling and statistical analysis of Neural Spike Train Data, and to machine learning. One can refer to his book [Analysis of Neural Data](https://www.stat.cmu.edu/~kass/research.html#and), which is actually an excellent introductory book on probability and statistics through the lens of neural data. His page [Contributions to Analysis of Neural Spike Train Data](https://www.stat.cmu.edu/~kass/contrib.html) also provides an overview of his contributions to the field.

- **[Analysis of Parallel Spike Trains](https://link.springer.com/book/10.1007/978-1-4419-5675-0)** edited by S. Grün and S. Rotter (Springer, 2010).
- **[Stochastic Models for Spike Trains of Single Neurons](https://link.springer.com/book/10.1007/978-3-642-48302-8)** by G. Sampath and S. K. Srinivasan


#### Python packages

- **[syncopy](https://github.com/esi-neuroscience/syncopy)** - Systems Neuroscience Computing in Python: a Python package for large-scale analysis of electrophysiological data, with the following [article](https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/fninf.2024.1448161/full).
```{index} syncopy 
```

- **[MNE](https://mne.tools/stable/index.html)** - Open-source Python package for exploring, visualizing, and analyzing human neurophysiological data: MEG, EEG, sEEG, ECoG, NIRS, and more).
```{index} MNE 
```

- **[pynapple](https://pynapple.org)** – Python Neural Analysis Package. Pynapple is a lightweight Python library for neurophysiological data analysis. See the article: [Pynapple, a toolbox for data analysis in neuroscience](https://elifesciences.org/reviewed-preprints/85786), 2023.
```{index} pynapple 
```

- **[osl-ephys](https://osl-ephys.readthedocs.io/en/latest/)** - This package contains models for analysing electrophysiology data. It builds on top of the widely used MNE-Python package and contains  analysis tools for M/EEG sensor and source space analysis.  From the [Oxford Centre for Human Brain Activity Analysis Group](https://www.psych.ox.ac.uk/research/ohba-analysis-group), with this [GitHub repository](https://github.com/OHBA-analysis/OHBA-Examples/tree/main) and this 2025 paper: [osl-ephys: a Python toolbox for the analysis of electrophysiology data](https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2025.1522675/full).  
```{index} osl-ephys 
```

- **[Elephant - Electrophysiology Analysis Toolkit](https://elephant.readthedocs.io/en/latest/)** is an emerging open-source, community centered library for the analysis of electrophysiological data in the Python programming language. Elephant focuses on generic analysis functions for spike train data and time series recordings from electrodes [GitHub repository](https://github.com/NeuralEnsemble/elephant)
```{index} elephant 
```

- **[NeuralEnsemble](http://neuralensemble.org)** – a community-based initiative to promote and coordinate open-source software development in neuroscience. **Inactive since 2022.**
```{index} NeuralEnsemble 
```

#### Jupyter (note)book(s)

- **[Spike sorting the 'Do It Yourself' way](https://c_pouzat.gitlab.io/spike-sorting-the-diy-way/)** a Jupyter book by [Christophe Pouzat](https://xtof.perso.math.cnrs.fr) with the [gitlab repository](https://gitlab.com/c_pouzat/spike-sorting-the-diy-way). See also the  [Probabilistic Spiking Neuronal Nets: Companion](https://probabilistic-spiking-neuronal-nets-c-pouzat-491a1ca82ffec5679d.gitlab.io/index.html) associated with tge book [Probabilistic Spiking Neuronal Nets](https://link.springer.com/book/10.1007/978-3-031-68409-8) co-authored with Antonio Galves and Eva Löcherbach.

### Blog(s) and blog posts

- [Spikes and Bursts](https://spikesandbursts.wordpress.com) — an interesting blog by [David Cabrera-Garcia](https://scholar.google.com/citations?user=Dmwnwb4AAAAJ&hl=en), where he explores various [concepts](https://spikesandbursts.wordpress.com/neuroscience-contents/).  
  He also runs a [YouTube channel](https://www.youtube.com/@spikesandbursts/videos) and shares projects on [GitHub](https://github.com/dav1dcg). An interesting post:

     - [Patch-clamp data analysis in Python: animate time series data](https://spikesandbursts.wordpress.com/tag/patch-clamp/).

- [Patch clamp electrophysiology analysis with Python](https://www.scientifica.cn/neurowire/patch-clamp-electrophysiology-analysis-with-python) (2023) by [Vincenzo Mastrolia](https://devneuro.org/cdn/people-detail.php?personID=2242)


### Misc.

- **[ElecFeX](https://github.com/XinyueMa-neuro/ElecFeX)** - A MATLAB-based Electrophysiological Feature eXtraction toolbox for single-cell intracellular recordings. See the article: *[ElecFeX is a user-friendly toolbox for efficient feature extraction from single-cell electrophysiological recordings](https://www.sciencedirect.com/science/article/pii/S2667237524001437)*

### Other tools

Before analyzing data, we first need to read electrophysiology recordings and handle the different standards used.

- The  [pyABF](https://pypi.org/project/pyabf/) library was created by [Scott Harden](https://swharden.com/about/). We will return to that package in a future section.


## Sometimes we don’t even know what we’re talking about


> Data science, statistics, math, machine learning—sure, they’re all great when applied to modeling and analyzing spikes and bursts. But let’s not forget: we also need to paddle upstream to the very source of those signals. Where do the spikes and bursts records come from? The experimental lab. And what do they actually represent? The wild and real dynamics of real neurons.


- **[Guide to Research Techniques in Neuroscience](https://shop.elsevier.com/books/guide-to-research-techniques-in-neuroscience/carter/978-0-12-818646-6)** by Matt Carter, Rachel Essner, Nitsan Goldstein, and Manasi Iyer (2022, 3rd Edition)
- **[Electrophysiological Recording Techniques](https://link.springer.com/book/10.1007/978-1-0716-2631-3)**  edited by Robert P. Vertes and Timothy Allen (Springer, 2022).
- **[Introduction to Electrophysiological Methods and Instrumentation](https://www.sciencedirect.com/book/9780128142103/introduction-to-electrophysiological-methods-and-instrumentation#book-info)** by Franklin Bretschneider and Jan R. de Weille (Academic Press, Second edition, 2019).
- **[Basic Electrophysiological Methods](https://academic.oup.com/book/25187)** edited by Matt Carter and Ellen Covey (Oxford University Press, 2015)
- **[The Laboratory Computer: A Practical Guide for Physiologists and Neuroscientists](https://www.sciencedirect.com/book/9780122095511/the-laboratory-computer)** by John Dempster (Academic Press, 2001).
