Giter VIP home page Giter VIP logo

neuroglia's Introduction

neuroglia: more than just brain glue

Neuroglia is a Python machine learning library for neurophysiology data. It features scikit-learn compatible transformers for extracting features from extracellular electrophysiology & optical physiology data for machine learning pipelines.

DOI CircleCI Travis AppVeyor Documentation Status codecov.io Requirements Status Say Thanks!

Installation

pip install git+https://github.com/AllenInstitute/neuroglia.git

Level of Support

We are planning on occasional updating this tool with no fixed schedule. Community involvement is encouraged through both issues and pull requests.

License

BSD-3-Clause

Authors

Development Lead

  • Justin Kiggins

Contributors

  • Nicholas Cain
  • Michael Oliver
  • Sahar Manavi
  • Johannes Friedrich
  • Christopher Mochizuki

neuroglia's People

Contributors

mochic avatar neuromusic avatar nicain avatar nilegraddis avatar saharmanavi avatar the-moliver avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neuroglia's Issues

refactor spike inference

currently, spike inference is implemented with a transformer called OASISInferer, which takes the OASIS arguments as parameters.

this transformer should be replaced with an algorithm-agnostic transformer that accepts more intuitive arguments.

e.g.:

inferer = ng.calcium.EventInferer(
    penalty='l0',
    method='oasis',
)

deployment plan

i think we need a deployment plan for pushing to pypi:

  • checkout master (will use it as dev since we've already started doing that?)
  • update CHANGELOG.rst
  • bumpversion
  • (wait for ci to pass)
  • push tag
  • merge to production
  • (wait for ci to pass and publish to pypi)

implement epoch reducers

implement EpochSpikeReducer and EpochTraceReducer transformers which perform "reduce" operations using a user-defined function all of the data within a time range for each operation.

e.g.

epoch_reducer = ng.epoch.EpochTraceReducer(
    traces=TRACES,
    agg_func=np.mean,
)

mean_responses = epoch_reducer.fit_transform(EPOCHS)

open question: what should the expected format of EPOCHS be?

option 1: a 'time' columns and a 'duration' column

this maintains consistency with the EVENTS dataframes expected with other transformers in this package.

option 2: a 'start' column and an 'end' column

this is likely closer to the native representation of this kind of data

option 3: both

should Smoother accept an explicit list of neurons?

currently, smoother automatically discovers the IDs of the neurons through a groupby operation on the X dataframe of spike times that it accepts.

these then become the columns of the output dataframe

however, if a given neuron is unobserved in the data passed into X, it will not get a column.

this could be supported by accepting a "neurons" kwarg when initializing the object & replacing the groupby operation with an explicity loop over the neurons values & building masks.

this approach would also let the user ignore any neurons in X that they don't want to consider with a kwarg (that is, which neurons get smoothed could become a hyperparameter to optimize)

EventTraceTensorizer fails if `bins` is an integer

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-90797e9adccb> in <module>()
      2     traces,
      3     bins=30,
----> 4     range=(0,1)
      5 )

c:\users\justink\code\neuroglia\neuroglia\event.py in __init__(self, traces, bins, range)
     13         super(EventTraceTensorizer, self).__init__()
     14         self.traces = traces
---> 15         self.bins = bins[:-1]
     16         self.range = range
     17 

TypeError: 'int' object is not subscriptable

add example: spike inference on spikefinder datasets

add a script to examples/ that does the following:

  1. loads data from one spikefinder dataset (see http://spikefinder.codeneuro.org) & fits it
  2. tests the prediction against the spikefinder metrics (import spikefinder; spikefinder.score(y,y_pred) or something like that should work. see https://github.com/codeneuro/spikefinder-python)
  3. compactly repeats 1 & 2 on all spikefinder datasets, generating a colorized table of results as in http://spikefinder.codeneuro.org

If the results are any good, consider submitting them to spikefinder, if it is still accepting new submissions :D

[demo] reliability

implement Dan Denman's "reliability" analysis in a neuroglia pipeline

Binned at 0.5ms, then smoothed with 5ms boxcar, then trials extracted, then for each trial (e.g. stimulus), calculate reliability. need to double check with Dan what the reliability metric was

Bug: tox.ini

  • neuroglia version: b20ee5c
  • Python version: N/A
  • Operating System: N/A

Description

Looks like a typo in tox.ini:

deps = -rrequirements.txt

rrequirements.txt --> requirements.txt

@neuromusic maybe triage this to a new user as a learning example?

PeriEventSpikeTensorizer should accept multiple data structures for spikes

pandas.DataFrame

  • rows: observed spikes
  • columns: time, source, *spike_features

np.ndarray

  • can also accept each column as an array
  • need to pass additional args to indicate time and neuron column indices

dict

  • keys: cluster ids
  • values: timestamps

or should the dict representation be a separate transformation step, as in nwb.SpikeTablizer?

fix readthedocs build

readthedocs is currently failing due to trouble building the documentation (in particular, sphinx gallery examples)

implement TraceTensorizer

the TraceTensorizer should be initialized with a dataframe of events and a time axis relative to event times over which the

a key challenge is that if event times are "in between" times on the trace time axis, then a decision needs to be made:

  • do we align to the nearest time bin?
  • do we resample the trace?
  • if we resample, what method should we use?

my first thoughts:

  • if the trace is integers, then it is likely spike counts and we should NOT interpolate. grab nearest?
  • if the trace is continuous, then we should interpolate. cubic spline is an obvious default. might want to look into others. kriging? https://en.m.wikipedia.org/wiki/Kriging

ResponseExtractor needs numpy

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-16-e2b82dfff346> in <module>()
      1 from neuroglia.tensor import ResponseExtractor
----> 2 extractor = ResponseExtractor()
      3 X = extractor.fit_transform(X)

c:\users\justink\code\neuroglia\neuroglia\tensor.py in __init__(self, method, dim)
      7 
      8         if method == 'mean':
----> 9             self.method = np.mean
     10         elif method == 'max':
     11             self.method = np.max

NameError: name 'np' is not defined

documentation

  • Introduction
  • Installation
  • Examples
    • PSTH
    • Allen Brain Observatory - Natural Images Decoding
    • Spike Inference from Calcium
    • Canonical Polyadic Tensor Decomposition
  • Tutorial
    • Traces
    • Spikes
    • Events
    • Tensors
  • API reference

allow for neuron-specific smoothing kernels

Currently, Smoother takes a single tau parameter, which sets the time constant of whichever kernel is selected.

In some instances, it may be preferable to give different neurons different kernel parameters and/or kernels.

make test fails

  • neuroglia version: b20ee5c
  • Python version: 2.7
  • Operating System: Ubuntu 16.04.2

Description

Looks like test rule needs to be fixed

What I Did

tried to run make test , received:

make test
test.sh
make: test.sh: Command not found
Makefile:2: recipe for target 'test' failed
make: *** [test] Error 127

MNT: Stop using ci-helpers in appveyor.yml

To whom it may concern,

If you are using https://github.com/astropy/ci-helpers in your appveyor.yml , please know that the Astropy project has dropped active development/support for Appveyor CI. If it still works, good for you, because we did not remove the relevant files (yet). But if it ever stops working, we have no plans to fix anything for Appveyor CI. Please consider using native Windows support other CI, e.g., Travis CI (see https://docs.travis-ci.com/user/reference/windows/). We apologize for any inconvenience caused.

If this issue is opened in error or irrelevant to you, feel free to close. Thank you.

xref astropy/ci-helpers#464

add "datasets" module

this module should implement a similar api as sklearn.datasets and nilearn.datasets

  • use sklearn.datasets.base (including sklearn datasets cache folder) to store downloaded data
  • use logic as in crcnsget to download data (optionally using environment variables for passwords)
  • functions should return data ready to analyze with neuroglia (e.g. event dataframes, spike dataframes, or trace dataframes/xarrays)

first candidate datasets, depending on needs for examples:

cai-1 for calcium inference

allen institute brain observatory experiment for decoding example

improve test coverage

Description

test coverage is low ~67% so getting > 90% would be nice and not too much effort

@j-friedrich i'm not very experienced with your contributions could you potentially review the test code i submit?

implement more interpolation methods for PeriEventTraceSampler

with keyword arguments

  • cubic spline (default)
  • sinc
  • kriging

further, the user should be able to pass any of the univariate functions from scipy.interpolate that take x & y as arguments and return a function that can be applied to new x values to return interpolated y values

identify open datasets for demos

look at crcns.org for spiking data (since neuropixels is unreleased)

and brain observatory is an obvious candidate for calcium traces

Feature request: source extraction for calcium images

Currently neuroglia's calcium module works with extracted fluorescence traces. It would be useful to integrate the ability to extract fluorescence traces for downstream processing.
This could look something like:

from dask.array.image import imread
from neuroglia.calcium import SourceExtraction

image = imread('image.tif')
se = SourceExtraction(method='some_method', *args, **kwargs)

fluorescence_traces = se.transform(image)

Some libraries already exist for this (SIMA, CaImAn, Thunder), but an integrated solution with a consistent API would allow for more efficient processing. Which algorithms to use and how to implement or wrap them are up for discussion. Dask is used in the example above because it would support both in-memory processing of small images and out-of-memory processing of large images, and because it integrates naturally with xarray for downstream analysis.

merge junk in API reference

  • neuroglia version: 0.2.9
  • Python version: 3.6
  • Operating System: ubuntu 17.10

Description

API reference docs has a bit of merge junk:

 40 <<<<<<< HEAD
 41 
 42 =======
 43 
 44 >>>>>>> 14a9cab... :memo: typo in API docs

What I Did

cd docs
python -m sphinx . _build

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.