Giter VIP home page Giter VIP logo

trackpy's Introduction

trackpy

Build status Build status DOI

What is it?

trackpy is a Python package for particle tracking in 2D, 3D, and higher dimensions. Read the walkthrough to skim or study an example project from start to finish.

Documentation

Read the documentation for

  • an introduction
  • tutorials on the basics, 3D tracking, and much, much more
  • easy installation instructions
  • the reference guide

If you use trackpy for published research, please cite the release both to credit the contributors, and to direct your readers to the exact version of trackpy they could use to reproduce your results.

trackpy's People

Contributors

ahmadia avatar anntzer avatar apiszcz avatar bruot avatar caspervdw avatar charlesreid1 avatar crisp-snakey avatar danielballan avatar dwieker avatar freemansw1 avatar hadim avatar hugovk avatar ivanovmg avatar jankatins avatar kevin-duclos avatar krrk avatar lagru avatar leouieda avatar magnunor avatar marcocaggioni avatar nkeim avatar pfigliozzi avatar prashnts avatar rbnvrw avatar rebeccawperry avatar sciunto avatar tacaswell avatar thierrybottaro avatar veramtitze avatar zoeith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trackpy's Issues

numba help

@nkeim I know you are busy this week, but file this away for when you find a moment....

I have to process a bunch of long videos, and now I'm dreaming of numba-accelerated refine. I have a working version on this branch. It passes all the same tests as my original _refine.

It is 2D only, but written so that copying the code for 1D or 3D variants would be straightforward. I think I have avoided using any generic slicing syntax that would cause numba to fallback on numpy.

I made every loop explicit. Using numba.typeof(...), I checked that all the variables are interpreted as numerical arrays, not object. However, the speed increase over Python is not large. If and when you have time, can you improve on this or point out any glaring shortcomings?

dallan@dielectric-pc:~/trackpy/benchmarks$ ipython simple_benchmarks.ipy 
Compiling Numba...
Locate using Python Engine with Default Settings (Accurate)
1 loops, best of 3: 1.02 s per loop
Locate using Python Engine with Fast Settings (Sloppy)
1 loops, best of 3: 325 ms per loop
Locate using Numba Engine with Default Settings (Accurate)
1 loops, best of 3: 1.02 s per loop
Locate using Numba Engine with Fast Settings (Sloppy)
1 loops, best of 3: 295 ms per loop

You can switch between using my original _refine and _numba_refine using the engine keyword. See benchmarks/simple_benchmarks.ipy for an example.

Notice that, in the benchmarks, I do the band pass ahead of time. By default, locate and batch will preprocess in the manner of Crocker/Weeks epretrack, but that can be shut off using preprocess=False.

Merge Nathan's code.

FYI, @nkeim , Tom Caswell is over my shoulder, and we're serious about merging trackpy in the next few weeks. We're going to reconcile my (Dan's) branch first and then yours. I'm also going to pull in all my feature-finding, tests, and analysis code. See related issues.

"Easy"

  • Allow access to nonrecursive_link via keyword link_strategy.
  • Add KDTree alternative. Access HashTable vs. KDTree via keyword neighbor_strategy.
  • Make Tom document this.

Hard

  • Add Nathan's linker.
  • Allow it to work with numba if numba can be imported; otherwise, without. (Maybe this just works, once we add the function.) Cherry-pick if this is practical, and either way, attribute Nathan in the README.
  • Make sure Nathan's API is supported.

re-work `hash_generator` api

So that it takes in a list of points to match the api in the kdtree wrappers.

a bunch of logic can be removed from link_iter if this is done.

Sloppy `_refine_numba()` much slower than Python

I'm not sure why we didn't see this before, but running locate() with max_iterations=0 is actually much faster without Numba. (numba v0.11 or v0.12.1)

This is from benchmarks/numba_benchmarks.ipy, with the first case added by me.

10x: Locate using Python Engine with Fast Settings (Sloppy)
1 loops, best of 3: 3.65 s per loop
10x: Locate using Numba Engine with Fast Settings (Sloppy)
1 loops, best of 3: 14.6 s per loop

When accuracy is desired, numba is the way to go:

10x: Locate using Python Engine with Default Settings (Accurate)
1 loops, best of 3: 38.4 s per loop
10x: Locate using Numba Engine with Default Settings (Accurate)
1 loops, best of 3: 23 s per loop

I am not in favor of adding "smart" logic to the code to automatically switch back to pure-Python. But I am still working on special documentation for users who need performance, and I will make a note of this there.

According to me, this issue can be ignored or even closed until we get around to making _refine_numba() take full advantage of future Numba releases. But I wanted to give you a chance to comment.

thumbs.db file crashed batch particle locate

I took my images on a windows machine, and it created a thumbs.db in my directory. When I used:
frames = tp.ImageSequence()
f = tp.batch(frames, 5, minmass=540)
It handled 840 images properly before reaching the thumbs.db file at which point it objected to that not being an image file-- so I had to remove thumbs.db and then run the whole thing again. A suggested enhancement-- just skip files that don't appear to be image files.

another dependency: ffmpeg

I ran into another dependency that I didn't have installed. I ran into it when I tried to load some images and locate the particles: ffmpeg. I would add this to the list of things to install to get trackpy running.

Optimize bandpass

  • Re-examine ndimage.uniform_filter, which is not exactly a boxcar and which is accidentally maybe slower than a boxcar filter. Checkout the boxcar filter scipy 0.7 and the (1D-only?) version in modern scipy. Where did the 2D version go? Is it somewhere in astropy?
  • Can we implement a 2D version as two 1D versions?
  • Look into whether letting FFTW do more optimization (write more "wisdom") would be worthwhile.
  • Also look into performance of FFTW in 2d, nd, "multiple 1D."

4D/5D functionality

Hello all.

I was wondering about how to, in a memory efficient way, batch process 4D data (3D + t) and track cell nuclei.

I was also wondering how to use these tracks to track a periodic nuclear fluorescent marker, which would be observed through a second channel, which would be 5D.

For 4D functionality, here is some test data from the Megason lab in the MB range.

http://www.gofigure2.org/index.php/users.html

GoFigureSampleData.zip - the nuclei are in Channel 00 ("ch00")

Note that I have data in the GB to TB range that far exceeds my RAM.

4D and 5D functionality

Hello all!

I've seen very encouraging internal documentation, confirmed by the great and very helpful Dan Allen (who suggested that I open this as an "issue"), that trackpy can in principle handle n-dimensional data.

In practice, how to best get this working is, for now, somewhat unclear to me.

In principle, getting the information in as a numpy array of the proper dimensionality will work.

However, I am also interested in the batch processing available through pims and trackpy, since experimental data sets can extend into the GB and TB range.

Here is sample data from the Megason lab (in the MB range), which is 4D x,y,z, and t, if you only look at Channel 00, which has blog-like nuclei in three time points. I would like to track 3D nuclei in xyz space through time:

http://www.gofigure2.org/index.php/users.html ("GoFigureSampleData.zip")

5D functionality - this is the real reason I am interested in trackpy. Imagine that I have nuclei, which are being tracked in 4D, in one channel, and in another, I have some periodic fluorescent marker. I would like to use the (constant) nuclei tracks to connect the on/off states of the periodic marker, so that I could track and do data analysis in those periodic cycles.

How could this be done?

Consider changing the default value of filter_before.

The filter_before=True parameter of locate is for performance. It eliminates candidate features based on mass (and size if minsize is set) to avoid refining the position of features that will be thrown away. When numba is enabled, it does not actually improve performance (as @nkeim recognized) because the computation is faster if performed in the refine step as a numba-optimized loop over the image.

It is possible that filter_before could still be helpful in some use cases, especially for images that are not 2D and thus cannot use numba, so it should remain available.

Should it be:

  • False by default in all cases
  • False by default if the numba engine is used
  • False by default according to some more subtle criterion?

Frames only containing background

Hello all,
Thanks for trackpy!

Everything works great for me.
I have a little problem.
When I have particles (deconvolved files with a widefield microscope) everything works great, however if I try to use an image with only background (a negative control for example). I get a lot of particles all over the place, with a high mass even.
For example, if I use the minmass=30000 in frames with particles it detects with no problem whatsoever even really small particles close to the background level. However if I have no particles in the picture even with minmass=1e5 for example I have 600 particles detected which are only background.
I am using the software to detect two colours and one of them is supposed to disappear, so this sometimes bugs me.
Thanks for any help, appreciated.

`import trackpy` imports matplotlib

While plots.py is in the news... one frustration I encountered when using trackpy with the IPython parallel computing facilities is that it always starts up matplotlib. If the (headless) cluster workers are running on a remote host, this requires a workaround because there is no X server in DISPLAY. It seems like moving import matplotlib into each function definition in plots.py would result in fewer net surprises.

Try out recent releases of numba

Numba releases have been coming fast. I was told at SciPy that they have achieve big performance gains. At some point we should check whether the upgrade breaks anything. Provisionally tagging 0.2.3.

Is numba worth the trouble if numpy can do most of the work?

This S.O. question, in particular the original poster's comment on the answer, sums up my feelings as I try work fit numba into my code. Seems like a lot of common numpy idioms are not supported.

Explicit loops, which seem to be what numba wants, make code less readable and obviously slower for users who might not have numba available. Disappointing.

On the other hand, the line profiler %lprun shows me that the refine function in feautre.py takes 95% of the runtime during feature location, so making that faster is worth the trouble, even if it does take a lot of trouble.

Thoughts?

Let `setup.py` and sphinx share version info

#111 and #125 concern the version string displayed all of the sphinx documentation page titles. It's confusing when it doesn't reflect the actual version of the code. Currently, it needs to be updated by hand when there's a release.

There ought to be a better way! I've created this issue just so we remember the problem exists; otherwise we'll only encounter it when we're trying to get a release out the door, which is of course a bad time to try to fix it.

Use standard data sets for benchmarking and examples.

Low priority, but a note for later...

It would be nice to have examples at different densities to answer questions like, "Can our BTree mode ever beat KDTree, even though it's in pure Python?"

We could use a subset of the data from this 2012 particle tracking competition. They systematically cover a range of densities and signal-to-noise ratios. The results were published in Nature. You can download the videos and so-called "ground truth" trajectories against which entries were judged.

h/t to Ben Schuster for pointing me to the paper.

Implement interface to sm_core

I haven't spent more than five minutes looking at it, but I'm imaging some way of hooking the inputs and outputs of locate and link_df_iter into an SM_serial.

Roadmap: Enhancements to finish before or during SciPy conference

A tour of our walkthrough notebook alone could fill 15 minutes. There's no shortage of features to discuss. But I'd like to use the impending deadline and the conference's "Sprints session" as an excuse to finish work on some enhancements.

This is my list of ideas. If you have anything to offer or to add, please speak up.

  • Try to connect mpld3 to IPython widgets to build a fast, interactive, zoomable feature-filtering tool. No one has connected mpld3 to widgets yet, though it has been suggested. This will be of wide interest if it works. I have done preliminary work on this.
  • Display features or trajectories drawn onto HTML5 videos embedded in notebooks. I did the hard part -- the video -- for another project. I just have to connected to trackpy.
  • Consider other ways that widgets and/or mpld3 could make life easier. (Suggestions?)
  • Consider making more use of the tools in scikit-image. Broadly, I want to have a clear understanding of how each of our imaging steps (e.g., bandpass) is related to and distinct from tools in scikit-image. That project will be a major focus at the meeting.

These almost definitely won't fit into the talk, but I'd like to finish them soon anyway.

  • Implement trajectory splitting/joining and a more sophisticated gap-closing (like "memory") that is globally optimized in time. See Jaqaman et. al. This is about half-written.
  • Implement clustering of heterogeneous trajectories into homogenous sub-ensembles using an F-statistic, following Valetine et. al.. This is written, but I want to test it on real data before I submit a PR.

Generate doi

It looks like the doi generator does not support github organization repos just yet. I sent an email to support and I then found this:

https://twitter.com/ZENODO_ORG/status/467267473725992960

So much for the rush to do this in time for my paper. We can revisit once Zenodo supports organization repos. If someone sees news of this before I do, please bump this issue.

Subclass FramewiseData to read other formats.

Wish list:

  • GDF for compatibility with Crocker/Grier/Weeks
  • whatever format the Blair code uses (I have never touched that code myself, I have no idea)
  • non-pandas hdf5

I originally thought of CSV for newbies, but you can already do DataFrame.to_csv, and if your data doesn't fit in memory, you shouldn't be saving it to a CSV!

Others?

Publish a contribution policy

It would be helpful, sooner rather than later, to have guidelines for version numbering, PR etiquette, etc. on the Github site. Perhaps we can just open a project wiki and slap it up on the front page.

The actual content could be along the lines of Tom's email on 1/28. Most salient point: don't merge your own PR.

Better testing for predictor diagnostics

Tests for predict.py should check that the instrumented() class decorator works properly with each of the built-in predictor classes. This should involve a simple linking task, followed by cursory inspection of the output of dump().

Since the diagnostic output depends on the predictor class it was used on, checking the numerical correctness of that information would be difficult.

Roadmap: Upcoming releases

Awhile back, @danielballan suggested a post-SciPy v0.2.2 release to clean things up. Before we do much more merging, I'm starting this issue to help us come to consensus on what goes in v0.2.2, and (hypothetically) v0.2.3 and v0.3.

Here's what I can think of:

v0.2.2

This release would just be a more perfect v0.2.1, that we can get out soon.

v0.2.3

  • "Unofficial" py3 support #122 (Involves a bunch of little changes to lots of files)
  • (Your PR here?)

Please share your ideas. Finally, are there any open or imminent PRs that should be targeted for v0.3?

Choose a better default colormap in plot_traj.

Thanks to Kristen Thyng's talk at SciPy 2014 (not yet online) I now know that cm.winter is the very worst colormap in matplotlib, perceptually speaking. It should not be the default for coloring-by-time in trackpy.plots.plot_traj.

Rethink batch

Something to mull over, as we aim for sharing data between groups and reading GDF files...

batch takes a bunch of kwargs to manage output to a variable or a pandas HDFStore or a SQL database. This gets the job done and offers some flexibility, but it would be better to have an (abstract?) base class that can accept frames one at a time, save them in whatever format, and later regurgitate them one at a time for link_iter / link_df_iter.

In mr, I had a half-baked Trajectories class for this, which -- without looking -- sounds like it does the same thing as Nathan's BigTracks.

Thoughts and questions for your consideration:

  • Is this exactly what SM_serial is for? It seems hdf5-specific, but if I replace the word "group" with the word "frame," it could be more general. Is there a role for some intermediate class, or should I be thinking of using SM_serial directly?
  • I want the class to be general enough that it can output to a simple format like text, CSV, or even Excel. If we want to win any converts from MATLAB, or those who still do it by hand (!), we can't assume everyone will climb the learning curve to HDF or appreciate its benefits. Managing output formats is outside the scope of trackpy, but pandas does it all -- we just need to managing chunking data sets by frame.

numba 0.12: extremely slow `_refine_numba()`

You may have noticed that just doing import trackpy with numba 0.12 installed takes upwards of 30 seconds, and execution of _refine_numba() also seems to take forever. After extensive study I've concluded that getting _refine_numba() back up to full speed will require a major rewrite. It's pretty unbelievable that the new Anaconda release would include a version of one of their flagship projects that is such a drastic regression in performance, but there you go.

For future reference: All of the trouble seems to be with _refine_numba; the numba subnet code is thankfully safe because it does not create any arrays and does not need any external math functions (such as sqrt).

Our best bet is to stick to numba 0.11 until the numba team can bring their new releases back up to that standard.

test images? test video?

I can't find the test video referenced in the walk through. If only I could get ONE image with located particles, I would feel like I was on my way to being able to use this package.

BUG: C_fallback_python is not working as intended

This only matters if the user does not have a Python-compatible C compiler.

The pure Python fallback for the one piece of C code in mr is not quite right: 5 of the feature tests are showing poor precision.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.