soft-matter / trackpy Goto Github PK

View Code? Open in Web Editor NEW

436.0 20.0 131.0 96.18 MB

Python particle tracking toolkit

Home Page: http://soft-matter.github.io/trackpy

License: Other

Python 99.86% Shell 0.01% Jupyter Notebook 0.13%

trackpy's Introduction

trackpy

What is it?

trackpy is a Python package for particle tracking in 2D, 3D, and higher dimensions. Read the walkthrough to skim or study an example project from start to finish.

Documentation

Read the documentation for

an introduction
tutorials on the basics, 3D tracking, and much, much more
easy installation instructions
the reference guide

If you use trackpy for published research, please cite the release both to credit the contributors, and to direct your readers to the exact version of trackpy they could use to reproduce your results.

trackpy's People

Contributors

Stargazers

Watchers

Forkers

danielballan alexlib nkeim normanav2357 elizan tacaswell ronojoy aashish24 rebeccawperry joshuaskootsky pombreda abhilashksh sciunto caspervdw ahmadia silky wavelets apiszcz fayazr daniorerio joeforth kmsouthard pjnewm hadim angelberihuete zoeith danielmckeown vivarose liuqx315 hohlraum huangynj acorbe jakirkham thierrybottaro pfigliozzi ianphysics xetrocoen markhannel mathieuleocmach andresvn89 jenk1 electronicshelf mcquin prashnts georgewang1989 lagru magdalenat blank-wang tracking-fun francoismarquier bruot carolhungwt m0hsend krrk rmatsum836 wenhai-zheng photoonh gewitterblitz ulrikeboehm yueguangzoum zhangy277 mstiegl hayatoikoma rbnvrw krucifier-jr ktp-forked-repos jackyysu wsyxbcl 108mk scdzhen vbits ebteddy anntzer jabayat fegonda cskv monanis trasse hugovk yuechuanlin-cw tlozsolt stevenvanuytsel cbilsing luki58 magnunor yhuang-thu ckenned liu-ran m-abbaszadeh jdiodati20 chanjeunlam jinslu rmcgorty freemansw1 hgauri zloverty i02132002 hitdps eggeling-lab-microscope-software markma1990

trackpy's Issues

numba help

@nkeim I know you are busy this week, but file this away for when you find a moment....

I have to process a bunch of long videos, and now I'm dreaming of numba-accelerated refine. I have a working version on this branch. It passes all the same tests as my original _refine.

It is 2D only, but written so that copying the code for 1D or 3D variants would be straightforward. I think I have avoided using any generic slicing syntax that would cause numba to fallback on numpy.

I made every loop explicit. Using numba.typeof(...), I checked that all the variables are interpreted as numerical arrays, not object. However, the speed increase over Python is not large. If and when you have time, can you improve on this or point out any glaring shortcomings?

dallan@dielectric-pc:~/trackpy/benchmarks$ ipython simple_benchmarks.ipy 
Compiling Numba...
Locate using Python Engine with Default Settings (Accurate)
1 loops, best of 3: 1.02 s per loop
Locate using Python Engine with Fast Settings (Sloppy)
1 loops, best of 3: 325 ms per loop
Locate using Numba Engine with Default Settings (Accurate)
1 loops, best of 3: 1.02 s per loop
Locate using Numba Engine with Fast Settings (Sloppy)
1 loops, best of 3: 295 ms per loop

You can switch between using my original _refine and _numba_refine using the engine keyword. See benchmarks/simple_benchmarks.ipy for an example.

Notice that, in the benchmarks, I do the band pass ahead of time. By default, locate and batch will preprocess in the manner of Crocker/Weeks epretrack, but that can be shut off using preprocess=False.

Merge Nathan's code.

FYI, @nkeim , Tom Caswell is over my shoulder, and we're serious about merging trackpy in the next few weeks. We're going to reconcile my (Dan's) branch first and then yours. I'm also going to pull in all my feature-finding, tests, and analysis code. See related issues.

"Easy"

Allow access to nonrecursive_link via keyword link_strategy.
Add KDTree alternative. Access HashTable vs. KDTree via keyword neighbor_strategy.
Make Tom document this.

Hard

Add Nathan's linker.
Allow it to work with numba if numba can be imported; otherwise, without. (Maybe this just works, once we add the function.) Cherry-pick if this is practical, and either way, attribute Nathan in the README.
Make sure Nathan's API is supported.

tutorial update-- image squence

http://nbviewer.ipython.org/github/soft-matter/trackpy/blob/master/examples/loading-video-frames.ipynb

The output for line 4 is not what I get. I get: pims.image_sequence.ImageSequence at 0x4467d10

re-work `hash_generator` api

So that it takes in a list of points to match the api in the kdtree wrappers.

a bunch of logic can be removed from link_iter if this is done.

Sloppy `_refine_numba()` much slower than Python

I'm not sure why we didn't see this before, but running locate() with max_iterations=0 is actually much faster without Numba. (numba v0.11 or v0.12.1)

This is from benchmarks/numba_benchmarks.ipy, with the first case added by me.

10x: Locate using Python Engine with Fast Settings (Sloppy)
1 loops, best of 3: 3.65 s per loop
10x: Locate using Numba Engine with Fast Settings (Sloppy)
1 loops, best of 3: 14.6 s per loop

When accuracy is desired, numba is the way to go:

10x: Locate using Python Engine with Default Settings (Accurate)
1 loops, best of 3: 38.4 s per loop
10x: Locate using Numba Engine with Default Settings (Accurate)
1 loops, best of 3: 23 s per loop

I am not in favor of adding "smart" logic to the code to automatically switch back to pure-Python. But I am still working on special documentation for users who need performance, and I will make a note of this there.

According to me, this issue can be ignored or even closed until we get around to making _refine_numba() take full advantage of future Numba releases. But I wanted to give you a chance to comment.

See what scikit-image can do for us.

The project has come a long way. ImageCollection is similar to our ImageSequence. Their imread looks smarter than the one in ndimage or mpl.

I wonder if their local maxima detector is faster than ours: http://scikit-image.org/docs/dev/api/skimage.feature.html#peak-local-max

thumbs.db file crashed batch particle locate

I took my images on a windows machine, and it created a thumbs.db in my directory. When I used:
frames = tp.ImageSequence()
f = tp.batch(frames, 5, minmass=540)
It handled 840 images properly before reaching the thumbs.db file at which point it objected to that not being an image file-- so I had to remove thumbs.db and then run the whole thing again. A suggested enhancement-- just skip files that don't appear to be image files.

another dependency: ffmpeg

I ran into another dependency that I didn't have installed. I ran into it when I tried to load some images and locate the particles: ffmpeg. I would add this to the list of things to install to get trackpy running.

Move most of mr in trackpy.

Merge the repositories and keep the commit history.
Maybe rename feature.py to something Crocker-Grier specific.
Basically use mr's README. Obviously add Tom. Also also Tom's list of related software projects: http://tacaswell.github.io/tracking/html/

use `np.ravel`/`np.unravel` in `Hash_table`

Currnently, this is done with self-rolled code, switch to using the numpy verisions to a) simplify code b) remove source of bugs

Optimize bandpass

Re-examine ndimage.uniform_filter, which is not exactly a boxcar and which is accidentally maybe slower than a boxcar filter. Checkout the boxcar filter scipy 0.7 and the (1D-only?) version in modern scipy. Where did the 2D version go? Is it somewhere in astropy?
Can we implement a 2D version as two 1D versions?
Look into whether letting FFTW do more optimization (write more "wisdom") would be worthwhile.
Also look into performance of FFTW in 2d, nd, "multiple 1D."

4D/5D functionality

Hello all.

I was wondering about how to, in a memory efficient way, batch process 4D data (3D + t) and track cell nuclei.

I was also wondering how to use these tracks to track a periodic nuclear fluorescent marker, which would be observed through a second channel, which would be 5D.

For 4D functionality, here is some test data from the Megason lab in the MB range.

http://www.gofigure2.org/index.php/users.html

GoFigureSampleData.zip - the nuclei are in Channel 00 ("ch00")

Note that I have data in the GB to TB range that far exceeds my RAM.

4D and 5D functionality

Hello all!

I've seen very encouraging internal documentation, confirmed by the great and very helpful Dan Allen (who suggested that I open this as an "issue"), that trackpy can in principle handle n-dimensional data.

In practice, how to best get this working is, for now, somewhat unclear to me.

In principle, getting the information in as a numpy array of the proper dimensionality will work.

However, I am also interested in the batch processing available through pims and trackpy, since experimental data sets can extend into the GB and TB range.

Here is sample data from the Megason lab (in the MB range), which is 4D x,y,z, and t, if you only look at Channel 00, which has blog-like nuclei in three time points. I would like to track 3D nuclei in xyz space through time:

http://www.gofigure2.org/index.php/users.html ("GoFigureSampleData.zip")

5D functionality - this is the real reason I am interested in trackpy. Imagine that I have nuclei, which are being tracked in 4D, in one channel, and in another, I have some periodic fluorescent marker. I would like to use the (constant) nuclei tracks to connect the on/off states of the periodic marker, so that I could track and do data analysis in those periodic cycles.

How could this be done?

Has feature refactoring botched subpx accuracy?

Rerunning the tutorial, the subpx_bias plot looks bad, even with engine='python'. I will have to go back and find where this changed.

Document LinkOnDisk, out of core trajectory linking.

Yup.

Consider changing the default value of filter_before.

The filter_before=True parameter of locate is for performance. It eliminates candidate features based on mass (and size if minsize is set) to avoid refining the position of features that will be thrown away. When numba is enabled, it does not actually improve performance (as @nkeim recognized) because the computation is faster if performed in the refine step as a numba-optimized loop over the image.

It is possible that filter_before could still be helpful in some use cases, especially for images that are not 2D and thus cannot use numba, so it should remain available.

Should it be:

False by default in all cases
False by default if the numba engine is used
False by default according to some more subtle criterion?

Add g(r) code.

Refactor circle to use mpl; remove OpenCV dependency.

Since pims has a way around OpenCV, I will remove it as a dependency entirely by refactoring mr.circle (now tp.circle) to use mpl animations. As a bonus, this will work in IPython Notebooks.

Switch build on Travis to conda so numba tests can be run.

Frames only containing background

Hello all,
Thanks for trackpy!

Everything works great for me.
I have a little problem.
When I have particles (deconvolved files with a widefield microscope) everything works great, however if I try to use an image with only background (a negative control for example). I get a lot of particles all over the place, with a high mass even.
For example, if I use the minmass=30000 in frames with particles it detects with no problem whatsoever even really small particles close to the background level. However if I have no particles in the picture even with minmass=1e5 for example I have 600 particles detected which are only background.
I am using the software to detect two colours and one of them is supposed to disappear, so this sometimes bugs me.
Thanks for any help, appreciated.

`import trackpy` imports matplotlib

While plots.py is in the news... one frustration I encountered when using trackpy with the IPython parallel computing facilities is that it always starts up matplotlib. If the (headless) cluster workers are running on a remote host, this requires a workaround because there is no X server in DISPLAY. It seems like moving import matplotlib into each function definition in plots.py would result in fewer net surprises.

Properly handle cases with zero features.

This can happen at many stages, and each one has to proceed with producing errors. Maybe it should warn though, especially if it finds no non-black pixels.

Try out recent releases of numba

Numba releases have been coming fast. I was told at SciPy that they have achieve big performance gains. At some point we should check whether the upgrade breaks anything. Provisionally tagging 0.2.3.

Sphinx docs are missing functions

The docs at e.g. http://trackpy.readthedocs.org/en/latest/reference/trackpy.linking.html have the module docstrings only, which are of very limited use. I think they should at least have the names, call signatures, and docstring first lines of the member functions.

I've configured Sphinx to do this in the past, so I may take care of this myself at some point.

Make trackpy behave like customized_trackpy in mr.

Make iterative_link (i.e., new name for link_iterator) the fundamental loop.
Support link with Tom's original trackpy API.
Support track (dataframe_link?) with Dan's mr API.

Is numba worth the trouble if numpy can do most of the work?

This S.O. question, in particular the original poster's comment on the answer, sums up my feelings as I try work fit numba into my code. Seems like a lot of common numpy idioms are not supported.

Explicit loops, which seem to be what numba wants, make code less readable and obviously slower for users who might not have numba available. Disappointing.

On the other hand, the line profiler %lprun shows me that the refine function in feautre.py takes 95% of the runtime during feature location, so making that faster is worth the trouble, even if it does take a lot of trouble.

Thoughts?

Let `setup.py` and sphinx share version info

#111 and #125 concern the version string displayed all of the sphinx documentation page titles. It's confusing when it doesn't reflect the actual version of the code. Currently, it needs to be updated by hand when there's a release.

There ought to be a better way! I've created this issue just so we remember the problem exists; otherwise we'll only encounter it when we're trying to get a release out the door, which is of course a bad time to try to fix it.

Use standard data sets for benchmarking and examples.

Low priority, but a note for later...

It would be nice to have examples at different densities to answer questions like, "Can our BTree mode ever beat KDTree, even though it's in pure Python?"

We could use a subset of the data from this 2012 particle tracking competition. They systematically cover a range of densities and signal-to-noise ratios. The results were published in Nature. You can download the videos and so-called "ground truth" trajectories against which entries were judged.

h/t to Ben Schuster for pointing me to the paper.

Implement interface to sm_core

I haven't spent more than five minutes looking at it, but I'm imaging some way of hooking the inputs and outputs of locate and link_df_iter into an SM_serial.

Roadmap: Enhancements to finish before or during SciPy conference

A tour of our walkthrough notebook alone could fill 15 minutes. There's no shortage of features to discuss. But I'd like to use the impending deadline and the conference's "Sprints session" as an excuse to finish work on some enhancements.

This is my list of ideas. If you have anything to offer or to add, please speak up.

Try to connect mpld3 to IPython widgets to build a fast, interactive, zoomable feature-filtering tool. No one has connected mpld3 to widgets yet, though it has been suggested. This will be of wide interest if it works. I have done preliminary work on this.
Display features or trajectories drawn onto HTML5 videos embedded in notebooks. I did the hard part -- the video -- for another project. I just have to connected to trackpy.
Consider other ways that widgets and/or mpld3 could make life easier. (Suggestions?)
Consider making more use of the tools in scikit-image. Broadly, I want to have a clear understanding of how each of our imaging steps (e.g., bandpass) is related to and distinct from tools in scikit-image. That project will be a major focus at the meeting.

These almost definitely won't fit into the talk, but I'd like to finish them soon anyway.

Implement trajectory splitting/joining and a more sophisticated gap-closing (like "memory") that is globally optimized in time. See Jaqaman et. al. This is about half-written.
Implement clustering of heterogeneous trajectories into homogenous sub-ensembles using an F-statistic, following Valetine et. al.. This is written, but I want to test it on real data before I submit a PR.

A call for good videos / images to use in my SciPy talk.

If you don't know, I am presenting on TrackPy at the SciPy 2014 conference.

I could use some cool data to show off! Mine is dirty, which is impressive in the sense that it can be tracked, but it is perhaps not as fun to look at as some you might have. Would you be willing to share your images and/or your IPython notebooks? Please send to [email protected]. You can share larger files via Dropbox using that same address.

Generate doi

It looks like the doi generator does not support github organization repos just yet. I sent an email to support and I then found this:

https://twitter.com/ZENODO_ORG/status/467267473725992960

So much for the rush to do this in time for my paper. We can revisit once Zenodo supports organization repos. If someone sees news of this before I do, please bump this issue.

Use BSD license?

We don't feel strongly about this, but we're considering it. Any objections, @nkeim ?

Subclass FramewiseData to read other formats.

Wish list:

GDF for compatibility with Crocker/Grier/Weeks
whatever format the Blair code uses (I have never touched that code myself, I have no idea)
non-pandas hdf5

I originally thought of CSV for newbies, but you can already do DataFrame.to_csv, and if your data doesn't fit in memory, you shouldn't be saving it to a CSV!

Others?

Publish a contribution policy

It would be helpful, sooner rather than later, to have guidelines for version numbering, PR etiquette, etc. on the Github site. Perhaps we can just open a project wiki and slap it up on the front page.

The actual content could be along the lines of Tom's email on 1/28. Most salient point: don't merge your own PR.

Better testing for predictor diagnostics

Tests for predict.py should check that the instrumented() class decorator works properly with each of the built-in predictor classes. This should involve a simple linking task, followed by cursory inspection of the output of dump().

Since the diagnostic output depends on the predictor class it was used on, checking the numerical correctness of that information would be difficult.

Fix name of "exmaples" directory

Should be "examples". I think one can just use "git mv" for this.

C extensions [?] failing in Travis

The HDFStore and sqlite tests (and those tests only) are failing in Travis. The exception is during the feature-finding code.

See https://travis-ci.org/soft-matter/trackpy/jobs/18312343

Encountered with #55.

Roadmap: Upcoming releases

Awhile back, @danielballan suggested a post-SciPy v0.2.2 release to clean things up. Before we do much more merging, I'm starting this issue to help us come to consensus on what goes in v0.2.2, and (hypothetically) v0.2.3 and v0.3.

Here's what I can think of:

v0.2.2

This release would just be a more perfect v0.2.1, that we can get out soon.

Updated README #123
Upcoming bugfix for annotate, see #104 — @RebeccaWPerry has the fix.
Documentation fixes #121 , #125

v0.2.3

"Unofficial" py3 support #122 (Involves a bunch of little changes to lots of files)
(Your PR here?)

Please share your ideas. Finally, are there any open or imminent PRs that should be targeted for v0.3?

Choose a better default colormap in plot_traj.

Thanks to Kristen Thyng's talk at SciPy 2014 (not yet online) I now know that cm.winter is the very worst colormap in matplotlib, perceptually speaking. It should not be the default for coloring-by-time in trackpy.plots.plot_traj.

Rethink batch

Something to mull over, as we aim for sharing data between groups and reading GDF files...

batch takes a bunch of kwargs to manage output to a variable or a pandas HDFStore or a SQL database. This gets the job done and offers some flexibility, but it would be better to have an (abstract?) base class that can accept frames one at a time, save them in whatever format, and later regurgitate them one at a time for link_iter / link_df_iter.

In mr, I had a half-baked Trajectories class for this, which -- without looking -- sounds like it does the same thing as Nathan's BigTracks.

Thoughts and questions for your consideration:

Is this exactly what SM_serial is for? It seems hdf5-specific, but if I replace the word "group" with the word "frame," it could be more general. Is there a role for some intermediate class, or should I be thinking of using SM_serial directly?
I want the class to be general enough that it can output to a simple format like text, CSV, or even Excel. If we want to win any converts from MATLAB, or those who still do it by hand (!), we can't assume everyone will climb the learning curve to HDF or appreciate its benefits. Managing output formats is outside the scope of trackpy, but pandas does it all -- we just need to managing chunking data sets by frame.

use np.mgrid to set up shifts in Hash_table

this will clean up the code and make it scalable to any dimension.

Why is PIL listed as a dependency?

numba 0.12: extremely slow `_refine_numba()`

You may have noticed that just doing import trackpy with numba 0.12 installed takes upwards of 30 seconds, and execution of _refine_numba() also seems to take forever. After extensive study I've concluded that getting _refine_numba() back up to full speed will require a major rewrite. It's pretty unbelievable that the new Anaconda release would include a version of one of their flagship projects that is such a drastic regression in performance, but there you go.

For future reference: All of the trouble seems to be with _refine_numba; the numba subnet code is thankfully safe because it does not create any arrays and does not need any external math functions (such as sqrt).

Our best bet is to stick to numba 0.11 until the numba team can bring their new releases back up to that standard.