nel-lab / mesmerize-core Goto Github PK
View Code? Open in Web Editor NEWHigh level pandas-based API for batch analysis of Calcium Imaging data using CaImAn
License: Other
High level pandas-based API for batch analysis of Calcium Imaging data using CaImAn
License: Other
When removing a batch item, there should be an option (or default with option for not doing) to delete all files with the same uuid
Should convert input tiff to memmap
component evaluation should be done here within the backend, not in mesmerize-napari. write a cnmf extension that wraps cnmf_obj.estimates.filter_components
.
Should it store this as a new batch item in the dataframe or update an existing item and just update the eval_kwargs of the params? @ArjunPutcha thoughts?
While trying to update the environment via mamba:
mamba env update -n mesmerize-core --file environment.yml
ERROR:
Encountered problems while solving:
checking the installed python --version
= 3.10.4
Then I tried conda install python=3.10.0
, rerunning the env update yielded the same error
Next conda install python=3.9
made the env update work.
Should I just continue with 3.9, or use something else?
args can be specific ranges of indices to use for Fo, everything in between is dfof w.r.t. these ranges.
list of (fo_start, fo_end, (F_start, F_end))
get_chunked_dfof(temporal: np.ndarray, f0_ranges: List[np.ndarray], f_ranges: List[np.ndarray]) -> np.ndarray
temporal is 2d array of traces
f0_ranges vstack of [f0_start, f0_end]
for each temporal trace, each trace has 1 array, list of arrays for all traces
f_ranges vstack of [f_start, f_end]
for each temporal trace, each trace has 1 array, list of arrays for all traces
label with experimental warning
datajoint
within mesmerize-core
pandas extensions, or if datajoint
should make calls to mesmerize-core
pandas extensions within their table make()
methods. https://github.com/datajoint/element-calcium-imagingremove use_open_file_dialog
, use_save_file_dialog
, use_open_dir_dialog
, and present_exceptions
. All Qt code should be removed from mesmerize-core
except for QProcess (maybe remove that too?)
@ArjunPutcha can you please confirm that none of these decorator functions are used in mesmerize-napari
Since pandas.Series
use the name
attribute to store the index of the Series instance in the parent DataFrame.
@clewis7 posting this because it just popped into my head, don't worry about it until you're back
the return_copy
kwarg is irrelevant for determine a cache hit/miss therefore it should be dropped for the comparison.
something like:
raise UnsuccessfulItem(df.iloc[0].outputs["traceback"])
Running Windows 11 x64, running caiman gives a couple errors:
I can confirm the file is there -
os.path.isfile('C:/Users/gjb326/caiman_data/mesmerize-core-batch/f1f56d3d-53b9-4e80-a92c-a34a635b28da.runfile.ps1')
returns True, but only when the '.ps1' is appended
But os.path.isfile(r'C:\Users\..\..a.runfile.ps1')
and os.path.isfile('C:\\Users\\..\\..a.runfile.ps1')
return False
---------------------------------Begin error------------------------------------------
OSError Traceback (most recent call last)
File c:\users\gjb326\mesmerize-core\mesmerize_core\caiman_extensions\common.py:248, in CaimanSeriesExtensions.run(self, backend, callbacks_finished, callback_std_out)
247 try:
--> 248 self.process = getattr(self, f"run{backend}")(
249 runfile, callbacks_finished, callback_std_out
250 )
251 except:
File c:\users\gjb326\mesmerize-core\mesmerize_core\caiman_extensions\common.py:184, in CaimanSeriesExtensions._run_subprocess(self, runfile_path, callbacks_finished, callback_std_out)
182 parent_path = self._series.paths.resolve(self._series.input_movie_path).parent
--> 184 self.process = Popen(runfile_path, cwd=parent_path)
185 return self.process
File ~\Anaconda3\envs\mesmerize-core\lib\subprocess.py:951, in Popen.init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask)
948 self.stderr = io.TextIOWrapper(self.stderr,
949 encoding=encoding, errors=errors)
--> 951 self._execute_child(args, executable, preexec_fn, close_fds,
952 pass_fds, cwd, env,
953 startupinfo, creationflags, shell,
954 p2cread, p2cwrite,
955 c2pread, c2pwrite,
956 errread, errwrite,
957 restore_signals,
958 gid, gids, uid, umask,
959 start_new_session)
960 except:
961 # Cleanup if the child failed starting.
File ~\Anaconda3\envs\mesmerize-core\lib\subprocess.py:1420, in Popen._execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_gid, unused_gids, unused_uid, unused_umask, unused_start_new_session)
1419 try:
-> 1420 hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
1421 # no special security
1422 None, None,
1423 int(not close_fds),
1424 creationflags,
1425 env,
1426 cwd,
1427 startupinfo)
1428 finally:
1429 # Child is launched. Close the parent's copy of those pipe
1430 # handles that only the child should have open. You need
(...)
1433 # pipe will not close when the child process exits and the
1434 # ReadFile will hang.
OSError: [WinError 193] %1 is not a valid Win32 application
During handling of the above exception, another exception occurred:
FileNotFoundError Traceback (most recent call last)
Input In [5], in <cell line: 3>()
1 # run the first "batch item"
2 # this will run in a subprocess by default
----> 3 process = df.iloc[0].caiman.run()
4 process.wait()
File c:\users\gjb326\mesmerize-core\mesmerize_core\caiman_extensions\common.py:252, in CaimanSeriesExtensions.run(self, backend, callbacks_finished, callback_std_out)
248 self.process = getattr(self, f"run{backend}")(
249 runfile, callbacks_finished, callback_std_out
250 )
251 except:
--> 252 with open(runfile_path, "r") as f:
253 raise ValueError(f.read())
255 return self.process
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\gjb326\caiman_data\mesmerize-core-batch\f1f56d3d-53b9-4e80-a92c-a34a635b28da.runfile'
this is extremely bizarre and occurs only on windows as of recently, nothing has changed in the cache implementation ever since this test started failing.
https://github.com/nel-lab/mesmerize-core/runs/8042869907?check_suite_focus=true#step:5:2145
I might just disable the cache on windows and fix it in a later release
all data that relates to an item should be in a dir named using the uuid for that item. - Done in #51
add_item: params could be chunked by step. - done in #50 , algo params get put in "main"
key allowing room to add more keys for other things like Ain
matrix etc.
caiman.run() needs to only take backend
as arg
and rest are optional kwargs
. _run_subprocess
only takes runfile_path
as an arg
. Remove the callbacks args from everything, it should be passed as a kwarg only within the napari stuff for QProcess and therefore removed from mesmerize-core.
add comments as default column
cnmf:
ixs_frames
for getting a single frame @clewis7general:
It should be possible to subclass CaimanSeriesExtensions
in mesmerize-napari
and append the _run_qprocess()
method to that, this way mesmerize-core
gets rid of Qt completely!
callbacks_finished should be an optional argument in common.py caiman.run() extension
the output paths for cnmf max, mean, std projection are stored as absolute paths, not relative (like for the hdf5 and corr img paths)
If only a single frame index is requested for get_reconstructed_movie()
or get_residuals()
, the returned array should be of shape [x_pixels, y_pixels]
and not [1, x_pixels, y_pixels]
Something like this at the end should do it:
if np.diff(ixs_frames).item() == 1:
return residuals[0]
return residuals
@clewis7 wait until your current stuff is done and merged into master
, this can be done afterwards.
mesmerize-viz
for ready to use widgets.There should be an extension for getting the dfof vals. It can take kwargs to pass to the caiman dfof function.
If dfof has already been calculated with the given kwargs, it just returns those, (perhaps just using the cache). If new kwargs are provided it will use the caiman function to calculate them and then return the dfof vals
CaimanDataFrameExtension.process
with Any
and a comment annotation stating it is one of QtCore.QProcess or subprocess.POpenQApplication
requirements.txt
and enviroment.yml
all keys within kwargs are not checked:
https://github.com/nel-lab/mesmerize-core/blob/master/mesmerize_core/caiman_extensions/cache.py#L125
This should instead call _check_args_equality()
:
https://github.com/nel-lab/mesmerize-core/blob/master/mesmerize_core/caiman_extensions/cache.py#L23
@_component_indices_parser
@cache.invalidate
run_eval()
get_good_components()
before and after evalget_bad_components()
before and after evalget_detrend_dfof()
with diff args, such as dfof and just detrendget_chunked_dfof()
get_rcm()
with different temporal_components
argget_rcm()
, get_rcb()
and get_residuals()
w.r.t. the new #140Add these to an existing test function that produces a lot of rows:
get_children()
get_parent()
save_to_disk()
with safety checksremove_item()
Makes it easier for downstream random-access handling of large movies
change behavior of component_indices
argument for CNMF extensions:
None
: uses cnmf.estimates.idx_components
, i.e. good components
"good"
: same as None
, uses cnmf.estimates.idx_components
"bad"
: uses cnmf.estimates.idx_components_bad
"all"
: uses np.arange(cnmf.estimates.A.shape[1])
can you increase the difference that's checked here, the tests fail sometimes because the github CI pipeline computers aren't fast enough. Maybe something like 0.05? https://github.com/nel-lab/mesmerize-core/runs/7752247571?check_suite_focus=true#step:5:1652
Ain
matrix, binary or sparse, masks to seed componentsget_input_movie()
, see #52list of (fo_start, fo_end, (F_start, F_end))
class so that a downsampled average movie can be utilized like other arrays by making a class implementing __getitem__
Maybe subclass np.ndarray
?
roughly:
class DSAvgMovie:
def __init__(self, mcorr_memmap: np.ndarray, window_size: int):
self.mcorr_memmap = mcorr_memmap
self.window_size = window_size
def __getitem__(self, ix: int):
w = self.window_size
return np.nanmean(self.mcorr_memmap[ix - w:ix + w], axis=0)
OS - Windows 11
Hey guys, while installing mesmerize-core for development I got an error while updating the env with the environment file:
(mesmerize-core) C:\Users\gjb326\mesmerize-core>mamba env update -n mesmerize-core --file environment.yml
Traceback (most recent call last):
--File "C:\Users\gjb326\Anaconda3\Scripts\mamba-script.py", line 10, in
----sys.exit(main())
--File "C:\Users\gjb326\Anaconda3\lib\site-packages\mamba\mamba.py", line 848, in main
----from conda.common.compat import ensure_text_type, init_std_stream_encoding
ImportError: cannot import name 'init_std_stream_encoding' from 'conda.common.compat' (C:\Users\gjb326\Anaconda3\lib\site-packages\conda\common\compat.py)
args:
idx_components
-> component_indices
ixs_frames
-> frame_indices
methods:
get_spatial_masks()
-> get_masks()
get_spatial_contours()
-> get_contours()
get_temporal_components()
-> get_temporal()
get_reconstructed_movie()
-> get_rcm()
get_reconstructed_background()
-> get_rcb()
get_correlation_image()
-> get_corr_image()
Stop using the kushalkolar
branch for saving the mcorr memmaps with a specific filename, use this once the PR is merged into caiman
master
branch.
Otherwise old outputs remain in RAM and new outputs can't be reloaded.
get_good_components()
- returns indices of good components
get_bad_components()
- return indices of bad components
evaluate_components(<eval kwargs>)
- performs eval, modifies df.iloc[i].params["eval"]
in place, and modifies the hdf5 file on disk. Replaces the data in the hdf5 file. @clewis7 this would require invalidation of all cache entries with the uuid of this batch item.
make an example script where a user can add items to a batch dataframe from existing caiman-processed data.
algo: specified by the user
input_movie_path: the path to the movie used by the algo, if not a memmap convert it to an appropriate memmap
params: user either passes a dict manually or use the hdf5 file if algo is cnmf
output: dict containing path to the output files, if the user does not cn_image, projections etc. calculate them
currently the CNMF extension get_reconstructed_movie()
doesn't take any ixs_components
arg. This should be implemented so that the reconstructed movie is for example made only using the good components
Because:
CAIMAN_TEMP
to be implemented for mcorr memmapsh5py
version issues, wait for newer release of caiman with latest h5py
and the CAIMAN_TEMP
See cnmf.py, CNMFExtensions._get_spatial_contours
Passing np.ndarray for ixs_components yields the following error: "TypeError: unhashable type: 'numpy.ndarray'"
Allow passing in custom temporal traces to get_reconstructed_movie()
as a kwarg to use these instead of estimates.C
. Useful for making reconstructed movie using dF/F0, detrended trace, z-scored, etc.
cnmf extensions should not rely on any arguments that require accessing the hdf5 output file since this defeats the whole purpose of that. For example, get_spatial_contour_coors()
should not require the idx_components
, it should be a kwargs with default value of None
. And if None
it uses cnmf.estimates.idx_components
after loading the output within the function itself. If the user provides an idx_components
it uses that.
ideas:
pandas API can be used to read from a SQL db
df = pd.read_sql("SELECT * FROM my_table", connector)
df_cached = df.query()
df.commit() # write back to db
Daniel suggested looking into MongoDB since large files will need to be lazy-loaded
Create a single cache, probably using a class decorator, for CNMF outputs (contours, the hdf5 file etc.) because they can take a few seconds to load sometimes.
make auto publish to pypi and conda-forge, we're ready for v0.1 once pandas v1.5 is out
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.