contextlab / timecorr Goto Github PK

View Code? Open in Web Editor NEW

38.0 9.0 11.0 131.05 MB

Estimate dynamic high-order correlations in multivariate timeseries data

License: MIT License

Makefile 0.02% Python 25.73% Jupyter Notebook 73.84% Dockerfile 0.42%

timeseries network-dynamics multivariate-analysis dimensionality-reduction graph-theory

timecorr's People

Contributors

Stargazers

Watchers

Forkers

lucywowen htwangtw kauttoj f1lm1 snorkeldepth mafrasiabi chrjlong vishalbelsare paxtonfitzpatrick mccallw23 dal3006

timecorr's Issues

Provide a synthesis of what you did, what you found, and why it matters. Start by summarizing the methods and results. Then discuss how the results fit with the prior literature and/or advance our knowledge. For example:

Did you find some interesting results where decoding accuracy varies in a consistent way by level, across datasets? Did decoding accuracy vary with cognitive salience in the Pie Man dataset?
What worked? What didn't? For example, under what circumstances can we extract the "true" dynamic correlations from data, and under what circumstances does the approach break down?
Are there other domains to which your work might be relevant?
What are the broader implications of what you did/found?

Preprint to describe method

Write this stuff up!

Moving Forward

Hi Professor Manning and Emily!

Hope your weekend is off to a great start!

I have just created a new branch TomThesis with the code I used at the time of my thesis submission. Please take a look and let me know if you run into any problems.

My daily schedule has finally stabilized somewhat and I am ready to dive into this project again! (I have to move again in 2 weeks to a new place, but that should be a swift transition). So I am creating this issue to discuss what we should work on moving forward. As discussed at my thesis defence, there are three major parts that we can work on at the moment:

Improving Timecorr with HMM and Laplacian Distribution. We know that multi-subject brain correlation patterns in the datasets we have explored consists of random activities with peaks---time points of high brain patterns similarity between all subjects--spread throughout the time series. We can try to adapt the HMM method from Baldassano's paper to try to identify and isolate these peaks with timecorr to produce between recovery of brain activities. In addition, we also identified that using Laplacian Distribution in place of Gaussian distribution might help us gain more temporal accuracy, so we should also adapt that into Timecorr (should be an easy implementation).
Improving our level up process. As we identified that brain activities do not follow a normal Gaussian distribution, we have decided to switch from PCA to other factorization methods. What do you think we should use to replace PCA? Leveling up beyond the second level could also be tricky and might require another process. What do you think?
Machine Learning implementation. I know we gave up on this path early on in my research due to my failure in implementing Gaussian Process, but should we bring it back? I could try to implementing Gaussian Process for correlation calculation again and/or try out a neural network implementation for factorization (got to think about this one a little bit).

Let me know what you think about these ideas!

Best,
Tom

get laplace weights working

something seems messed up (maybe due to ignoring values close to zero)

identify common function across lab projects and put them in a single repo

Benchmark and optimize code

Using a synthetic dataset, plot performance (timing) curves showing how long it takes to:

compute timecorr for the "within" subjects analysis (i.e. individually for each subject)
compute timecorr for the "across" subjects analysis (i.e. the timecorr version of ISFC)
compute levelup

see how this changes as a function of:

number of subjects
number of voxels
number of timepoints

We want to show that the code can scale up reasonably well to something like 10000 voxels or features, 500 timepoints, and 100 subjects. (Or, at minimum, 1000 voxels/features.)

One thing we haven't tried yet is using threads or multiprocesses to parallelize the computations-- e.g. the computations for each individual subject could run in parallel threads/processes rather than being carried out in serial. In addition, the computations for each timepoint could be carried out in a separate thread/process rather than being carried out in serial.

So there are two potential levels of parallelization that could substantially speed up the code:

per-subject
per-timepoint

Analyses of fMRI datasets

Generate figures showing decoding accuracy as a function of "level" for several fMRI datasets. Suggestions:

Pie Man story listening dataset. Separately analyze three conditions (intact, paragraph, word, and rest) and make decoding-by-level figures for each condition. In our prior work we've found that decoding accuracy decreases as you move through those four conditions-- intact > paragraph > word > rest. (Details and intuitions are in the linked-to-paper.)
Sherlock movie watching dataset. Show decoding accuracy by level.
Raiders of the Lost Ark movie watching dataset. Show decoding accuracy by level.

Here's what I'm thinking for the figures

Start with the raw data and split the data (or for Pie Man, the data in each condition) into two groups. Compute the average within groups, and then use correlation-based decoding (like in timepoint_decoder) to compute decoding accuracy. Repeat 100 times (with a different random group assignment each time) to get a distribution of decoding accuracies. (This will be the "level 0" decoding accuracy.)
Now split the data into 2 groups and compute the "across" timecorr for each group. Use one group to decode the other (using correlation-based decoding) and compute the decoding accuracy. Repeat 100 times (with different group assignments each time) to get a distribution of decoding accuracies. (This will be the "level 1" decoding accuracy.)
Also compute the "within" timecorr for each individual subject. Now use hyp.tools.reduce to reduce the "level 1" data down to v voxels (where v is a parameter we will probably want to explore...but I'd start with something like 1000). Then split the subjects into two groups and use "across" timecorr on the level 1 data to get "level 2" decoding accuracy (again, use 100 random group assignments to get a distribution of accuracies).
For each subsequent level, L you can keep track of the "within" timecorr values for level L-1, and then to get the level L decoding accuracies use "across" timecorr for two randomly assigned groups. To level up, compute the "within" timecorr values for level L (using the L-1 "within" timecorr values), and then reduce to v features using hyp.tools.reduce).
I'd like to see decoding accuracy (with error bars) as a function of level, for levels 0 through 10.

Some key questions:

How does decoding accuracy change with level? Does this depend on our choice of v?
Is the pattern we observe consistent across these different datasets?
Using the Pie Man dataset to explore, does the pattern we observe change with the level of cognitive salience (i.e. the richness of the input stimulus-- intact > paragraph > word > rest)?

something's up with the weights functions

I'm getting the following error when I set weights_function=laplace_weights:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-111-5a253239e182> in <module>()
----> 1 recovered_covs_laplace = tc.timecorr(data, mode='across', weights_function=laplace_weights, weights_params=laplace_params)

/usr/local/lib/python3.6/site-packages/timecorr/timecorr.py in timecorr(data, weights_function, weights_params, mode, cfun)
     98 
     99     if (mode == 'across') or (type(data) != list) or (len(data) == 1):
--> 100         return cfun(data, weights)
    101     elif mode == 'within':
    102         return list(map(lambda x: cfun(x, weights), data))

/usr/local/lib/python3.6/site-packages/timecorr/helpers.py in isfc(data, timepoint_weights)
    132     else:
    133         subject_weights = None
--> 134     return wisfc(data, timepoint_weights, subject_weights=subject_weights)
    135 
    136 

/usr/local/lib/python3.6/site-packages/timecorr/helpers.py in wisfc(data, timepoint_weights, subject_weights)
    109                               axis=2, weights=subject_weights[s, other_inds])
    110 
--> 111             next = wcorr(a, b, timepoint_weights)
    112             for t in np.arange(T):
    113                 x = next[:, :, t]

/usr/local/lib/python3.6/site-packages/timecorr/helpers.py in wcorr(a, b, weights, tol)
     58             mb, varb, diffs_b = weighted_mean_var_diffs(b, weights[:, t])
     59 
---> 60         alpha = np.dot(diffs_a.T, diffs_b)
     61         beta = np.dot(vara[:, np.newaxis], varb[np.newaxis, :])
     62 

ValueError: shapes (100,714) and (691,100) not aligned: 714 (dim 1) != 691 (dim 0)

Thesis writeup: results

First, we should do a thorough exploration of how well we can recover "ground truth" parameters from several illustrative synthetic datasets (for example the blocked and ramping datasets we've been exploring). We'll also need good benchmark datasets for the "across" version of timecorr. For example, we could generate analogs of the the blocked and ramping datasets, but add a little noise to each "subject's" data to generate a simulated multi-subject dataset.

Second, we should have some performance benchmarks showing how well the approach scales as a function of dataset size, along the lines of what's described in this issue.

Third, we should report results on real fMRI datasets, along the lines of what's described in this issue. Specifically: how does decoding accuracy change as a function of level?

Provide enough description to clearly explain the results, including motivations for doing each analysis. Walk the reader through each analysis to provide clear intuitive explanations of why each analysis was done and what we found.

Turn jeremy's notebook into synthetic data functions

Thesis writeup: introduction/literature review

Review the relevant literature (e.g. other papers that have discussed dynamic correlations and/or that propose related ideas/methods). Some places to start:

Uri Hasson's papers
Dani Bassett's papers
Steve Peterson's papers
This recent Nature Neuroscience paper
Do a Google Scholar search and see what other people have done (focus on papers that are either recent and look cool, are in good journals (i.e. journals with high impact factors), or that are highly cited)

The goal is to come up with a comprehensive overview that synthesizes (a) why looking at dynamic correlations in brain data is interesting/important, (b) what other people have already done in this domain, and (c) why timecorr and levelup are good new approaches to take. In other words: why were we exploring this problem in the first place, and why is what we did a logical next step?

code cleanup tasks

the code has gotten a bit messy. some tasks:

remove no-longer-relevant files
clean up formatting throughout helpers.py
it doesn't look like isfc actually makes use of multiprocessing-- you open a Pool but then don't add jobs to it...???? (All actual computations happen in the map step.) So I'm now wondering why you saw a speedup with multiprocessing. Is the code on GitHub perhaps not up-to-date?
I thought you were going to parallelize per-timepoint correlations in wcorr, but the code doesn't reflect that.

Note: if you haven't been running your benchmarks using multiprocessing, you'll need to re-run them.

Implement different width parameters

-allow users to change Gaussian/kernel width parameters
-explore datasets using searchlight, changing radii, changing width parameters

ModuleNotFoundError: No module named '_shared'

Hi all,

I am encountering this error:

File "/Users/StephenSatterthwaite/anaconda/lib/python3.6/site-packages/timecorr/timecorr.py", line 5, in
from _shared.helpers import isfc, wcorr, sliding_window, sliding_window_isfc
ModuleNotFoundError: No module named '_shared'

when I attempt to import timecorr into a Python environment.

I am running Python 3.6.1, on OSX.

I have installed timecorr by downloading the package and running "pip install ."

I cannot figure out how to get Python to recognize the _shared folder.

I have tried to fix this by changing my PATH using sys.path.append to the "_shared" folder.

If anyone has any suggestions or workarounds, that would be very helpful.

Thanks,

Steve

Documentation: readthedocs/Sphinx website and extended README file

Create a readthedocs page for Timecorr that includes a Sphinx website with the API and also expand the readme file to include installation instructions and some basic usage instructions. Examples from the lab:

HyperTools readthedocs and readme.md
Quail readthedocs and readme.md

clean up docstrings

the descriptions of all functions need to be cleaned up for clarity, grammar, and typos

Tutorials and working examples

Also: synthetic data and/or sample datasets

in levelup make V a parameter

Instead of assuming the user wants to reduce the data back to the original shape, allow the user to specify their own size (possibly dependent on level). Another idea would be to have a threshold variance parameter, which tells the alg how much variance you are will to compromise in teh dimensionality reduction. could be accomplished with the hyp.describe function

test decoding using "ramping" synthetic dataset

Generate a multi-subject dataset as described in this issue

Then apply the timepoint decoding method (correlation-based) by dividing the dataset randomly into two groups and testing how well (accuracy) we can recover timepoints from each group, from synthetic data.

Thesis writeup: detailed methods

Write a detailed description of the methods underlying timecorr and levelup. Specifically:

All relevant math, and algorithms, explained succinctly with intuitive "plain English" descriptions about each step
Also explain relevant features of the timecorr toolbox that implement each of the main functions, with short code snippets (this format has worked well for our lab in the past, and you could consider using it as a rough guide)

You need enough detail for someone to (a) replicate our work from scratch, if they wanted to, and/or (b) get started using our toolbox if they were interested in our method but didn't want to write their own code.

accuracy calculation

When updating to python3 accuracy_calculation.py repeatedly produced an error, the file was commented out in its entirety so that the rest of the code could be updated

API documentation updates

The documentation for timepoint_decoder is out of sync. Please update:

the documentation lists a var keyword argument, but this isn't coded into the function (it should be)
the documentation says that a results object is returned containing accuracy, error, and rank performance information, but actually only the accuracy is returned.

add kalman filter-based predictions

for each timepoint, generate an estimate of the next timepoint. these predictions can be made in "raw" data space (level 0), or at a higher level and then mapped back to level 0 using the inverse PCA transforms.

mutable parameters passed to gaussian_weights and laplace weights

default parameters passed to python functions should be immutable. probably does not matter for this particular case but wanted to document in case a bug pops up related to this.

http://docs.python-guide.org/en/latest/writing/gotchas/

write up math

weighted correlation
weighting functions
levelup

Floor division

In the futurize branch in decoding_analysis, timecorr_job, and timecorr.py there are commented instances of "old division" or floor division being used. I am wondering if this rounding down is what we wanted?

testing: synthetic data figures

Generate figures for two synthetic datasets:

dataset 1: data drawn in 10 blocks of 100 samples, each with a different correlation matrix.
dataset 2: every sample gets a different correlation matrix, but give the dataset structure by gradually fading from one correlation matrix to another (the same analysis you had done with the time-varying covariance version, but with correlation matrices)

then use timecorr to recover the correlations. then generate four plots for each dataset and upload here in response to this issue:

same plots as you were making before-- correlations between the recovered correlation matrix and the actual correlation matrices. for the blocked version, show 10 lines (one per correlation matrix) and show that each block’s line fades in for that block, and then fades out when the block is done. for the ramping version, show 2 lines (one for the first timepoint’s correlation matrix and one for the last timepoint’s), and show that the two lines cross. purpose: show that we recover the temporal structure of the dataset.
alternative version of plot 1: instead of correlation, show the mean squared error between the recovered matrices and true matrices (the plot should have the same format — same number of lines — but the y-axis is now showing MSE instead of correlation). purpose: show that we are recovering the magnitudes of the correlations, not just the shapes of the correlation matrices.
just a single line-- for each timepoint, compute the correlation between the recovered correlation matrix at that timepoint and then true correlation matrix (for that timepoint only). the main difference from plot 1 will be for the ramping dataset. rather than showing crossing lines due to the recovered correlation matrices getting gradually more/less similar to the first and last correlation matrices, you should instead see a flat line showing the per-timepoint recovery.
MSE version of plot 3. (same plot, but show MSE rather than correlation on the y-axis)

Pool Multi Process

MaybeEncodingError: Error sending result: '[array(<map object at 0x7fe7ca0cdba8>, dtype=object), array(<map object at 0x7fe7ca0cdc50>, dtype=object)]'. Reason: 'AttributeError("Can't pickle local object 'isfc_helper..isfc_timepoint_helper'",)'

Thinking that the formatting of the answer is incorrect, should be in an array format
lines 89 in timecorr.py 81 in _shared/helpers, 260 & 608 multiprocessing/pool

Docker container for development

Implement and test the "levelup" function

Given a matrix or a list of matrices (each T by F dimensional), levelup performs two operations in sequence:

Use timecorr to compute moment-by-moment correlations
Use hyp.tools.reduce to reduce the correlations back to T by F dimensions

If mode="within" (default), call timecorr with the "within" flag. If mode="across", call timecorr with the "across" flag.

To test this function, compare the levelup answers for 2 levels (using timecorr) with the corresponding answers on a synthetic dataset (e.g. the ramping dataset) using sliding windows (shifting the sliding window answers along the x-axis so that the time sequences align).

Detect state changes using HMMs

Use brainiak implementation of event segmentation

Create Tests

Make tests throughout code following the CDL tutorial outline (https://github.com/ContextLab/CDL-tutorials/tree/master/testing)

pull multiprocessing into master

The multiprocessing performed best in your benchmark runs, right? If so, please initiate a pull request of the multiprocessing branch into the master branch, and delete all no-longer-relevant branches.

timecorr function returning correlations greater than 1

I discovered this issue when trying to apply Fisher Z-transform on the recovered correlation, and further verified this issue by printing out the maximum value in the correlation matrices. The values range between -2.5 and 2.5. Need to verify if this issue only exists for ISFC or also for single subject timecorr

error importing `timecorr` function

i installed timecorr using (within the timecorr directory)
pip install --upgrade .

then i tried (in python) importing the timecorr function using
from timecorr import timecorr as tc

I got the following error:
File "/Users/jmanning/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/timecorr/_shared/helpers.py", line 2
cimport numpy as np
^
SyntaxError: invalid syntax

Implement and check ISFC ("across" mode for timecorr)

To test the ISFC implementation, I propose starting with one of the synthetic datasets as a baseline (e.g. the "ramping" dataset). Then generate simulated data from multiple subjects (5?) by adding zero-mean Gaussian noise to the activation values from the dataset (each subject gets different noise added). You can generate several datasets like this (3?) with different variance parameters for the Gaussian noise that you add.

Now use "across" timecorr to compute the moment-by-moment ISFC matrices. Correlated these with the "ground truth" moment-by-moment correlation matrices from the original synthetic dataset (before adding any Gaussian noise).

For each of the three synthetic datasets (each with a different noise variance), plot for each timepoint the correlations between the ISFC matrix and the corresponding timepoint's correlation matrix in the synthetic data.

Key insight to test: the correlations between the recovered ISFC matrices and the true correlation matrices should decrease as the amount of noise (variance) increases.

Update Smooth Function

Update the Smooth function, below, to be functional with timecorr:

def smooth(w, windowsize):
kernel = np.ones(windowsize)
w /= kernel.sum()
x = np.zeros([w.shape[0] - windowsize + 1, w.shape[1]])
for i in range(0, w.shape[1]):
x[:, i] = np.convolve(kernel, w[:, i], mode='valid')
return x

Float object can not be interpreted as an integer

For the past bit I have been having issues with this section of the code:

I have tried removing the backlash and using np.divide (the line then turns green instead of red but both lines are still called in the error), also tried to make all of the variables in correlations_vector integers by places int() around them but that didn't do anything. Will still be working on this just wanted to crowdsources ideas and give an update.

more efficient computations with kernels or convolutions

There may be a very quick way to implement the main timecorr algorithm using some sort of kernel method or convolutions (using the Gaussian weight function as the kernel). Need to think more about this...and maybe talk to a kernel person.

Futurize 2

Can not delete branch

Synthetic Data

Create a synthetic dataset to test code

bugs found in first run through of TomThesis branch

timecorr_job-->line 42, map has no len()
timecorr--> no module named 'main._shared", main not a package
findmissing->not enough values to unpack--> expected 4 got 0 (might have been a problem because I didn't upload data?)
decoding_analysis--> attempted relative import with no known parent package
accuracycalculation-->attempted relative import with no known parent package
accuracy calc--> list index out of range

Examples missing datasets

Many of the examples are missing datasets

make some example datasets for debugging

Multi-level Mixture Analysis

I'm also flagging an alternative version of this analysis here for possible exploration, as a different way of exploring decoding accuracy by level. It could be that, even if data at higher levels didn't by itself lead to better decoding accuracy, mixing in information at higher levels could still improve the decoding. In other words, a mix of level 0 and level 1 might be better than level 0 alone, even if level 0 out-performs level 1.

To test this more formally, I suggest the following:

split the dataset into 2 parts, A and B. further subdivide each of those into two-- so we have A1, A2, B1, and B2. compute the level 0, 1, 2, ..., 10 representations of the data for each of those 4 subset of the data (using the "across" version). So for each subset, for each level, we should have one timepoints-by-features matrix.
using A1 and A2, find the vector of mixing proportions that maximizes decoding accuracy (in other words, find the mix of level 0, level 1, level 2, etc. produces the best decoding accuracy).
using this mixing proportion from A1 and A2, compute decoding accuracy for B1/B2.
repeat the above 100 times (with different random assignments of A1/A2/B1/B2 each time) to get a distribution of decoding accuracies.
Then plot:

the average optimal mixing proportions across those 100 runs (as a bar graph or violin plot)
the average decoding accuracy for B1/B2 across those 100 runs, plotted as a bar graph or violin plot. Also compare the decoding accuracy to the decoding accuracy obtained using only level 0 data.

After thinking through this process a little bit more, I have the following questions that I would like to clarify:

What is the exact argument we are trying to make with this analysis? What combination of data at each level will give us the best decoding accuracy? Why do we need to find this combination, as it does not tell us anything about the quality of the dataset nor about the level of noise at each timepoint...I am having some difficulty grasping the goal of this analysis
What will be the target of our optimization? I have been thinking about the equation we are trying to optimize, and since it's linear sum of the different levels, wouldn't gradient descent converge toward putting the most weight on the level with the highest decoding accuracy to begin with? Or maybe the model will converge toward the weight distribution that would make all the subjects look the most similar? In either case, I am not sure what kind of conclusion we can extrapolate from the results....
Will the optimal weight distribution for one dataset be the same for all datasets? I highly doubt this as in our optimization we really have no way of enforcing generalization...

In the end, I am just a little bit confused as to what purpose this branch will contribute to...could you give me a better picture of your vision? Thanks!