lolab-msm / pydream Goto Github PK

View Code? Open in Web Editor NEW

49.0 8.0 32.0 748.88 MB

MT-DREAM(ZS) algorithm for model optimization, calibration, selection

License: GNU General Public License v3.0

Python 100.00%

pydream's People

Contributors

Stargazers

Watchers

pydream's Issues

DreamPool implementation seems incompatible with Python3.8 multiprocess?

I have been having troble getting PyDream to work with Python 3.8 and I think I have tracked down the root of the problem: The DreamPoolclass integrates with a multiprocess function that has changed in the Python3.8 version of multiprocess.

What?

In Python3.8 when installing pyDream in a clean environment, a very simple test script kept outputting the error:

  File "/virtual_envs/py38_dream/lib/python3.8/site-packages/multiprocess/pool.py", line 212, in __init__
    self._repopulate_pool()
  File "/virtual_envs/py38_dream/lib/python3.8/site-packages/multiprocess/pool.py", line 303, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
  File "/virtual_envs/py38_dream/lib/python3.8/site-packages/multiprocess/pool.py", line 319, in _repopulate_pool_static
    w = Process(ctx, target=worker,
  File "/PyDREAM/pydream/Dream.py", line 996, in __init__
    mp.Process.__init__(self, group, target, name, args, kwargs)
  File "virtual_envs/py38_dream/lib/python3.8/site-packages/multiprocess/process.py", line 82, in __init__
    assert group is None, 'group argument must be None for now'
AssertionError: group argument must be None for now

When doing the exact same thing (environment, pip install ., try script) it works as expected.

Why?

In previous implementations of multiprocess, the Pool class defined processes with just args and kw args see here.
In the Python3.8 version Pool has an extra argument ctx see here.
In PyDream the Process member of Pool gets overwritten by the custom class NoDaemonProcess. NoDaemonProcessdoes not expect to be passed the ctx variable, and interprets is as the group argument. group needs to be None, thus the error.

Ideas

In order to be compatible with multiprocess, I assume we need to make NoDaemonProcess depend on ctx (which is a multiprocess.context.ForkContext object, which incidentally hides another layer of indirection with another Process subclass). Unfortunately, I'm not really understanding what the ctx variable does.

I tried simply catching it with

    def Process(self, ctx, *args, **kwds):
        return NoDaemonProcess(*args, **kwds)

and that seemed to work, I'm not entirely sure of whether there are unwanted consequences of this.

I install pydream via: pip install pydream but it is not working properly

when i try to use the LogWrapper, i get this message:

No module named 'pydream.LogWrapper'

FEM error

Dear all,

I am using pydream to optimize a function that uses a Finite Element Method (FEM) model as forward operator. Before using pydream with the FEM operator, I tested it with a much simpler analytical solution. It works well.

The problem that I encounter when running the FEM code is that the parameters are becoming negative. However, that should be impossible since I am working with the logarithm of the parameters. So, my wild guess is that the forward operators are somehow interfering with each other.

Has anyone experienced this sort of issue?

Kind regards,

Juan

Confusion between likelihood and log likelihood

One of the parameters of pydream.core.run_dream is called "likelihood" and the documentation specify it to be "A user-defined likelihood function".

It seems though that the code is actually expecting a log-likelihood function, as seen in pydream.model.total_logp line 30:

loglike = self.likelihood(q0)

This can be easily solved by changing the parameter name to be log_likelihood and changing the documentation accordingly.

Reporting errors when run the mixturemodel example

Hello:
I have installed the PyDREAM package on Windows 10, when run the mixturemodel example, I get the following error:

multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\Lib\site-packages\multiprocess\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "C:\ProgramData\Anaconda3\Lib\site-packages\multiprocess\pool.py", line 44, in mapstar
return list(map(*args))
File "C:\ProgramData\Anaconda3\Lib\site-packages\pydream\core.py", line 119, in _sample_dream
raise e
File "C:\ProgramData\Anaconda3\Lib\site-packages\pydream\core.py", line 105, in _sample_dream
sampled_params[iteration], log_prior , log_like = step_fxn(q0)
File "C:\ProgramData\Anaconda3\Lib\site-packages\pydream\Dream.py", line 406, in astep
raise e
File "C:\ProgramData\Anaconda3\Lib\site-packages\pydream\Dream.py", line 254, in astep
self.last_prior, self.last_like = self.logp(q0)
File "C:\ProgramData\Anaconda3\Lib\site-packages\pydream\model.py", line 31, in total_logp
loglike = self.likelihood(q0)
File "D:\Python_codes\PyDREAM-master\pydream\examples\mixturemodel\mixturemodel.py", line 39, in likelihood
log_lh = np.zeros((k))
NameError: name 'np' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "D:\Python_codes\PyDREAM-master\pydream\examples\mixturemodel\mixturemodel.py", line 64, in
sampled_params, log_ps = run_dream([params], likelihood, niterations=niterations, nchains=nchains, start=starts, start_random=False, save_history=True, history_file='mixturemodel_seed.npy', multitry=5, parallel=False)
File "C:\ProgramData\Anaconda3\Lib\site-packages\pydream\core.py", line 74, in run_dream
returned_vals = pool.map(_sample_dream, args)
File "C:\ProgramData\Anaconda3\Lib\site-packages\multiprocess\pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\ProgramData\Anaconda3\Lib\site-packages\multiprocess\pool.py", line 644, in get
raise self._value
NameError: name 'np' is not defined

It seems that the errors are due to the multiprocessing pool.map(), maybe there are some differences of this function when it is run on Linux and Windows, but I donot known how to fix it. Would you please help me with this?

Thank you very much for your help
Best regards

While loop needed instead of if statements in generate_proposal_points function for hardboundaries reflections?

In the generate_proposal_points function, there are locations in the code that adjust (reflect) parameters that are outside of their prior ranges to be within their specified ranges (when hardboundaries=True, the default condition). The code used to do this relies on if statements to check the conditions. There is a note that "#Occasionally reflection will result in points still outside of boundaries", and then reflection is only applied once more with another set of if statements. But, I don't think applying a reflection once more guarantees that the parameters will be in their prior boundaries. It seems like a while loop is instead needed for this whole section of code to continually reflect the points until the parameters are in their boundaries.

Line example: 728 - 744 of Dream.py

I have not tested this thought, but I wanted to report this perceived issue.

what is log_ps ?

A stupid question: what is log_ps compared to the number of identical samples in the output ?

If Ni counts the number (frequency) of identical samples for each unique parameter set i, I expect that exp(log_ps) is the probably to find sample i in the output, and that is is equal to Ni (once normalized, and after burning half of the samples). Is it correct ?

Does it means that the convergence is not reached ? I have to admin that my GR stats are close to 1.2 but some are above.

The correlated question is how to I select the "best" parameter set. Should I take that with max(log_ps) or the most frequent in the sampled set ?

history_thin does not affect returned samples and log_ps in core.run_dream()

It may be by design, but history_thin does not thin the results returned by run_dream(). As a result, history_thin does not ameliorate memory issues that affect large runs, i.e., large niterations.

I was able to solve the problem by hacking core._sample_dream (or at least I think I solved it). I pass the variable step_instance.history_thin as a fifth argument and modified the array allocation of sampled_params and log_ps. Accordingly, I also ensure that sampled params and log_ps are updated only every history_thin iterations.

Make Priors Optional

Sometimes, it is convenient to put priors into the likelihood function directly (for instance, if scipy distributions don't have the function type needed, or for multidimensional priors). It would be convenient to allow prior to be None and remove computation of priors from the core algorithm.

Invalid filename for chain history

Hi,

I'm getting an error because the filename generated for the chain history file contains a colon (:), as part of the timestamp.

My code and error message (for one thread) are added below.

cheers,
Maarten

from pydream.core import run_dream
from pydream.parameters import FlatParam
import numpy as np
from scipy.stats import norm, uniform

mu = 0
var = 1

def Latin_hypercube(minn, maxn, N):
    y = np.random.rand(N, len(minn))
    x = np.zeros((N, len(minn)))
    
    for j in range(len(minn)):
        idx = np.random.permutation(N)
        P = (idx - y[:,j])/N
        x[:,j] = minn[j] + P * (maxn[j] - minn[j])
    
    return x

if __name__ == '__main__':
    history_seed = Latin_hypercube([-1],[1],50)
    np.save('history_seed.npy',history_seed)
    param = FlatParam(np.array(0))
    sampled_params, log_ps = run_dream(param, likelihood, 
              start = np.array(0), 
              start_random = False, 
              niterations = 1000,
              verbose = False,
              history_file = 'history_seed.npy')

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pydream\core.py", line 127, in _sample_dream
    raise e
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pydream\core.py", line 114, in _sample_dream
    sampled_params[iteration], log_prior , log_like = step_fxn(q0)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pydream\Dream.py", line 421, in astep
    raise e
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pydream\Dream.py", line 362, in astep
    self.record_history(self.nseedchains, self.total_var_dimension, q_new, self.len_history)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pydream\Dream.py", line 945, in record_history
    self.save_history_to_disc(np.frombuffer(Dream_shared_vars.history.get_obj()), prefix)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pydream\Dream.py", line 959, in save_history_to_disc
    np.save(filename, history)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\numpy\lib\npyio.py", line 517, in save
    fid = open(file, "wb")
OSError: [Errno 22] Invalid argument: '2020_04_26_18:58:29_DREAM_chain_history.npy'

Cannot interpret DEpairs as a list

Dear developers,

I am not able to use DEpairs as a list (DEpairs=[1,2,3]) as it throws following error suggesting that DEpairs argument needs to be an integer. However, it worked perfectly fine in Robertson_no_pysb example where I passed DEpairs a list. Any help will be appreciated.

Codes looks like:

run_dream(sampled_parameter_names, likelihood, niterations=niterations, nchains=nchains,
multitry=False, gamma_levels=4, adapt_gamma=True, history_thin=1, model_name='S_14_5chain', verbose=True,
DEpairs=[1,2,3],hardboundaries =True, start_random=False, start=starts)

Error message:

File "dreams_run.py", line 74, in
DEpairs=[1,2,3],hardboundaries =True, start_random=False, start=starts)
File "/home/7s2/.local/lib/python3.6/site-packages/pydream/core.py", line 64, in run_dream
step_instance = Dream(model=model, variables=parameters, verbose=verbose, mp_context=mp_context, **kwargs)
File "/home/7s2/.local/lib/python3.6/site-packages/pydream/Dream.py", line 150, in init
self.DEpairs = np.linspace(1, DEpairs, num=DEpairs, dtype=int) #This is delta in original Matlab code
File "<array_function internals>", line 6, in linspace
File "/home/7s2/.local/lib/python3.6/site-packages/numpy/core/function_base.py", line 121, in linspace
.format(type(num)))

Thanks
Saubhagya

ValueError arises when using uniform priors

When I attempt to use a list of SampledParam objects with uniform distributions, I get the following error from inside the Dream object:

ValueError: zero-dimensional arrays cannot be concatenated

I initialized the parameters as seen in https://gist.github.com/ryants/b0da3eea4d6d515183fca8476916365b

absence of adapt_gamma raises error

When running the run_dream command without gamma adaptation, restarting the chains raises the following error:

Traceback (most recent call last):
  File "/scratch/anaconda2/envs/py3/lib/python3.6/site-packages/pydream/Dream.py", line 372, in astep
    if self.adapt_gamma and self.iter > 10 and self.iter < self.crossover_burnin and not np.any(np.array(self.gamma)==1.0) and not run_snooker:
AttributeError: 'Dream' object has no attribute 'adapt_gamma'

frequent dumps to file

I'd like to do relatively frequent dumps of the chain and log-likelihood to file such that the crossover burnin continues past multiple dumps. Would it work to set crossover_burnin=1 for multiple run_dream commands, followed by crossover_burnin=0 for subsequent run_dream commands once the appropriate number of iterations has been completed?

Let me know if this isn't clear, and I'll provide a code sample.

Also, if you'd like me to direct general questions somewhere other than this repository, let me know

Thanks!

Initial draws from prior are identical

When not specifying the "start" argument in the run_dream function, the initial parameters for each chain are identical. I assume they should be distinct?

Running chains in parallel

I am trying to like PyDREAM with my own program. The program reads the parameters in the input file and then generate a output file to disk. I need to read data from the output file and then calculate the likelihood. The problem is that PyDREAM always runs chains in parallel. This would cause problem when parallel runs try to write the output values to the same file. Is it possible to disable the parallel run even if one more chains are used, say nchains=5?

Hope I have made myself clear. Thank you.

discrepancy between stored values and calculated values

After I run PyDREAM and output the parameters and log-posteriors to a .npy file, I load the .npy files assuming that entry i of the parameter file corresponds to entry i of the log-posteriors file. However when I recalculate the log-posterior value manually, the values are different (often by multiple orders of magnitude).

From the files I have, the entries in the parameter file correspond to the log-posterior file are consistent up to some arbitrary point (i.e. I can successfully recaculate the posterior), when they become inconsistent (i.e. the log-likelihood blows up).

Any ideas what may be happening? My model (written in PySB) and PyDREAM file are attached.

rab_dream.zip

I'm using Python 3.6 on a Linux workstation and the master branches of PySB and PyDREAM (as of about two weeks ago)

Getting error AttributeError: 'FlatParam' object has no attribute 'dist'

I am getting this error despite the fact that I'm not putting any attribute 'dist' into the FlatParam function, just the test_value argument which it is supposed to have. Below is a simplified toy model which also creates this error. It is a random effects model, and initially I thought the problem was somehow due to the LLB line and its inclusion in the return line, but that can be removed and the error will still occur.

from pydream.parameters import FlatParam
from pydream.parameters import SampledParam
from pydream.core import run_dream
from pydream.convergence import Gelman_Rubin
import numpy as np
#from pysb.integrate import Solver
from scipy.stats import norm
from scipy.stats import uniform
from dill import dump_session
x=norm.rvs(size=10)
xb=np.repeat(x,10)
y=xb+norm.rvs(size=100)
explan=np.repeat(range(0,9),10)
pysb_sampled_parameter_names=['B','sigmaB','sigma']
sigma = SampledParam(uniform,loc=0,scale=1000)
sigmaB = SampledParam(uniform,loc=0,scale=1000)
Blst = np.linspace(0,0,num=10)
B = FlatParam(test_value=Blst)
sampled_parameter_names=[B,sigmaB,sigma]
def likFH(parameter_vector):
param_dict = {pname: pvalue for pname, pvalue in zip(pysb_sampled_parameter_names, parameter_vector)}
LLB = norm.logpdf(param_dict['B'],0,param_dict['sigmaB'])
Mu = param_dict['B'][explan]
LL1 = norm.logpdf(y,Mu,param_dict['sigma'])
return (np.sum(LL1)+np.sum(LLB))

#Run model
sampled_params, log_ps = run_dream(sampled_parameter_names, likFH, model_name='ToyRE_5chain', verbose=True)

run_dream fails if start keyword is given a 2D numpy array instead of a list of 1D arrays

The code is comparing to None with the operators == and != instead of the correct (and more efficient) pythonic idiom is or is not. Thus, an if clause fails on a numpy array because == and != operate elementwise, while is and is not return a scalar bool

The lines that require modifications are

regis@gatito ~/git/CO-Ori-VLTI % for file in /usr/local/lib/python3.7/dist-packages/pydream/*.py; do echo $file; egrep -n '[!=]=\s*None' $file; done 
/usr/local/lib/python3.7/dist-packages/pydream/convergence.py
/usr/local/lib/python3.7/dist-packages/pydream/core.py
43:        if start == None:
218:                #On first iteration without starting points this will fail because q0 == None
284:    if step_instance.crossover_burnin == None:
287:    if start_pt != None:
/usr/local/lib/python3.7/dist-packages/pydream/Dream.py
158:        if self.nseedchains == None:
227:            if last_loglike != None:
253:            if self.last_logp == None:
421:        if self.nchains == None:
425:        if self.chain_n == None:
/usr/local/lib/python3.7/dist-packages/pydream/Dream_shared_vars.py
/usr/local/lib/python3.7/dist-packages/pydream/__init__.py
/usr/local/lib/python3.7/dist-packages/pydream/model.py
/usr/local/lib/python3.7/dist-packages/pydream/parameters.py

lolab-msm / pydream Goto Github PK

pydream's People

Contributors

Stargazers

Watchers

Forkers

pydream's Issues

What?

Why?

Ideas

Recommend Projects

Recommend Topics

Recommend Org