Giter VIP home page Giter VIP logo

pysindy's People

Contributors

akaptano avatar billtubbs avatar briandesilva avatar jacob-stevens-haas avatar jcallaham avatar kopytjuk avatar kpchamp avatar ludgerpaehler avatar matteoettam09 avatar mikkelbue avatar ohjeah avatar oliviaz0826 avatar scliao47 avatar tarengorman avatar thomascoconnor avatar thomasillo avatar tuelwer avatar yb6599 avatar znicolaou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pysindy's Issues

Complexity property test

Sorry, I didn't see your comment before making the most recent commit. I relaxed the complexity test constraint a bit more.

I was able to reproduce the failed test locally by using the same parameters: n_samples=1124, n_features=27, n_informative=5, random_state=6. It looks like LASSO somehow finds 4 informative features, while the other heavily regularized models find 6. This seems like a bit of a fluke; for different random states this doesn't occur.

Maybe we could remove LASSO from this test? Or we could stick with the quick fix I made.

In any case, the bug isn't caused by this PR, so I think it's okay to accept it.

Originally posted by @briandesilva in #71 (comment)

check_is_fitted() missing 1 required positional argument: 'attributes'

### Reproducing code example: ### I tried run the example of lorenz system as it is, given in pysindy documentation, but I am getting this error and this is true for all the other examples as well.

image

import pysindy
<< your code here >>

Error message:

PySINDy/Python version information:

constrain on states

Is your feature request related to a problem? Please describe.

Hello,
I have started to learn pysindy and am going over the examples now.
One question that I have is, is it possible to impose a constrain on the states?
As an example, I would like to get a SIR - model from a data, but in general this model has a constraint: S+ I + R = 1.

Thank you

Complexity property for optimizers

We need to implement the complexity property for the optimizers. Previously we were erroneously using a method that computed the complexity assuming STLSQ was used, but each method has different ways of computing complexity. For now I am just having the different complexity methods all throw NotImplemetedError.

Once we implement complexity for each optimization class we will also need to update the appropriate unit tests. I have marked the unit tests that will need to be updated with TODO.

Have SINDyOptimizer test for optimizer attributes

The SINDyOptimizer object assumes that the optimizer that's passed in has a number of methods and properties, but we don't actually check for any of them. I think it would be better if we had a series of checks in the __init__ function so that users would see error messages as soon as they passed in their custom optimizers rather than when they call SINDy.fit. It would be nice to just check if the optimizer inherits from sklearn.linear_model._base.LinearModel, but they are discouraging users from importing from the _* modules.

Methods to check for:

  • fit
  • predict

Attributes to check for:

  • fit_intercept
  • normalize
  • coef_
  • intercept_

get_params update for PolynomialLibrary

PolynomialLibrary inherits from sklearn.preprocessing.PolynomialFeatures, granting it a get_parameters method, but it has extra parameters beyond those in sklearn's PolynomialFeatures (e.g. include_interaction). The class should be updated to reflect this difference.

Check requirements files

  • The requirements.txt should only contain third party dependencies used in the main package.
  • The requirements-dev.txt should contain everything needed for a development environment, including tests and examples.

[BUG] import fails due to ModuleNotFoundError in sklearn.preprocessing

Import of sklearn.preprocessing._csr_polynomial_expansion fails.

Reproducing code example:

import pysindy

Error message:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-640311a760f6> in <module>
----> 1 import pysindy

c:\users\jordan\appdata\local\programs\python\python37\lib\site-packages\pysindy\__init__.py in <module>
     11 from pysindy.differentiation import *
     12 from pysindy.optimizers import *
---> 13 from pysindy.feature_library import *
     14 from pysindy.utils import *

c:\users\jordan\appdata\local\programs\python\python37\lib\site-packages\pysindy\feature_library\__init__.py in <module>
      2 from .fourier_library import FourierLibrary
      3 from .identity_library import IdentityLibrary
----> 4 from .polynomial_library import PolynomialLibrary

c:\users\jordan\appdata\local\programs\python\python37\lib\site-packages\pysindy\feature_library\polynomial_library.py in <module>
      6 from scipy import sparse
      7 from sklearn.preprocessing import PolynomialFeatures
----> 8 from sklearn.preprocessing._csr_polynomial_expansion import _csr_polynomial_expansion
      9 from sklearn.utils import check_array
     10 from sklearn.utils.validation import check_is_fitted

ModuleNotFoundError: No module named 'sklearn.preprocessing._csr_polynomial_expansion'

Version info:

import sys, sklearn; print(sklearn.__version__, sys.version) returns 0.20.3 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)]

Published paper additions to the examples directory

Hey y'all, this is Alan from Steve's group. I am working on SINDy models in the field of plasma physics and have been using pySINDy. I am interested in contributing to the examples/ directory of the pySINDy with a jupyter notebook + all my other files unrelated to SINDy (4-5 python files for plasma-specific things which are compatible with the current pySINDy python requirements) which would reproduce the paper I am hoping to publish soon. This could be a cool thing for others to join as well, as we could accumulate a large bin of examples from disparate fields that reproduce published works.

One minor caveat is that my analysis relies on some potentially large files to read in. However, I know that github has several ways to link to dropbox or other large extraneous files so this is probably a surmountable problem.

Anyways, if you are interested in this idea as well, maybe we should meet soon to discuss a bit how we should formalize going about this process. I'm also happy to get more involved generally with the maintenance and oversight of this awesome github repo.

Cheers,
Alan

Control vector U with unkown function[BUG]

Hello,

I have a question if you could please help me. I tried to use the pysindy with control on my own dataset. However, I have only the U time serie without any function. Model.fit works but when i try to simulate it throws an error that an x_new is above the interpolation range? Have you encountered this error before and de you any idea how to solve it ?

Thanks in advance for your feedback

[BUG] model.score does not return correct values when using default r2_score

As per scikit-learn r2_score documentation, the function expects the arguments to be y_true followed by y_pred, however, in your implementation the order is reversed. Since R^2 is non-symmetric, it will produce a different score compared to correct order of arguments. This is hardly noticeable when predictions closely match the gt (score is close to 1.0), but will produce incorrect results otherwise.

Reproducing code example:

import numpy as np

from scipy.integrate import odeint
from sklearn.metrics import r2_score

import pysindy as ps

# Control input
def u_fun(t):
    return np.column_stack([np.sin(2 * t), t ** 2])

# Lorenz equations with control input
def lorenz_control(z, t):
    u = u_fun(t)
    return [
        10 * (z[1] - z[0]) + u[0, 0] ** 2,
        z[0] * (28 - z[2]) - z[1],
        z[0] * z[1] - 8 / 3 * z[2] - u[0, 1],
    ]

# Generate measurement data
dt = .002

t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27]
x_train = odeint(lorenz_control, x0_train, t_train)
u_train = u_fun(t_train)

# Instantiate and fit the SINDYc model
model = ps.SINDy()
model.fit(x_train, u=u_train, t=dt)

print(model.score(x_train, u=u_train, t=dt))

x_dot_test_predicted = model.predict(x_train, u=u_train)
x_dot_test_computed = model.differentiate(x_train, t=dt)

print(r2_score(x_dot_test_computed, x_dot_test_predicted))

This should print 0.999999960658854 and 0.999999960658851 (sorry for not providing a better example, I would have to give away our internal dataset to reproduce the issue, but in our case the score changes from 0.41 to 0.05)

PySINDy/Python version information:

1.0.1 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54)
[GCC 7.3.0]

Matrices not rendered in sphinx documentation

The notebook 2_original_paper.ipynb contains multiple matrices typeset in LaTeX, but they are not rendered in the sphinx documentation.

The matrices were made using the bmatrix environment. From what I have read online, mathjax should be compatible with bmatrix, and we list mathjax as an extension in our configuration file, so I'm having trouble figuring out where the problem stems from. I couldn't find anything relevant in the nbexamples documentation either. @Ohjeah, do you have any insights into what the problem might be?

iSindy implementation

Hello,

I was looking for a python implementation of isindy, and found this repository. There there is a deprecation message pointing to here. However I was unable to find the isindy implementation here, there is also no mention of it in the documentation and it doesn't seem to be implemented at all.

Is that a work in progress and there is an intent to bring that implementation to this package as well? That deprecation warning in the previous package, am I safe to assume that that implementation is robust and I can use that package and get reliable results?

I need to use iSindy in a project I am working on, when I saw the deprecation warning I decided to implement my own version based on the matlab code available (that also helps better understanding the algorithm, in case some adaptation is required). That went well to the first tests but due to the interface I need to implement, testing has been a little trickier than I first assumed. If that implementation is robust, I could just wrap my interface around it and make it work.

Thank you,
Paulo

Error of big-sparsity parameter

  • I have x_train array of 801 x 7 (7 state variables, 801 time instances) , and same is shape of Derivate Dx_train array. I am facing following error of big-sparsity parameter error even with so small value of threshold, and also related to ill-conditioned matrix. I am new to pySINDy. Requesting help to understand the error in more detail.
    image

Optimizers with linear equality constraints

Hello again! Before discovering this wonderful repo, I was using Loiseau's hard-thresholded least-squares algorithm which allows for linear equality constraints. I am wondering if you have any suggestions about a good way to incorporate this into pysindy, and I can go ahead and implement that. I am having a bit of trouble currently with the sklearn syntax, and the syntax of the base optimizer class doesn't look like it supports constraint matrices, so I wanted to ask y'all if there is a nice solution here before I resort to bulldozing all the nice code y'all wrote.

Best,
Alan

Create option for column interdependence

In the current implementation, SINDy models are fit using multi-output regression (i.e. each column of the coefficient matrix is fit separately). Some advanced features that we may implement as extensions will require interdependence among columns. Examples include trimming corrupt data points and imposing constraints on the coefficients.

[BUG]AttributeError: module 'pytest' has no attribute 'lazy_fixture'

What version of pytest are you using? I get the following error trying to run the tests.

$ pytest
============================ test session starts =============================
platform darwin -- Python 3.7.7, pytest-3.10.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/billtubbs/pysindy, inifile:
collected 24 items / 4 errors                                                

=================================== ERRORS ===================================
___________________ ERROR collecting test/test_pysindy.py ____________________
test/test_pysindy.py:88: in <module>
    pytest.lazy_fixture("data_1d"),
E   AttributeError: module 'pytest' has no attribute 'lazy_fixture'

I tried going back as far as version 3.10.1 and can't get it working in any version (latest version on conda is 5.4.1).
Thanks.

I think it's related to this open issue with pytest.

[BUG] Cannot provide controls u for cross validation with scikit-learn

When you provide a control variable u to fit SINDy with cross validation from scikit-learn, there is an error:

TypeError: Model was fit using control variables, so u is required.

It seems like this error occurs in the predict and/or score functions: https://pysindy.readthedocs.io/en/latest/_modules/pysindy/pysindy.html

Possibly, the controls u, which are provided to the fit function for cross validation, are not passed on to these other functions (predict and/or score) during cross validation?

Reproducing code example:

This code example is taken from here: https://pysindy.readthedocs.io/en/latest/examples/4_scikit_learn_compatibility.html#cross-validation

However, I have added 1 constant control variable u to show that the code fails when I pass u as argument: search.fit(x_train, u = u_train). I have highlighted the relevant rows with the comment "RELEVANT ROW".

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import TimeSeriesSplit

import numpy as np
from scipy.integrate import odeint
import pysindy as ps

def lorenz(z, t):
    return [
        10 * (z[1] - z[0]),
        z[0] * (28 - z[2]) - z[1],
        z[0] * z[1] - (8 / 3) * z[2]
    ]

# Generate training data
dt = .002

t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27]
x_train = odeint(lorenz, x0_train, t_train)

# Make up control input
u_train = np.ones(x_train.shape[0]) # RELEVANT ROW

# Fit model
model = ps.SINDy(t_default=dt)

param_grid = {
    "optimizer__threshold": [0.001, 0.01, 0.1],
    "optimizer__alpha": [0.01, 0.05, 0.1],
    "feature_library": [ps.PolynomialLibrary(), ps.FourierLibrary()],
    "differentiation_method__order": [1, 2]
}

search = GridSearchCV(
    model,
    param_grid,
    cv=TimeSeriesSplit(n_splits=5)
)
search.fit(x_train, u = u_train) # RELEVANT ROW

print("Best parameters:", search.best_params_)
search.best_estimator_.print()

Error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-76-6c5d752be13a> in <module>
     38     cv=TimeSeriesSplit(n_splits=5)
     39 )
---> 40 search.fit(x_train, u = u_train) # RELEVANT ROW
     41 
     42 print("Best parameters:", search.best_params_)

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
    734                 return results
    735 
--> 736             self._run_search(evaluate_candidates)
    737 
    738         # For multi-metric evaluation, store the best_index_, best_params_ and

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_search.py in _run_search(self, evaluate_candidates)
   1186     def _run_search(self, evaluate_candidates):
   1187         """Search all candidates in param_grid"""
-> 1188         evaluate_candidates(ParameterGrid(self.param_grid))
   1189 
   1190 

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_search.py in evaluate_candidates(candidate_params)
    706                               n_splits, n_candidates, n_candidates * n_splits))
    707 
--> 708                 out = parallel(delayed(_fit_and_score)(clone(base_estimator),
    709                                                        X, y,
    710                                                        train=train, test=test,

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in __call__(self, iterable)
   1027             # remaining jobs.
   1028             self._iterating = False
-> 1029             if self.dispatch_one_batch(iterator):
   1030                 self._iterating = self._original_iterator is not None
   1031 

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
    845                 return False
    846             else:
--> 847                 self._dispatch(tasks)
    848                 return True
    849 

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in _dispatch(self, batch)
    763         with self._lock:
    764             job_idx = len(self._jobs)
--> 765             job = self._backend.apply_async(batch, callback=cb)
    766             # A job can complete so quickly than its callback is
    767             # called before we get here, causing self._jobs to

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)
    206     def apply_async(self, func, callback=None):
    207         """Schedule a func to be run"""
--> 208         result = ImmediateResult(func)
    209         if callback:
    210             callback(result)

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/_parallel_backends.py in __init__(self, batch)
    570         # Don't delay the application, to avoid keeping the input
    571         # arguments in memory
--> 572         self.results = batch()
    573 
    574     def get(self):

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in __call__(self)
    250         # change the default number of processes to -1
    251         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 252             return [func(*args, **kwargs)
    253                     for func, args, kwargs in self.items]
    254 

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in <listcomp>(.0)
    250         # change the default number of processes to -1
    251         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 252             return [func(*args, **kwargs)
    253                     for func, args, kwargs in self.items]
    254 

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, return_estimator, error_score)
    558     else:
    559         fit_time = time.time() - start_time
--> 560         test_scores = _score(estimator, X_test, y_test, scorer)
    561         score_time = time.time() - start_time - fit_time
    562         if return_train_score:

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_validation.py in _score(estimator, X_test, y_test, scorer)
    603         scorer = _MultimetricScorer(**scorer)
    604     if y_test is None:
--> 605         scores = scorer(estimator, X_test)
    606     else:
    607         scores = scorer(estimator, X_test, y_test)

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/metrics/_scorer.py in __call__(self, estimator, *args, **kwargs)
     88                                       *args, **kwargs)
     89             else:
---> 90                 score = scorer(estimator, *args, **kwargs)
     91             scores[name] = score
     92         return scores

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/metrics/_scorer.py in _passthrough_scorer(estimator, *args, **kwargs)
    370 def _passthrough_scorer(estimator, *args, **kwargs):
    371     """Function that wraps estimator.score"""
--> 372     return estimator.score(*args, **kwargs)
    373 
    374 

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/pysindy/pysindy.py in score(self, x, t, x_dot, u, multiple_trajectories, metric, **metric_kws)
    461         if u is None or self.n_control_features_ == 0:
    462             if self.n_control_features_ > 0:
--> 463                 raise TypeError(
    464                     "Model was fit using control variables, so u is required"
    465                 )

TypeError: Model was fit using control variables, so u is required

PySINDy/Python version information:

'1.2.0'

NaN, infinity for dtype float64 when using model.simulate()

I am using PySINDy with a model with 20 states and 2 inputs trying to identify the dynamical system of this dataset. The model was successfully identified using polynomial library of 3rd degree and STLSQ optimizer with a score of 0.975. I checked the derivatives and they look perfect compared to the ground truth.
Afterwards, I tried to simulate my model forward in time but I get the following error.

I tried to use another type of integrators from scipy.integrate but my trials failed.

Reproducing code example:

import pysindy as ps
stlsq_optimizer = ps.STLSQ(threshold=.01, alpha=10, max_iter=100, normalize=True)
poly_library = ps.PolynomialLibrary(degree=3, include_interaction=True)
model_nonlinear = ps.SINDy(optimizer=stlsq_optimizer, feature_library=poly_library)
model_nonlinear.fit(x=x_NL_train, x_dot=x_dot_NL_train, u=u_f_train, t=dt)
x_NL_test_sim = model_nonlinear.simulate(np.array(x_NL_test[0,:]), np.array(t_test), f_interp)

Error message:

Traceback (most recent call last)
in
----> 1 x_NL_test_sim = model_nonlinear.simulate(np.array(x_NL_test[0,:]), np.array(t_test), f_interp)

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\pysindy.py in simulate(self, x0, t, u, integrator, stop_condition, **integrator_kws)
712 return self.predict(x[newaxis, :], u(t))[0]
713
--> 714 return integrator(rhs, x0, t, **integrator_kws)
715
716 @Property

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\scipy\integrate\odepack.py in odeint(func, y0, t, args, Dfun, col_deriv, full_output, ml, mu, rtol, atol, tcrit, h0, hmax, hmin, ixpr, mxstep, mxhnil, mxordn, mxords, printmessg, tfirst)
242 full_output, rtol, atol, tcrit, h0, hmax, hmin,
243 ixpr, mxstep, mxhnil, mxordn, mxords,
--> 244 int(bool(tfirst)))
245 if output[-1] < 0:
246 warning_msg = _msgs[output[-1]] + " Run with full_output = 1 to get quantitative information."

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\pysindy.py in rhs(x, t)
705
706 def rhs(x, t):
--> 707 return self.predict(x[newaxis, :], u(t).reshape(1, -1))[0]
708
709 else:

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\pysindy.py in predict(self, x, u, multiple_trajectories)
357 x = validate_input(x)
358 u = validate_control_variables(x, u)
--> 359 return self.model.predict(concatenate((x, u), axis=1))
360
361 def equations(self, precision=3):

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\metaestimators.py in (*args, **kwargs)
117
118 # lambda, but not partial, allows help() to work with update_wrapper
--> 119 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
120 # update the docstring of the returned function
121 update_wrapper(out, self.fn)

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\pipeline.py in predict(self, X, **predict_params)
406 for _, name, transform in self._iter(with_final=False):
407 Xt = transform.transform(Xt)
--> 408 return self.steps[-1][-1].predict(Xt, **predict_params)
409
410 @if_delegate_has_method(delegate='_final_estimator')

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\optimizers\sindy_optimizer.py in predict(self, x)
83
84 def predict(self, x):
---> 85 prediction = self.optimizer.predict(x)
86 if prediction.ndim == 1:
87 return prediction[:, np.newaxis]

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\linear_model_base.py in predict(self, X)
234 Returns predicted values.
235 """
--> 236 return self._decision_function(X)
237
238 _preprocess_data = staticmethod(_preprocess_data)

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\linear_model_base.py in decision_function(self, X)
216 check_is_fitted(self)
217
--> 218 X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
219 return safe_sparse_dot(X, self.coef
.T,
220 dense_output=True) + self.intercept_

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
71 FutureWarning)
72 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 73 return f(**kwargs)
74 return inner_f
75

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
644 if force_all_finite:
645 _assert_all_finite(array,
--> 646 allow_nan=force_all_finite == 'allow-nan')
647
648 if ensure_min_samples > 0:

C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
98 msg_err.format
99 (type_err,
--> 100 msg_dtype if msg_dtype is not None else X.dtype)
101 )
102 # for object dtype data, we only check for NaNs (GH-13254)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64')

PySINDy/Python version information:

PySINDy: 1.1.0
Python: 3.7.7

Cross-validation with SINDy

At the moment it is not straightforward to perform cross-validation on different parameters passed into the SINDy constructor. Most of the parameters relevant to a SINDy model's performance are passed into the objects expected by the optimizer, feature_library, and differentiation_method arguments rather than being fed directly into the SINDy constructor.

Ideally users would be able to run cross-validation in a manner similar to the following example, but it is currently not possible to do so (since threshold is a parameter of the SINDy.optimizer class object).

model = SINDy()
params = {"threshold" : [0.01, 0.1, 0.5, 1]}
cv = KFold(n_splits=3)
grid_search = GridSearchCV(model, params, cv=cv)
grid_search.fit(x_train)

Our original motivation for encompassing most of the arguments in the optimizer, differentiation_method, and feature_library was to avoid creating a base object with dozens of optional arguments.

Is there a relatively clean way for us to perform cross-validation with our current setup? I have a few ideas about how to do so, but they feel hacky:

  • Use the sklearn.pipeline strategy of appending argument names to the keys in params. These names help identify which objects the corresponding parameters are meant to be passed into.
  • Create some kind of wrapper object or helper function to aid the user in performing cross validation.

Attribute formatting in the documentation

The documentation site currently renders class attributes independently rather than grouping them in a list under the heading "Attributes". This issue is related to a comment in openjournals/joss-reviews/issues/2104.

Example (from CustomLibrary):
Selection_001

What we would like is for attributes to be displayed in a list, like the parameters in the above example.

It seems this is a known issue, and that one solution is to set napoleon_use_ivar = True in docs/conf.py, but this places the attributes under the heading "Variables" rather than "Attributes". Another workaround appears to be to patch the napoleon package itself, but I'm not sure that would work for us because the documentation site is automatically built when changes are pushed.

[BUG] Cannot provide controls u for model.simulate

After fitting a continuous-time SINDy model with controls u_train, I am trying to simulate with u_test. This raises an error message. TypeError: 'numpy.ndarray' object is not callable

It seems like something is wrong with the nested if-statements in the beginning of the simulate function in:
https://github.com/dynamicslab/pysindy/blob/master/pysindy/pysindy.py Somehow, I am ending up at a condition, where the code is trying to call my u_test as a function although it is an array.

Reproducing code example:

import numpy as np
from scipy.integrate import odeint
import pysindy as ps

def lorenz(z, t):
    return [
        10 * (z[1] - z[0]),
        z[0] * (28 - z[2]) - z[1],
        z[0] * z[1] - (8 / 3) * z[2]
    ]

# Generate training data
dt = .002
t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27]
x_train = odeint(lorenz, x0_train, t_train)
u_train = np.ones(x_train.shape[0]) # RELEVANT ROW

# Evolve the Lorenz equations in time using a different initial condition
t_test = np.arange(0, 15, dt)
x0_test = np.array([8, 7, 15])
x_test = odeint(lorenz, x0_test, t_test)
u_test = np.ones(x_test.shape[0]) # RELEVANT ROW

# Initialize model
model = ps.SINDy()

# Fit model
model.fit(x = x_train, u = u_train, t = t_train)

# Simulate
model.simulate(x0 = x0_test, u = u_test, t = t_test)

Error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-99-29456ec7cb28> in <module>
     31 
     32 # Simulate
---> 33 model.simulate(x0 = x0_test, u = u_test, t = t_test)

~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/pysindy/pysindy.py in simulate(self, x0, t, u, integrator, stop_condition, **integrator_kws)
    700 
    701             else:
--> 702                 if ndim(u(1)) == 1:
    703 
    704                     def rhs(x, t):

TypeError: 'numpy.ndarray' object is not callable

PySINDy/Python version information:

1.2.0

PySINDy gui

Hi,

Thanks for making such an interesting package.

I made a GUI for it PySINDyGUI just in case if anyone's interested.

Best,
hyumo

fit_intercept option cannot be set to True

When a SINDy object is created using an optimizer with fit_intercept set to true, the individual estimators created by the SINDy object's MultiOutputClassifer have fit_intercept=False. Here's a minimal working example in two dimensions:

from numpy import ones
import pysindy as ps

x = ones((5, 2))

opt = ps.STLSQ(fit_intercept=True)
print('opt.fit_intercept:', opt.fit_intercept)

model = ps.SINDy(optimizer=opt)
model.fit(x)
for est in model.model.estimators_:
    print('inner_estimator.fit_intercept:', est.steps[1][1].fit_intercept)

I suspect this has something to do with the way MultiOutputClassifier is instantiating its sub-estimators.

Add examples to documentation

We should add examples to the docstrings of top-level functions likely to be called by users (e.g. SINDy, the optimizers, the feature libraries, and the differentiation methods. It's probably fine to borrow examples from our jupyter notebooks when possible.

Prepare a major release

After our JOSS submission has been accepted, we should release a major version of pysindy.

Stepwise sparse regression

It could be useful for us to implement the stepwise sparse regression optimization method proposed in Sparse learning of stochastic dynamical equations, Lorenzo Boninsegna et al., (2018) - link. See this page for similar methods. The method is similar to sequentially thresholded least squares except, at each iteration, rather than eliminating all coefficients of small magnitude, it eliminates the smallest one.

From the user's point of view, the biggest difference between this method and STLSQ is that this method allows one to forgo the sparsity threshold parameter and instead directly specify the desired number of nonzero coefficients. This number can also be chosen via cross-validation.

Failure to Sparsify Forced Linear System with Large Parameter Differences

Hello to whom it may concern,

Whilst I don't have a specific and identifiable issue with PySINDy functionality itself, I am unfortunately experiencing some issues with creating a sparse and representative model of a system that I perceive to be well within the bounds of SINDy's capability. The system at question is a simple spring-damper quarter car suspension model that is linear in four states. The state space representation of the system is:

Screenshot 2020-12-03 112407

I have simulated this system in MATLAB/SIMULINK (tried to solve this problem with the original MATLAB implementation before moving to Python) by providing a step input in u and using a fixed step (t = 0.001) ODE3 solver. My understanding of linear systems is that such a step input should be sufficient to characterise them.

I have subsequently moved to Python and imported the MATLAB array data using:

from scipy.io import loadmat
matcontents = loadmat('sim.mat')
simdata = matcontents['simdata']
u_train = simdata[:,4]
x_train = simdata[:,:4]

where the first four columns of simdata are the state variables and fifth is the input. I can also provide the original sim.mat if required. I have created a poly library up to 2nd degree + bias, and my default optimiser choice has been STLSQ, threshold = 0.5 and alpha = 0.

The resulting model is:
Screenshot 2020-12-03 114926

It should be clear that both x1' and x3' are correctly identified. However, x4', has failed to sparsify to the governing equation, despite having near perfectly captured the correct dynamics in the linear measurements. Addtionally, the solution for x2' is both completely incorrectly identified and not adequately sparsified. This result implies that the data characterising my system is correct (i.e. no errors in the data-generating model), but that there is something wrong with my attempt to create a data-driven model of it.

I have attempted myriad potential fixes, including:

  • Introducing a non-zero alpha value to the optimiser. This has produced an even less sparse solution in the problem states.
  • Normalising the measurement library. I thought this would instantly fix the problem but seems to have no effect on the output.
  • Thresholding incredibly small values (<10-5) of my states in the training data to 0 to reduce the range of values in the feature space. Even when I perform this and normalise there seems to be no effect.
  • Altering my excitation signal design from a step input to white noise. This also seems to have little effect on the result.
  • I have even attempted to increase the threshold of STLSQ to >1 (a value which regularises out all parameters in 2 out of 4 states) to attempt to sparsify x2', which has large parameter values. However, even this fails to converge to a true solution for any threshold values, regardless of how high I set it.
  • I have attempted to use SR3 with default values and some minor tuning, but the results appears to suffer from similar lack of sparsity. I will admit, however, to not having performed a full cross-validation to ascertain hyper-parameters for this optimiser, due to my lack of familiarity.
  • I've even thrown an Elastic Net at the problem, which has promoted better sparsity but produced some laughably inaccurate parameter values.

Beyond performing full hyper-parameter CV over multiple optimisers, I am at somewhat of a loss with how to move forward. This system was meant to be a demo case before moving onto a non-linear spring-damper version and subsequently a Discrepancy Model application inspired by the work of the Washington research team, but if I cannot get SINDy to work in this simple dynamic system with basic attributes I may have to abandon my work with it. Is the optimiser struggling because of the 10^3 order of magnitude difference between the largest and smallest parameter of the model, or are there more obvious faults in my approach to using SINDy that I havent yet realised?

Many thanks in advance for your assistance and kind regards,
Jack

[Bug] unbias is not applied to intercept

Looking at the implementation of pysindy.optimizers.SINDyOptimizer._unbias, it looks like we are only unbiasing the linear coefficients coef_ but not the intercept intercept_.

Support for Time-Delayed system ?

Thanks for the lib and the introduction paper! One thing I am curious is that does the system supports system ID for time-delayed system?

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Genetic algorithm models

Is your feature request related to a problem? Please describe.

Could be possible extend lib with nonlinear genetic algorithms to discover non linear patterns? There’re some python libs today that could help

Describe the solution you'd like

Research some genetic methods, implement them

Describe alternatives you've considered

Implement a base generic example of model that will be used to fit a genetic algorithm “fitter”

Additional context

Extraneous code in top-level directory

There are a few files sitting around in the top level directory from when I first forked the project and I'm not sure whether we still need them. Specifically, the files I'm unsure about are

  • .gitmodules
  • .codecov.yml
  • .gitattributes
  • .readthedocs.yml
  • .travis.yml

[BUG]SINDy defaults are instances of classes

I don't think it's a good idea to use instances of optimizer, feature_library, and differentiation_method as default values in the init method of the SINDy class because they will only be instantiated once and then all future instances of SINDy will be using the same instances of these classes.

To illustrate the problem:

from pysindy import SINDy

model_1 = SINDy()
model_2 = SINDy()

print(model_1.optimizer.threshold, model_2.optimizer.threshold)

model_1.optimizer.threshold = 0.2  # This modifies the default instance

print(model_1.optimizer.threshold, model_2.optimizer.threshold)

Output:

0.1 0.1
0.2 0.2

(model_1 and model_2 are sharing the same default optimizer object, furthermore, all future instances of SINDy will also be).

The problem is in the init method for SINDy:

    def __init__(
        self,
        optimizer=STLSQ(),
        feature_library=PolynomialFeatures(),
        differentiation_method=FiniteDifference(),
        feature_names=None,
        discrete_time=False,
        n_jobs=1,
    ):

how to introduce control inputs for the system identificaiton?

Thanks for the excellent package for system identification.
It seems it is working perfectly for cases:
Y = A*X + B
where Y is a matrix of interested outputs, X is a matrix of state variables, B is a matrix of constant bias.
####################################################################################
However, In many cases, there are control variables involved in such a relationship:
Y = A * X + B * U + C
Where Y is a matrix of interested outputs, X is a matrix of state variables, U is a matrix of control inputs, C is a matrix of constant bias.
How do we do system identification problem with control inputs?
Thanks.

ValueError in model simulation due to wrong assumption

When calling model.simulate, it is assumed that time 1 is can be interpolated:

if ndim(u_fun(1)) == 1:

This assumption is wrong is t is a given array.

Example:

import numpy as np

import pysindy as ps

t = np.linspace(0, 1, 100)
u = np.array([0]*100)
x = np.power(t, 2)

model = ps.SINDy()
model.fit(
    x,
    t=t,
    u=u,
)
print(model.print())

def gen_u(t):
    return t

x0 = 6

t_test_success = np.arange(0, 1.1, 0.1)
u_test_success = gen_u(t_test_success)

t_test_fail = np.arange(0, 1, 0.1)
u_test_fail = gen_u(t_test_fail)

sim_success = model.simulate(
    [x0],
    t=t_test_success,
    u=u_test_success,
)
sim_fail = model.simulate(
    [x0],
    t=t_test_fail,
    u=u_test_fail,
)

In this example, sim_fail will raise a ValueError exception as 1 is out of the interpolation range (see linked line above).

Proposed fix:

Replace : if ndim(u_fun(1)) == 1: with if ndim(u_fun(t[0])) == 1: .

[BUG] Error while using model.simulate with unknown control input function

Hi there,

I downloaded the example code from the pysindy website. In the example file named "1_feature_overview.py", there is an error while I tried to run it in the function of simulate (line 568) as mentioned below.

Code from example:

x_test_sim = model.simulate(x0_test, t_test, u=u_test)

Error message:

ValueError: A value in x_new is above the interpolation range.

PySINDy/Python version information:

1.2.2

How can I fixed it:

From the "pysindy.py" at line 725, I added the further parameter to the function "interp1d" as suggested from https://stackoverflow.com/questions/45429831/valueerror-a-value-in-x-new-is-above-the-interpolation-range-what-other-re?answertab=votes#tab-top

Previous code

u_fun = interp1d(t, u, axis=0, kind="cubic")

Modified code

u_fun = interp1d(t, u, axis=0, kind="cubic", fill_value="extrapolate")

After I put it, the simulate function is work for me. Maybe you consider to put it into the update version of pysindy.

Regards,

minimization constraints

Is your feature request related to a problem? Please describe.

Hi, is there a way of doing constraints that are not equality constraints Bx=0, but minimizing a function like Bx.

Describe the solution you'd like

basically in the optimizer step that is using a sparsity to minimize the coefficients would be multiplied by a weighted matrix B

Describe alternatives you've considered

I looked at the features and there was an example using equality constraints, but I was wondering if it could also have minimizing or maximizing functions

Thank you!

Perform SINDy fitting on only a subset of the variables

It would be nice to have an option to only find equations for a subset of the input variables. This would be useful e.g. if you want to include forcing inputs in your library (you don't care about fitting a dynamical equation that explains the external forcing).

Improve STLSQ

Setting the alpha (L2 regularization) parameter for the STLSQ optimizer improves the condition number of the least-squares problems that are solved, leading to better overall results when features are close to linearly dependent. However, it adds some bias to the learned coefficients (toward 0). We should consider modifying STLSQ (and maybe some of our other methods) by changing it to a two-step procedure when alpha is positive:

  1. Use the existing algorithm with regularization to identify the support
  2. Perform one last unregularized least-squares fit using only the features/terms identified in step 1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.