dynamicslab / pysindy Goto Github PK
View Code? Open in Web Editor NEWA package for the sparse identification of nonlinear dynamical systems from data
Home Page: https://pysindy.readthedocs.io/en/latest/
License: Other
A package for the sparse identification of nonlinear dynamical systems from data
Home Page: https://pysindy.readthedocs.io/en/latest/
License: Other
Sorry, I didn't see your comment before making the most recent commit. I relaxed the complexity test constraint a bit more.
I was able to reproduce the failed test locally by using the same parameters: n_samples=1124
, n_features=27
, n_informative=5
, random_state=6
. It looks like LASSO somehow finds 4 informative features, while the other heavily regularized models find 6. This seems like a bit of a fluke; for different random states this doesn't occur.
Maybe we could remove LASSO from this test? Or we could stick with the quick fix I made.
In any case, the bug isn't caused by this PR, so I think it's okay to accept it.
Originally posted by @briandesilva in #71 (comment)
Hello,
I have started to learn pysindy and am going over the examples now.
One question that I have is, is it possible to impose a constrain on the states?
As an example, I would like to get a SIR - model from a data, but in general this model has a constraint: S+ I + R = 1.
Thank you
We need to implement the complexity
property for the optimizers. Previously we were erroneously using a method that computed the complexity assuming STLSQ was used, but each method has different ways of computing complexity. For now I am just having the different complexity methods all throw NotImplemetedError
.
Once we implement complexity
for each optimization class we will also need to update the appropriate unit tests. I have marked the unit tests that will need to be updated with TODO
.
Instead of the
if hasattr(self, "model"):
else:
raise NotFittedError
we should make use of check_is_fitted(self, "model")
to comply with sklearns interface.
The SINDyOptimizer
object assumes that the optimizer that's passed in has a number of methods and properties, but we don't actually check for any of them. I think it would be better if we had a series of checks in the __init__
function so that users would see error messages as soon as they passed in their custom optimizers rather than when they call SINDy.fit
. It would be nice to just check if the optimizer inherits from sklearn.linear_model._base.LinearModel
, but they are discouraging users from importing from the _*
modules.
Methods to check for:
fit
predict
Attributes to check for:
fit_intercept
normalize
coef_
intercept_
PolynomialLibrary
inherits from sklearn.preprocessing.PolynomialFeatures
, granting it a get_parameters
method, but it has extra parameters beyond those in sklearn's PolynomialFeatures
(e.g. include_interaction
). The class should be updated to reflect this difference.
requirements.txt
should only contain third party dependencies used in the main package.requirements-dev.txt
should contain everything needed for a development environment, including tests and examples.Import of sklearn.preprocessing._csr_polynomial_expansion
fails.
import pysindy
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-3-640311a760f6> in <module>
----> 1 import pysindy
c:\users\jordan\appdata\local\programs\python\python37\lib\site-packages\pysindy\__init__.py in <module>
11 from pysindy.differentiation import *
12 from pysindy.optimizers import *
---> 13 from pysindy.feature_library import *
14 from pysindy.utils import *
c:\users\jordan\appdata\local\programs\python\python37\lib\site-packages\pysindy\feature_library\__init__.py in <module>
2 from .fourier_library import FourierLibrary
3 from .identity_library import IdentityLibrary
----> 4 from .polynomial_library import PolynomialLibrary
c:\users\jordan\appdata\local\programs\python\python37\lib\site-packages\pysindy\feature_library\polynomial_library.py in <module>
6 from scipy import sparse
7 from sklearn.preprocessing import PolynomialFeatures
----> 8 from sklearn.preprocessing._csr_polynomial_expansion import _csr_polynomial_expansion
9 from sklearn.utils import check_array
10 from sklearn.utils.validation import check_is_fitted
ModuleNotFoundError: No module named 'sklearn.preprocessing._csr_polynomial_expansion'
import sys, sklearn; print(sklearn.__version__, sys.version)
returns 0.20.3 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)]
Hey y'all, this is Alan from Steve's group. I am working on SINDy models in the field of plasma physics and have been using pySINDy. I am interested in contributing to the examples/ directory of the pySINDy with a jupyter notebook + all my other files unrelated to SINDy (4-5 python files for plasma-specific things which are compatible with the current pySINDy python requirements) which would reproduce the paper I am hoping to publish soon. This could be a cool thing for others to join as well, as we could accumulate a large bin of examples from disparate fields that reproduce published works.
One minor caveat is that my analysis relies on some potentially large files to read in. However, I know that github has several ways to link to dropbox or other large extraneous files so this is probably a surmountable problem.
Anyways, if you are interested in this idea as well, maybe we should meet soon to discuss a bit how we should formalize going about this process. I'm also happy to get more involved generally with the maintenance and oversight of this awesome github repo.
Cheers,
Alan
Hello,
I have a question if you could please help me. I tried to use the pysindy with control on my own dataset. However, I have only the U time serie without any function. Model.fit works but when i try to simulate it throws an error that an x_new is above the interpolation range? Have you encountered this error before and de you any idea how to solve it ?
Thanks in advance for your feedback
Triggered by a git tag, we should deploy a new version to PyPI (and maybe Github releases) using the CI.
As per scikit-learn r2_score documentation, the function expects the arguments to be y_true
followed by y_pred
, however, in your implementation the order is reversed. Since R^2 is non-symmetric, it will produce a different score compared to correct order of arguments. This is hardly noticeable when predictions closely match the gt (score is close to 1.0), but will produce incorrect results otherwise.
import numpy as np
from scipy.integrate import odeint
from sklearn.metrics import r2_score
import pysindy as ps
# Control input
def u_fun(t):
return np.column_stack([np.sin(2 * t), t ** 2])
# Lorenz equations with control input
def lorenz_control(z, t):
u = u_fun(t)
return [
10 * (z[1] - z[0]) + u[0, 0] ** 2,
z[0] * (28 - z[2]) - z[1],
z[0] * z[1] - 8 / 3 * z[2] - u[0, 1],
]
# Generate measurement data
dt = .002
t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27]
x_train = odeint(lorenz_control, x0_train, t_train)
u_train = u_fun(t_train)
# Instantiate and fit the SINDYc model
model = ps.SINDy()
model.fit(x_train, u=u_train, t=dt)
print(model.score(x_train, u=u_train, t=dt))
x_dot_test_predicted = model.predict(x_train, u=u_train)
x_dot_test_computed = model.differentiate(x_train, t=dt)
print(r2_score(x_dot_test_computed, x_dot_test_predicted))
This should print 0.999999960658854
and 0.999999960658851
(sorry for not providing a better example, I would have to give away our internal dataset to reproduce the issue, but in our case the score changes from 0.41 to 0.05)
1.0.1 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54)
[GCC 7.3.0]
Presumably I can't use PySindy 'out-of-the-box' for a problem with actuated control inputs. Or is there a way?
If not, are there any plans to extend this to SINDY with control (SINDYc)?
thanks.
The notebook 2_original_paper.ipynb
contains multiple matrices typeset in LaTeX, but they are not rendered in the sphinx documentation.
The matrices were made using the bmatrix
environment. From what I have read online, mathjax should be compatible with bmatrix
, and we list mathjax as an extension in our configuration file, so I'm having trouble figuring out where the problem stems from. I couldn't find anything relevant in the nbexamples documentation either. @Ohjeah, do you have any insights into what the problem might be?
Hello,
I was looking for a python implementation of isindy, and found this repository. There there is a deprecation message pointing to here. However I was unable to find the isindy implementation here, there is also no mention of it in the documentation and it doesn't seem to be implemented at all.
Is that a work in progress and there is an intent to bring that implementation to this package as well? That deprecation warning in the previous package, am I safe to assume that that implementation is robust and I can use that package and get reliable results?
I need to use iSindy in a project I am working on, when I saw the deprecation warning I decided to implement my own version based on the matlab code available (that also helps better understanding the algorithm, in case some adaptation is required). That went well to the first tests but due to the interface I need to implement, testing has been a little trickier than I first assumed. If that implementation is robust, I could just wrap my interface around it and make it work.
Thank you,
Paulo
x_train
array of 801 x 7 (7 state variables, 801 time instances) , and same is shape of Derivate Dx_train
array. I am facing following error of big-sparsity parameter error even with so small value of threshold, and also related to ill-conditioned matrix. I am new to pySINDy. Requesting help to understand the error in more detail.Hello again! Before discovering this wonderful repo, I was using Loiseau's hard-thresholded least-squares algorithm which allows for linear equality constraints. I am wondering if you have any suggestions about a good way to incorporate this into pysindy, and I can go ahead and implement that. I am having a bit of trouble currently with the sklearn syntax, and the syntax of the base optimizer class doesn't look like it supports constraint matrices, so I wanted to ask y'all if there is a nice solution here before I resort to bulldozing all the nice code y'all wrote.
Best,
Alan
In the current implementation, SINDy models are fit using multi-output regression (i.e. each column of the coefficient matrix is fit separately). Some advanced features that we may implement as extensions will require interdependence among columns. Examples include trimming corrupt data points and imposing constraints on the coefficients.
What version of pytest are you using? I get the following error trying to run the tests.
$ pytest
============================ test session starts =============================
platform darwin -- Python 3.7.7, pytest-3.10.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/billtubbs/pysindy, inifile:
collected 24 items / 4 errors
=================================== ERRORS ===================================
___________________ ERROR collecting test/test_pysindy.py ____________________
test/test_pysindy.py:88: in <module>
pytest.lazy_fixture("data_1d"),
E AttributeError: module 'pytest' has no attribute 'lazy_fixture'
I tried going back as far as version 3.10.1 and can't get it working in any version (latest version on conda is 5.4.1).
Thanks.
I think it's related to this open issue with pytest.
When you provide a control variable u
to fit SINDy with cross validation from scikit-learn, there is an error:
TypeError: Model was fit using control variables, so u is required
.
It seems like this error occurs in the predict and/or score functions: https://pysindy.readthedocs.io/en/latest/_modules/pysindy/pysindy.html
Possibly, the controls u, which are provided to the fit function for cross validation, are not passed on to these other functions (predict and/or score) during cross validation?
This code example is taken from here: https://pysindy.readthedocs.io/en/latest/examples/4_scikit_learn_compatibility.html#cross-validation
However, I have added 1 constant control variable u to show that the code fails when I pass u as argument: search.fit(x_train, u = u_train)
. I have highlighted the relevant rows with the comment "RELEVANT ROW".
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import TimeSeriesSplit
import numpy as np
from scipy.integrate import odeint
import pysindy as ps
def lorenz(z, t):
return [
10 * (z[1] - z[0]),
z[0] * (28 - z[2]) - z[1],
z[0] * z[1] - (8 / 3) * z[2]
]
# Generate training data
dt = .002
t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27]
x_train = odeint(lorenz, x0_train, t_train)
# Make up control input
u_train = np.ones(x_train.shape[0]) # RELEVANT ROW
# Fit model
model = ps.SINDy(t_default=dt)
param_grid = {
"optimizer__threshold": [0.001, 0.01, 0.1],
"optimizer__alpha": [0.01, 0.05, 0.1],
"feature_library": [ps.PolynomialLibrary(), ps.FourierLibrary()],
"differentiation_method__order": [1, 2]
}
search = GridSearchCV(
model,
param_grid,
cv=TimeSeriesSplit(n_splits=5)
)
search.fit(x_train, u = u_train) # RELEVANT ROW
print("Best parameters:", search.best_params_)
search.best_estimator_.print()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-76-6c5d752be13a> in <module>
38 cv=TimeSeriesSplit(n_splits=5)
39 )
---> 40 search.fit(x_train, u = u_train) # RELEVANT ROW
41
42 print("Best parameters:", search.best_params_)
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
70 FutureWarning)
71 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72 return f(**kwargs)
73 return inner_f
74
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
734 return results
735
--> 736 self._run_search(evaluate_candidates)
737
738 # For multi-metric evaluation, store the best_index_, best_params_ and
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_search.py in _run_search(self, evaluate_candidates)
1186 def _run_search(self, evaluate_candidates):
1187 """Search all candidates in param_grid"""
-> 1188 evaluate_candidates(ParameterGrid(self.param_grid))
1189
1190
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_search.py in evaluate_candidates(candidate_params)
706 n_splits, n_candidates, n_candidates * n_splits))
707
--> 708 out = parallel(delayed(_fit_and_score)(clone(base_estimator),
709 X, y,
710 train=train, test=test,
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in __call__(self, iterable)
1027 # remaining jobs.
1028 self._iterating = False
-> 1029 if self.dispatch_one_batch(iterator):
1030 self._iterating = self._original_iterator is not None
1031
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
845 return False
846 else:
--> 847 self._dispatch(tasks)
848 return True
849
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in _dispatch(self, batch)
763 with self._lock:
764 job_idx = len(self._jobs)
--> 765 job = self._backend.apply_async(batch, callback=cb)
766 # A job can complete so quickly than its callback is
767 # called before we get here, causing self._jobs to
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)
206 def apply_async(self, func, callback=None):
207 """Schedule a func to be run"""
--> 208 result = ImmediateResult(func)
209 if callback:
210 callback(result)
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/_parallel_backends.py in __init__(self, batch)
570 # Don't delay the application, to avoid keeping the input
571 # arguments in memory
--> 572 self.results = batch()
573
574 def get(self):
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in __call__(self)
250 # change the default number of processes to -1
251 with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 252 return [func(*args, **kwargs)
253 for func, args, kwargs in self.items]
254
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/joblib/parallel.py in <listcomp>(.0)
250 # change the default number of processes to -1
251 with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 252 return [func(*args, **kwargs)
253 for func, args, kwargs in self.items]
254
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, return_estimator, error_score)
558 else:
559 fit_time = time.time() - start_time
--> 560 test_scores = _score(estimator, X_test, y_test, scorer)
561 score_time = time.time() - start_time - fit_time
562 if return_train_score:
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/model_selection/_validation.py in _score(estimator, X_test, y_test, scorer)
603 scorer = _MultimetricScorer(**scorer)
604 if y_test is None:
--> 605 scores = scorer(estimator, X_test)
606 else:
607 scores = scorer(estimator, X_test, y_test)
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/metrics/_scorer.py in __call__(self, estimator, *args, **kwargs)
88 *args, **kwargs)
89 else:
---> 90 score = scorer(estimator, *args, **kwargs)
91 scores[name] = score
92 return scores
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/sklearn/metrics/_scorer.py in _passthrough_scorer(estimator, *args, **kwargs)
370 def _passthrough_scorer(estimator, *args, **kwargs):
371 """Function that wraps estimator.score"""
--> 372 return estimator.score(*args, **kwargs)
373
374
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/pysindy/pysindy.py in score(self, x, t, x_dot, u, multiple_trajectories, metric, **metric_kws)
461 if u is None or self.n_control_features_ == 0:
462 if self.n_control_features_ > 0:
--> 463 raise TypeError(
464 "Model was fit using control variables, so u is required"
465 )
TypeError: Model was fit using control variables, so u is required
'1.2.0'
I am using PySINDy with a model with 20 states and 2 inputs trying to identify the dynamical system of this dataset. The model was successfully identified using polynomial library of 3rd degree and STLSQ optimizer with a score of 0.975. I checked the derivatives and they look perfect compared to the ground truth.
Afterwards, I tried to simulate my model forward in time but I get the following error.
I tried to use another type of integrators from scipy.integrate but my trials failed.
import pysindy as ps
stlsq_optimizer = ps.STLSQ(threshold=.01, alpha=10, max_iter=100, normalize=True)
poly_library = ps.PolynomialLibrary(degree=3, include_interaction=True)
model_nonlinear = ps.SINDy(optimizer=stlsq_optimizer, feature_library=poly_library)
model_nonlinear.fit(x=x_NL_train, x_dot=x_dot_NL_train, u=u_f_train, t=dt)
x_NL_test_sim = model_nonlinear.simulate(np.array(x_NL_test[0,:]), np.array(t_test), f_interp)
Traceback (most recent call last)
in
----> 1 x_NL_test_sim = model_nonlinear.simulate(np.array(x_NL_test[0,:]), np.array(t_test), f_interp)
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\pysindy.py in simulate(self, x0, t, u, integrator, stop_condition, **integrator_kws)
712 return self.predict(x[newaxis, :], u(t))[0]
713
--> 714 return integrator(rhs, x0, t, **integrator_kws)
715
716 @Property
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\scipy\integrate\odepack.py in odeint(func, y0, t, args, Dfun, col_deriv, full_output, ml, mu, rtol, atol, tcrit, h0, hmax, hmin, ixpr, mxstep, mxhnil, mxordn, mxords, printmessg, tfirst)
242 full_output, rtol, atol, tcrit, h0, hmax, hmin,
243 ixpr, mxstep, mxhnil, mxordn, mxords,
--> 244 int(bool(tfirst)))
245 if output[-1] < 0:
246 warning_msg = _msgs[output[-1]] + " Run with full_output = 1 to get quantitative information."
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\pysindy.py in rhs(x, t)
705
706 def rhs(x, t):
--> 707 return self.predict(x[newaxis, :], u(t).reshape(1, -1))[0]
708
709 else:
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\pysindy.py in predict(self, x, u, multiple_trajectories)
357 x = validate_input(x)
358 u = validate_control_variables(x, u)
--> 359 return self.model.predict(concatenate((x, u), axis=1))
360
361 def equations(self, precision=3):
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\metaestimators.py in (*args, **kwargs)
117
118 # lambda, but not partial, allows help() to work with update_wrapper
--> 119 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
120 # update the docstring of the returned function
121 update_wrapper(out, self.fn)
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\pipeline.py in predict(self, X, **predict_params)
406 for _, name, transform in self._iter(with_final=False):
407 Xt = transform.transform(Xt)
--> 408 return self.steps[-1][-1].predict(Xt, **predict_params)
409
410 @if_delegate_has_method(delegate='_final_estimator')
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\pysindy\optimizers\sindy_optimizer.py in predict(self, x)
83
84 def predict(self, x):
---> 85 prediction = self.optimizer.predict(x)
86 if prediction.ndim == 1:
87 return prediction[:, np.newaxis]
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\linear_model_base.py in predict(self, X)
234 Returns predicted values.
235 """
--> 236 return self._decision_function(X)
237
238 _preprocess_data = staticmethod(_preprocess_data)
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\linear_model_base.py in decision_function(self, X)
216 check_is_fitted(self)
217
--> 218 X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
219 return safe_sparse_dot(X, self.coef.T,
220 dense_output=True) + self.intercept_
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
71 FutureWarning)
72 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 73 return f(**kwargs)
74 return inner_f
75
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
644 if force_all_finite:
645 _assert_all_finite(array,
--> 646 allow_nan=force_all_finite == 'allow-nan')
647
648 if ensure_min_samples > 0:
C:\Users\kamel\Anaconda3\envs\tf-gpu\lib\site-packages\sklearn\utils\validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
98 msg_err.format
99 (type_err,
--> 100 msg_dtype if msg_dtype is not None else X.dtype)
101 )
102 # for object dtype data, we only check for NaNs (GH-13254)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64')
PySINDy: 1.1.0
Python: 3.7.7
At the moment it is not straightforward to perform cross-validation on different parameters passed into the SINDy constructor. Most of the parameters relevant to a SINDy model's performance are passed into the objects expected by the optimizer, feature_library, and differentiation_method arguments rather than being fed directly into the SINDy constructor.
Ideally users would be able to run cross-validation in a manner similar to the following example, but it is currently not possible to do so (since threshold
is a parameter of the SINDy.optimizer class object).
model = SINDy()
params = {"threshold" : [0.01, 0.1, 0.5, 1]}
cv = KFold(n_splits=3)
grid_search = GridSearchCV(model, params, cv=cv)
grid_search.fit(x_train)
Our original motivation for encompassing most of the arguments in the optimizer, differentiation_method, and feature_library was to avoid creating a base object with dozens of optional arguments.
Is there a relatively clean way for us to perform cross-validation with our current setup? I have a few ideas about how to do so, but they feel hacky:
@andgoldschmidt has implemented a number of numerical differentiation methods, some of which could easily be ported over to PySINDy. I'm creating this issue to create a discussion space for this project.
The documentation site currently renders class attributes independently rather than grouping them in a list under the heading "Attributes". This issue is related to a comment in openjournals/joss-reviews/issues/2104.
What we would like is for attributes to be displayed in a list, like the parameters in the above example.
It seems this is a known issue, and that one solution is to set napoleon_use_ivar = True
in docs/conf.py
, but this places the attributes under the heading "Variables" rather than "Attributes". Another workaround appears to be to patch the napoleon package itself, but I'm not sure that would work for us because the documentation site is automatically built when changes are pushed.
import pysindy as ps
error:-
ContextualVersionConflict: (scikit-learn 0.22.2.post1 (/usr/local/lib/python3.6/dist-packages), Requirement.parse('scikit-learn[alldeps]>=0.23'), {'pysindy'})
import pysindy
<< your code here >>
After fitting a continuous-time SINDy model with controls u_train
, I am trying to simulate with u_test
. This raises an error message. TypeError: 'numpy.ndarray' object is not callable
It seems like something is wrong with the nested if-statements in the beginning of the simulate function in:
https://github.com/dynamicslab/pysindy/blob/master/pysindy/pysindy.py Somehow, I am ending up at a condition, where the code is trying to call my u_test
as a function although it is an array.
import numpy as np
from scipy.integrate import odeint
import pysindy as ps
def lorenz(z, t):
return [
10 * (z[1] - z[0]),
z[0] * (28 - z[2]) - z[1],
z[0] * z[1] - (8 / 3) * z[2]
]
# Generate training data
dt = .002
t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27]
x_train = odeint(lorenz, x0_train, t_train)
u_train = np.ones(x_train.shape[0]) # RELEVANT ROW
# Evolve the Lorenz equations in time using a different initial condition
t_test = np.arange(0, 15, dt)
x0_test = np.array([8, 7, 15])
x_test = odeint(lorenz, x0_test, t_test)
u_test = np.ones(x_test.shape[0]) # RELEVANT ROW
# Initialize model
model = ps.SINDy()
# Fit model
model.fit(x = x_train, u = u_train, t = t_train)
# Simulate
model.simulate(x0 = x0_test, u = u_test, t = t_test)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-99-29456ec7cb28> in <module>
31
32 # Simulate
---> 33 model.simulate(x0 = x0_test, u = u_test, t = t_test)
~/.cache/pypoetry/virtualenvs/powerup-process-control-VY7sLJ9r-py3.8/lib/python3.8/site-packages/pysindy/pysindy.py in simulate(self, x0, t, u, integrator, stop_condition, **integrator_kws)
700
701 else:
--> 702 if ndim(u(1)) == 1:
703
704 def rhs(x, t):
TypeError: 'numpy.ndarray' object is not callable
1.2.0
Hi,
Thanks for making such an interesting package.
I made a GUI for it PySINDyGUI just in case if anyone's interested.
Best,
hyumo
When a SINDy
object is created using an optimizer with fit_intercept
set to true, the individual estimators created by the SINDy
object's MultiOutputClassifer
have fit_intercept=False
. Here's a minimal working example in two dimensions:
from numpy import ones
import pysindy as ps
x = ones((5, 2))
opt = ps.STLSQ(fit_intercept=True)
print('opt.fit_intercept:', opt.fit_intercept)
model = ps.SINDy(optimizer=opt)
model.fit(x)
for est in model.model.estimators_:
print('inner_estimator.fit_intercept:', est.steps[1][1].fit_intercept)
I suspect this has something to do with the way MultiOutputClassifier
is instantiating its sub-estimators.
We should add examples to the docstrings of top-level functions likely to be called by users (e.g. SINDy, the optimizers, the feature libraries, and the differentiation methods. It's probably fine to borrow examples from our jupyter notebooks when possible.
Before pushing or merging on the master branch we should make sure to verify that the examples are still working as intended.
After our JOSS submission has been accepted, we should release a major version of pysindy.
It could be useful for us to implement the stepwise sparse regression optimization method proposed in Sparse learning of stochastic dynamical equations, Lorenzo Boninsegna et al., (2018) - link. See this page for similar methods. The method is similar to sequentially thresholded least squares except, at each iteration, rather than eliminating all coefficients of small magnitude, it eliminates the smallest one.
From the user's point of view, the biggest difference between this method and STLSQ is that this method allows one to forgo the sparsity threshold parameter and instead directly specify the desired number of nonzero coefficients. This number can also be chosen via cross-validation.
Hello to whom it may concern,
Whilst I don't have a specific and identifiable issue with PySINDy functionality itself, I am unfortunately experiencing some issues with creating a sparse and representative model of a system that I perceive to be well within the bounds of SINDy's capability. The system at question is a simple spring-damper quarter car suspension model that is linear in four states. The state space representation of the system is:
I have simulated this system in MATLAB/SIMULINK (tried to solve this problem with the original MATLAB implementation before moving to Python) by providing a step input in u and using a fixed step (t = 0.001) ODE3 solver. My understanding of linear systems is that such a step input should be sufficient to characterise them.
I have subsequently moved to Python and imported the MATLAB array data using:
from scipy.io import loadmat
matcontents = loadmat('sim.mat')
simdata = matcontents['simdata']
u_train = simdata[:,4]
x_train = simdata[:,:4]
where the first four columns of simdata
are the state variables and fifth is the input. I can also provide the original sim.mat if required. I have created a poly library up to 2nd degree + bias, and my default optimiser choice has been STLSQ, threshold = 0.5 and alpha = 0.
It should be clear that both x1'
and x3'
are correctly identified. However, x4'
, has failed to sparsify to the governing equation, despite having near perfectly captured the correct dynamics in the linear measurements. Addtionally, the solution for x2'
is both completely incorrectly identified and not adequately sparsified. This result implies that the data characterising my system is correct (i.e. no errors in the data-generating model), but that there is something wrong with my attempt to create a data-driven model of it.
I have attempted myriad potential fixes, including:
x2'
, which has large parameter values. However, even this fails to converge to a true solution for any threshold values, regardless of how high I set it.Beyond performing full hyper-parameter CV over multiple optimisers, I am at somewhat of a loss with how to move forward. This system was meant to be a demo case before moving onto a non-linear spring-damper version and subsequently a Discrepancy Model application inspired by the work of the Washington research team, but if I cannot get SINDy to work in this simple dynamic system with basic attributes I may have to abandon my work with it. Is the optimiser struggling because of the 10^3 order of magnitude difference between the largest and smallest parameter of the model, or are there more obvious faults in my approach to using SINDy that I havent yet realised?
Many thanks in advance for your assistance and kind regards,
Jack
Looking at the implementation of pysindy.optimizers.SINDyOptimizer._unbias
, it looks like we are only unbiasing the linear coefficients coef_
but not the intercept intercept_
.
Thanks for the lib and the introduction paper! One thing I am curious is that does the system supports system ID for time-delayed system?
Could be possible extend lib with nonlinear genetic algorithms to discover non linear patterns? There’re some python libs today that could help
Research some genetic methods, implement them
Implement a base generic example of model that will be used to fit a genetic algorithm “fitter”
There are a few files sitting around in the top level directory from when I first forked the project and I'm not sure whether we still need them. Specifically, the files I'm unsure about are
We should set up our online documentation on https://readthedocs.org/ .
I think a basic template is enough for now and we can iterate on the content as we go.
I don't think it's a good idea to use instances of optimizer
, feature_library
, and differentiation_method
as default values in the init method of the SINDy class because they will only be instantiated once and then all future instances of SINDy will be using the same instances of these classes.
To illustrate the problem:
from pysindy import SINDy
model_1 = SINDy()
model_2 = SINDy()
print(model_1.optimizer.threshold, model_2.optimizer.threshold)
model_1.optimizer.threshold = 0.2 # This modifies the default instance
print(model_1.optimizer.threshold, model_2.optimizer.threshold)
Output:
0.1 0.1
0.2 0.2
(model_1
and model_2
are sharing the same default optimizer object, furthermore, all future instances of SINDy will also be).
The problem is in the init method for SINDy:
def __init__(
self,
optimizer=STLSQ(),
feature_library=PolynomialFeatures(),
differentiation_method=FiniteDifference(),
feature_names=None,
discrete_time=False,
n_jobs=1,
):
Thanks for the excellent package for system identification.
It seems it is working perfectly for cases:
Y = A*X + B
where Y is a matrix of interested outputs, X is a matrix of state variables, B is a matrix of constant bias.
####################################################################################
However, In many cases, there are control variables involved in such a relationship:
Y = A * X + B * U + C
Where Y is a matrix of interested outputs, X is a matrix of state variables, U is a matrix of control inputs, C is a matrix of constant bias.
How do we do system identification problem with control inputs?
Thanks.
When calling model.simulate
, it is assumed that time 1
is can be interpolated:
Line 738 in 0f7ca74
This assumption is wrong is t
is a given array.
import numpy as np
import pysindy as ps
t = np.linspace(0, 1, 100)
u = np.array([0]*100)
x = np.power(t, 2)
model = ps.SINDy()
model.fit(
x,
t=t,
u=u,
)
print(model.print())
def gen_u(t):
return t
x0 = 6
t_test_success = np.arange(0, 1.1, 0.1)
u_test_success = gen_u(t_test_success)
t_test_fail = np.arange(0, 1, 0.1)
u_test_fail = gen_u(t_test_fail)
sim_success = model.simulate(
[x0],
t=t_test_success,
u=u_test_success,
)
sim_fail = model.simulate(
[x0],
t=t_test_fail,
u=u_test_fail,
)
In this example, sim_fail
will raise a ValueError
exception as 1
is out of the interpolation range (see linked line above).
Replace : if ndim(u_fun(1)) == 1:
with if ndim(u_fun(t[0])) == 1:
.
Hi there,
I downloaded the example code from the pysindy website. In the example file named "1_feature_overview.py", there is an error while I tried to run it in the function of simulate (line 568) as mentioned below.
x_test_sim = model.simulate(x0_test, t_test, u=u_test)
ValueError: A value in x_new is above the interpolation range.
1.2.2
From the "pysindy.py" at line 725, I added the further parameter to the function "interp1d" as suggested from https://stackoverflow.com/questions/45429831/valueerror-a-value-in-x-new-is-above-the-interpolation-range-what-other-re?answertab=votes#tab-top
Previous code
u_fun = interp1d(t, u, axis=0, kind="cubic")
Modified code
u_fun = interp1d(t, u, axis=0, kind="cubic", fill_value="extrapolate")
After I put it, the simulate function is work for me. Maybe you consider to put it into the update version of pysindy.
Regards,
We should create a Sphinx gallery so that all our examples are housed in one place on our documentation site.
Hi, is there a way of doing constraints that are not equality constraints Bx=0, but minimizing a function like Bx.
basically in the optimizer step that is using a sparsity to minimize the coefficients would be multiplied by a weighted matrix B
I looked at the features and there was an example using equality constraints, but I was wondering if it could also have minimizing or maximizing functions
Thank you!
It would be nice to have an option to only find equations for a subset of the input variables. This would be useful e.g. if you want to include forcing inputs in your library (you don't care about fitting a dynamical equation that explains the external forcing).
Setting the alpha
(L2 regularization) parameter for the STLSQ
optimizer improves the condition number of the least-squares problems that are solved, leading to better overall results when features are close to linearly dependent. However, it adds some bias to the learned coefficients (toward 0). We should consider modifying STLSQ
(and maybe some of our other methods) by changing it to a two-step procedure when alpha
is positive:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.