redmod-team / profit Goto Github PK
View Code? Open in Web Editor NEWProbabilistic Response mOdel Fitting with Interactive Tools
Home Page: https://profit.readthedocs.io
License: MIT License
Probabilistic Response mOdel Fitting with Interactive Tools
Home Page: https://profit.readthedocs.io
License: MIT License
If the option ntask = n is used for parallel computing, check the number of available cores before starting the computation.
In proFit, the hyperparameter vector used is: [ l=length-scale , sigma^2 = (sigma_n/sigma_f)^2 ] in order to normalize.
Adapt the written functions to this definition:
In the file profit.sur.backend the following functions do the same task:
The posterior covariance matrix (cov_f_star) isn't perfectly symmetric.
There is an error of approximatly 1e-14:
The command : np.max(cov_f_star-np.transpose(cov_f_star)) returns a value arround 1e-14
In profit.yaml
and LocalCommand
troubles can arise with relative paths. The most logical way from the user would be, to relate all occurances of ../
to the study directory, i.e. replace ../
by ../../../
everywhere (study/run/XX/
instead of study/
).
The best place to change this is directly in LocalCommand
, since the place from which people access the Python API is usually also in study
, as the profit.yaml
.
After projecting to a low-order spectral basis (PCE for global UQ) one can model the residue by a GP with an additive kernel. This allows for modeling complex behavior and sensitivity analysis (ANOVA / Sobol indices)
Implement / revise test cases for Custom and GPy.
Also revise the Examples to match with the cleaned up project structure. (also solves #16)
Many codes rely on a standardized directory structure for each run. To automatically generate run directories the user provides a template file. Placeholders for input parameters in the template file are automatically replaced by values for a specific run. This feature should be usable for both, online and offline runs, and also dynamically generated parameter vectors.
Redmod is too generic and SurUQ sounds too orcish. Instead of Surrogate the word parameter should be in focus. Suggestions:
Using scikit-learn, then scale up with PyTorch
See e.g. https://i-systems.github.io/teaching/ML/iNotes/15_Autoencoder.html
start with PCA and work towards more generic nonlinear methods (maybe based on local sensitivity analysis)
Double brackets? Configurable?
input_mode(json)
separator = '{{'
input.json :
{
'x': {{x}},
'y': 4
}
config.h
int return_config()
{
return {{u}}*x;
}
{{
'x': {x},
'y': 4
}}
Include
In some cases, to use a function in proFIt, it is required to indicate its whole path: from the root file: profit.profit. ... instead of just starting it from the current file.
Implement possibility to shift x-axis of data such that two data sources are stitched together in the optimum way. This will require a hyperparameter that quantifies the relative shift.
Bring Config class and user interface in a clear form.
Currently, only Uniform and LogUniform are supported after switching configuration backends to yaml.
pip install -e . --user
moves profit into %APPDATA%
, which is usually not in %PATH%
on Windows with Anaconda. Documentation should be updated to use pip install -e .
on this setup.
This is related to #7 . Could be done via MPI and/or simpler local solution for the multi-process runner.
A user develops a new numerical method that is faster at the same accuracy than existing methods. He wants to produce plots of accuracy vs computation time for his new code as well as an existing one.
Input parameters:
1e-6
to 1e-12
Input parameters:
1e-1
to 1e-3
It should be possible to plot two outputs against each other here. So one would fit a response model with x = computation time and y = accuracy.
In the file profit.profit.sur.backend the following functions have the same definition:
When doing distributed runs on the cluster, all output must be written in a concurrency-sage way. HDF5 with MPI communication looks like a reasonable choice.
For the work with Ulrich Callies from HZG a tool was developed to explore conditional distributions with one or more variables fixed in a certain range. Then the marginal distributions of the remaining variables are plotted as histograms and/or with a kernel density estimator. This way a high-dimensional probability distribution can be explored in an intuitive way.
The user wants to specify points where the response should be evaluated. Based on a user-supplied template she tells profit
to generate a set of directories and a batch submission script.
Remarks:
Dividing K_y by sigma_f^2 introduces an extra additive term -1/2 log(sigma_f^(-2)) in the NLL. This should be either justified or cancelled by adding it again.
There needs to be one simple and clearly documented way to build the Covariance Matrices: K(X_test,X_test) ; K(X_test,X_training) ; K(X_training,X_training) .
The user would like to run his code independently from suruq. Therefore the user takes the following steps
Interfacing to input/output file should be easy and done by the user. For this purpose a txt and a hdf5 standard format will be supplied.
Standardize surrogates. For now only Custom and GPy.
Right now, run folders are created as "0, 1, 2, 3, ..., 10, 11, ...". For better sorting in the file manager and console it should be standard to have "000, 001, 002, ..." which supports up to 1000 run folders. More generally one should put a configuration option ndigit
in the run
section of profit.yaml
that defaults to three.
Generation of grid for 7 parameters takes 8s for sparse grid opposed to 0.05s for full grid. Test case was Gaussian quadrature rule in https://github.com/redmod-team/suruq/blob/1fc3689bf52eb1610dd450ee1363f388df761031/draft/test_sparsegrid.py#L21 .
add surface plots and slicing
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.