Giter VIP home page Giter VIP logo

icb-dcm / parpe Goto Github PK

View Code? Open in Web Editor NEW
19.0 7.0 4.0 57.21 MB

Parameter estimation for dynamical models using high-performance computing, batch and mini-batch optimizers, and dynamic load balancing.

License: MIT License

CMake 4.17% C++ 74.58% Shell 2.09% Python 18.25% R 0.51% Dockerfile 0.04% SWIG 0.37%
parameter-estimation optimization systems-biology dynamical-modeling high-performance-computing amici petab sbml hacktoberfest ode

parpe's Introduction

parPE tests Coverage PEtab test suite Deploy to dockerhub DOI

parPE

The parPE library provides functionality for solving large-scale parameter optimization problems requiring up to thousands of simulations per objective function evaluation on high performance computing (HPC) systems.

parPE offers easy integration with AMICI-generated ordinary differential equation (ODE) models.

Features

parPE offers the following features:

  • MPI-based load-balancing of individual simulations
  • improved load balancing by intermingling multiple optimization runs (multi-start local optimization)
  • simple integration with SBML models via AMICI and PEtab
  • interfaces to Ipopt, Ceres, FFSQP and SUMSL (CALGO/TOMS 611) optimizers
  • HDF5 I/O compatible with a wide variety of programming languages
  • Good parallel scaling to up to several thousand cores (highly problem dependent)

Getting started

Although various modules of parPE can be used independently, the most meaningful and convenient use case is parameter optimization for an SBML model specified in the PEtab format. This is described in doc/petab_model_import.md.

Dependencies

For full functionality, parPE requires the following libraries:

  • CMAKE (>=3.15)
  • MPI (OpenMPI, MPICH, ...)
  • IPOPT (>= 1.2.7) (requires coinhsl)
  • CERES (>=1.13) (requires Eigen)
  • Boost (serialization, thread)
  • HDF5 (>= 1.10)
  • CBLAS compatible BLAS (libcblas, Intel MKL, ...)
  • AMICI (included in this repository) (uses SuiteSparse, Sundials)
  • C++17 compiler
  • Python >= 3.9, including header files

On Debian-based systems, dependencies can be installed via:

sudo apt-get install \
  build-essential \
  cmake \
  cmake-curses-gui \
  coinor-libipopt-dev \
  curl \
  gfortran \
  libblas-dev \
  libboost-chrono-dev \
  libboost-serialization-dev \
  libboost-thread-dev \
  libceres-dev \
  libmpich-dev \
  libhdf5-dev \
  libpython3-dev \
  python3-pip

Scripts to fetch and build the remaining dependencies are provided in /ThirdParty/:

ThirdParty/installDeps.sh

NOTE: When using ThirdParty/installIpopt.sh to build Ipopt, you may have to download the HSL library separately as described at https://coin-or.github.io/Ipopt/INSTALL.html#DOWNLOAD_HSL. Place the HSL archive into ThirdParty before running ThirdParty/installIpopt.sh. If asked type in your coinhsl version (e.g. 2019.05.21 if you have coinhsl-2019.05.21.tar.gz).

Building

After having taken care of the dependencies listed above, parPE can be built:

./buildAll.sh

Other sample build scripts are provided as /build*.sh.

Recently tested compilers

  • GCC 10.2.0
  • Intel icpc (ICC) 17.0.6

Docker

There is a Dockerfile available in container/charliecloud/ and images can be found on dockerhub.

Documentation & further information

Some high-level documentation is available at https://parpe.readthedocs.io/en/latest/ and among GitHub issues. No extensive full-text documentation is available for the C++ interface yet. For usage of the C++ interface see examples/ and */tests.

References

parPE is being used or has been used in the following projects:

  • Leonard Schmiester, Yannik Schälte, Fabian Fröhlich, Jan Hasenauer, Daniel Weindl. Efficient parameterization of large-scale dynamic models based on relative measurements. Bioinformatics, btz581, doi:10.1093/bioinformatics/btz581 (preprint: doi:10.1101/579045).

  • Stapor, P., Schmiester, L., Wierling, C. et al. Mini-batch optimization enables training of ODE models on large-scale datasets. Nat Commun 13, 34 (2022). doi:10.1038/s41467-021-27374-6 (preprint: doi:10.1101/859884).

  • Paul F. Lang, David R. Penas, Julio R. Banga, Daniel Weindl, Bela Novak. Reusable rule-based cell cycle model explains compartment-resolved dynamics of 16 observables in RPE-1 cells. bioRxiv (2023). doi:10.1101/2023.05.04.539349

  • CanPathPro

Funding

parPE has been developed within research projects receiving external funding:

  • Through the European Union's Horizon 2020 research and innovation programme under grant agreement no. 686282 (CanPathPro).

  • Computer resources for testing parPE have been provided among others by the Gauss Centre for Supercomputing / Leibniz Supercomputing Centre under grant pr62li and pn72go.

parpe's People

Contributors

dweindl avatar elgohr avatar katrinleinweber avatar leonardschmiester avatar merktsimon avatar pauljonasjost avatar paulstapor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

parpe's Issues

Need to check for invalid optimizer options

Exception of type: OPTION_INVALID in file "IpAlgBuilder.cpp" at line 271:
 Exception message: Selected linear solver MA27 not available.
Tried to obtain MA27 from shared library "libhsl.so", but the following error occured:
libhsl.so: cannot open shared object file: No such file or directory

EXIT: Invalid option encountered.

Update: Should terminate if any unknown options are set

Document expected format for generateHDF5DataFileFromText.py

E.g.

  • finalize column names
  • rules for observable names
  • rules for scaling parameter names in model and data files (_offset_, scaling, sigma`)
  • specification of timepoints, inf, units, ...
  • document hdf5 output
  • provide example data and sample measurements

Implement early stopping

In intermediate function, evaluate model on test set and see if prediction likelihood improves

-> need to add validation set to dataprovider

Should be usable for both batch and minibatch optimizers

Check for new_x in IpOpt

IpOpt may call Eval_F and Eval_Grad_F with identical parameters. Save previous results and check for new_x to avoid recomputation.

Add minibatch optimizers

Need to reorganize training data

class MinibatchDataProvider : public MultiConditionDataProvider

Unlike MultiConditionDataProvider, need one instance per optimization to keep track of batches

Add class Optimizer, class LocalOptimizer : public Optimizer, class MiniBatchOptimizer : public LocalOptimizer
Add Optimizer::optimize(OptimizationProblem*)
-> need to refactor Ceres and Ipopt wrappers

Adapt to new data format

  • hdf5 import (simulation<->optimization parameter mapping, scaling factors)
  • remove "genotypespecific" parameter, replace by parameter map

getLocalOptimum... to classes

Will make it easier to minibatch methods later on

add factory method to OptimizationOptions

Subclass OptimizationOptions for each Optimizer to account for unique options?

Add support for timeseries data

HDF5 file: measurements/y(sigma): condition x t x y
-> should enable chunked storage / compression
-> need to label timepoints: /measurements/[attr]timepoints double; [infinity] for steady-state data

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.