Giter VIP home page Giter VIP logo

gmpavanlab / swarm-cg Goto Github PK

View Code? Open in Web Editor NEW
39.0 7.0 9.0 32.86 MB

Swarm-CG: Automatic Parametrization of Bonded Terms in MARTINI-based Coarse-Grained Models of Simple to Complex Molecules via Fuzzy Self-Tuning Particle Swarm Optimization

Home Page: https://pubs.acs.org/doi/10.1021/acsomega.0c05469

License: MIT License

Python 99.81% Shell 0.19%
coarse-graining coarse-grained molecular-dynamics molecular-modeling gromacs optimization optimization-tools bonded-terms bonded-parameters swarm-cg

swarm-cg's Introduction

Swarm-CG

Swarm-CG is designed for automatically optimizing the bonded terms of a MARTINI-based coarse-grained (CG) molecular model, in explicit or implicit solvent, with respect to a reference all-atom (AA) trajectory and starting from a preliminary CG model (topology and non-bonded parameters). The package is designed for usage with Gromacs and contains 3 modules for:

  1. Evaluating the bonded parametrization of a CG model
  2. Optimizing bonded terms of a CG model
  3. Monitoring an optimization procedure

Swarm-CG

Swarm-CG works with MARTINI version 2 or 3. The AA-to-CG mapping can be interpreted as center of mass (COM) or center of geometry (COG). Virtual sites handling is under development and will be available soon.

Publication

Empereur-mot, C.; Pesce, L.; Bochicchio, D.; Capelli, R.; Perego, C.; Pavan, G.M. (2020) Swarm-CG: Automatic Parametrization of Bonded Terms in MARTINI-based Coarse-Grained Models of Simple to Complex Molecules via Fuzzy Self-Tuning Particle Swarm Optimization. ACS Omega

Installation & Usage

Swarm-CG has been tested with Python versions >= 3.6.8 and Gromacs versions >= 2018.1.

yum install python3-devel        # python dev tools CentOS (optional)
pip3 install python-dev-tools    # python dev tools Ubuntu (optional)

pip3 install swarm-cg            # creates the 3 entrypoints/aliases below

scg_evaluate -h                  # see point 1
scg_optimize -h                  # see point 2
scg_monitor -h                   # see point 3

To better handle sampling in symmetrical molecules you can form groups of bonds/angles/dihedrals that Swarm-CG will consider identical, using line returns and/or comments in the topology (ITP) file. AA-mapped distributions will be averaged within groups to create the references used for evaluation (see point 1) or as target of the optimization procedure (see point 2). For optimization, identical parameters will be used for the bonds/angles/dihedrals within each group.

Here is an ITP file extract from the demonstration data of PAMAM G1:

[ bonds ]
;   i     j   funct   length   force.c.   
; bond group 1
    1     2       1        0         0           ; B1
; bond group 2
    1     3       1        0         0           ; B2
    1     9       1        0         0           ; B2
; bond group 3
    3     4       1        0         0           ; B3
    9    10       1        0         0           ; B3

1. Evaluate bonded parametrization of a CG model

The module scg_evaluate enables quick evaluation of the fit of bond, angle and dihedral distributions between a CG model trajectory and a reference AA model trajectory of an identical molecule, by producing a single comprehensive figure.

scg_evaluate -aa_tpr G1_DATA/aa_topol.tpr -aa_traj G1_DATA/aa_traj.xtc -cg_map G1_DATA/cg_map.ndx -cg_itp G1_DATA/cg_model.itp -cg_tpr G1_DATA/cg_topol.tpr -cg_traj G1_DATA/cg_traj.xtc

It can also be used for inspecting AA-mapped distributions exclusively.

scg_evaluate -aa_tpr G1_DATA/aa_topol.tpr -aa_traj G1_DATA/aa_traj.xtc -cg_map G1_DATA/cg_map.ndx -cg_itp G1_DATA/cg_model.itp

This module is particularly useful to assess the need to run an optimization procedure (assuming one already has a CG model). It is also suited to the assessment of geometrical changes triggered by a modification of CG beads types (defining non-bonded parameters) or after manually editing bonded parameters while working on a model. This command also provides publication-quality figures to support the parametrization of your models (also in vectorized formats). Radius of gyration (Rg) and solvent accessible surface area (SASA) are also calculated.

2. Optimize bonded terms of a CG model

The module scg_optimize allows to automatically optimize the bonded parameters of a CG model according to a reference AA trajectory. To this end, several simulations will be run to explore and evaluate the relevance of different sets of bonded parameters, using 3 optimization cycles.

For example, using demonstration data of PAMAM G1:

scg_optimize -in_dir G1_DATA/ -gmx gmx_2018.6_p

Which will use all default filenames of the software and is exactly identical to this command:

scg_optimize -aa_tpr G1_DATA/aa_topol.tpr -aa_traj G1_DATA/aa_traj.xtc -cg_map G1_DATA/cg_map.ndx -cg_itp G1_DATA/cg_model.itp -cg_gro G1_DATA/start_conf.gro -cg_top G1_DATA/system.top -cg_mdp_mini G1_DATA/mini.mdp -cg_mdp_equi G1_DATA/equi.mdp -cg_mdp_md G1_DATA/md.mdp -gmx gmx_2018.6_p

We recommend to first prepare files in a directory to be fed to Swarm-CG via argument -in_dir.

The input is composed of:

  1. An AA reference trajectory (TPR + XTC/TRR)
  2. The AA to CG mapping (NDX)
  3. A preliminary CG model (ITP, equilibrium values and force constants can be initialized arbitrarily to e.g. 0)
  4. A CG configuration used as starting point of each iterative optimization run (GRO file, from a mapped AA frame and solvated if necessary)
  5. Other simulation files (TOP and MDP, notably with your barostat and thermostat choices)

At all times during execution, the best parametrized model is accessible in the optimization output folder at out_dir/optimized_CG_model/cg_model.itp. The bonded parameters obtained via the Boltzmann inversion implemented in Swarm-CG with groups averaging (see paper sections 2.1 and 6.1) are also available at out_dir/boltzmann_inv_CG_model/cg_model.itp.

The AA trajectory is mapped on-the-fly (if atoms are mapped to multiple CG beads, atom masses are split accordingly). The AA trajectory must contain box information for PBC handling, otherwise it is assumed the molecule is "unwrapped" already. Only the MDP file provided via arg -cg_mdp_md will be modified to adjust nsteps according to arguments -cg_time_short and cg_time_long, taking into account the timestep ts you provided. To minimize the execution time of scg_optimize, equilibration should stay short (e.g. 50-500 fs) and so should the optimization cycles 1 and 2 (via arg -cg_time_short e.g. 10-20 ns). To maximize the precision of scg_optimize, optimization cycle 3 must always use longer simulation times (via arg -cg_time_long e.g. 25-100 ns). Execution times should vary between 4h to 24h according to parameters and hardware used.

For information about the different execution modes, please see paper sections 2.4 and 4 and command help (-h).

3. Monitor an ongoing CG model optimization

Optimization procedures can be monitored at any point during execution. The module scg_monitor produces a visual summary (see paper Fig. 3) of the progress of an optimization procedure started with module scg_optimize. The plot will be produced in the directory provided via arg -opti_dir.

scg_monitor -opti_dir MODEL_OPTI__STARTED_03-07-2020_10h_12m_15s

See the help (-h) for a complete description of scg_monitor output. In particular, note that Rg and SASA might be rough estimates in this display, as they are calculated from short simulations used for optimization. These values must probably be validated using longer simulation times. Using scg_evaluate can be helpful to this end.

Extended usage (untested)

In principle, Swarm-CG workflow is general and can be applied also for tuning bonded terms in coarser CG models (by mapping more than 3-5 atoms to each CG bead and providing adequate non-bonded parameters). To this end, it is possible to use an AA trajectory as reference for optimization, but also instead a high resolution CG trajectory (fine grain) for tuning the coarser CG model (see paper section 4 for a more detailed discussion about crossing CG scales).

Another possible use case would be the tuning of elastic networks in CG models of proteins, although this still requires a well sampled AA or fine CG reference trajectory.

Please feel free to open an Issue or email us if you are interested into extended usages and need help.

Credits

Swarm-CG makes extensive use of FST-PSO and MDAnalysis. We thank Marco S. Nobile for his valuable insights.

swarm-cg's People

Contributors

charlyempereurmot avatar fiskissimo avatar giovannidoni avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

swarm-cg's Issues

Extended Usage Request: Adding GROMACS angle function 10 - Restriced Bending Angle Potential

Hello again,

I'm using your program and it is awesome!

I do have one suggestion, that I think may help many CG users. That is adding the restricted bending angle potential (gromacs function 10) to the options for angular potential optimization. The restricted bending angle potential stabilizes systems and prevents them from blowing up in certain cases, especially when the CG model includes a dihedral angle. The reasoning behind the stabilization is illustrated in this exchange on the martini forum, where adding the restricted bending angle was needed to prevent my system from blowing up from lincs errors related co-linear atoms within the dihedral that caused the potential to diverge.

http://www.cgmartini.nl/index.php/component/kunena/15-water/5894-yet-another-thread-about-lincs-warnings-in-polarized-martini-water#8632

Cheers!
Mike

scg_optimize fails on WSL due to permission denied

Dear Swarm-CG creators,

Running the scg_optimize executable on WSL crashes with the following error:

shutil.Error: [('.internal/input_CG_simulation_files', 'CG_sim_files_eval_step_1', "[Errno 13] Permission denied: 
'CG_sim_files_eval_step_1'")]

We traced the error to line 2690 in swarmCG.py:

# create new directory for new parameters evaluation
current_eval_dir = f'{config.iteration_sim_files_dirname}_eval_step_{ns.nb_eval}'
shutil.copytree(config.input_sim_files_dirname, current_eval_dir)

We found that there is a bug in the copytree function on WSL (https://bugs.python.org/issue38633). A temporary fix could be to add the following lines at the beginning of the file:

import errno
orig_copyxattr = shutil._copyxattr
def patched_copyxattr(src, dst, *, follow_symlinks=True):
  try:
    orig_copyxattr(src, dst, follow_symlinks=follow_symlinks)
  except OSError as ex:
    if ex.errno != errno.EACCES: raise
shutil._copyxattr = patched_copyxattr

Thanks to @franciscoadasme for his help in the debugging.

I hope this helps!

md run failing

Starting iteration 4 at 14:14:40 on 12-10-2022
MD run failed (simulation process terminated with error)
Iteration time: 0.5 min

Even the distribution plots are not appropriate.
Kindly tell me how to resolve this issue.

AttributeError: 'Namespace' object has no attribute 'gmx_cmd'

Hello,

I'm running into a pesky AttributeError: 'Namespace' object has no attribute 'gmx_cmd' when running scg_optimize on a polymer I am trying to map using your program.

It seems everything is running smoothly until the minimization step, but then an error is triggered when it tries run the equilibration step of the first CG iteration. I attached my input and output files here. I didn't include the atomistic trajectory as the file is large, and I think this might be a bug on the cg-swarm side given that the minimization runs fine and this is a python based error. If you need the trajectory to reproduce the error, let me know and I can drop a google drive link here.

I use slurm to run the bash script 'cg-swarm-run' in the zip file below. I think if you look there first, the rest of the files should be arranged according to your example.

cg_swarm_testing(2).zip

Please let me know if I made a mistake anywhere!

Cheers,
Mike Boyle

Constraints optimization

Dear Swarm-CG creators,
First of all, great project!
I have issue with constraints optimization. The addition of constraints into the itp file always crushes the optimization with the following error:

-- ! ERROR ! -- In the provided CG ITP file constraints have been grouped, but constraints group 1 holds lines that have different parameters. Parameters should be identical within a group, only CG beads IDs should differ. Please correct the CG ITP file and separate groups using a blank or commented line.

But all parameters inside the group are identical in my itp file, moreover even addition only one bond constraint results in the same error. Without any constraint all works perfectly fine.

itp_file.zip

VirtualSites are silently dropped from itp

Hi,

First of all I want to say that this is a very interesting project. I'm very much looking forward to using the program in some high throughout parametrization of models. However, it seems that the program fails to include virtual sites. I assume they are dropped because they cannot be optimized. However, the program also does not issue a warning or error message at the pre process stage.

From a martini centric perspective it would be very useful to have the code simply keep virtual sites, because a lot of martini3 models are using virtual sites. Can this be implemented in the current code?

EDIT: Actually after hacking in VS treating them like exclusions, I figured out that this will not be possible to have because MDAnalysis make whole does not support making whole molecules that have VS (i.e. not contiguous bonds).

Swarm-CG & Gromacs 2021

Hi, I'd like to use your model to optimize bonded parameters for a drug molecule. Just wanted to check if your tool is compatible with the latest version of gromacs (2021) and does it currently handle virtual sites?

Newer gromacs version than 2019 have a different tpr and are not supported?

Hoi,

I run into an MDAnalysis error due to a non-supported tpr format. I understand the issue with the ever changing format. Therefore I was wondering if you have some ideas to work around this limitation?

I would really like to use your software, but 2019 is 5 years ago and I guess it would be cool if we are not locked into outdated software?

Love to hear your idea on this issue.

Cheers,

Bart

ValueError: Failed to construct topology from file 4-prod.tpr with parser <class 'MDAnalysis.topology.TPRParser.TPRParser'>.
Error: Your tpx version is 133, which this parser does not support, yet

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.